CrUX methodology

This section documents how CrUX collects and organizes user experience data.

Published on Updated on

Eligibility

At the core of the CrUX dataset are individual user experiences, which are aggregated into page-level and origin-level distributions. This section documents user eligibility and the requirements for pages and origins to be included in the dataset. All eligibility criteria must be satisfied in order for an experience to be included in page-level data available in PageSpeed Insights and the CrUX API: User, Origin and Page. Experiences which meet the User and Origin criteria but not Page will be included in the origin-level data available in all CrUX data sources.

Pages and origins will be automatically included or removed from the dataset if their eligibility changes over time. There is not currently a way to manually submit pages or origins for inclusion.

Publicly discoverable

A page must be publicly discoverable to be considered for inclusion in the CrUX dataset.

A page is determined to be publicly discoverable using the same indexability criteria as search engines.

Any page will not meet the discoverability requirement if any of the following conditions are met, including root pages for the origin dataset:

  • The page is served with an HTTP status code other than 200 (after redirects).
  • The page is served with an HTTP X-Robots-Tag: noindex header or equivalent.
  • The document includes a <meta name="robots" content="noindex"> meta tag or equivalent.

Refer to Google Search Console for an overview of your site's indexing status.

Sufficiently popular

A page is determined to be sufficiently popular if it has a minimum number of visitors. An origin is determined to be sufficiently popular if it has a minimum number of visitors across all of its pages. An exact number is not disclosed, but it has been chosen to ensure that we have enough samples to be confident in the statistical distributions for included pages. The minimum number is the same for pages and origins.

Pages and origins that do not meet the popularity threshold are not included in the CrUX dataset.

Origin

An origin represents an entire website, addressable by a URL like https://www.example.com. For an origin to be included in the CrUX dataset it must meet two requirements:

  1. Publicly discoverable
  2. Sufficiently popular

You can verify that your origin is discoverable by running a Lighthouse audit and looking at the SEO category results. Your site is not discoverable if your root page fails the Page is blocked from indexing or Page has unsuccessful HTTP status code audits.

If an origin is determined to be publicly discoverable, eligible user experiences on all of that origin's pages are aggregated at the origin-level, regardless of individual page discoverability. All of these experiences count towards the origin's popularity requirement.

For querying purposes, note that all origins in the CrUX dataset are lowercase.

Page

The requirements for a page to be included in the CrUX dataset are the same as origins:

  1. Publicly discoverable
  2. Sufficiently popular

You can verify that a page is discoverable by running a Lighthouse audit and looking at the SEO category results. Your page is not discoverable if it fails the Page is blocked from indexing or Page has unsuccessful HTTP status code audits.

Pages commonly have additional identifiers in their URL including query string parameters like ?utm_medium=email and fragments like #main. These identifiers are stripped from the URL in the CrUX dataset so that all user experiences on the page are aggregated together. This is useful for pages that would otherwise not meet the popularity threshold if there were many disjointed URL variations for the same page. Note that in rare cases this may unexpectedly group experiences for distinct pages together; for example if parameters ?productID=101 and ?productID=102 represent different pages.

Pages in CrUX are measured based on the top-level page. Pages included as iframes are not reported on separately in CrUX, but do contribute to the metrics of the top-level page. For example, if https://www.example.com/page.html embeds https://www.example.com/frame.html in an iframe, then page.html will be represented in CrUX (subject to the other eligibility criteria) but frame.html will not. And if frame.html has poor CLS then the CLS will be included when measuring the CLS for page.html. CrUX is the Chrome User Experience Report and a user may not even be aware this is an iframe. Therefore, the experience is measured at the top level page—as per how the user sees this.

A website's architecture may complicate how its data is represented in CrUX. For example, single page apps (SPAs) may use a JavaScript-based route transition scheme to move between pages, as opposed to traditional anchor-based page navigations. These transitions appear as new page views to the user, but to Chrome and the underlying platform APIs the entire experience is attributed to the initial page view. This is a limitation of the native web platform APIs on which CrUX is built, see How SPA architectures affect Core Web Vitals on web.dev for more information.

User

For a user to have their experiences aggregated in the CrUX dataset, they must meet the following criteria:

  1. Enable usage statistic reporting.
  2. Sync their browser history.
  3. Not have a Sync passphrase set.
  4. Use a supported platform.

The current supported platforms are:

  • Desktop versions of Chrome including Windows, MacOS, ChromeOS, and Linux operating systems.
  • Android versions of Chrome, including native apps using Custom Tabs and WebAPKs.

There are a few notable exceptions that do not provide data to the CrUX dataset:

  • Chrome on iOS.
  • Native Android apps using WebView.
  • Other Chromium browsers (for example Microsoft Edge).

Chrome does not publish data about the proportions of users that meet these criteria. You can learn more about the data we collect in the Chrome Privacy Whitepaper.

Accelerated Mobile Pages (AMP)

Pages built with AMP are included in the CrUX dataset like any other web page. As of the June 2020 CrUX release, pages served via the AMP Cache and / or rendered in the AMP Viewer are also captured, and attributed to the publisher's page URL.

Tools

The CrUX dataset is made available through a variety of tools maintained by Google. Each tool may access CrUX data slightly differently, resulting in varying levels of timeliness and metric support.

ToolFrequencyMetricsDimensionsHistorical DataOrigin / Page-level
CrUX on BigQueryMonthly 1All metricsAll dimensionsSince 2017 5Origin
CrUX DashboardMonthly 1All metricsNo country dimensionSince 2017 5Origin
CrUX API28-day average 2Subset of key metrics 4No country dimensionNoOrigin & Page
CrUX History APIWeekly 3Subset of key metrics 4No country dimensionPrevious 25 weeksOrigin & Page
PageSpeed Insights28-day average 2Subset of key metrics 4No effective connection type or country dimensionsNoOrigin & Page
PageSpeed Insights API28-day average 2Subset of key metrics 4No effective connection type or country dimensionsNoOrigin & Page
Google Search Console28-day average 2Core web vitalsNo dimensionsThree monthsPage Group 6

1 Monthly data is released on the second Tuesday after each monthly collection period. The last 28 days of each month period are included.
2 28-day rolling average data is updated daily, based on the aggregated data from the previous 28 days.
3 Weekly historical data is released every Monday, containing the 25 most recent 28 day collection periods that end on Saturdays.
4 The web vital metrics are available in all tools.
5 Not all metrics are available in all monthly tables, see the release notes for details.
6 Search Console groups URLs that provide similar experiences, Core Web Vitals data are shown aggregated by these page groups.

The following sections briefly summarize each tool and how the data can be used.

CrUX on BigQuery

Origin-level CrUX data is available for public querying via BigQuery. Read the guide on Using the Chrome UX Report.

CrUX on BigQuery provides a publicly accessible database of all origin-level data collected by CrUX. It is possible to query any and all origins for which data is collected, analyze any metric that CrUX supports and filter by all available dimensions. Full metric histograms are stored in the BigQuery tables allowing for visualization of performance distributions, including experimental metrics.

The data in BigQuery is updated monthly, with each month's data released on the second Tuesday after the collection period. Page-level data is not available in BigQuery tables, and percentiles are interpreted from coarse histogram data which results in approximate values.

Use CrUX on BigQuery for analysis across any dimension: origins, countries, dates, form factor and connection type. Read more on the CrUX on BigQuery documentation page.

CrUX Dashboard

The CrUX Dashboard is a Looker Studio dashboard that allows you to query and render CrUX data into an interactive dashboard, as well as exporting PDF reports.

The dashboard provides visualization of all CrUX metrics in monthly trends, with data available back to 2017. Data can be split by form factor to compare mobile / tablet / desktop performance and performance goals are available to create red-amber-green visualizations. Effective connection type can be shown as a distribution.

The CrUX Dashboard does not support the country dimension, so all global data is presented in the reports. Page-level data is not available and percentile values are calculated from coarse histogram data so are approximate.

CrUX API

The CrUX API provides programmatic access to CrUX data by page or origin, and can be further filtered by form factor, effective connection type and metrics.

The API provides Web Vitals metrics both by origin and at page-level and the data is updated daily. The only values provided for metrics are calculated from the previous 28 days as a rolling window. Historical data is available via the separate History API.

The CrUX API returns more quickly than the PageSpeed Insights API but does not include the additional Lighthouse data provided by PageSpeed Insights.

Read more in the API documentation.

CrUX History API

The CrUX History API provides programmatic access to CrUX historical data by page or origin, and can be further filtered by form factor, effective connection type and metrics.

The API provides Web Vitals metrics both by origin and at page-level and the data is updated weekly. The only values provided for metrics are calculated from the past 25 weekly collection periods of 28 days as a rolling window.

Read more in the History API documentation.

PageSpeed Insights

PageSpeed Insights uses CrUX to present real-user performance data alongside performance opportunities powered by Lighthouse.

The PageSpeed Insights report presents a consolidated view of the Core Web Vitals for the given URL or origin, plus additional diagnostic metrics. Data is presented by desktop and mobile form factors and can be compared with the lab test results to give a better understanding of your page performance.

PageSpeed Insights does not provide historical data, and does not include country or effective connection type dimensions.

PageSpeed Insights API

The PageSpeed Insights API offers programmatic access to the data shown in PageSpeed Insights, including Core Web Vitals data from CrUX.

This API integrates well into existing SEO tooling and workflows, allowing CrUX data to be included in automated reports and analyses. The PageSpeed Insights API returns slower than the CrUX API, but includes additional data provided by Lighthouse.

As in the web version, the PageSpeed Insights API has no historical data and is limited to the Core Web Vitals. Country and effective connection type dimensions are not included.

Search Console

Search Console shows how CrUX data influences the page experience ranking factor by URL and URL group.

Search Console presents Core Web Vitals values as aggregates of groups of similar pages. This provides a quick indication of which sections of a site are potentially impacting the page experience ranking factor.

Data is updated daily and is split by mobile and desktop form factors. A maximum sample of 20 pages per group are presented for further analysis.

Metrics

Metrics in CrUX are powered by standard web platform APIs exposed by browsers. In the BigQuery dataset in particular, this data is aggregated to origin-resolution. Site owners requiring more detailed (e.g. URL-level resolution) analysis and insight into their site performance can use the same APIs to gather detailed real user measurement (RUM) data for their own origins. Note that while all APIs are available in Chrome, other browsers may not support the full set of metrics.

Most metrics are represented as a histogram aggregation, allowing visualization of the distribution and approximation of percentile values.

First Paint

"First Paint reports the time when the browser first rendered after navigation. This excludes the default background paint, but includes non-default background paint. This is the first key moment developers care about in page load - when the browser has started to render the page."

Paint Timing API

First Contentful Paint

"First Contentful Paint reports the time when the browser first rendered any text, image (including background images), non-white canvas or SVG. This includes text with pending webfonts. This is the first time users could start consuming page content."

Paint Timing API

DOM Content Loaded

"The DOMContentLoaded reports the time when the initial HTML document has been completely loaded and parsed, without waiting for stylesheets, images, and subframes to finish loading."

MDN

Largest Contentful Paint

"Largest Contentful Paint (LCP) is an important, user-centric metric for measuring perceived load speed because it marks the point in the page load timeline when the page's main content has likely loaded — a fast LCP helps reassure the user that the page is useful."

web.dev/lcp/

Onload

"The load event is fired when the page and its dependent resources have finished loading."

MDN

Cumulative Layout Shift

"Cumulative Layout Shift (CLS) is an important, user-centric metric for measuring visual stability because it helps quantify how often users experience unexpected layout shifts — a low CLS helps ensure that the page is delightful."

web.dev/cls/

First Input Delay

"First Input Delay (FID) is an important, user-centric metric for measuring load responsiveness because it quantifies the experience users feel when trying to interact with unresponsive pages—a low FID helps ensure that the page is usable."

web.dev/fid/

Interaction to Next Paint

"Interaction to Next Paint (INP) is a field metric that assesses responsiveness. INP logs the latency of all interactions throughout the entire page lifecycle. The highest value of those interactions—or close to the highest for pages with many interactions—is recorded as the page's INP. A low INP ensures that the page will be reliably responsive at all times."

web.dev/inp/

Interaction to Next Paint (INP) was added to the CrUX dataset in February 2022. This new metric captures the end-to-end latency of individual events and offers a more holistic picture of the overall responsiveness of a page throughout its lifetime.

Experimental metrics

Experimental metrics are available in the CrUX dataset via BigQuery, with some also available in the CrUX API. These metrics are likely to change regularly as they evolve based on user feedback. Check the release notes to keep up to date on the latest changes.

Time to First Byte

"Time to First Byte (TTFB) is a foundational metric for measuring connection setup time and web server responsiveness in both the lab and the field. It helps identify when a web server is too slow to respond to requests. In the case of navigation requests—that is, requests for an HTML document—it precedes every other meaningful loading performance metric."

web.dev/ttfb/

TTFB is only collected on full page loads, unlike other timers (such as LCP) which are also collected on back-forward navigations and pre-rendering. As such, the sample size of TTFB can be smaller than other metrics and may not necessarily be compared directly with them.

Interaction to Next Paint (deprecated)

Important

The Interaction to Next Paint (INP) metric is available both with and without the experimental prefix. The experimental prefix should now be considered deprecated and will be removed in August 2023. The non-prefixed schema should be used going forward.

Popularity

The popularity rank metric is a relative measure of site popularity within the CrUX dataset, measured by the total number of navigations on the origin. Rank is on a log10 scale with half steps (e.g. top 1k, top 5k, top 10k, top 50k, top 100k, top 500k, top 1M, etc.) with each rank excluding the previous (e.g. top 5k is actually 4k URLs, excluding top 1k). The upper limit is dynamic as the dataset grows.

Popularity is provided as a guide for broad analysis, e.g. to determine performance by country for the top 1,000 origins.

Notification Permissions

"The Notifications API allows web pages to control the display of system notifications to the end user. These are outside the top-level browsing context viewport, so therefore can be displayed even when the user has switched tabs or moved to a different app. The API is designed to be compatible with existing notification systems, across different platforms."

MDN

For websites that request permission to show users notifications, this metric represents the relative frequency of users' responses to the prompts: accept, deny, ignore, or dismiss.

Dimensions

The CrUX dataset includes dimension data to allow deeper interrogation of the data. Dimensions identify a specific group of data that a record is being aggregated against, e.g. a form factor of "phone" indicates that the record contains information about loads that took place on a mobile device.

Data may not be available for all dimensions, based on eligibility criteria.

Effective Connection Type

Effective Connection Type (ECT) is a web platform API to broadly categorize visitor connection speeds. This dimension in the CrUX dataset allows you to:

  • See a breakdown of connection speeds of real visitors
  • Filter performance data by connection speed

Note that the specification defines four connection types, the majority of visits will likely be on connections faster than 3G and thus be classified as 4G:

ECTMinimum RTTMaximum downlinkExplanation
offlineN/AN/AThe network is offline, only cached files can be served.
slow-2g2000ms50 KbpsThe network is suited for small transfers only such as text-only pages.
2g1400ms70 KbpsThe network is suited for transfers of small images.
3g270ms700 KbpsThe network is suited for transfers of large assets such as high resolution images, audio, and SD video.
4g0msThe network is suited for HD video, real-time video, etc.

Form Factor

The form factor dimension allows you to query against three separate form factors:

  • Phone
  • Tablet
  • Desktop

Form factor is inferred from the device User-Agent string.

Country

The country dimension was added to CrUX in 2018. The term "country" is used loosely, as some geographic areas are politically disputed. Values in the country dimension are inferred from users' IP addresses and represented as two-letter country codes as defined by ISO 3166-1.

Country-level datasets are provided in addition to the global dataset, with the standard eligibility requirements applied at a country level. A table is provided for each country, as well as summary tables which include the country code as a column.

Optional dimensions

As of the May 2022 release, the CrUX dataset supports optional dimensions. Previously, a form factor and effective connection type (ECT) combination must have independently met the sufficiently popular criterion or else it would be excluded from the page or origin record. With this feature, experiences on different ECTs could be combined by their common form factor, and the corresponding ECT value will be NULL. Experiences on different form factors may also be combined, and both the ECT and form factor values will be NULL.

Previously, form factor and effective connection type were required columns in our BigQuery tables. This meant that when we did not have sufficient coverage to express the histogram densities in the specific rows (e.g., form factor = phone, effective connection type = 2G), we were dropping the entire origin from the dataset. With optional dimensions, we made the form factor and effective connection type optional (NULLABLE) and therefore, we're now able to publish overall histogram densities in such cases; that is, we may set the effective connection type value to NULL indicating "all effective connection types", or we may set both effective connection type and form factor to NULL indicating "all effective connection types" and "all form factors".

Data quality

Data in CrUX undergoes a small amount of processing to ensure that it is statistically accurate, well structured and easy to query.

Filtering

The CrUX dataset is filtered to ensure that the presented data is statistically valid. This may exclude entire pages or origins from appearing in the dataset.

In addition to the eligibility criteria applied to origins and pages, further filtering is applied for segments within the data:

Origins or pages having more than 20% of their total traffic excluded due to ineligible combinations of dimensions are excluded entirely from the dataset.

Because the global-level dataset encompasses user experiences from all countries, combinations of dimensions that do not meet the popularity criteria at the country level may still be included at the global level, provided that there is sufficient popularity.

Fuzzing

A small amount of randomness is applied to the dataset to prevent reverse-engineering of sensitive data, such as total traffic volumes. This does not affect the accuracy of aggregate statistics.

Precision

Most metric values within the CrUX dataset are represented as histograms of values and bin sizes, where the histogram value is a fraction of all included segments summing to 1. Bin sizes are floating point numbers between 1.0 and 0.0001.

Histogram bin widths are normalized to simplify querying and visualizing the data. This means that larger bins may be split into smaller bins, which equally share the original density in order to maintain consistent bin widths.

Feedback and support

We would love to hear your feedback, questions, and suggestions to help us improve CrUX. Join the conversation on our public Google Group. We also tweet at @ChromeUXReport with updates.

There are a number of channels to receive support, depending on the type of support required:

chrome-ux-report on Stack Overflow
For questions about particular queries.
CrUX discussion on Google Groups
For general questions about the dataset.
HTTPArchive discussion forum
To share observations about the data.
GCP support
For formal BigQuery support.

License

CrUX datasets by Google are licensed under a Creative Commons Attribution 4.0 International License.

Updated on Improve article

This site uses cookies to deliver and enhance the quality of its services and to analyze traffic. If you agree, cookies are also used to serve advertising and to personalize the content and advertisements that you see. Learn more about our use of cookies.