How to Read a RUM Histogram

Reading a histogram

Performance is a distribution, not a single number. A histogram represents the distribution of your users who had a range of experiences when visiting your site. Looking at your RUM data in the form of a histogram allows you to think of the shape of performance and how you want to influence it. Faster shifts left, slower shifts right.

1402

Understanding Percentiles

This is key when you want to communicate a single number that represents a population. Percentiles represent segments of your population from 0-100. You don’t need to be a statistician to do this. Just think in terms like this:

50th percentile (median)

The 50th percentile, aka the median, is arguably the easiest to understand and communicate. You can even refer to it as an average if you want.

1402

75th percentile

For some popular metrics, such as Core Web Vitals, the 75th percentile is being used for reporting.

1402

95th percentile

If your site is already pretty fast for most users, you can focus on making it fast for your long tail. Five percent of your users may not sound like much, but if you site gets 10M visitors a month, that means 500,000 of those people are having a really poor experience.

1402

Exploring real-world data

Now that you understand what a histogram represents, here are a few common things to look for when trying to understand a population of users.

"There's no way our site is this fast/slow"

It's very common for someone looking at RUM data for the first time to feel a little bit apprehensive. This is especially true if you are coming from a synthetic monitoring background where you are controlling all of the variables that may impact a user's experience.

Faster pages

1590

Drivers

  • Cached navigation (full browser cache or back/forward navigation)
  • Refresh of the page
  • Text only version of page using reader mode or service
  • Other unknown/unfiltered bot traffic
  • Top of the line hardware

Slower pages

1586

Drivers

  • First time visits (empty browser cache)
  • Remote location or under-serviced country or region
  • Network saturation in over populated areas
  • Other poor network conditions ("The Internet")
  • Airplane wifi
  • Lower-end, underpowered hardware

📘

Further reading:

A lumpy curve to the shape of your histogram

When looking at RUM data, it's not uncommon to run into a distribution that may look something like this:

1584

This is known as a multi-modal distribution. More simply put, there are more than one groups or populations that have different performance characteristics represented in the histogram.

1590

When we apply a browser level filter, we can see that the modality is driven by a high number of Firefox users, who appear to be having a much faster experience. Other common cohorts to explore:

Sparse data or data with little variation

Sometimes histograms can look a bit sparse or incomplete.

1586

This may mean the following:

  • You need to look at a larger time window
  • The filter you applied is too restrictive/not representative of your population
  • You aren't sampling enough
  • The data just isn't interesting or relevant

Other times, the data may look less variable than you expected.

1590

In most cases, this just means the data isn't influenced by as many factors. This can be true of metrics like CLS, DNS Lookup, and other page construction metrics like sizes and request counts.

📘

Exploring your own data

John Rauser delivered one of our all-time favorite talks at Velocity Conference in 2011 called 'Look at your data'.. This remains one of the best explainers on the subject of distributed populations applied to web performance/operations monitoring.

Histogram correlations

Another place you'll see histograms popping up is in correlation charts. One of the unique things we focus on with RUM is the impact of performance on business outcomes. This can be modeled by looking at the distribution for a specific metric and correlating it with a behavior-based metric such as bounce rate, conversion rate, etc.

1584

In the 'Users' dashboard, you can see the impact of metrics like start render on the bounce rate or conversion rate (if defined).

Correlation charts allow you to look at the impact of a specific variable on a metric by holding the other variables constant. What we take from this, and other examples you may create in your custom dashboards, is that there is a strong correlation between performance (start render) and user behavior (bounce rate).