There are a number of reasons why your Lighthouse scores in SpeedCurve might be different from the scores you see in other tools, such as Page Speed Insights. Here are a few common reasons, along with our recommendation not to focus too much on getting your scores to "match".
Different Lighthouse versions
While PageSpeed sometimes runs the same version of Lighthouse that we do, from time to time they may be out of sync with us. (Note that we typically run the latest release. We are currently extending our testing of Lighthouse v6 due to many changes included in the release.)
TTI and different test environments
The performance score is strongly influenced by TTI, which can be quite different depending on the test environment. You can’t really compare them directly like you could with the older version. The Lighthouse team have written some great background on what can cause variability in your scores.
At the bottom of the Lighthouse reports it says "Provided by environment" for network and CPU throttling. This is because we apply the throttling at the OS level, which is more accurate than Chrome's built-in throttling. We always use a 3G network speed, and we don't apply any CPU throttling.
Separate page loads
The Lighthouse report doesn't reuse the same page load that SpeedCurve/WebPageTest does. It's a separate page load done at the end of the SpeedCurve test so the metrics numbers will be different. You can't directly compare the metrics like you see in the SpeedCurve UI like "Time To Interactive", with the metrics in the Lighthouse report. Depending on the nature of your page and the network throttling in your test settings, there could be a some variation in each page load.
Our recommendation: Don't overly focus on wondering why your metrics don't "match"
We've tried to make it easy to compare the recommendations and metrics coming from those different sources. But we recommend that you don't overly focus on wondering why your metrics don't "match". The idea is to have them all in one place so you can compare and decide which to focus on.
You shouldn't really consider any of these metrics as "reality" – that's what RUM is for. Synthetic testing is more about establishing a baseline in a clean and stable environment, and then improving those metrics by X% over time.