co2ordinate Accuracy Assessment

Background

Accuracy will most likely be important to the success of this tool, internally and externally. We don't want a tool that's wildly inaccurate, either in terms of the recommended location to meet, or in the amount of GHG that are emitted to get there.

However, the degree of acceptable inaccuracy is not known. How far from the perfect meeting location is acceptable? What percentage off from the true emissions are acceptable? What is our baseline for emissions accuracy?

To at least ensure that the tool is in the same "ballpark", we can compare our results with other publicly-accessible flight emissions estimates.

Caveats

Our tool is assuming direct flights (probably the largest source of inaccuracy) because we don't have access (at the moment) to actual flight routes
We don't have access to the actual type of aircraft that is being flown, so we're making some generalizations.

Test

I'll test with 10 random DS member's locations, all flying one way to IAD (Washington, D.C.). I'll compare with Google Flights and the results from @nerik's Observable Notebook.

I'll search a 1-week window and will try to include the connection with the least amount of transfers (direct if possible), and will choose the option with the lowest amount of CO₂e from the options of the least amount of connections. The results from the tool were halved to represent 1 way flights. (is this correct that the tool currently shows round-trip impacts? @nerik

- = no direct flight found in Observable / = airport not listed in Observable

Team Member	Home Airport	Co₂ordinate (kg CO₂)	Google Flights (kg CO₂e)	Observable (kg CO₂e)	Co₂ordinate vs Google (100% means equal)
Bob	GVA	990	530	645	187%
Jane	CDG (RNS didn't appear in Observable)	890	319	481	279%
Muhammed	ICN	1670	965	1066	173%
Doug	YVR (YLW didn't appear in Observable)	500	209	228	239%
Katrina	EDI	840	504	489	167%
Paul	LIM	870	387	-	225%
Sandra	SMF	570	275	-	207%
Chen	BEY	1405	552	-	255%
Pablo	TUS	465	196	/	237%
Phil	GOI	1990	795	/	250%
TOTAL		10190	4732		215%

Next steps

Co₂ordinate estimates are 1.6-2.8x that of Googles, averaging 2.15x. It'd be worth looking into:

Why?
If the range of discrepancy (1.6-2.8x in this test) impact the recommended meeting location (since some team member's estimated impacts are more different than others, compared to the results from Google Flights).
If 1 and 2 are acceptable to users.

cc @kamicut @nerik @wildintellect @LanesGood

Apr 14 '23 07:04 wrynearson

This feels like a Data Science and Quality Assurance task. Perhaps bring in @kathrynberger and the Data Team to work on the data sourcing, algorithm, and coming up with tests to verify our results.

Apr 14 '23 15:04 wildintellect

@wrynearson Hey thanks for that, super insightful 👍

is this correct that the tool currently shows round-trip impacts

Yes I actually fixed that slight omission back in January 🙄

I am a bit surprised by the gap between the Observable notebook and Google Flights. One should assume that Google Flights uses Google's Travel Impact model API. Have you picked the lowest or highest value amongst flights for the same route? Might also be a function of the time of the year.
Personally, I'm not bothered too much by Meet & Greta estimates are 1.6-2.8x that of Googles for the many reasons we've discussed on Slack and elsewhere. However,
The range of discrepancy (1.6-2.8x in this test) is more preoccupying IMHO.

Apr 18 '23 14:04 nerik

Thanks @nerik. I chose the lowest emission value from the least amount of connections. E.g., if there are 5 options with 1 connection, I picked the least emissions of the five. I just picked a random day, so is it possible that there were lower emission flights that Observable is picking up on a flight I didn't see? That sounds unlikely.

Agreed that if we're relatively consistently more than Google, that's OK – we can defend that Google's model is less holistic. But we'll need to test the range of discrepancy.

@kathrynberger flagged interest in this project, including around more in-depth accuracy assessments. I don't think that makes sense to do this quarter given we only have 1 sprint of labs work for this, but we can keep that offer for another time. As long as we're not recommending the wrong location by a wide margin, I think we're OK

Apr 18 '23 15:04 wrynearson