the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improving reanalysis weather for contrail validation by incorporating satellite observations
Abstract. Aviation-induced condensation trails (contrails) contribute significantly to anthropogenic radiative forcing. While navigational contrail avoidance has been proposed as a strategy to mitigate this climate impact, the operational viability of such maneuvers relies on the ability to verify their efficacy. Current verification methodologies often employ contrail models (such as CoCiP) driven by reanalysis weather data; however, these assessments are limited by the variable fidelity of the underlying meteorological datasets. In this work, we address this uncertainty by leveraging satellite observations to refine reanalysis estimates for specific contrail events. We demonstrate that this approach significantly improves the agreement between reanalysis data and in-situ measurements obtained from the IAGOS program, thereby offering a more robust framework for evaluating avoidance strategies.
Competing interests: As denoted by their affiliations, some authors are employed by Google LLC. Google is a technology company that sells computing and machine learning services as part of its business.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(1273 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 15 Jun 2026)
- AC1: 'Minor data bug in "Improving reanalysis weather for contrail validation by incorporating satellite observations"', Scott Geraedts, 24 Apr 2026 reply
-
AC2: 'Comment on jecats-2026-6', Scott Geraedts, 24 Apr 2026
reply
We just discovered a minor data bug in the paper that we thought the community should be made aware of. In a few places when joining 2019 data with 2024 data, in around 15% of cases we accidentally treated a measurement from 2019 and a measurement from 2024 as the same measurement. Fixing this changes the numbers slightly in a way that we do not feel changes the conclusions of the paper. The performance of S_hybrid is overall slightly improved. The specific changes are as follows:
Fig 3: improve the TPR of all the points by around 1 percentage pointFig 5a: S_hybrid is slightly improved, moving the dashed line slightly to the right. The "coarse" point is dramatically improved, to the point that it no longer worse than S_{ensemble mean} and so I=we removed it.Lines 176-178: The following text is removed because it is no longer true: "To test this we downsampled the nominal weather to the same resolution as the ensemble weather, which we call S_nominal(coarse). We find this performs worse than the ensemble weather, as expected." We still believe that the nominal weather is outperforming the ensemble weather because it is running at higher resolution. We thought (incorrectly) that we'd proved this point by downscaling the nominal weather and showing that performance dropped below S{ensemble mean}. We now haven't proved this, but we still believe that running weather at higher resolution (even if it downsampled at the end) leads to improved performance.Table 2: the value shown is the benefit of combining observations with ensemble weather as opposed to just using ensemble weather, for ERA5 value improves from 0.018 +- 0.007 to 0.021 +- 0.006Fig 6: S_hybrid is slightly improved, moving the dashed lines all move slightly to the rightFig 7: (a) all points increase TPR by ~1 percentage point (b) S_hybrid is slightly improved, moving the dashed lines slightly to the right. Add a new 'More Distance" caseTable 4: Add a new "more distance" caseThese corrections will be made in future revisions of the paperCitation: https://doi.org/10.5194/jecats-2026-6-AC2 -
RC1: 'Comment on jecats-2026-6', Anonymous Referee #1, 30 Apr 2026
reply
The paper introduces a method to combine satellite-based contrail observations with weather predictions and in-situ measurements for assessing contrail related weather predictions. The paper demonstrates that this approach improves the agreement between reanalysis data and in-situ measurement, and, hence, improves the framework for evaluating avoidance strategies.
The paper is rather straightforward and can be published soon, after clarifying or correcting some text issues, including the technical corrections identified by the authors in their comment themselves.
The only general critics I want to see reflected by the authors is the following. The method described requires measurements and reanalysis data which are not available quickly after the event considered. What can be done to reduce the time interval between the event to be assessed and the completion of the data analysis? Ideally, a pilot wants to know how well he avoided contrail formation immediately after the flight.
Line 26 -> and so cannot see every flight
Fig 1: the yellow region is hard to see.
and
How can I see that ‘Ensemble 0’ matches the observed contrail better?
Line 66: “is the prior” : I miss a subject behind prior.
Line 101: Please explain the ‘histogram matching’ method.
Line 75: I cannot believe that IFS relies only in ensembles which use the ‘pressure-level’ data at the rather coarse resolution of 25hPa.
The two references Driver (2025a,and b) are actually referring to the same paper.
Citation: https://doi.org/10.5194/jecats-2026-6-RC1 -
RC2: 'Comment on jecats-2026-6', Anonymous Referee #2, 05 Jun 2026
reply
The paper combines satellite contrail observations with weather ensembles and IAGOS in-situ measurements to assess contrail-formation predictions, and shows the hybrid score agrees better with the in-situ data than weather alone. The approach is sound and the paper is clearly written. I think it can be accepted after a few clarifications.I also think the broader direction is the interesting part: using contrail detections to inform the weather side, rather than only using weather to predict contrails. That seems worth pursuing well beyond this paper.Most of my comments are about how in-situ measurements, flights and satellite detections are matched.First, something I could not reconcile. The chosen distance threshold is 50 km in Eq. 2, but the "Base case" row of Table 4 says 15 km (and 50 km then appears in the separate "Distance" row). These can't both be the configuration used everywhere. Since this threshold defines the neighbourhood that produces the observed/not-observed labels feeding Eq. 1, it matters which one was actually used. Could you confirm the deployed value and make Eq. 2 and Table 4 agree?On the altitude tolerance: 20 m is at the low end of the searched range (10–300 m). The weather is spaced ~10–25 hPa in the vertical, a few hundred metres near cruise, and supersaturated layers are usually deeper than that, so two waypoints 20 m apart sit in the same grid cell and the same layer. What is the 20 m meant to capture? It may well be the conservative choice given the "any nearby contrail" rule, and Table 4 includes a 50 m variant that looks similar, so I am not asking you to change it — just to say in one line why 20 m.The satellite sees the contrail 20–30 min after it forms, so detections are advected back to the flight with ERA5 winds. The advection removes the bulk transport. What is left is the wind error over those 20–30 min, of order a few km for a ~1–3 m/s wind error, so small against the matching distance. But it is spatially correlated, not random noise, and it feeds the observed/not-observed label. A line on its likely size, and on whether the synthetic TPR/FPR calibration already accounts for it, would help.On §3.1: IFS differs from the ERA5 ensemble in resolution and member count as well as in being a forecast, and you already attribute the mixed result to resolution. So I would just say explicitly that reanalysis vs forecast can't be cleanly separated here. S_nominal(coarse) already isolates resolution and is the obvious place to start for a fuller comparison.One more, minor: the agreement results (~90k IAGOS points, 2019 and 2024) and the energy-forcing results (~430k flights, one day per month, June 2024–May 2025) use two different datasets over different periods. Worth saying so plainly so the reader doesn't read it as one validation set.Citation: https://doi.org/
10.5194/jecats-2026-6-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 160 | 80 | 18 | 258 | 19 | 23 |
- HTML: 160
- PDF: 80
- XML: 18
- Total: 258
- BibTeX: 19
- EndNote: 23
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
We just discovered a minor data bug in the paper that we thought the community should be made aware of. In a few places when joining 2019 data with 2024 data, in around 15% of cases we accidentally treated a measurement from 2019 and a measurement from 2024 as the same measurement. Fixing this changes the numbers slightly in a way that we do not feel changes the conclusions of the paper. The performance of S_hybrid is overall slightly improved. The specific changes are as follows: