Introducing Sentinel-2 Global Mosaic
The Sentinel-2 Global Mosaic (S2GM) service is a component of the Copernicus Global Land Service providing analysis-ready composites from time-series of Sentinel-2 surface reflectance observations for a variety of land applications. S2GM comprises best representative pixels in three spatial resolutions and from different compositing periods ranging from one day to one year, currently limited to Europe but very soon also for the entire globe.
Input for the S2GM service are the official Sentinel-2 Level 2A (L2A) products, which have been corrected for atmospheric influences and thus provide Bottom of Atmosphere reflectance values. ESA generates these products in the ground segment using the sen2cor software developed and maintained by DLR/Telespazio. Users may find extensive documentation on the input products to S2GM on ESA’s web pages, e.g. in the user guide, and the technical guide.
The L2A products come with a scene classification, which assigns a qualitative property, e.g. cloud, vegetation, snow, etc., to each pixel. It is used in S2GM to identify valid pixels for mosaic generation, hence removing all pixels related to clouds (cloud & cirrus) or cloud shadows, as well as saturated and defected measurements. The remaining L2A pixels are then fed into the S2GM algorithm. Consequently, the selection of input products and eventually the outcome of the mosaicking and composition are critically depending on the validity of the scene classification. Classification errors, like undetected clouds or misclassified snow, may thus impair the S2GM product quality considerably as they are passed on to the pixel selection algorithm.
In S2GM, two different algorithms are used for compositing depending on the number of valid observations left after filtering. For more than three valid observations, i.e. in long time periods (>monthly), the best representative pixel is chosen by applying a method called Medoid, which is conceptually similar to a mean or a centroid. The Medoid is, however, always a member of the set of observations, like a median in several dimensions. The spectra in S2GM products are therefore real observations rather than synthetically calculated values. This is the key difference to other compositing methods typically to Sentinel-2 Level 1C products, i.e. to Top-of-Atmosphere reflectance data, like WASP, Sentinel-2 Cloudless, or JRC Sentinel2 L1C cloud-free composites.
If the surface changes over time, arguably the most common case, the latter algorithms provide a mixed, computed spectrum of the different observations, while the Medoid selects one of the observed spectra, which is supposed to be the most representative one as it minimizes the distance across all dimensions between the valid observations. One of the immanent drawbacks of the Medoid is, however, that the resulting mosaics exhibit higher spatial heterogeneity compared to other methods with synthetic values. This effect may become particularly apparent for long compositing periods with only a few observations.
Figure 1 shows exemplary results of different compositing algorithms (geometric median, median and medoid) applied to the same Landsat 8 data conducted by Roberts et al. 2017. The comparison shows highest spatial homogeneity for the geometric median and the characteristic spatial variability for the Medoid. It also illustrates, however, that none of the algorithms fully preserves the spatial features identified in the input scenes, marked in orange, red, and blue here.
This is particularly apparent for the feature delineated by the orange line, which is clearly visible in the original scene (panel on the right). Note, that the examples are based on a low number of observations (9 inputs with 4 valid observations used for compositing). More observations in longer compositing periods lead to better spatial homogeneity of all used algorithms.
Figure 1: Comparison of geometric median, median and medoid against single clear observation. Three different land cover features are marked in red, orange and blue to compare spatial homogeneity between different algorithms. The geometric median shows the best homogeneity and the medoid the least. But none of the algorithms preserves the spatial features completely, especially for the orange marked feature. (Source: modified after Roberts et al. 2017)
As mentioned above, the advantage of the medoid is that it selects an original observation. Figure 2 shows a comparison of spectra of the different algorithms compared to all input spectra from a study conducted by Roberts et al. 2017. The Medoid shown as black dots always represents a real spectrum (black solid line), therefore the exact spectral relation of the measurement is only preserved by the medoid. In contrast, applying a geometric median to all bands leads to artificial spectra.
Figure 2: Spectral profiles of observations from all time periods represented by the Landsat 8 image stack for the four pixel locations #1 = (25, 5), #2 = (26, 23), #3 = (26, 24), #4 = (46, 43) within the area shown in Figure. 1. Dashed lines are observations removed through the pixel quality assessment. Black solid lines are retained observations except for the comparative clear observation (LS8 28.09.2014) that is shown in blue. Black dots highlight the medoid (a true pixel observation) and the red line is the calculated geometric median. (Source: modified after Roberts et al. 2017)
If less than four valid observations are left for compositing, the Medoid does not provide satisfactory results. Instead, a dedicated short-term compositing (STC) method is applied in S2GM. It is comparable to the approach adopted for Web-enabled Landsat data (WELD) and based on spectral comparisons. If only a single valid observation is available, it is provided as output.
For further information, the reader is kindly referred to the user manual
Roberts, D.; Mueller, N.; McIntyre, A. High-Dimensional Pixel Composites from Earth Observation Time Series. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6254–6264