2  Pressure map

This second chapter of the basic tutorial covers the main steps to determine the position of a bird from pressure data. This code is a direct implementation of the method introduced in Nussbaumer et al. (2023)

What is a GeoPressureR map?

In GeoPressureR, a map object is a container for a spatio-temporal variable, such as the likelihood of pressure. It is therefore defined by its spatial characteristics (extent and scale) and temporal dimension in terms of stationary periods. The data is stored as a list of matrices: one matrix (or layer) for each stationary period.

See map_create() for technical characteristics of a GeoPressureR map.

2.1 Define geographical and temporal parameters of the map

The first step is to define the characteristics of the map, by setting the following parameter to the tag object.

tag <- tag_set_map(tag,
  extent = c(-100, -68, 0, 40), # coordinates of the map to request (W, E, S, N)
  scale = 2, # request on a 1/2=0.5° grid, coarse, but fast
  known = data.frame(
    stap_id = 1,
    known_lat = 37.286812,
    known_lon = -82.304972
  )
)

Read more about the meaning of these parameters on the documentation of the function tag_set_map(). Here are a few additional indications to select optimal parameters:

  • A smaller map (small scale and tight extent) results in faster computation. However, if the extent is too small and excludes the true position of the bird, the process will still produce a trajectory, but a wrong trajectory. We recommend starting with a coarse scale = 1 or 2 and a large extent and refining these when you have a better idea of the trajectory and have completed the labeling.
  • Using the known position can significantly speeds up the computation as it excludes these stationary periods with long timeseries from the computation.
  • Depending on the precision of the trajectory that you need for your study, include_min_duration can be of great help to reduce the complexity of the labeling and computation.

2.2 Compute pressure maps

We are now ready to create the pressure maps!

To do so, we must match the pressure timeseries of each stationary period with the surface level pressure dataset of ERA5-Land hourly (Copernicus Climate Change Service 2019) for all possible pixels of the maps.

How does the GeoPressureAPI work?

To overcome the challenges of handling the large ERA5 dataset, we perform the mismatch computation on the Google Earth Engine (GEE) server which has access to the ERA5 dataset and directly returns the map of mismatch.

GeoPressureR uses the GeoPressureAPI which serves as an interface to the GEE server.

The function geopressure_map() conveniently performs all the necessary steps, but we outline each step below for a comprehensive understanding.

2.2.1 Compute mismatch maps

tag <- geopressure_map_mismatch(tag,
  max_sample = 100,
  margin = 20,
  thr_mask = 0.95,
  quiet = TRUE
)
  • max_sample reduces the computational time by limiting the number of data-points used in the match. This only impacts long stationary periods where the position is well defined. During labelling, or when accuracy is not critical, it can be convenient to reduce this number between 50 to 100.
  • margin can be reduced to 10 or 20 if your bird does not change elevation level during its stationary period.
  • thr_mask filter map based on absolute pressure threashold already on the GEE server to drastically reduce computational time (see below for details). It generally has little influence and can usually be left at 0.9.
Taking a long time to compute?

This function is the most computationally intensive as it needs to:

  1. Pre-process pressure: the pressure measurements are first smoothed and downscaled to a 1-hour resolution in order to match ERA-5 resolution (see geopressure_map_preprocess()).
  2. Generate requests: Send a single request to the GeoPressureAPI to generate the Google Earth Engine (GEE) URLs, one for each stationary period which can be used to compute the maps on the GEE server. At this stage, no computation has been performed, we just generated the actual code.
  3. Send the requests: Call the URLs in parallel, which will start the computation on the GEE server.
  4. Compute and download the maps: When all requests are sent, we wait for the GEE server to return a geotiff file (map) for each stationary period.

A progress bar will update you on the completion status, but the timing can be tricky to apprehend because of the computational optimization used and variability in the GEE server availability.

This function returns the tag with two maps: - tag$map_pressure_mask \(\textbf{z}_{thr}\) is a GeoPressureR map of the proportion of data-points in the pressure timeseries which correspond to an altitude that falls between the min and max altitude of each grid cell (accounting for the margin parameter). - tag$map_pressure_mse \(\textbf{MSE}\) is a GeoPressureR map of the normalized mean square error between the tag pressure timeseries and ERA5 map. The mean error is removed because we assume no specific altitude of the tag, thus allowing an altitudinal shift of the pressure timeseries. For computational efficiency, \(\textbf{MSE}\) is only computed on the pixels for which \(\textbf{z}_{thr}>thr_{mask}\).

plot(tag, type = "map_pressure_mse")

This is an alternative and identical way to plot a map from a tag.

plot(tag$map_pressure_mask)

2.2.2 Compute likelihood maps

We combine and convert these two maps into a single likelihood map using \[f \propto \exp \left(-w(n) \frac{\textbf{MSE}}{\sigma^2} \right) [\textbf{z}_{thr}>thr_{mask}]\] where \(\sigma\) is the standard deviation of pressure error and \(thr_{mask}\) is the threshold of the mask.

Because the auto-correlation of the timeseries is not accounted for in this equation, we use a log-linear pooling weight \(w(n)=\log(n)/n\), where \(n\) is the number of data-points in the time series. See Probability aggregation for more information on this.

tag <- geopressure_map_likelihood(
  tag,
  sd = 0.5,
  log_linear_pooling_weight = \(n) 4 * log(n) / n
)
Calibrating sd

The standard deviation sd (\(\sigma\)) plays an important role in the spread of the uncertainty of your map. It accounts for (1) error in reanalysis data (may vary spatially, but generally assumed small), (2) sensor error (also assumed low), and (3) bird vertical movement (most significant contribution). A bird moving up and down a tall tree (1 hPa = 10 m) is likely to affect the match.

This value should ideally be calibrated, but it is usually safe to start with a value of 1. We see how to adjust this value in check #4 of the labelling procedure. Swainson’s Warbler tend to stay low with small pressure variation such that a value of 0.5 is more adequate. In addition, following several trial and error test, we found that scaling the log linear pooling function to 4 * log(n) / n provides more satifying result.

The resulting pressure likelihood map can be visualized with:

plot(tag$map_pressure)

Note that the threshold of the mask is performed directly on the GEE server, i.e., with geopressure_map_mismatch(). This allows to compute the MSE only for pixels which are within the threshold, thus reducing the computational cost significantly.

The geopressure_map() function is a wrapper of geopressure_map_mismatch() and geopressure_map_likelihood(). By default, it delete the mask and MSE map to save space (see the keep_mask and keep_mse parameters).