tag <- tag_set_map(tag,
extent = c(-100, -68, 0, 40), # coordinates of the map to request (W, E, S, N)
scale = 2, # request on a 1/2=0.5° grid, coarse, but fast
known = data.frame(
stap_id = 1,
known_lat = 37.286812,
known_lon = -82.304972
)
)
2 Pressure map
This second chapter of the basic tutorial covers the main steps to determine the position of a bird from pressure data. This code is a direct implementation of the method introduced in Nussbaumer et al. (2023)
map
?
In GeoPressureR, a map
object is a container for a spatio-temporal variable, such as the likelihood of pressure. It is therefore defined by its spatial characteristics (extent
and scale
) and temporal dimension in terms of stationary periods. The data is stored as a list of matrices: one matrix (or layer) for each stationary period.
See map_create()
for technical characteristics of a GeoPressureR map
.
2.1 Define geographical and temporal parameters of the map
The first step is to define the characteristics of the map
, by setting the following parameter to the tag
object.
Read more about the meaning of these parameters on the documentation of the function tag_set_map()
. Here are a few additional indications to select optimal parameters:
- A smaller map (small
scale
and tightextent
) results in faster computation. However, if the extent is too small and excludes the true position of the bird, the process will still produce a trajectory, but a wrong trajectory. We recommend starting with a coarsescale = 1 or 2
and a largeextent
and refining these when you have a better idea of the trajectory and have completed the labeling. - Using the
known
position can significantly speeds up the computation as it excludes these stationary periods with long timeseries from the computation. - Depending on the precision of the trajectory that you need for your study,
include_min_duration
can be of great help to reduce the complexity of the labeling and computation.
2.2 Compute pressure maps
We are now ready to create the pressure maps!
To do so, we must match the pressure timeseries of each stationary period with the surface level pressure dataset of ERA5-Land hourly (Copernicus Climate Change Service 2019) for all possible pixels of the maps.
To overcome the challenges of handling the large ERA5 dataset, we perform the mismatch computation on the Google Earth Engine (GEE) server which has access to the ERA5 dataset and directly returns the map of mismatch.
GeoPressureR uses the GeoPressureAPI which serves as an interface to the GEE server.
The function geopressure_map()
conveniently performs all the necessary steps, but we outline each step below for a comprehensive understanding.
2.2.1 Compute mismatch maps
tag <- geopressure_map_mismatch(tag,
max_sample = 100,
margin = 20,
thr_mask = 0.95,
quiet = TRUE
)
-
max_sample
reduces the computational time by limiting the number of data-points used in the match. This only impacts long stationary periods where the position is well defined. During labelling, or when accuracy is not critical, it can be convenient to reduce this number between50
to100
. -
margin
can be reduced to10
or20
if your bird does not change elevation level during its stationary period. -
thr_mask
filter map based on absolute pressure threashold already on the GEE server to drastically reduce computational time (see below for details). It generally has little influence and can usually be left at0.9
.
This function is the most computationally intensive as it needs to:
-
Pre-process pressure: the pressure measurements are first smoothed and downscaled to a 1-hour resolution in order to match ERA-5 resolution (see
geopressure_map_preprocess()
). - Generate requests: Send a single request to the GeoPressureAPI to generate the Google Earth Engine (GEE) URLs, one for each stationary period which can be used to compute the maps on the GEE server. At this stage, no computation has been performed, we just generated the actual code.
- Send the requests: Call the URLs in parallel, which will start the computation on the GEE server.
- Compute and download the maps: When all requests are sent, we wait for the GEE server to return a geotiff file (map) for each stationary period.
A progress bar will update you on the completion status, but the timing can be tricky to apprehend because of the computational optimization used and variability in the GEE server availability.
This function returns the tag
with two maps: - tag$map_pressure_mask
\(\textbf{z}_{thr}\) is a GeoPressureR map
of the proportion of data-points in the pressure timeseries which correspond to an altitude that falls between the min and max altitude of each grid cell (accounting for the margin
parameter). - tag$map_pressure_mse
\(\textbf{MSE}\) is a GeoPressureR map
of the normalized mean square error between the tag pressure timeseries and ERA5 map. The mean error is removed because we assume no specific altitude of the tag, thus allowing an altitudinal shift of the pressure timeseries. For computational efficiency, \(\textbf{MSE}\) is only computed on the pixels for which \(\textbf{z}_{thr}>thr_{mask}\).
plot(tag, type = "map_pressure_mse")
This is an alternative and identical way to plot a map
from a tag
.
plot(tag$map_pressure_mask)
2.2.2 Compute likelihood maps
We combine and convert these two maps into a single likelihood map using \[f \propto \exp \left(-w(n) \frac{\textbf{MSE}}{\sigma^2} \right) [\textbf{z}_{thr}>thr_{mask}]\] where \(\sigma\) is the standard deviation of pressure error and \(thr_{mask}\) is the threshold of the mask.
Because the auto-correlation of the timeseries is not accounted for in this equation, we use a log-linear pooling weight \(w(n)=\log(n)/n\), where \(n\) is the number of data-points in the time series. See Probability aggregation for more information on this.
tag <- geopressure_map_likelihood(
tag,
sd = 0.5,
log_linear_pooling_weight = \(n) 4 * log(n) / n
)
sd
The standard deviation sd
(\(\sigma\)) plays an important role in the spread of the uncertainty of your map. It accounts for (1) error in reanalysis data (may vary spatially, but generally assumed small), (2) sensor error (also assumed low), and (3) bird vertical movement (most significant contribution). A bird moving up and down a tall tree (1 hPa = 10 m) is likely to affect the match.
This value should ideally be calibrated, but it is usually safe to start with a value of 1
. We see how to adjust this value in check #4 of the labelling procedure. Swainson’s Warbler tend to stay low with small pressure variation such that a value of 0.5
is more adequate. In addition, following several trial and error test, we found that scaling the log linear pooling function to 4 * log(n) / n
provides more satifying result.
The resulting pressure likelihood map can be visualized with:
plot(tag$map_pressure)
Note that the threshold of the mask is performed directly on the GEE server, i.e., with geopressure_map_mismatch()
. This allows to compute the MSE only for pixels which are within the threshold, thus reducing the computational cost significantly.
The geopressure_map()
function is a wrapper of geopressure_map_mismatch()
and geopressure_map_likelihood()
. By default, it delete the mask and MSE map to save space (see the keep_mask
and keep_mse
parameters).