Notation
Before diving into coding, let us establish some common notation so we understand what we are estimating.
Ecological data on species density often exhibits zeros. This occurs because species may be patchily distributed or because sampling methods may fail to detect individuals at low densities. Since densities are continuous non-negative quantities, we cannot use discrete probability distributions (e.g., Poisson or Binomial) to model this type of phenomena. However, standard continuous probability distributions assign a zero probability to specific values (that is, , for any constant and continuous probability distribution). To address this, we use zero-augmented (Hurdle/Delta) models, which explicitly account for the excess zeros by combining a discrete and a continuous probability distribution through two components:
A probability of observing a zero (or probability of absence): This accounts for the “true” absence or non-detection of the species.
A continuous distribution for positive values: This models the density of the species when it is present.
Formally, let be a random variable representing the density (individuals per unit of area) of a focal species at time and patch/site . We denote realizations (i.e., what we observe) of by . We define a zero-augmented probability density function (pdf) as follows: where
is the probability of observing a zero density.
is the pdf of a continuous probability distribution with expected value (or theoretical mean) and additional parameter . This distribution represents the densities of the species when they are present.
In Equation 1, we can choose different probability distributions to specify the pdf . Popular choices include the Log-Normal, Log-Logistic, and Gamma distributions, which can be parametrized in terms of their mean facilitating modeling (Ye et al. 2021).
Species Distribution Models
A typical SDM based on Equation 1 relates the parameters of the zero-augmented distribution to environmental covariates. Let represent time points and represents patches/sites. The first assumptions of SDMs is that are independent random variables when conditioned on the environmental factors. From here on, only SDMs based on (generalized) linear models will be discussed.
We define as a vector of covariates related to the probability of observing a zero density at time and site , and as a vector of covariates related to the (theoretical) mean density when the species is present. Note that, and may depend on a different set of environmental factors.
We define linear-predictors as follows: where and are vectors of regression coefficients. Finally, through the introduction of suitable link functions, and , we relate the linear predictors (and consequently environmental factors) to the probability of absence and mean density as follows:
The link functions will ensure that the estimated absence probabilities and mean densities are on the appropriate scale. For instance, should be a function that take a number between 0 and 1 as input and outputs a real number. Common choices for are the and the functions. The latter is asymmetric and preferred when the proportion of zeros is very high or low. For , the function is often used as it ensures that the predicted mean densities are positive.
The assumption of independence can be relaxed by introducing random effects into the model.
Dynamic Range Models
Similar to SDMs, a DRM based Equation 1 also relates the parameters of the zero-augmented distribution to environmental covariates. However, the DRMs offer a more comprehensive approach by incorporating explicit representations of the underlying biological processes that govern species distributions. This advancement enables the integration of expert knowledge and empirical data regarding species demography, dispersal patterns, and mechanistic interactions with environmental factors (Pagel and Schurr 2012).
The assumptions of our DRM are placed on the mean density , with the probability of absence remaining the same as in the SDM. In fact, those assumptions can be interpreted as a non-linear function of the environmental variables. In particular, we make assumptions about the (unobserved1) age-structure behind the densities at a given time point and patch.
Define as a random variable representing the unobserved density (individuals per unit of area) for individuals of age , where . Similarly, we denote the expected density (or theoretical mean for a given age, time, and patch) as: We do not observe and assume the previously defined (overall) density, which we actually observe, to be the sum of the densities across all age groups. In other words, . Consequently,
Biological processes and additional assumptions are encoded through the expected “age densities” . Some key process we consider are: recruitment, survival, and movement. They are described below.
Recruitment
We begin by defining the expected density for recruits (fish at age ) at time and patch as:
where represents the overall mean recruitment per unit of area. The assumption in Equation 5 is quite restrictive. To make the model more realistic, we can replace the constant with
where~ is the vector of environmental drivers associated to recruitment at time~ and patch~, is a vector of corresponding regression coefficients. This depicts a log-linear relationship between recruitment and the environment.
To account for the influence of time on recruitment (or SDM densities), we introduce a random effect in Equation Equation 6. In the current version of the model, this random effect is the same across all groups for a given time point, meaning for every . This creates a temporal dependence, where the recruitment or density at one point in time is related to its value at the previous time point. We model this temporal dependence using an autoregressive process of order 1 (AR(1)). Formally, this is defined as:
where are independent and identically distributed residual terms from a zero-mean Normal distribution with variance , and is the temporal autocorrelation parameter. In this model, the recruitment (or density from Equation 2) at time is a function of its value at time , plus a random error term.
Survival
In the proposed model, the expected density at age , time , and patch evolves over time according to survival rates denoted . Formally, we have
As with recruitment, the survival rates may vary by patch/site and time :
where represents environmental drivers potentially different from those affecting recruitment, and are associated regression coefficients. Note that, despite the age-specific index~, the survival rates are constant across age-groups.
In Equation 8, we may use external information to allow for different survival rates across age-groups. For instance, in fisheries research, stock assessments often provide estimates of age-specific fishing mortality instantaneous rates for several species. It is reasonable, if not logical, to assume that the survival rates for an age-group with a higher fishing mortality is lower. In that case, Equation~ may be updated to:
where~ is the instantaneous rate of fishing mortality for the age-group~ at time~.
In our current implementation, we do not allow for a random effect on mortality/survival yet.
Movement
Movement is an important demographic process affecting species’ densities. In our package, we implement a simplistic movement routine described as follows. Denote the total number of patches in the present study. For each age-group, we define a movement matrix denoted . The element in represents the probability that individuals of age move from patch at time to patch at time . The movement matrix is derived from adjacency matrix , which encodes connections between patches. In , the element is given by
where is the number of neighbors of patch , and indicates that and are neighbors (i.e., share borders).
In addition to the adjacency matrix , the user must also indicate the age-groups ``allowed’’ to move. If an age group is not assumed to move, the movement matrix becomes the identity matrix. Given those inputs, we define as the probability of individuals remaining in the same patch between two time points. The movement matrix is then constructed as:
where is a identity matrix. In other words, while the probability of remaining on the same patch if fixed, the probability of moving~(i.e., ) is evenly distributed across neighbors. We could extend this setting in, at least, three ways: (1) allowing to vary with time and patch; (2) having different movement probabilities for different age-groups, and (3) make the movement probabilities to vary according to environmental conditions~. Any of these extensions, however, would represent significant increase in computation. Therefore, for now, we opted to stick with the simplest option.
Once a movement matrix is defined, we need to adjust the age-specific expected densities accordingly. Let be a vector of length (number of patches) representing the expected densities of age at time for every patch in the given study. After movement, the expected density is:
where is the aforementioned movement matrix.
Selectivity at age
Our main data source, trawl surveys, often exhibit age-based selectivity, where younger fish have a lower probability of being caught due to their smaller size. In particular, the probability of capturing an animal of a given length given it encounters the gear is denoted selectivity at length. On the other hand, selectivity at age is the probability of capturing a fish of a certain age, provided it “encountered the gear”. We represent selectivity at age by the vector , where is the total number of age groups. Each element of this vector is a value between 0 and 1, inclusive, representing the selectivity for age group . Our package does not estimate and, unless provided by the user, assumes the selectivity is 1 for every age group.
When a vector is provided, our model assumes:
and, consequently,
Note that, our model cannot estimate selectivity. Instead, the model assumes it is provided as known without error. Of course, this is an oversimplification, however, we seldomly have the necessary data to estimate selectivity readily available. cvf
Bayesian Inference and Forecasting
To complete the Bayesian model specification, we need to specify prior distributions for all model parameters. This involves defining our prior beliefs about the likely values of these parameters before observing any data.
First, we define the priors for the fixed effects regression coefficients, typically denoted by with superscrip. The default priors for these coefficients are uncorrelated zero-mean Normal distributions with marginal variances of 1. The parameter , which controls the scale of the non-zero part of the pdfs in Equation 1 (e.g., Log-Normal, Gamma, or Log-Logistic), has a Gamma prior with shape and rate parameters equal to 2 and 1, respectively. In general, the scale of this parameter varies with the pdf chosen.
Next, we specify the priors for the parameters associated with the AR(1) process in Equation Equation 7. We take a hierarchical approach, placing a Normal prior on the log of the square root of the conditional variance (i.e., the log of the conditional standard deviation). This prior has a default mean of and and a standard deviation of 0.25, favoring a model with no AR(1) term. For the autocorrelation term, we employ a beta prior on . The default parameters on this prior are and . Lastly, we the same Beta prior on . The priors and default hyperparameters are highlighted in the table below:
Parameter prior | Default hyperparameters |
---|---|
Denote by a set containing all the parameters associated with the model we have chosen. The posterior distribution of (and ) given the observed data and environmental factors is
where denotes prior and posterior distributions of its arguments. Importantly, the distribution placed on the random effects also functions as a prior, as we draw samples from these random effects distribution conditioned on the observed data.
Parameter estimation
The estimation of the model parameters is obtained from the posterior distribution defined in Equation 11. As the posterior distribution is intractable, we use the No-U-Turn (Hoffman and Gelman 2014) Markov Chain Monte Carlo~(MCMC) sampler available in Stan
to draw samples from it. The No-U-Turn algorithm samples all the model parameters jointly and efficiently exploits the parameter space using automatic differentiation. Additionally, it eliminates the need for hand-tuning, making it a highly convenient sampler. We initialize the parameters with samples from their respective prior distributions. The latent random effects are initialized from a standard Gaussian distribution. The number of samples and the warm-up period of the MCMC algorithm are application dependent. We assess the convergence of the chains using the split- diagnostic (Vehtari et al. 2021). Finally, the parameters’ point estimates and 95% credible intervals~(CI) are obtained as, respectively, the median and percentiles ( and , unless otherwise stated) of the marginal MCMC samples.
Model comparison
The assessment of goodness-of-fit GoF is carried out using the leave-one-out information criterion [Vehtari, Gelman, and Gabry (2017); LOOIC]. Lower LOOIC values indicate a better fit.
Predictions/Forecasting
Given the posterior MCMC samples, we can obtain Monte Carlo samples from the posterior predictive distribution under different environmental conditions and make forecasts. Suppose we wish to compute forecasts time points ahead. We define an -dimensional variable representing the densities at these time points. Assuming environmental at these time points are also available, we can generate predictions accounting for uncertainty using the posterior predictive distribution of :
where
is a realization of the AR(1) process at the new time points, and represents the model parameters. Moreover, the pdf can be obtained as in Section 3 or Section 2.
Pontential issues with estimation & Closed formulas for special cases
Define
as the environment dependent survival for age at time and site .
Then, in the scenario without movement, the expected density for age at time and site can be written as:
Therefore, a closed form for the expected density is:
Or, with selectivity,