In October 2018, I completed my M.S. thesis entitled Methodologies for abundance estimation of moose (Alces alces) and other rare species. The process was incredibly rewarding, despite the numerous obstacles to success, and I'm proud of the skill set I was able to develop during my time here, studying with Dr. Angela Fuller (Cornell University, USGS NY Cooperative Fish and Wildlife Research Unit), Dr. Andy Royle (USGS Patuxent Wildlife Research Center), and Jeremy Hurst (NYSDEC).
The project began with and always carried the intent of quantifying the abundance of moose in New York, particularly within the Adirondack Park. Originally, we were going to use spatial capture-recapture SCR. Ordinary capture-recapture methods for abundance estimation most of the time involve estimating a probability of detection for the population based on features of the landscape, features of the animal, or features of the observer. This detection probability is used to figure out how many individuals went unseen to figure out the total population. SCR adds on to this in that the detection is also conditional on some unobserved aspect of the animal -- its home range. Ordinary SCR makes the assumption of uniformly distributed activity centers, though this can be modified to depend on known continuous spatial variables to account for some spatial dependence and answer ecological questions at once.
We carried out sampling of the moose population in the Adirondacks using detection dogs from Conservation Canines under this methodology, locating moose scats for genetic identification. In 2016, we sampled from June to the end of August across the entirety of the Adirondack park on a sample of 3 km transects selected by random cluster sampling. When all was said and done, we collected a sample size of only 236 scats! This led us to consider ways to improve our sample size.
Moose in New York are at an extremely low density -- no statistics are needed to realize this. A full 2/3 of all of our survey units were devoid of any moose scat, and only a small amount of samples were generated from the survey units we did visit. A better way to survey would be to survey units selectively based on whether there were any moose present or not. However, without accounting for this aspect of the sampling, we would end up biasing our estimates.
We specified a statistical model that accounts for this preferential sampling, and results in unbiased estimates of abundance.
Applying this method to the 2017 field survey, we succeeded in improving our sample size by a factor of four -- almost 1,000 samples were collected.
However, there was a greater underlying issue. The samples were not successfully being identified to individual -- the DNA were not amplifying under polymerase chain reaction. This meant that we could not obtain individual identities, and it also meant that we could not use spatial capture-recapture as a method of analysis. We needed a way to estimate abundance of moose without relying on individual identity -- the inference needed to be drawn from GPS logged movements of the detection dogs, the scat collection locations, and spatially-continuous landscape features.
Abundance estimators from scat surveys have been developed in the past. What is different is that the search patterns of detection dogs are extremely unstructured compared to survey protocols defined in the past. Usually, detection probability can be inferred from highly structured survey protocols such as distance sampling (Buckland et al., generally), or plot sampling, but detection dogs do not conform to such survey protocols -- nor are they suited, as most of these detection models are based off of human visual searching rather than the scent-based search of a dog.
We realized that we needed to account for a few factors to solve this puzzle: How many scats were present before we arrived for the first time? What was the rate of detection of the dogs? What was the rate of accumulation of scats over time? The last bit is crucial. Very generally, setting aside the problem of detectability for a moment, the population of animals in an area can be calculated from the rate of scat accumulation (which I'm calling theta), and the per-capita rate of scat deposition (delta), in the formula below:
The rate of scat accumulation had to be scaled by detection probability -- certainly, we were not locating all of the scats on the landscape, even with the dogs' scenting prowess. The detection probability was estimated from situations where the dogs' traversed the same grid cell more than once, and scats were collected in each separate replicate observation of that grid cell. Each replicate observation can be thought of as a binomial trial, with some probability of detection and a population that depends on how many scats were there before that replicate observation.
As an example, the dog enters a grid cell -- which has 5 scats to begin with from deposition occurring between visits -- and finds 2 scats. This is a naive detection rate of 0.4. Say the dog enters the grid cell again, and this time finds 1 of the remaining 3; this is a detection rate of 1/3. Our best guess (estimate) of detection probability, given these data, is somewhere in between: 0.3666. Through many such observations, we can estimate a rate of detection, possibly dependent upon features such as dog identity, handler identity, habitat, or how long the dog track was in the grid cell.
We applied this method, using spatially-continuous covariates of UTM Northing, elevation, highway kernel density, minor road kernel density, and a categorical habitat covariate (reclassified into conifer, deciduous, mixed, wetland, and 'other').
These covariates were selected based on the ecology of the moose and hypothesized effects in this system -- moose are thermally stressed in summer, and so they would seek habitats such as dense conifer cover and wetlands, and higher elevation regions; moose avoid human activity represented by the road metrics here; finally, moose ought to favor deciduous forest, particularly regenerating stands, because it is their main source of forage in the summer (alongside wetlands). There were four model classes, two of which were non-spatial, and two of which had spatial covariates -- the latter were the "human deterrence model" and the "habitat model".
The "human deterrence model" had the following covariates:
The "Habitat model" had the following covariates:
Assessed by DIC, the null model was actually the best favored:
However, overall the abundance estimates were relatively stable -- we predict approximately 600 moose within the Adirondack park of New York. The spatial distribution of moose, based on the two spatial models, appears to be dominated by elevation, with most of the predicted moose abundance in the High Peaks region of the park and near Lyon Mountain to the north.
Nevertheless, the predictions aligned with what most people knew anecdotally and what we saw in the field. Most of the moose -- by far -- are concentrated up in the northern part of the park. We hardly made any collections at all south of Indian Lake, except for a pocket near Northville. Certainly, the eastern border of the park had 0 observed moose scats, reflected well in the spatial predictions below.
I am very, very glad to have participated in this project. The work was so varied, and I feel that I grew greatly in my professional capacity. My experiences were not limited to developing statistical analyses and programming techniques -- I was able to experience leadership roles mentoring undergraduates, leading a field crew of 10 personnel into the Adirondack wilderness, to experience public communication in publication, conference presentations, and interaction with landowners, and to experience practical logistical management of quality data collection in wilderness conditions. I hope that my future work will continue to challenge me entirely in as many aspects as this project has.