Finding Balance on the Bias-Variance Seesaw
Building models is a tricky business. There are lots of decisions involved and competing motivations. Say we are an ecologist studying owl abundance in a park near our school. Our primary goal may be to have a good understanding of what is going on in our data. We don’t want to miss any important relationships between abundance and measurable factors about the landscape. Like if we didn’t include tree cover as an explanatory variable, we might have a model that is underfit since that variable would give us potential information about the availability of spots for owls to nest.Read more
Fixed, Mixed, and Random Effects: The Ecology Edition
Image Credit: WomEOS, CC BY-SA 2.0, Image Cropped
I’ve written about fixed, mixed, and random effects in linear models before (and others have too) but I think it’s time to approach the topic with some ecology motivation. What do these different types of effects mean to us in the wild and when might we need to use one over the other? Read on to learn more!Read more
It’s All Relative: Measuring Abundance In The Face of Detection Bias
There are many papers out there discussing estimates of abundance and occurrence of a variety of plants and animals. Sometimes you’ll also see references to relative abundance and relative occurrence. What makes researchers go for one estimate over the other? When might you face a similar choice? The goal of this post is to try to shed some light on when you might want to keep things relative.Read more
The Ecological Fallacy: What Does It Have To Do With Us?
Image Credit: Erik Karits, Pixabay licence, Image Cropped
Every once and awhile the term “ecological fallacy” gets thrown around to critique a particular study. Some Twitter discussion around this pre-print, which compares COVID-19 mortality to vegetable consumption at a country level, got me thinking about the term again. So let’s go through what it is, why it’s a problem, and why sometimes it can’t be avoided.
Hey You… Take a Sad Estimator and Make it Better: The Rao Blackwell Theorem
Image Credit: Bureau of Land Management, CC BY 2.0, Image Cropped
A common goal of ecologists is to understand the population abundance of a particular species. We might be looking for the California condor as part of assessing how well the recovery project is going. This requires some field work, going out to a variety of sites and counting animals that we see. How do we choose which sites to go to? Even in the era of camera traps, we still need to know where to put our extra set of eyes. It would be a shame to have a particular camera not get any action due to an unlucky placement. We don’t have infinite time and money after all!
Don’t Let Coefficient Interpretation Make an Ass of You
Image Credit: beeveephoto, CC BY-SA 2.0, Image Cropped
Everything that ecologists do – from saving endangered species to projecting climate change impacts – requires ecological data. Sometimes that data can be hard to come by, like when you’re trying to figure out the range of a rare moss. At other times, that data can be smack bang in front of you, but impossible to measure. The depth of a lake for instance, or the surface area of a tree. Today, we’ll look at how to overcome that second situation, by using other, more easy-to-obtain covariates to provide an estimate of the property you’re looking for.
Bayesians v. Frequentists: A Tale as Old as Time
In our last stats post, we talked at length about everything that can influence the outcome of a statistical model. The choice of parameters. The choice of data. But one thing we avoided talking about was the choice of the approach to the model itself. And that brings us to the two big approaches in statistical modelling – Bayesian vs. Frequentist.
Model Mis-specification: All The Ways Things Can Go Wrong…
Image Credit: Grand Velas Riviera Maya, CC BY-SA 2.0, Image Cropped
In ecological studies, the quality of the data we use is often a concern. For example, individual animals may be cryptic and hard to detect. Certain sites that we should really be sampling might be hard to reach, so we end up sampling more accessible, less relevant ones. Or it could even be something as simple as recording a raven when we’re really seeing a crow (check our #CrowOrNo if you have problems with that last one). Modeling approaches aim to mitigate the effect on our results of these shortcomings in the data collection.
However, even if we had perfect data, when we decide how to model that data, we have to make choices that may not match the reality of the scenario we are trying to understand. Model mis-specification is a generic term for when our model doesn’t match the processes which have generated the data we are trying to understand. It can lead to biased estimates of covariates and incorrect uncertainty quantification.
What’s the Deal with P-Values and Their Friend the Confidence Interval?
After the first edition of Ecology for the Masses’ new Stats Corner, many people requested a discussion of p-values. Ask and you shall receive! And as an added bonus, we’ll also talk about confidence intervals. (Image Credit: Patrick Kavanagh, CC BY 2.0, Image Cropped)