Author Archives: Sara Stoudt

Who Is Simpson And What Does His Paradox Mean For Ecologists?

Edward H. Simpson was a codebreaker at Bletchley Park, the home of Allied code-breakers during the Second World War. While you’d think this would be his claim to fame, perhaps his most lasting contribution is his description of Simpson’s paradox. The paradox describes the phenomena whereby a relationship within a dataset dramatically changes if you look at the data by group or all together. More famous examples of the paradox stem from the medical world or the famous Berkeley admissions example. But what examples can we have in mind in ecological settings to guide us? Let’s consider the dimensions of penguins’ bills compiled from Palmer Station in Antarctica. If we are interested in the relationship between the bill depth and length we might do a preliminary analysis like the following linear regression.

Read more

Let’s Get Meta… The Good Kind

Image Credit: Patrick Kavanagh, CC BY 2.0, Image Cropped

In my last post we talked about using images as data. This time we’ll consider another non-traditional source of data: the results of other investigations. Using results to generate more results? That seems weird… at first. But think about how science progresses. We build on other studies all of the time! Sometimes we use others’ findings as a jumping off point. Other times, studies invite us to see if we can reproduce their findings under new conditions or with respect to our own study site or species of interest.

Read more

Data in Colour: Bringing Photos Into Our Spreadsheets

Image Credit: Shiv’s fotografia, CC BY-SA 4.0, Image Cropped

When I think of the ecological data I typically work with, it usually tells me where plants or animals are, how many of them there are, and how those quantities might change. Most often, these organisms boil down to a few spreadsheet cells. But what if the questions you’re asking are less “where is the organism”, and more “what does it look like”? 

Photographic data is not a new phenomenon for scientists, but thanks to huge leaps in technology (hello, camera phones) it is a booming data source.  Community science – whereby members of the general public submit photos of species they’ve happened across – has seen a huge rise in popularity, thanks to apps and community platforms like iNaturalist. As a result, photo data is constantly growing in abundance, and many studies are quickly adapting to take advantage of this data source.

Read more

To Get Great (Statistical) Power, It Takes Great Responsibility

Image Credit: Miss Ophelia, Pixabay licence, Image Cropped

There are a lot of questions in ecological research that ask whether or not something has changed over time, or put more simply, whether two things are different – vegetation levels, climate variables, maybe species diversity.

Suppose we are monitoring nutrient levels in a lake to make sure they stay at levels that are habitable for the fish living there. A change in policy about what is allowed to be dumped into the river by local factories was enacted, and we want to see if there is evidence that the nutrient levels have deteriorated in the year following the change when compared to the year before. 

Read more

Does A Modern Ecologist Need To Become A Bayesian?

Image Credit: 2010 Jee & Rani Nature Photography, CC BY-SA 4.0, Image Cropped

This question comes from Marney Pratt (@marney_pratt) as she noted that a recent paper tracking trends in ecology papers shows the use of Bayesian statistics increasing over time. (Before we get going, if you want a refresher about what exactly Bayesian thought entails, check out this previous post.) Anderson et al. say:

Read more

“Wait, What Am I Even Saying?” Communicating Statistics To A Wide Audience

If we write about our statistical methods behind our ecology work, and none of our readers understand it, have we really communicated at all?

This month I’m getting meta. It’s been about a year and a half since I started writing the Stats Corner for this blog with the goal of demystifying some of the statistical methods that are used by ecologists every day. At the same time, I’ve been writing a book with Deborah Nolan called “Communicating with Data: The Art of Writing for Data Science.” The book was released this spring, so it seemed like a good time to reflect on writing about statistics accessibly. 

Read more

The How, Why, and When of Transforming Data

We’ve been out in the field, painstakingly collecting each butterfly and measuring its body length and wingspan. Now is the moment of truth. We’re about to make a plot and see if the assumptions we make about the relationship between the two measurements are backed up by a linear regression. Is the relationship between length and wingspan what we’d expect? Will a linear model be appropriate or are we going to have to break out the heavier machinery?

Read more

“Those Things Are Evil”: Prediction Intervals in Mixed Models

Suppose we study salamanders and want to predict body mass based on their body length. We also want to account for different access to food and differing levels of competition at each site we’ve collected our salamanders from. So we fit a linear model with a random effect for site as we only have samples from a subset of sites. (Want a refresher on random effects? We’ve got you covered.)

Read more

Finding Balance on the Bias-Variance Seesaw

Building models is a tricky business. There are lots of decisions involved and competing motivations. Say we are an ecologist studying owl abundance in a park near our school. Our primary goal may be to have a good understanding of what is going on in our data. We don’t want to miss any important relationships between abundance and measurable factors about the landscape. Like if we didn’t include tree cover as an explanatory variable, we might have a model that is underfit since that variable would give us potential information about the availability of spots for owls to nest. 

Read more
« Older Entries