Image Credit: angela n., CC BY 2.0, Image Cropped
The time has come for the Stats Corner to wrap up as Ecology for the Masses closes. I want to thank Sam Perrin for bringing me on board, carving out a place for statistics on the site, and supporting my writing for the past two years. He was always generous with his time helping me brainstorm topics when I was out of ideas and providing helpful feedback to make the posts more engaging and accessible. We all have Sam to thank for the fun pictures that kick off each post as well. 🙂
Title Image Credit: Tony Webster, CC BY-SA 2.0, Image Cropped
Nature is complicated and the environment is vast. How can we possibly learn all there is to know about our surroundings? Aspects of our natural world like life population dynamics and life histories influence the very survival of species, but understanding these requires data from long time periods. Luckily, technology and the remarkable commitment of some scientists have meant that we are making progress in data collection by establishing many long-term monitoring networks that collect a variety of information at many different locations. Weather sensors keep track of temperature, precipitation, and wind speed. Air and water-quality sensors keep track of what is in the air we breathe and the water we drink.
Edward H. Simpson was a codebreaker at Bletchley Park, the home of Allied code-breakers during the Second World War. While you’d think this would be his claim to fame, perhaps his most lasting contribution is his description of Simpson’s paradox. The paradox describes the phenomena whereby a relationship within a dataset dramatically changes if you look at the data by group or all together. More famous examples of the paradox stem from the medical world or the famous Berkeley admissions example. But what examples can we have in mind in ecological settings to guide us? Let’s consider the dimensions of penguins’ bills compiled from Palmer Station in Antarctica. If we are interested in the relationship between the bill depth and length we might do a preliminary analysis like the following linear regression.
Image Credit: Patrick Kavanagh, CC BY 2.0, Image Cropped
In my last post we talked about using images as data. This time we’ll consider another non-traditional source of data: the results of other investigations. Using results to generate more results? That seems weird… at first. But think about how science progresses. We build on other studies all of the time! Sometimes we use others’ findings as a jumping off point. Other times, studies invite us to see if we can reproduce their findings under new conditions or with respect to our own study site or species of interest.
Image Credit: Shiv’s fotografia, CC BY-SA 4.0, Image Cropped
When I think of the ecological data I typically work with, it usually tells me where plants or animals are, how many of them there are, and how those quantities might change. Most often, these organisms boil down to a few spreadsheet cells. But what if the questions you’re asking are less “where is the organism”, and more “what does it look like”?
Photographic data is not a new phenomenon for scientists, but thanks to huge leaps in technology (hello, camera phones) it is a booming data source. Community science – whereby members of the general public submit photos of species they’ve happened across – has seen a huge rise in popularity, thanks to apps and community platforms like iNaturalist. As a result, photo data is constantly growing in abundance, and many studies are quickly adapting to take advantage of this data source.
Image Credit: Miss Ophelia, Pixabay licence, Image Cropped
There are a lot of questions in ecological research that ask whether or not something has changed over time, or put more simply, whether two things are different – vegetation levels, climate variables, maybe species diversity.
Suppose we are monitoring nutrient levels in a lake to make sure they stay at levels that are habitable for the fish living there. A change in policy about what is allowed to be dumped into the river by local factories was enacted, and we want to see if there is evidence that the nutrient levels have deteriorated in the year following the change when compared to the year before.
Image Credit: 2010 Jee & Rani Nature Photography, CC BY-SA 4.0, Image Cropped
This question comes from Marney Pratt (@marney_pratt) as she noted that a recent paper tracking trends in ecology papers shows the use of Bayesian statistics increasing over time. (Before we get going, if you want a refresher about what exactly Bayesian thought entails, check out this previous post.) Anderson et al. say:
If we write about our statistical methods behind our ecology work, and none of our readers understand it, have we really communicated at all?
This month I’m getting meta. It’s been about a year and a half since I started writing the Stats Corner for this blog with the goal of demystifying some of the statistical methods that are used by ecologists every day. At the same time, I’ve been writing a book with Deborah Nolan called “Communicating with Data: The Art of Writing for Data Science.” The book was released this spring, so it seemed like a good time to reflect on writing about statistics accessibly.