I want to share a story about how a new idea can originate in a chance hallway conversation. The story is about ecological niches.
The ecological niche is something different, but draws on the metaphor of a recessed space that can hold something. The concept was primarily popularized by G.E. Hutchinson in the late 1950s, and used to describe species. For him, a niche is
An area is thus defined, each point of which corresponds to a possible environmental state… an n-dimensional hypervolume is defined, every point in which corresponds to a state of the environment which would permit the species.. to exist indefinitely.
This reasonable concept makes intuitive sense: for some set of variables that are important to species, the niche is the combination of variables in which the species can live. You might imagine that a ‘real’ niche simultaneously depends on many variables. Unfortunately the definition says nothing about how to measure one.
This brings us a few decades forward, to early 2012. One morning I left my desk to get a drink of water in the hallway. I ran into Brian Enquist and Christine Lamanna, who were there talking about ecological niches. They asked me if I knew of any algorithms to find the overlap between two niches. I said I didn’t, but I was sure that the problem was solved, and a quick search would resolve the question. We did that search, and after some hours it became clear that the problem was actually not solved. Conceptual debates over what a niche is had for several decades obscured the issue of how to measure one.
The trouble is that a niche probably has many dimensions, and the mathematics get bad very quickly. Thinking about the geometry of a high-dimensional object is fundamentally different than thinking about a low-dimensional object. We were surprised by how hard the problem seemed, but also that no one had yet solved it. So we set ourselves the challenge of finding a more efficient way to determine the shape and overlap of high-dimensional ecological niches. We decided to do a thresholded kernel density estimation, which amounts to taking a set of observations of all variables, estimating their overall probability density, then slicing through this density to determine the ‘edge’ of the niche. You can see this illustrated in one dimension below.
Finding a computationally fast way to do this was a challenge. Over the next few months, I wrote five or six different algorithms based on different ideas. Most were a complete failure or were computationally too slow to be effective in more than two or three dimensions. Here’s an early version based on range box-intersections.
And here’s another version, some weeks later – after I discovered an idea that would form the basis of our final solution.
The core innovation is that the entire niche space doesn’t have to be sampled. A good algorithm can avoid doing calculations in the vast regions of niche space that are almost surely ‘out’, and focus instead on the regions that might be ‘in’. The tradeoff is that the niche is only defined in a stochastic sense. We only get approximate answers but we can get them quickly, and can easily visualize the shape and volume of an arbitrary complex niche in any number of dimensions.
I was able to get a usable version working in MATLAB, and after another few weeks of work, rewrote all the code in R, a more commonly used statistical programming language. I won’t go into all the details of the underlying mathematics, but will simply say that everything works – and for reasonably sized datasets, answers can be obtained in seconds or minutes.
Here’s an example of what the algorithms can do. Below you are seeing a three-dimensional niche, the functional hypervolume of tree species in the temperate (blue) or tropical biomes (red) of the New World. The overlapped hypervolume is in orange. We are now able to determine exactly what resources and strategies are unique or shared in different parts of the world.
We had a rough time getting the method published – niches are controversial because of their central importance in ecology. But after some discouraging rejections, our paper is now out! Check it out in Global Ecology and Biogeography (PDF version here). The issues around niche concepts and niche measurement are more complex than I have presented in this short blog post, and I encourage you to read the paper for a full exploration of the topic. As for the algorithms, they are freely available as the ‘hypervolume’ package on CRAN and come with documentation and demonstration analyses.
Here’s an example (in R) of how to use the code – just a few lines to analyze and visualize a complex multidimensional dataset.
I hope these algorithms will open up a whole new range of analyses that weren’t possible before. It has been a long road to see our work come out – and all thanks to a lucky hallway conversation, good collaborators, and the stubbornness to believe that some things that look impossible are not.