Why is it so hard to detect species interactions?

organ pipe

The western coast of Sonora, México is populated by a fascinating mixture of desert and dry forest species. Here the organ-pipe cactus (Stenocereus thurberi) rises over a landscape dominated by Bursera, Ruellia, Jatropha, and a range of other common shrubs. Do any of these species care about each other? Do they co-occur entirely by chance, or do they actively interact? Presumably they compete for nutrients and water; they may also share pollinators, or require the same seed dispersers, or mutually suffer when one inadvertently attracts herbivores. But does any of this matter when we simply want to predict the distribution of these species?

To answer that question, consider these two contrasting examples from the coast of Sonora.

habitat

The first example is from a canyon bottom. A palm (Washingtonia robusta (Arecaceae)) is growing adjacent to a fig tree (Ficus insipida (Moraceae)). But their co-occurrence is probably not due to any direct interaction. They are both here because they share similarly high water requirements. They would both be outcompeted in drier environments or be physiologically unable to survive and reproduce in the open desert. We can safely call this the absence of a direct interaction.

parasite

The second example is from a seaside promontory. This is actually not one plant, but two – a Bursera microphylla (Burseraceae) tree, with a hemiparasitic Psittacanthus sonorae (Loranthaceae) rooted onto its branches, happily flowering using stolen water and carbon. In this case the Bursera suffers from the Psittacanthus, and the Psittacanthus cannot live without the Bursera. This is a clear case of a direct interaction.

One of the major goals of ecology is to accurately predict how species will adapt, move or die, as the Earth’s climate begins to change. If species do not interact, the problem is relatively easy; each species can be modeled as independent from all others. But if species do interact, then the problem becomes difficult much more quickly. In these two examples, we know enough about the natural history and physiology of the organisms to determine whether interactions are occurring. But this doesn’t work when the goal is to predict the distribution of all the species on Earth. The problem is that the number of possible pairwise associations between n species is equal to n(n-1)/2. For two species, there is only 1 association to understand. For a thousand species, there are 499,500 possible pairwise associations that need to be understood. These associations provide a lower bound on the number of possible interactions. Too many to ever understand through natural history and expert knowledge.

Back in 2012, my friend Naia Morueta-Holme and I began developing some new statistical approaches to resolve this well-known but unsolved problem. Our idea was that interactions could be determined by subtracting away other confounding factors like shared habitat requirements. We proposed to first build models of species’ broad-scale geographic distributions based on climatic tolerances and predict how likely it was for species to co-occur. We then proposed to compare these co-occurrence scores to how often species did co-occur in small-scale communities. If species were found more often or less often than under the regional climatic expectation, then they were positively or negatively associated with each other. We then further subtracted away any indirect associations between other species. This final association could then hopefully be interpreted as an interaction. We also argued that species associations may exist between pairs of species, but are much more useful in a network context, just as individual friendships are less interesting that an entire social network. This let us identify species that (for example) were hubs in communities, actively attracting or repelling other species.

network framework

Three years later, the framework for doing all this is now published as a shared first-author piece in the journal Ecography as Morueta-Holme, N., Blonder, B., et al., A network approach for inferring species associations from co-occurrence data. I also wrote a free R package, netassoc, that implements the framework.

We used a large dataset of New World plant distributions from the BIEN initiative to explore the framework’s inferences for the trees of the eastern United States. We found that interactions between species are surprisingly rare, and are positive when found. A few species seem to act as key aggregators, like the balsam-fir, Abies balsamea (Pinaceae).
network zoom

And you can see this species does happily co-occur with white cedar (Thuja occidentalis (Cupressaceae)) in this Maine forest!

thuja abies

The framework makes these predictions without needing to know the natural history of every possible pairwise interaction between species. We hope that the approach will provide a useful tool for predicting the future distributions of species under climate change.

It doesn’t work perfectly – the error rates in simulations are higher than we’d like, and small changes in inputs can lead to relatively high uncertainty in predictions. It took three years of simulations and arguments and failures to get this far. At times we thought about giving up and abandoning the whole project. But I’m glad we stuck with it. I think that the ‘easy’ co-occurrence data that the framework uses is inherently very limited. We throw a lot of mathematics at the data, and push the limits as far as they can go. It’s a start.

(The desert photos were taken on a New Year’s trip with two other botanists. As the ecologist Rob Colwell has apocryphally but accurately said, “One botanist, half a walk. Two botanists…quarter of a walk. Three botanists….no walk at all”.)