Capturing a River’s Memory

By Nancy Bazilchuk

When Patty Zaradic sees the spiny alien body of a mayfly nymph or the rough geometric tube of a caddisfly nymph, she sees more than an aquatic insect adapted to clear running water. In the patterns and assemblages of these macroinvertebrates, she sees a river’s memory.

“Aquatic insects are like camcorders,” says Zaradic, a postdoctoral fellow at the Stroud Water Research Center as part of The Nature Conservancy’s Smith Fellowship program. “The presence or absence of one of hundreds of possible species can tell us what happened to the stream for months or even a year prior to the sampling date.”

Aquatic biologists have long taken advantage of aquatic insects as “camcorders” of water quality. Unfortunately, much of the complexity of what macroinvertebrates record can’t be detected using traditional statistical analyses and models. Zaradic’s research with artificial neural networks may offer biologists a powerful new tool to interpret more of the information recorded by bottom-dwelling creatures.

Artificial neural networks—computer algorithms modeled on the ganglion networks in animal brains—have been used for more than a decade by physicists, economists, and medical researchers to recognize patterns in nonlinear, nonnormal data. Biologists are now using neural nets to identify species, predict phytoplankton production, and model parameters for brown trout management. Zaradic’s neural network uses freshwater aquatic communities to track the land use cover in a watershed with great precision.

When given aquatic insect sampling data, Zaradic’s neural network can tell her exactly what kind of land use is found in the watershed draining into a stream. And when something goes wrong—a spill from a wastewater treatment plant, the paving of a parking lot—the neural network can detect the change and sound the alarm. The first step in the process is to “train” the network by giving it a large amount of data showing the relationship between macroinvertebrate samples and their associated land use cover.

Zaradic has a powerful ally in amassing this data—the Stroud Water Research Center’s six-year, US$7.2-million project funded by the state of New York and the U.S. Environmental Protection Agency to monitor the watershed that provides New York City’s drinking water. Since 2000, Stroud researchers have been collecting baseline water chemistry data, aquatic insect samples, and land use data from sub-watersheds in a 5,000-km2 area in upstate New York (sampling began in 60 sub-watersheds and now occurs in 110). More than 500 species of macro-invertebrates have been identified in the streams.

Using this huge store of empirical data, the neural network generates a complicated model of the relationship between macroinvertebrate assemblages and land use. The network then uses an iterative process to make predictions based on the model and check them against the known land use cover. It can then adjust internal weighting algorithms so that the model’s next predictions are more accurate.

Once the neural network has perfected its model for each sub-watershed, it can process new, routinely collected macroinvertebrate data from the Stroud New York City monitoring project. Because samples are collected from the same locations each time, the output from the network should plot on the line previously generated for the watershed in question. But if something has changed—for example, if a sewage treatment plant has had a spill affecting the macroinvertebrate community—the network will detect this change by plotting the data point as an outlier. This alerts managers to go to the watershed and look for a problem. Zaradic hopes her neural network can someday be used by water managers to evaluate the effectiveness of management practices on dairy farms or in stormwater runoff treatment systems.

Zaradic says neural networks are ideal for the nonparametric, nonlinear data that often characterize ecological communities, where common species may exist in large numbers along with a few occurrences of rarer species. “The thing about a neural network is that it doesn’t mind an ungainly, unbalanced data set,” she says.

David B. Arscott, New York watershed project coordinator for the Stroud Water Research Center, says one of the great strengths of the network is its ability to use more of the data than standard macroinvertebrate bioassess-ment tools are able to accommodate. Traditional tools such as species richness indices are often not able to use data for rarer or less-known species.

Thus far, Zaradic’s work has been limited to the New York data set. But she envisions adding new watershed data from other areas to see whether her neural network can build on its understanding of familiar aquatic species and incorporate new information for new species.

While the process is compu-tationally intensive, it doesn’t need a supercomputer to work. Zaradic uses an ordinary desktop computer with a freeware statistical package from the University of Stuttgart, Germany. That ease of use gives Zaradic hope that she can expand the use of her network beyond land use managers by training it to operate with macroinvertebrate samples that have been identified only to family.

“If you can take it down to family-level identification for some insects, then high schools and community groups can do this,” Zaradic says. “They can monitor their local watershed with their own little AI tool.”

About the Author

Nancy Bazilchuk is a freelance writer based in Trondheim, Norway.