How Much Data is Enough?
By Scott Norris
How do you know when you have enough data? That is a difficult question to answer if you don’t know how much you need. We seldom know the basic quanta of conservation with much precision. And when science offers only partial knowledge and statistically informed guesswork, we often make policy decisions based on ideology—or postpone them until we have more information.
But precisely for that reason, say some researchers, it is essential to know the strengths and weaknesses of the data one has in hand. How likely is it that a recently observed change in population size represents a short-term fluctuation or a long-term trend? Will another year of monitoring or another demographic study bring any more certainty? When and how can managers say with confidence that more data really are or are not needed to decide a policy question?
A recent analysis of 19 population surveys of the California gray whale (Eschrichtius robustus) conducted since 1967 provides a powerful case study, showing how a combination of population monitoring and probabilistic modeling can assist conservation decision making. The approach to the U.S. Endangered Species Act (ESA) listing and delisting decisions developed by Leah Gerber of the National Center for Ecological Analysis and Synthesis (NCEAS) and Douglas DeMaster of the National Marine Mammal Laboratory focuses explicitly on how data increase in value as they increase in quantity. In the case of the gray whale, the analysis shows precisely at what point sufficient information existed to unambiguously support delisting the population.
The work provides strong validation for the controversial 1994 decision to remove the California gray whale from the list of threatened and endangered species. But the approach also may have broader applications, both in determining ESA classifications for different species and in deciding how much monitoring or data collection is required for other conservation decisions. A central point stressed by Gerber and DeMaster is that managers need to be explicit about what it is they want their data to help them decide.
A second lesson that emerges from the analysis is that although data may accumulate in a linear fashion, the power of data to address policy issues does not. Once decisions have been made on how data are to be interpreted and how much uncertainty is acceptable, the practical value of a given dataset may jump abruptly with a small additional input of information. Alternatively, when uncertainty is high, or when management standards are highly conservative, even a significant increase in data may not be decisive. A quantitative assessment of how data in hand bear on a particular policy question can help determine when data collection efforts should be stepped up, maintained, or discontinued. “The question of how much data is enough should be considered in the context of a particular policy decision,” Gerber says.
The eastern North Pacific population of the California gray whale is a conservation success story in two respects: the population has recovered and the data exist to prove it. The latter is the result both of years of hard work and the fortuitous fact that unlike most wide-ranging marine mammals, California gray whales actually can be counted on a regular basis. Each year the entire population travels south to Mexico along the California coast. The National Marine Fisheries Service (NMFS) has conducted 19 shore-based surveys to count migrating gray whales over the past 30 years. While observers cannot detect every whale that passes offshore, researchers have devised a variety of statistical means for calibrating the surveys and translating census figures into estimates of total population size.
The survey data show that from the late 1960s to the present, the California gray whale population has been increasing on average by about 2.5 percent per year. The current total of around 26,000 individuals is roughly twice that of the early 1970s, when the gray whale was listed as an endangered species. This long record of abundance information is a precious commodity. “It’s rare to have more than 15 years of data for an endangered species,” Gerber says. “The gray whale dataset is unique in that it documents the unequivocal recovery of a highly visible species and provides some measure of success of the ESA.”
In 1994, when it was clear to most that a strong recovery was underway, the California gray whale was removed from the endangered species list. But public concern over the plight of whales in general made the delisting debate contentious. Controversy arose in part because there were no quantitative guidelines in place to assess how the census data bore on the delisting question. Even without such guidelines, however, the evidence in support of the ruling was substantial. No analyses at the time indicated a strong likelihood of the population going extinct over a significant portion of its range in the foreseeable future. And, DeMaster notes, there was a consensus among many scientists that responsible administration of the ESA demanded delisting. “Many in the marine mammal community believed that if this population continued to be classified as endangered, then the value of ESA classifications in general was seriously in doubt,” he says.
As mandated by the ESA, the delisting decision was reviewed in 1999 after a 5-year period of continued monitoring. An invited panel of experts found no reason to reverse the prior ruling but recommended that monitoring be continued for another 5-year period. In its report, the panel noted that the California gray whale’s migration along a highly populated coastline and concentration in limited breeding areas in Mexico made it vulnerable to a variety of threats, including catastrophic events that could wipe out years of steady gains. The report also pointed out the proven success and ongoing potential of the monitoring program. “Never before,” the report stated, “has there been as good an opportunity to study life history parameters of a cetacean population approaching its carrying capacity—a study which will be very beneficial to research on other, less accessible whale stocks.”
Monitoring for Recovery
For scientists studying the California gray whale, each survey brings more useful information on population trends. But exactly how much information is needed to substantiate a regulatory decision such as delisting? That’s the question Gerber and NCEAS colleague Peter Kareiva thought they could answer by conducting a new analysis of the gray whale census data. Working with DeMaster, Gerber had already helped develop a quantitative approach to ESA classification of another threatened species, the humpback whale (1). The gray whale provided another test for this approach, one in which data were plentiful. “We were interested in examining how robust our approach to ESA classification would be to different amounts of data,” Gerber says.
This was not just an academic exercise. Being able to identify the minimal amount of information needed to persuasively determine a course of action represents an important step toward more objective and efficient ESA administration for other, less intensely studied species. A second related objective of the analysis was to draw attention to the importance of data collected during a period of species recovery. Conservation biologists frequently stress the need for monitoring to identify species in danger, Gerber notes, “but we had seen nothing in the literature about monitoring for recovery.”
The approach to listing de-cisions devised by Gerber and DeMaster is based on a probability-driven model of population demographics. The first step is to set parameters and thresholds for the model. Based on data from other whale populations, including the endangered northern right whale, the researchers assumed a critical population level of 500, below which extinction would be considered likely. They then set criteria to be used in determining the whale’s categorical status under the ESA. For example, if the population model indicated a greater than 5 percent chance that the whale would fall below the 500 level in the next 10 years, listing as endangered would be warranted. Similarly, a 5 percent chance of such a drop occurring over the next 25 years would result in listing as threatened.
Next, the researchers turned to the gray whale census data to carry out the modeling. Using the full 19 years of data, the model generated a range of population growth trajectories that could be used to calculate a range of possible population sizes at the specified intervals of 10 and 25 years. If 95 percent or more of these population sizes exceeded the critical level of 500, no listing action would be recommended. The California gray whale easily passed this test for the entire dataset.
But what if fewer data were available? To determine at what point in the monitoring program enough data existed to support delisting, Gerber and DeMaster re-ran the model using progressively smaller subsamples of the census data. For any given number of abundance estimates—ranging from 19 down to 5—all possible combinations of data were used. For example, there are 15 possible 5-year samples that can be taken from the dataset. Distributions of possible population growth rates resulting from each sample were subjected to the risk classification protocol.
The results clearly illustrate how incremental increases in data reduce uncertainty and provide a clear-cut basis for regulatory action. None of the datasets of 5 counts or more supported continued listing of the California gray whale as endangered. With only 5 surveys, however, there was no basis for deciding between listing as threatened and delisting. As the number of survey years increased to 10, this uncertainty was greatly reduced; at 11 survey years and above, all of the simulations met the criteria for delisting. Thus the analysis revealed a critical point at which information became sufficient to support the policy action. “If quantitative standards had been used to guide monitoring of gray whales,” Gerber says, “the decision to delist could have been made sooner.”
Getting the Most for Your Money
Gerber and DeMaster believe their framework provides a rigorously quantitative basis for listing/delisting decisions—but not a formulaic one. Their approach is both flexible and inherently conservative about lifting or easing protections. Some components of their model—such as time intervals and threshold probabilities for extinction risk—are variables that must be set by policy makers responsible for the management of particular species. Critical population levels, however, should be based on biology. In making these determinations, there is ample room for scientific and policy debate. But once these management goals and levels of acceptable risk have been set in the context of a population model, Gerber and DeMaster’s hope is that data—and not sentiment—will determine what course of action should be taken.
The precautionary nature of the approach arises from its treatment of population growth rate not as a known quantity but as a random variable with a range of possible values. Uncertainty is present due to possible errors in estimating growth rates and due to the possibility of environmental variation. This uncertainty increases when data are limited, resulting in a broader set of possible growth rates and a higher chance that the population will fall below a given critical threshold. Managers can weigh the time and cost of additional data collection versus the expected benefits that additional data might provide: a higher degree of certainty attached to more conservative standards of protection.
Probabilistic modeling can also be used to structure research and monitoring activities such that the most useful data is gathered in the shortest amount of time. Gerber notes that once classification criteria have been established in the form of probability statements, it becomes the responsibility of the agency or manager in charge to gather the necessary data before a population can be considered for delisting. “It is precisely for the majority of threatened species for which this kind of long-term data are lacking that this approach will be most useful,” she says. The very exercise of running the model can reveal information gaps and help managers predict at what point in the future data will exist to support more refined or conservative classification decisions.
The approach also highlights the economic value of population monitoring, particularly in cases of species recovery. Although not cheap, monitoring can yield substantial economic benefits if it provides the data necessary for delisting. DeMaster notes that throughout the history of the gray whale monitoring program, administrators often had to fight to secure continued funding. Overall, gathering the data necessary to delist cost the NMFS roughly $660,000. But Gerber says the $60,000 spent on each survey is likely to be far less than the regulatory costs associated with continued management of the California gray whale as endangered.
The gray whale case also reveals how, once a monitoring program is underway, economic arguments for continued monitoring—at least to the point of data sufficiency—may strength-en. If funding had been secured for only 5 years of gray whale monitoring, for example, a $300,000 expenditure would have yielded no clear policy recommendation. The $360,000 spent on six additional counts provided the critical information needed for delisting. The researchers emphasize that the economic benefits of monitoring are strengthened through the application of a quantitative framework. Another way to save money is to pinpoint when monitoring efforts can be scaled back or become less frequent, after a necessary amount of data has been obtained.
Gerber and DeMaster make no claim that their approach represents the solution to problems of ESA classification and administration. They do believe, however, that development of quantitative frameworks for specifying risk and measuring recovery may be broadly applicable and beneficial across a wide range of cases. In hindsight, the California gray whale case shows how such an approach can help set research priorities, enhance the value of data, validate management decisions, and support allocation of scarce conservation dollars to monitoring programs. Moreover, focusing on critical, measurable attributes of a population offers a detour around the kind of ideological impasses that develop in the absence of definitive data.
1. Gerber, L.R., and D.P. DeMaster. 1999.
A quantitative approach to Endangered Species Act classification of long-lived vertebrates: Application to the North Pacific humpback whale. Conservation Biology 13 (5):1203-1214.