Google Scales the Ivory Tower
By Nancy Bazilchuk
Whether it’s preserving snow leopards or managing populations of great apes, many of the world’s thorniest conservation problems are found far from university libraries and scholarly resources. Now, a new free search engine called Google Scholar (scholar.google.com) can enable researchers in the remotest places to sift through scholarly publications as easily as a university-based PhD.
Google Scholar was launched in November 2004 and is still in its beta test version. It works much like its widely used parent, Google, which ranks Web pages by analyzing the number and importance of their hyperlinks (see box). Google Scholar searches university websites, academic publications, library holdings, government documents, newspaper articles, and popular scientific magazines. It can also turn up articles that originally appeared in subscription-only scientific publications but have subsequently been reposted—such as when an author provides a free version of his or her work on a personal website.
Peter Kareiva, a Seattle-based lead scientist with The Nature Conservancy, uses Google Scholar every day. Because conservation biologists are often working outside the confines of academia, a search using more traditional databases doesn’t always provide the kinds of information that can be of most use, Kareiva says. Google Scholar search results allow conservation biologists not only to see what other academics are doing but also to monitor what the world at large is doing with their research. “The currency of academia is scholarship, but the currency of conservation is impact,” Kareiva says.
Just typing Kareiva’s name into Google Scholar’s query box gives you a sense of the scope of the search engine. The search retrieves documents from across the conservation biology spectrum, from Kareiva’s publications in peer-reviewed journals such as Ecology, Conservation Biology, and Annual Review of Ecology and Systematics to books he has edited. Each citation includes a “Cited by” link, which allows users to see how each document has been put to work elsewhere. For example, Kareiva’s work with Chinook salmon recovery plans has been cited in journal articles by other academics, used by professors across the globe as a part of coursework, and referred to in a report to Washington State’s Independent Science Panel.
In many ways, Google Scholar functions like a subscription-based search engine. From a university connection, it’s possible to link directly from the Google Scholar citation to the academic journal, provided the university library has a subscription to that journal and has signed up for the service with Google Scholar. Even without a university link, however, users will have some success finding free versions of peer-reviewed articles.
It is this latter feature that has put Google Scholar right into the thick of the debate over who owns scientific knowledge, who has the right to access it, and what that access costs. A subscription to Thompson ISI Web of Science, the gold standard of academic search engines, can cost a university thousands of dollars a year, with subscriptions to peer-reviewed journals an additional, equivalent cost. Some universities, led by the University of California system (which spends US$30 million annually on scholarly periodicals, according to a Wall Street Journal report), are fighting for scholarly publishing houses to reduce their fees or to provide free public access to peer-reviewed papers.
Google Scholar has limitations, as a number of research librarians have documented in a flurry of articles that followed the search engine’s release. Google has been vague about the breadth of the material it searches, saying only that it is working with all major academic publishers. Critics point out that it is important to know the details of what Scholar searches so that users know what they might be missing. In addition, it is difficult to limit searches to articles that have appeared in just one journal—a problem that particularly vexes Péter Jacsó, a professor in the University of Hawaii’s Library and Information Science Program. “It is hopeless to make a pure search for articles published in, say, Science magazine on a given topic,” he writes in an online critique of Scholar (1). “The results are full of journals whose title includes the word science along with other terms such as Information Science & Research or Cognitive Science.”
Another drawback is inherent in the way Google ranks its listings. Because the search engine lists the most cited documents first, the newest information is unlikely to make it to the top of the list.
However, the possibility of finding free content without a library connection is revolutionary. As a Nature Conservancy scientist working in Vermont, John H. Roe isn’t as far-flung as most. Nonetheless, he recently became aware of Google Scholar when he was stranded in the wilds of the Philadelphia airport and needed a source. He was impressed.
1. Péter’s Digital Reference Shelf: Google Scholar (Redux) June 2005 (www.galegroup.com/free_resources/reference/peter)
HOW GOOGLE WORKS
Google and Google Scholar use Googlebot, a Web crawler that archives and indexes webpages—more than 8 billion pages for the main Google search engine and a smaller, undisclosed number of sources for Google Scholar. When you type in your search, an algorithm looks at your search terms and retrieves relevant pages weighted in the following way. Pages with many links pointing to them are considered “authorities” and are given greater weight and higher ranking in a search. Links from these “authorities” to other Web pages also give those pages greater weight in a search—so that a link from Science is given greater weight than a link from a personal homepage. Google Scholar thus gives papers that have been cited many times a higher ranking than papers with fewer citations. Additional weight is given when a paper is referenced by another highly cited article.
About the Author
Nancy Bazilchuk is a freelance science writer based in Norway.