About Logbook Data

Database description

American whalemen sailed out of ports on the east coast of the United States and in California from the 18th to early 20th centuries, searching for whales throughout the world’s oceans. From an initial focus on sperm whales (Physeter macrocephalus) and right whales (Eubalaena spp.), the array of targeted whales expanded to include bowhead whales (Balaena mysticetus), humpback whales (Megaptera novaeangliae), and gray whales (Eschrichtius robustus). Extensive records of American whaling in the form of daily entries in whaling voyage logbooks contain a great deal of information about where and when the whalemen found whales.

The American Offshore Whaling Log database includes information from 1,381 logbooks from American offshore whaling voyages between 1784 and 1920. These data were extracted from the original whaling logbooks during three separate scientific research projects, one conducted by Lt. Cmdr Matthew Fontaine Maury in the 1850s, the second conducted by Charles Haskins Townsend in the 1930s, and the third conducted by a team from the Census of Marine Life project (CoML, www.coml.org) lead by Tim Denis Smith between 2000 and 2010. The Maury and Townsend data were assembled by the CoML team from archival sources, and there were limitations on the quality and completeness of the data available.

The data file includes 466,134 data records assembled in a common format suitable for spatial and temporal analysis of American whaling throughout the 19th century. The data fields are described in the Column definitions.

The three projects selected whaling logbooks in different manners and for each logbook the details of the data extracted varies. Maury extracted data from most days the vessel was a sea, but often omitted the initial portions of many voyages in the North Atlantic and many days when the whalers were whaling within enclosed bodies of water (e.g. the Okhotsk Sea). Townsend extracted data only from the days when whales were  struck or killed, leaving potentially large gaps in the sequence of dates and positions for any single voyage. CoML data were collected from all days recorded in the logbooks.

Spatial and temporal gaps occurred in all the data by design and because some logbooks did not include entries for some days. In some cases the gappy data were interpolated, especially where the gaps were small but also at times for larger gaps. Unfortunately, the latter at times created interpolated data points on land, especially when vessels crossed the date line or when logbooks ended well before the vessel returned to its homeport.  Not all of these problems have been addressed in these data.

Similarly, both the logbook keepers and the data extractors made mistakes, for example transposition of digits and, rarely, miscoding of hemispheres. These have been corrected when plotting of individual voyage tracklines have revealed such problems during some data analyses. However, not all of these have been corrected.

The data extractors attempted to identify the common name of cetaceans recorded in the logbooks, but this was not always possible because encounters were sometimes described only as unknown or as “whale.” Thus some of the cetaceans encountered are here listed here only as “Whale.” The CoML data included encounters with a wider range of cetaceans than the other two sources did. Maury’s data were of course only from logbooks completed before the early 1850s so do not include some species that had not then been targeted, for example bowhead whales. A more complete description of these data is given in the Materials and Methods section (see page 11) of a research paper authored by some of the CoML team; the paper itself is freely available on line as Smith TD, Reeves RR, Josephson EA, Lund JN (2012) Spatial and Seasonal Distribution of American Whaling and Whales in the Age of Sail. PLoS ONE 7(4):  e34905. doi:10.1371/journal.pone.0034905.