Corpus linguistics and the description of English

Corpus linguistics and the description of English. By Hans Lindquist. (Edinburgh textbooks on the English language—Advanced.) Edinburgh: Edinburgh University Press, 2009. Pp. xx, 219. ISBN 9780748626151. $28.50.

Reviewed by Susanne Wagner, Chemnitz University of Technology

Lindquist’s book is one of those rare ‘why did this take so long?’ finds everyone having taught (and—by necessity—having taught themselves in the process) corpus linguistics (CL) has been waiting for. It should become the default compulsory background reading for every student and researcher working in English CL today. Not only is the book written in a wonderfully accessible style, it also includes example studies from all areas of CL that can provide students and scholars with countless ideas for further research.

The first two chapters set the scene by discussing important background information concerning theoretical and practical concepts and the practical handling of corpora (e.g. tagging, statistics). Chs. 3–10 each focus on one research question, but always touch on general and far-reaching issues.

By way of example studies, many of which are based on Mark Davies’ web-based Brigham Young University (BYU) Corpora, L covers corpus-based studies in the fields of semantics (including lexis, collocations and colligations, lexical change, metaphor, and metonymy), grammar (passives, who and whom, grammatical change → grammaticalization, complementation patterns), and sociolinguistics (male versus female language with regard to both active language use and passive features such as the use of gender-marked versus gender-neutral terms). The author’s focus on corpora which are continuously updated/extended (BYU Corpora) and added to (e.g. international corpus of English, Brown family) ensures that the book will not be outdated in a couple of months (which is an abiding danger of any ‘how-to’ publication in the field). Along the same lines, L also wisely refrains from using one particular type of software for analysis.

The book combines a unique integration of historical facts and details with present-day questions, methodological caveats, and countless examples (forty-four figures, 101 tables). The way the author manages to casually include state-of-the-art research findings from areas such as sociolinguistics in a book of this type is nothing short of brilliant (e.g. colloquialization of English and role of gendered language).

Readers are familiarized not only with different types of corpora and their leading representatives, but also with historical background on CL in general (e.g. Otto Jespersen’s grammar as an early corpus-based grammar, the history of the Brown family). The author addresses countless methodological issues: e.g. the pros and cons of corpus-driven versus corpus-based work via issues of comparability, different types of statistics tests, and the advantages and disadvantages of qualitative versus quantitative studies.

L also addresses the usefulness of messy sources as corpora (particularly the internet and the Oxford English Dictionary quotations database), emphasizing that—as always in corpus studies—the researcher needs to be aware of their messy nature and the problems that entails. Once such issues are taken into consideration and carefully navigated, investigating corpora truly holds ‘joy and fascination’ (1).