ENGLISH CORPORA MAKING: HISTORICAL OVERVIEW
Keywords:
corpus, concordance, pre-electronic era, computer, generation, modern, megacorpus.Abstract
The emergence of corpus linguistics was preceded by a centuries old period of the use corpus methods and the creation of text corpora. In connection with the non-electronic form of storage of these corpora, as well as non-automatic methods of data processing, a special period in the history of corpus linguistics called pre-electronic can be distinguished. With the invention and widespread use of computers, a new stage of development corpus linguistics begins – the created corpora differ from the old ones not only in the storage format, but also in volume. Second-generation corpora are the products of the Internet and are distinguished by their large size. The third generation corpora are large and have many technological advantages. In this period, a number of new corpora were created, with a total volume of several billion words.