Abundantia Verborum


4. Virtual corpora and corpus linguistics

The typical data source for a case study in Abundantia Verborum is a single corpus or a set of corpora. The reliance on corpora as a data source is an old practice in some specific sub-fields of linguistics, but only in the last decade has it evolved into a generally accepted and applied approach. The methods concerned with creating, evaluating and using corpora have developed into a full-fledged discipline, called corpus linguistics.

In this chapter we sketch the position of Abundantia Verborum in relation to corpus linguistics. The first part gives some general information about corpus linguistics. After a general sketch of current activities in that field we focus on two issues, namely the format of markup in corpora and strategies for searching in corpora. In the second part we describe the virtual corpus mechanism supported by Abundantia Verborum and describe how and to which extent this mechanism can cope with existing markup formats. In the third part we treat the query language in Abundantia Verborum and describe how it relates to existing query strategies. Finally we summarize the most important observations made in the chapter.

The structure of chapter Four is given below.

4. Virtual corpora and corpus linguistics

  1. Corpus linguistics
    1. The broader field
    2. Corpus formats in the past, the future and the present
    3. Search strategies and existing software
  2. Virtual corpora
    1. Virtual corpus views
    2. The corpus preparation tool
    3. Users and encapsulation
  3. The Abundantia Verborum query language
    1. A typed query language
    2. Indexed search versus direct search
    3. Limited versus unlimited access
  4. Summary


Back to table of contents