Aspects théoriques et méthodologiques de la représentativité des corpus
In 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications wh...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Cercle linguistique du Centre et de l'Ouest - CerLICO
2014-05-01
|
Series: | Corela |
Subjects: | |
Online Access: | http://journals.openedition.org/corela/3029 |
id |
doaj-e78deb95c4c84f4dacaad7418b700243 |
---|---|
record_format |
Article |
spelling |
doaj-e78deb95c4c84f4dacaad7418b7002432020-11-24T23:48:33ZengCercle linguistique du Centre et de l'Ouest - CerLICOCorela1638-573X2014-05-0110.4000/corela.3029Aspects théoriques et méthodologiques de la représentativité des corpusNajib ArbachSaandia AliIn 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications which dealt with corpus linguistics. This paper aims at defining the concept of representativeness in corpus design and at illustrating its main features as well as the various methods used to achieve it, which will include a discussion on the issues of categorization, sampling or the required size of a corpus.We will try to achieve a better understanding of the concept of representativeness through a review of the related literature on corpus linguistics. The various methods that are proposed and implemented in order to achieve representativeness in corpus design will be discussed and contrasted. The two main methods that will be examined are Biber’s stratification techniques (1993a, 1993b) on the one hand, and the methods represented by Sinclair’s "monitor corpus" (1991, 1996, 2004) on the other hand. Finally, we will address the issue of the required size of a corpus and provide a brief review of the current situation regarding corpus design along with some recommendations for corpus building.http://journals.openedition.org/corela/3029design corporamethodology in corpus linguisticsrepresentativenesscorpora structurecorpora size |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Najib Arbach Saandia Ali |
spellingShingle |
Najib Arbach Saandia Ali Aspects théoriques et méthodologiques de la représentativité des corpus Corela design corpora methodology in corpus linguistics representativeness corpora structure corpora size |
author_facet |
Najib Arbach Saandia Ali |
author_sort |
Najib Arbach |
title |
Aspects théoriques et méthodologiques de la représentativité des corpus |
title_short |
Aspects théoriques et méthodologiques de la représentativité des corpus |
title_full |
Aspects théoriques et méthodologiques de la représentativité des corpus |
title_fullStr |
Aspects théoriques et méthodologiques de la représentativité des corpus |
title_full_unstemmed |
Aspects théoriques et méthodologiques de la représentativité des corpus |
title_sort |
aspects théoriques et méthodologiques de la représentativité des corpus |
publisher |
Cercle linguistique du Centre et de l'Ouest - CerLICO |
series |
Corela |
issn |
1638-573X |
publishDate |
2014-05-01 |
description |
In 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications which dealt with corpus linguistics. This paper aims at defining the concept of representativeness in corpus design and at illustrating its main features as well as the various methods used to achieve it, which will include a discussion on the issues of categorization, sampling or the required size of a corpus.We will try to achieve a better understanding of the concept of representativeness through a review of the related literature on corpus linguistics. The various methods that are proposed and implemented in order to achieve representativeness in corpus design will be discussed and contrasted. The two main methods that will be examined are Biber’s stratification techniques (1993a, 1993b) on the one hand, and the methods represented by Sinclair’s "monitor corpus" (1991, 1996, 2004) on the other hand. Finally, we will address the issue of the required size of a corpus and provide a brief review of the current situation regarding corpus design along with some recommendations for corpus building. |
topic |
design corpora methodology in corpus linguistics representativeness corpora structure corpora size |
url |
http://journals.openedition.org/corela/3029 |
work_keys_str_mv |
AT najibarbach aspectstheoriquesetmethodologiquesdelarepresentativitedescorpus AT saandiaali aspectstheoriquesetmethodologiquesdelarepresentativitedescorpus |
_version_ |
1725485582651490304 |