Aspects théoriques et méthodologiques de la représentativité des corpus

In 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications wh...

Full description

Bibliographic Details
Main Authors: Najib Arbach, Saandia Ali
Format: Article
Language:English
Published: Cercle linguistique du Centre et de l'Ouest - CerLICO 2014-05-01
Series:Corela
Subjects:
Online Access:http://journals.openedition.org/corela/3029
id doaj-e78deb95c4c84f4dacaad7418b700243
record_format Article
spelling doaj-e78deb95c4c84f4dacaad7418b7002432020-11-24T23:48:33ZengCercle linguistique du Centre et de l'Ouest - CerLICOCorela1638-573X2014-05-0110.4000/corela.3029Aspects théoriques et méthodologiques de la représentativité des corpusNajib ArbachSaandia AliIn 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications which dealt with corpus linguistics. This paper aims at defining the concept of representativeness in corpus design and at illustrating its main features as well as the various methods used to achieve it, which will include a discussion on the issues of categorization, sampling or the required size of a corpus.We will try to achieve a better understanding of the concept of representativeness through a review of the related literature on corpus linguistics. The various methods that are proposed and implemented in order to achieve representativeness in corpus design will be discussed and contrasted. The two main methods that will be examined are Biber’s stratification techniques (1993a, 1993b) on the one hand, and the methods represented by Sinclair’s "monitor corpus" (1991, 1996, 2004) on the other hand. Finally, we will address the issue of the required size of a corpus and provide a brief review of the current situation regarding corpus design along with some recommendations for corpus building.http://journals.openedition.org/corela/3029design corporamethodology in corpus linguisticsrepresentativenesscorpora structurecorpora size
collection DOAJ
language English
format Article
sources DOAJ
author Najib Arbach
Saandia Ali
spellingShingle Najib Arbach
Saandia Ali
Aspects théoriques et méthodologiques de la représentativité des corpus
Corela
design corpora
methodology in corpus linguistics
representativeness
corpora structure
corpora size
author_facet Najib Arbach
Saandia Ali
author_sort Najib Arbach
title Aspects théoriques et méthodologiques de la représentativité des corpus
title_short Aspects théoriques et méthodologiques de la représentativité des corpus
title_full Aspects théoriques et méthodologiques de la représentativité des corpus
title_fullStr Aspects théoriques et méthodologiques de la représentativité des corpus
title_full_unstemmed Aspects théoriques et méthodologiques de la représentativité des corpus
title_sort aspects théoriques et méthodologiques de la représentativité des corpus
publisher Cercle linguistique du Centre et de l'Ouest - CerLICO
series Corela
issn 1638-573X
publishDate 2014-05-01
description In 1982, Francis (1991: 17) defines a corpus as:’A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis.’The representativeness of a corpus would then be taken into account by most of the main publications which dealt with corpus linguistics. This paper aims at defining the concept of representativeness in corpus design and at illustrating its main features as well as the various methods used to achieve it, which will include a discussion on the issues of categorization, sampling or the required size of a corpus.We will try to achieve a better understanding of the concept of representativeness through a review of the related literature on corpus linguistics. The various methods that are proposed and implemented in order to achieve representativeness in corpus design will be discussed and contrasted. The two main methods that will be examined are Biber’s stratification techniques (1993a, 1993b) on the one hand, and the methods represented by Sinclair’s "monitor corpus" (1991, 1996, 2004) on the other hand. Finally, we will address the issue of the required size of a corpus and provide a brief review of the current situation regarding corpus design along with some recommendations for corpus building.
topic design corpora
methodology in corpus linguistics
representativeness
corpora structure
corpora size
url http://journals.openedition.org/corela/3029
work_keys_str_mv AT najibarbach aspectstheoriquesetmethodologiquesdelarepresentativitedescorpus
AT saandiaali aspectstheoriquesetmethodologiquesdelarepresentativitedescorpus
_version_ 1725485582651490304