Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom

In French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation p...

Full description

Bibliographic Details
Main Author: Juliette Thuilier
Format: Article
Language:English
Published: Publications de l’Université de Provence 2013-12-01
Series:TIPA. Travaux interdisciplinaires sur la parole et le langage
Subjects:
Online Access:http://journals.openedition.org/tipa/1066
id doaj-e53f8a4412004c56a27a2273c80b792c
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Juliette Thuilier
spellingShingle Juliette Thuilier
Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
TIPA. Travaux interdisciplinaires sur la parole et le langage
corpus linguistics
quantitative approach
statistical modeling
French TreeBank
C-ORAL-ROM
author_facet Juliette Thuilier
author_sort Juliette Thuilier
title Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_short Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_full Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_fullStr Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_full_unstemmed Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_sort syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
publisher Publications de l’Université de Provence
series TIPA. Travaux interdisciplinaires sur la parole et le langage
issn 2264-7082
publishDate 2013-12-01
description In French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation phenomenon. We aim to determine in which cases the syntax of this phenomenon is different in SF and WF, and to quantify these differences. The methodology is inspired by the work by Bresnan et al. (2007) and Bresnan and Ford (2010) on dative alternation in English. Using statistical modeling on data extracted from written and spoken corpora, we test syntactic factors found in the literature (Abeillé and Godard, 1999; Wilmet, 1981; Forsgren, 1978; Blinkenberg, 1933 a. o.) We assume that, with statistical tools (logistic regression – Agresti, 2007 – and mixed-effect models – Gelman and Hill, 2006), we are able to free ourselves from variations due to the sampling of the corpora. Moreover, one advantage of the mixed-effect logistic regression is that it is predictive, in the sense that one can build a model on a set of data and use this model to predict the choice between anteposition and postposition on unseen data. This way, we can evaluate how well the model generalizes from the training set. Lastly, we make use of the possibility of testing the significance of interaction between different factors in order to evaluate which syntactic factors have a different behavior according to the medium used (spoken vs. written). To build our database, we first extracted the attributive As that appeared in both positions in the syntactically annotated newspaper corpus French Treebank (FTB, Abeillé and Clément 2004), leaving aside As with post-adjectival dependents. We then extracted the same As from the spoken corpus C-ORAL-ROM (CORAL, Cresti and Moneglia 2005). Besides the variable capturing the medium used (SF vs. WF), these data were annotated for 11 variables concerning the syntactic environment of each A in context: (1) the A is coordinated, (2) the A is modified by an adverbial element; the NP contains (3) an other A in postposition, (4) an other A in anteposition, (5) a relative clause, (6) a PP; the determiner of the NP is (7) demonstrative, (8) possessive, (9) definite; a measure of collocation for (10) the ordered sequence A+N and (11) the ordered sequence N+A (collocations estimated with χ2, Manning and Schütze, 1999). We also differentiated two lemmas in context for 5 As: ancien 'ancient/former', pur 'pure', seul 'alone/single', simple 'simple/modest', propre 'own/clean' . The database contains 6612 occurrences of attributive As (4986 in FTB, 1626 in CORAL) representing 170 lemmas, with 68.9% of anteposition (67.1% in FTB, 74.3% in CORAL). There is variation according to the lemmas: for instance, the A unique 'unique' is anteposed in 20% of the cases, whereas sérieux 'serious' appears in this position in 51.4% and petit 'small' in 98.6%. Moreover, there is less alternation in spoken data than in written ones: the 170 lemmas appear in both positions in FTB, while only 56 (72,8% of the 130 lemmas attested in CORAL) of them are really alternating in CORAL. This seems to reveal that in spoken French, the As tend to have a more fixed behavior than in the written variant. One can hypothesize that the more the speech is spontaneous, the more the A occurs in its preferred position, that is the more frequent position. We used mixed-effects logistic regression to estimate the probability that the anteposition will be chosen as a function of 12 predictive variables (the 11 syntactic variables and the medium used: WF or SP). The construction of the model consists in estimating the coefficients that are associated with each variable. Besides the predictive variables, also called fixed effects, mixed-effects models are able to take into account the variation in the data by means of random-effects. In our case, the adjectival lemmas are the random effects in order to model the adjectival idiosyncrasies. We built a model with 12 fixed-effects, 1 random-effect and 11 interaction between the medium and the 11 syntactic variables. We tested all the interactions between the medium and the 10 syntactic variables interactions. We removed predictors and interactions that were non-significant at the 0.05 level step by step, but keeping in the model non-significant fixed-effects for predictors that participated in significant interactions. All the fixed-effects as well as 4 interaction were significant and thus participate in predicting the position of the As.Each coefficient associated with fixed-effects can be interpreted as the preference for a position: a positive coefficient indicates a preference for anteposition and a negative one for postposition. Thus the model shows that the nature of the determiner influences the position: demonstrative, possessive and definite determiners favor the anteposition. Moreover, APs containing coordinated As or adverbial modifiers tend to be postposed, which confirms that speakers tend to put « heavy » APs after the N. The occurrence of a relative clause, a PP or another A after the N also favors the anteposition. Finally, the N the A is combined with affects the choice: the more the A and the N tend to be a collocation in a given order, as in à juste titre 'understandably', the more the sequence tend to occur in the given order. There is also a significant effect of the medium: SF favors postposition compared to WF. Insofar as each of these syntactic variables favor the same position in WF as well as in SF, we consider that the phenomenon is syntactically unified in both variants. There is only one factor that do not have the same effect: the presence of an anteposed adjective. It favors anteposition in SF and postposition in WF. The three other interactions with the medium show that the observed effect is strengthened or weakened in SF. First, demonstratives strongly favor antposition in WF, whereas in SF it has a weak effect. Second, the possessive determiner and the adverbial modifier tend to be more strongly associated with anteposition in SF.
topic corpus linguistics
quantitative approach
statistical modeling
French TreeBank
C-ORAL-ROM
url http://journals.openedition.org/tipa/1066
work_keys_str_mv AT juliettethuilier syntaxedufrancaisparlevsecritlecasdelapositiondeladjectifepitheteparrapportaunom
_version_ 1725621450764713984
spelling doaj-e53f8a4412004c56a27a2273c80b792c2020-11-24T23:06:43ZengPublications de l’Université de ProvenceTIPA. Travaux interdisciplinaires sur la parole et le langage2264-70822013-12-012910.4000/tipa.1066Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nomJuliette ThuilierIn French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation phenomenon. We aim to determine in which cases the syntax of this phenomenon is different in SF and WF, and to quantify these differences. The methodology is inspired by the work by Bresnan et al. (2007) and Bresnan and Ford (2010) on dative alternation in English. Using statistical modeling on data extracted from written and spoken corpora, we test syntactic factors found in the literature (Abeillé and Godard, 1999; Wilmet, 1981; Forsgren, 1978; Blinkenberg, 1933 a. o.) We assume that, with statistical tools (logistic regression – Agresti, 2007 – and mixed-effect models – Gelman and Hill, 2006), we are able to free ourselves from variations due to the sampling of the corpora. Moreover, one advantage of the mixed-effect logistic regression is that it is predictive, in the sense that one can build a model on a set of data and use this model to predict the choice between anteposition and postposition on unseen data. This way, we can evaluate how well the model generalizes from the training set. Lastly, we make use of the possibility of testing the significance of interaction between different factors in order to evaluate which syntactic factors have a different behavior according to the medium used (spoken vs. written). To build our database, we first extracted the attributive As that appeared in both positions in the syntactically annotated newspaper corpus French Treebank (FTB, Abeillé and Clément 2004), leaving aside As with post-adjectival dependents. We then extracted the same As from the spoken corpus C-ORAL-ROM (CORAL, Cresti and Moneglia 2005). Besides the variable capturing the medium used (SF vs. WF), these data were annotated for 11 variables concerning the syntactic environment of each A in context: (1) the A is coordinated, (2) the A is modified by an adverbial element; the NP contains (3) an other A in postposition, (4) an other A in anteposition, (5) a relative clause, (6) a PP; the determiner of the NP is (7) demonstrative, (8) possessive, (9) definite; a measure of collocation for (10) the ordered sequence A+N and (11) the ordered sequence N+A (collocations estimated with χ2, Manning and Schütze, 1999). We also differentiated two lemmas in context for 5 As: ancien 'ancient/former', pur 'pure', seul 'alone/single', simple 'simple/modest', propre 'own/clean' . The database contains 6612 occurrences of attributive As (4986 in FTB, 1626 in CORAL) representing 170 lemmas, with 68.9% of anteposition (67.1% in FTB, 74.3% in CORAL). There is variation according to the lemmas: for instance, the A unique 'unique' is anteposed in 20% of the cases, whereas sérieux 'serious' appears in this position in 51.4% and petit 'small' in 98.6%. Moreover, there is less alternation in spoken data than in written ones: the 170 lemmas appear in both positions in FTB, while only 56 (72,8% of the 130 lemmas attested in CORAL) of them are really alternating in CORAL. This seems to reveal that in spoken French, the As tend to have a more fixed behavior than in the written variant. One can hypothesize that the more the speech is spontaneous, the more the A occurs in its preferred position, that is the more frequent position. We used mixed-effects logistic regression to estimate the probability that the anteposition will be chosen as a function of 12 predictive variables (the 11 syntactic variables and the medium used: WF or SP). The construction of the model consists in estimating the coefficients that are associated with each variable. Besides the predictive variables, also called fixed effects, mixed-effects models are able to take into account the variation in the data by means of random-effects. In our case, the adjectival lemmas are the random effects in order to model the adjectival idiosyncrasies. We built a model with 12 fixed-effects, 1 random-effect and 11 interaction between the medium and the 11 syntactic variables. We tested all the interactions between the medium and the 10 syntactic variables interactions. We removed predictors and interactions that were non-significant at the 0.05 level step by step, but keeping in the model non-significant fixed-effects for predictors that participated in significant interactions. All the fixed-effects as well as 4 interaction were significant and thus participate in predicting the position of the As.Each coefficient associated with fixed-effects can be interpreted as the preference for a position: a positive coefficient indicates a preference for anteposition and a negative one for postposition. Thus the model shows that the nature of the determiner influences the position: demonstrative, possessive and definite determiners favor the anteposition. Moreover, APs containing coordinated As or adverbial modifiers tend to be postposed, which confirms that speakers tend to put « heavy » APs after the N. The occurrence of a relative clause, a PP or another A after the N also favors the anteposition. Finally, the N the A is combined with affects the choice: the more the A and the N tend to be a collocation in a given order, as in à juste titre 'understandably', the more the sequence tend to occur in the given order. There is also a significant effect of the medium: SF favors postposition compared to WF. Insofar as each of these syntactic variables favor the same position in WF as well as in SF, we consider that the phenomenon is syntactically unified in both variants. There is only one factor that do not have the same effect: the presence of an anteposed adjective. It favors anteposition in SF and postposition in WF. The three other interactions with the medium show that the observed effect is strengthened or weakened in SF. First, demonstratives strongly favor antposition in WF, whereas in SF it has a weak effect. Second, the possessive determiner and the adverbial modifier tend to be more strongly associated with anteposition in SF.http://journals.openedition.org/tipa/1066corpus linguisticsquantitative approachstatistical modelingFrench TreeBankC-ORAL-ROM