Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom

In French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation p...

Full description

Bibliographic Details
Main Author:	Juliette Thuilier
Format:	Article
Language:	English
Published:	Publications de l’Université de Provence 2013-12-01
Series:	TIPA. Travaux interdisciplinaires sur la parole et le langage
Subjects:	corpus linguistics quantitative approach statistical modeling French TreeBank C-ORAL-ROM
Online Access:	http://journals.openedition.org/tipa/1066

id	doaj-e53f8a4412004c56a27a2273c80b792c
record_format	Article
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Juliette Thuilier
spellingShingle	Juliette Thuilier Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom TIPA. Travaux interdisciplinaires sur la parole et le langage corpus linguistics quantitative approach statistical modeling French TreeBank C-ORAL-ROM
author_facet	Juliette Thuilier
author_sort	Juliette Thuilier
title	Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_short	Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_full	Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_fullStr	Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_full_unstemmed	Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
title_sort	syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom
publisher	Publications de l’Université de Provence
series	TIPA. Travaux interdisciplinaires sur la parole et le langage
issn	2264-7082
publishDate	2013-12-01
description	In French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation phenomenon. We aim to determine in which cases the syntax of this phenomenon is different in SF and WF, and to quantify these differences. The methodology is inspired by the work by Bresnan et al. (2007) and Bresnan and Ford (2010) on dative alternation in English. Using statistical modeling on data extracted from written and spoken corpora, we test syntactic factors found in the literature (Abeillé and Godard, 1999; Wilmet, 1981; Forsgren, 1978; Blinkenberg, 1933 a. o.) We assume that, with statistical tools (logistic regression – Agresti, 2007 – and mixed-effect models – Gelman and Hill, 2006), we are able to free ourselves from variations due to the sampling of the corpora. Moreover, one advantage of the mixed-effect logistic regression is that it is predictive, in the sense that one can build a model on a set of data and use this model to predict the choice between anteposition and postposition on unseen data. This way, we can evaluate how well the model generalizes from the training set. Lastly, we make use of the possibility of testing the significance of interaction between different factors in order to evaluate which syntactic factors have a different behavior according to the medium used (spoken vs. written). To build our database, we first extracted the attributive As that appeared in both positions in the syntactically annotated newspaper corpus French Treebank (FTB, Abeillé and Clément 2004), leaving aside As with post-adjectival dependents. We then extracted the same As from the spoken corpus C-ORAL-ROM (CORAL, Cresti and Moneglia 2005). Besides the variable capturing the medium used (SF vs. WF), these data were annotated for 11 variables concerning the syntactic environment of each A in context: (1) the A is coordinated, (2) the A is modified by an adverbial element; the NP contains (3) an other A in postposition, (4) an other A in anteposition, (5) a relative clause, (6) a PP; the determiner of the NP is (7) demonstrative, (8) possessive, (9) definite; a measure of collocation for (10) the ordered sequence A+N and (11) the ordered sequence N+A (collocations estimated with χ2, Manning and Schütze, 1999). We also differentiated two lemmas in context for 5 As: ancien 'ancient/former', pur 'pure', seul 'alone/single', simple 'simple/modest', propre 'own/clean' . The database contains 6612 occurrences of attributive As (4986 in FTB, 1626 in CORAL) representing 170 lemmas, with 68.9% of anteposition (67.1% in FTB, 74.3% in CORAL). There is variation according to the lemmas: for instance, the A unique 'unique' is anteposed in 20% of the cases, whereas sérieux 'serious' appears in this position in 51.4% and petit 'small' in 98.6%. Moreover, there is less alternation in spoken data than in written ones: the 170 lemmas appear in both positions in FTB, while only 56 (72,8% of the 130 lemmas attested in CORAL) of them are really alternating in CORAL. This seems to reveal that in spoken French, the As tend to have a more fixed behavior than in the written variant. One can hypothesize that the more the speech is spontaneous, the more the A occurs in its preferred position, that is the more frequent position. We used mixed-effects logistic regression to estimate the probability that the anteposition will be chosen as a function of 12 predictive variables (the 11 syntactic variables and the medium used: WF or SP). The construction of the model consists in estimating the coefficients that are associated with each variable. Besides the predictive variables, also called fixed effects, mixed-effects models are able to take into account the variation in the data by means of random-effects. In our case, the adjectival lemmas are the random effects in order to model the adjectival idiosyncrasies. We built a model with 12 fixed-effects, 1 random-effect and 11 interaction between the medium and the 11 syntactic variables. We tested all the interactions between the medium and the 10 syntactic variables interactions. We removed predictors and interactions that were non-significant at the 0.05 level step by step, but keeping in the model non-significant fixed-effects for predictors that participated in significant interactions. All the fixed-effects as well as 4 interaction were significant and thus participate in predicting the position of the As.Each coefficient associated with fixed-effects can be interpreted as the preference for a position: a positive coefficient indicates a preference for anteposition and a negative one for postposition. Thus the model shows that the nature of the determiner influences the position: demonstrative, possessive and definite determiners favor the anteposition. Moreover, APs containing coordinated As or adverbial modifiers tend to be postposed, which confirms that speakers tend to put « heavy » APs after the N. The occurrence of a relative clause, a PP or another A after the N also favors the anteposition. Finally, the N the A is combined with affects the choice: the more the A and the N tend to be a collocation in a given order, as in à juste titre 'understandably', the more the sequence tend to occur in the given order. There is also a significant effect of the medium: SF favors postposition compared to WF. Insofar as each of these syntactic variables favor the same position in WF as well as in SF, we consider that the phenomenon is syntactically unified in both variants. There is only one factor that do not have the same effect: the presence of an anteposed adjective. It favors anteposition in SF and postposition in WF. The three other interactions with the medium show that the observed effect is strengthened or weakened in SF. First, demonstratives strongly favor antposition in WF, whereas in SF it has a weak effect. Second, the possessive determiner and the adverbial modifier tend to be more strongly associated with anteposition in SF.
topic	corpus linguistics quantitative approach statistical modeling French TreeBank C-ORAL-ROM
url	http://journals.openedition.org/tipa/1066
work_keys_str_mv	AT juliettethuilier syntaxedufrancaisparlevsecritlecasdelapositiondeladjectifepitheteparrapportaunom
_version_	1725621450764713984
spelling	doaj-e53f8a4412004c56a27a2273c80b792c2020-11-24T23:06:43ZengPublications de l’Université de ProvenceTIPA. Travaux interdisciplinaires sur la parole et le langage2264-70822013-12-012910.4000/tipa.1066Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nomJuliette ThuilierIn French, attributive adjectives (A) can appear both before or after the noun (N): (1) a. une agréable soirée (anteposed)b. une soirée agréable (postposed) « a nice evening »We compare the difference between the syntax of spoken French (SF) and written French (WF) on the basis of this alternation phenomenon. We aim to determine in which cases the syntax of this phenomenon is different in SF and WF, and to quantify these differences. The methodology is inspired by the work by Bresnan et al. (2007) and Bresnan and Ford (2010) on dative alternation in English. Using statistical modeling on data extracted from written and spoken corpora, we test syntactic factors found in the literature (Abeillé and Godard, 1999; Wilmet, 1981; Forsgren, 1978; Blinkenberg, 1933 a. o.) We assume that, with statistical tools (logistic regression – Agresti, 2007 – and mixed-effect models – Gelman and Hill, 2006), we are able to free ourselves from variations due to the sampling of the corpora. Moreover, one advantage of the mixed-effect logistic regression is that it is predictive, in the sense that one can build a model on a set of data and use this model to predict the choice between anteposition and postposition on unseen data. This way, we can evaluate how well the model generalizes from the training set. Lastly, we make use of the possibility of testing the significance of interaction between different factors in order to evaluate which syntactic factors have a different behavior according to the medium used (spoken vs. written). To build our database, we first extracted the attributive As that appeared in both positions in the syntactically annotated newspaper corpus French Treebank (FTB, Abeillé and Clément 2004), leaving aside As with post-adjectival dependents. We then extracted the same As from the spoken corpus C-ORAL-ROM (CORAL, Cresti and Moneglia 2005). Besides the variable capturing the medium used (SF vs. WF), these data were annotated for 11 variables concerning the syntactic environment of each A in context: (1) the A is coordinated, (2) the A is modified by an adverbial element; the NP contains (3) an other A in postposition, (4) an other A in anteposition, (5) a relative clause, (6) a PP; the determiner of the NP is (7) demonstrative, (8) possessive, (9) definite; a measure of collocation for (10) the ordered sequence A+N and (11) the ordered sequence N+A (collocations estimated with χ2, Manning and Schütze, 1999). We also differentiated two lemmas in context for 5 As: ancien 'ancient/former', pur 'pure', seul 'alone/single', simple 'simple/modest', propre 'own/clean' . The database contains 6612 occurrences of attributive As (4986 in FTB, 1626 in CORAL) representing 170 lemmas, with 68.9% of anteposition (67.1% in FTB, 74.3% in CORAL). There is variation according to the lemmas: for instance, the A unique 'unique' is anteposed in 20% of the cases, whereas sérieux 'serious' appears in this position in 51.4% and petit 'small' in 98.6%. Moreover, there is less alternation in spoken data than in written ones: the 170 lemmas appear in both positions in FTB, while only 56 (72,8% of the 130 lemmas attested in CORAL) of them are really alternating in CORAL. This seems to reveal that in spoken French, the As tend to have a more fixed behavior than in the written variant. One can hypothesize that the more the speech is spontaneous, the more the A occurs in its preferred position, that is the more frequent position. We used mixed-effects logistic regression to estimate the probability that the anteposition will be chosen as a function of 12 predictive variables (the 11 syntactic variables and the medium used: WF or SP). The construction of the model consists in estimating the coefficients that are associated with each variable. Besides the predictive variables, also called fixed effects, mixed-effects models are able to take into account the variation in the data by means of random-effects. In our case, the adjectival lemmas are the random effects in order to model the adjectival idiosyncrasies. We built a model with 12 fixed-effects, 1 random-effect and 11 interaction between the medium and the 11 syntactic variables. We tested all the interactions between the medium and the 10 syntactic variables interactions. We removed predictors and interactions that were non-significant at the 0.05 level step by step, but keeping in the model non-significant fixed-effects for predictors that participated in significant interactions. All the fixed-effects as well as 4 interaction were significant and thus participate in predicting the position of the As.Each coefficient associated with fixed-effects can be interpreted as the preference for a position: a positive coefficient indicates a preference for anteposition and a negative one for postposition. Thus the model shows that the nature of the determiner influences the position: demonstrative, possessive and definite determiners favor the anteposition. Moreover, APs containing coordinated As or adverbial modifiers tend to be postposed, which confirms that speakers tend to put « heavy » APs after the N. The occurrence of a relative clause, a PP or another A after the N also favors the anteposition. Finally, the N the A is combined with affects the choice: the more the A and the N tend to be a collocation in a given order, as in à juste titre 'understandably', the more the sequence tend to occur in the given order. There is also a significant effect of the medium: SF favors postposition compared to WF. Insofar as each of these syntactic variables favor the same position in WF as well as in SF, we consider that the phenomenon is syntactically unified in both variants. There is only one factor that do not have the same effect: the presence of an anteposed adjective. It favors anteposition in SF and postposition in WF. The three other interactions with the medium show that the observed effect is strengthened or weakened in SF. First, demonstratives strongly favor antposition in WF, whereas in SF it has a weak effect. Second, the possessive determiner and the adverbial modifier tend to be more strongly associated with anteposition in SF.http://journals.openedition.org/tipa/1066corpus linguisticsquantitative approachstatistical modelingFrench TreeBankC-ORAL-ROM

Syntaxe du français parlé vs. écrit : le cas de la position de l’adjectif épithète par rapport au nom

Similar Items