Lemmatization of codea data and its use in quantitative analyzes on the eñe and the silent hache

In this article we will explain a method of lemmatization of Spanish old documents using the data of «CODEA» Corpus de Documentos Españoles Anteriores a 1800 (Sánchez-Prieto et al., 2009) and the analysis tool «LYNEAL» (Letras y Números en Análisis Lingüísticos). Our goal is to present the simplest...

Full description

Bibliographic Details
Main Author: Hiroto Ueda
Format: Article
Language:Spanish
Published: Editorial Universidad de Sevilla 2019-12-01
Series:Philologia Hispalensis
Subjects:
Online Access:https://revistascientificas.us.es/index.php/PH/article/view/9543
Description
Summary:In this article we will explain a method of lemmatization of Spanish old documents using the data of «CODEA» Corpus de Documentos Españoles Anteriores a 1800 (Sánchez-Prieto et al., 2009) and the analysis tool «LYNEAL» (Letras y Números en Análisis Lingüísticos). Our goal is to present the simplest possible method of lemmatization, easy to perform with high degree of accuracy. Next, we will expose two examples of its use in the historical study of Spanish spelling: on the eñe and the silent hache.
ISSN:1132-0265
2253-8321