The Lexical Novelty as lexical richness measure in a contemporary oral corpus
the PRESEEA-Santander corpus
DOI:
https://doi.org/10.4151/S0718-09342025011701076Abstract
The most generalized quantitative studies of the lexicon are, among others, the basic lexicon, the available lexicon and those that measure lexical richness (Ávila Muñoz 2014). The present study is located in this last direction. Traditionally, the formula used to obtain the average richness of each text is TTR (Type Tokens Ratio), consisting of dividing the number of different words by the total number of words (Capsada and Torruella 2017). This formula is valid when comparing corpora of the same size, but it is not reliable when comparing texts of different dimensions. Along with density, other indices to measure lexical richness -such as diversity (Baayen 2001), information (Shannon and Weaver 1963) or what we call peculiarity (Baayen 2001, 2008)- are not exempt from problems either. In this article, the novelty of frequency (N) is proposed as a new index of lexical richness, and it is applied to the sociolinguistic corpus PRESEEA-Santander in two directions: the grammatical category and the sociolinguistic parameters (sex, age and educational level). According to the analyses, there is a notable difference in lexical novelty depending on the grammatical category and little in terms of sociolinguistic criteria, probably to guarantee human communication. The methodology from which the lexical statistics is carried out uses the R package of ggplot and psych
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Revista Signos. Estudios de Lingüística

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright agreement:
Authors who have a manuscript accepted for publication in this journal agree to the following terms:
Authors will retain their copyright and grant the journal the right of first publication of their work by means of this copyright agreement document, which is subject to the Creative Commons Acknowledgment License that allows third parties to share the work provided that its author and first publication in this journal are indicated.
Authors may adopt other non-exclusive license agreements for distribution of the published version of the work (e.g., depositing it in an institutional repository or publishing it in a monographic volume) as long as the initial publication in this journal is indicated.
Authors are allowed and encouraged to disseminate their work via the internet (e.g., in institutional publications or on their website) before and during the submission process, which can lead to interesting exchanges and increase citations of the published work (read more here).