Français


Orthographic, grapho-phonological, and morphological characteristics
of written words from French elementary textbooks







Manulex-Morpho



Understanding Manulex Morpho

The phonetic transcription is based on a simplified phonetic alphabet created for the database (the correspondence with the international phonetic alphabet can be found on the page 'Phonetic codes')

General information

• Word spelling
• Phonological code of the word
• Syntactic class (NC: noun; VER: verb; ADJ: adjective; PRO: pronoun; PRE: preposition; CON: conjunction; DET: determiner)
• Frequency of use (according to Manulex; Lété et al., 2004). The frequency index is the frequency of the word per million words (value derived from the F value of the Manulex database)
• Phonological structure CV1 (C=consonant, v=vowel, Y=semi-vowel)
• Phonological structure CV2 (O=occlusive, F=fricative, N=nasal consonant, L=phonemes /l/ and /R/, v=vowel, Y=semi-vowel)
• Presence/absence of consonant cluster in the word (column 'clusterCC'), and identity of CC and CCC clusters (column 'cluster_id')
• Segmentation of the word into graphemes (the '.' character indicates a grapheme boundary)
• Phonological segmentation reflecting the grapheme segmentation
• Grapheme-phoneme associations. This field allows to locate words including a particular association. '-' is used to link grapheme and corresponding phoneme, and '.' to delineate grapheme-phoneme associations. In addition, an open parenthesis '(' signals the beginning of the word, and a closing parenthesis ')' its end: for example, '(ch-S.a-a.r-R)' for the word 'char' /SaR/. These characters can be used to help locate words with a particular grapheme-phoneme association at the beginning or end of the word (e.g., searching for '(ch-S.' or '.ch-S)' provides a list of words including the association at the beginning or end of the word, respectively).
! ! Note: in version 2 of Manulex-Morpho, the grapho-phonemic segmentations are different from the phono-graphemic segmentations, to mainly take into account silent graphemes (see tab 'Changes from ver.1').


Word length indices

• Number of letters, phonemes, graphemes, and syllables


Frequency and consistency of G-Ph and Ph-G associations, for the whole word and as a function of position in the word (initial, final, internal)

(Notes. Estimates made by type and by token. The textual frequency index is the frequency of the word per million words. From ver. 2.4, by-token values computed using a log transform of word frequency, log10(frequency+1. In Manulex-Morpho version 2, several graphemes can be coded as final graphemes. For example, the 'd' of the word 'foulard' is considered as a final grapheme in 'foulards', before the nominal number inflection. See tab 'Changes from ver.1'.

• Average frequency of G-Ph and Ph-G associations
• Frequency of the initial G-Ph and Ph-G association
• Average frequency of internal G-Ph and Ph-G associations (non-initial and non-final)
• Frequency of final G-Ph and Ph-G association(s)

• Average consistency of G-Ph associations
• At word initial
• In internal position (non-initial and non-final)
• At the end of the word

• Average consistency of Ph-G associations
• At word initial
• In internal position (non-initial and non-final)
• At the end of the word