Understanding Manulex Morpho
The phonetic transcription is based on a simplified phonetic
alphabet created for the database (the correspondence with the
international phonetic alphabet can be found on the page
'Phonetic codes')
General information
• Word spelling
• Phonological code of the word
• Syntactic class (NC: noun; VER: verb; ADJ: adjective; PRO:
pronoun; PRE: preposition; CON: conjunction; DET: determiner)
• Frequency of use (according to Manulex; Lété et al.,
2004). The frequency index is the frequency of the word per
million words (value derived from the F value of the Manulex
database)
• Phonological structure CV1 (C=consonant, v=vowel,
Y=semi-vowel)
• Phonological structure CV2 (O=occlusive, F=fricative, N=nasal
consonant, L=phonemes /l/ and /R/, v=vowel, Y=semi-vowel)
• Presence/absence of consonant cluster in the word (column
'clusterCC'), and identity of CC and CCC clusters (column
'cluster_id')
• Segmentation of the word into graphemes (the '.' character
indicates a grapheme boundary)
• Phonological segmentation reflecting the grapheme segmentation
• Grapheme-phoneme associations. This field allows to locate
words including a particular association. '-' is used to link
grapheme and corresponding phoneme, and '.' to delineate
grapheme-phoneme associations. In addition, an open parenthesis
'(' signals the beginning of the word, and a closing parenthesis
')' its end: for example, '(ch-S.a-a.r-R)' for the word 'char'
/SaR/. These characters can be used to help locate words with a
particular grapheme-phoneme association at the beginning or end
of the word (e.g., searching for '(ch-S.' or '.ch-S)' provides a
list of words including the association at the beginning or end
of the word, respectively).
! ! Note: in version 2 of Manulex-Morpho, the grapho-phonemic
segmentations are different from the phono-graphemic
segmentations, to mainly take into account silent graphemes (see
tab 'Changes from ver.1').
Word length indices
• Number of letters, phonemes, graphemes, and syllables
Frequency and consistency of G-Ph and Ph-G associations, for
the whole word and as a function of position in the word
(initial, final, internal)
(Notes. Estimates made by type and by token. The textual
frequency index is the frequency of the word per million
words. From ver. 2.4, by-token values computed using a log
transform of word frequency, log10(frequency+1. In
Manulex-Morpho version 2, several graphemes can be coded as
final graphemes. For example, the 'd' of the word 'foulard' is
considered as a final grapheme in 'foulards', before the
nominal number inflection. See tab 'Changes from ver.1'.
• Average frequency of G-Ph and Ph-G associations
• Frequency of the initial G-Ph and Ph-G association
• Average frequency of internal G-Ph and Ph-G associations
(non-initial and non-final)
• Frequency of final G-Ph and Ph-G association(s)
• Average consistency of G-Ph associations
• At word initial
• In internal position (non-initial and non-final)
• At the end of the word
• Average consistency of Ph-G associations
• At word initial
• In internal position (non-initial and non-final)
• At the end of the word