Structure

Functions related to the structure of words.

lexlib.structure.clusters(words, vowels, sep=None, unique=False, case_sensitive=True)[source]

Separates a list of words into clusters. Clusters are defined as sequences of characters that do not contain any of the characters in the list of vowels.

If sep is defined, it will be used as the delimiter string (for example, with sep=”.”, the word “a.bc.de” will be treated as the three-character sequence [“a”, “bc”, “de”]).

If unique is True, returns each cluster only once. If unique is False (the default), returns each cluster as many times as it occurs.

If case_sensitive is True (the default), uppercase and lowercase characters will be treated as two different characters (e.g., “a” will be seen as different from “A”). If case_sensitive is False, uppercase and lowercase characters will be treated as the same character, and the output will be lowercase (e.g., “a” and “A” will both be treated as “a”).

lexlib.structure.clusters_word(word, vowels, sep=None, case_sensitive=True)[source]

Separates a word into clusters, defined as sequences of characters that do not contain any of the characters in the list of vowels.

If sep is defined, it will be used as the delimiter string (for example, with sep=”.”, the word “a.bc.de” will be treated as the three-character sequence [“a”, “bc”, “de”]).

If case_sensitive is True (the default), uppercase and lowercase characters will be treated as two different characters (e.g., “a” will be seen as different from “A”). If case_sensitive is False, uppercase and lowercase characters will be treated as the same character, and the output will be lowercase (e.g., “a” and “A” will both be treated as “a”).

lexlib.structure.filter_by_nsyll(words, vowels, nsyll, sep=None)[source]

Given a list of words, return a list containing only the words with the desired number of syllables, determined by the number of characters from the vowels list found in that word.

The number of syllables, nsyll can be either an integer or a list of integers. If it is a list, the returned list will contain words of any syllable length included in nsyll.

If sep is defined, it will be used as the delimiter string (for example, with sep=”.”, the word “a.bc.de” will be treated as the three-character sequence [“a”, “bc”, “de”]).

lexlib.structure.get_cv(word, vowels, sep=None)[source]

Calculate the consonant (“C”) and vowel (“V”) structure of the given word. Returns a string of the characters “C” and “V” corresponding to the characters in the word.

vowels – A list of the characters representing vowels.

sep – String used to separate phonemes (if the words are phonological forms). To separate into individual characters, set to None (default).

lexlib.structure.nsyll_list(words, vowels, sep=None)[source]

Count the number of syllables in each word in a words list, determined by the number of characters from the vowels list found in that word. Return a list of (word, nsyll) pairs.

If sep is defined, it will be used as the delimiter string (for example, with sep=”.”, the word “a.bc.de” will be treated as the three-character sequence [“a”, “bc”, “de”]).

lexlib.structure.nsyll_word(word, vowels, sep=None)[source]

Count the number of syllables in a word, determined by the number of characters from the vowels list found in that word.

If sep is defined, it will be used as the delimiter string (for example, with sep=”.”, the word “a.bc.de” will be treated as the three-character sequence [“a”, “bc”, “de”]).