Free and Open source Hill Mari analyser gtsvn-mrj

Julia Kuprina and Jack Rueter in cooperation with the Divvun and Giellatekno teams, community members
Software version
Documentation license
SVN Revision
$Revision:68217 $
SVN Date
$Date:2013-01-16 11:31:33 +0200 (Wed, 16 Jan 2013) $


This is free and open source Hill Mari morphology.



Analysis symbols

The morphological analyses of wordforms of UNDEFINED language are presented in this system in terms of following symbols. (It is highly suggested to follow existing standards when adding new tags).

The parts-of-speech are:

The parts of speech are further split up into: Nouns Pronouns Verbs

The Usage extents are marked using following tags:

Forms from older orthographic norms are marked with the following tags:

The nominals are inflected in the following Case and Number

The possession is marked as such: The comparative forms are: Numerals are classified under: Verb moods are: Verb personal forms are: Other verb forms are

Special symbols are classified with: The verbs are syntactically split according to transitivity: Special multiword units are analysed with: Non-dictionary words can be recognised with:

Question and Focus particles:

Semantics are classified with

Derivations are classified under the morphophonetic form of the suffix, the source and target part-of-speech.


To represent phonologic variations in word forms we use the following symbols in the lexicon files:

And following triggers to control variation k loss in am-verbs, also z to ts

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics:

The word forms in Hill Mari start from the lexeme roots of basic word classes, or optionally from prefixes: The assumption is that xml files with names pos.xml will provide the source material for the initial pos.lexc _LEXICON Pos_ entries

Small parts of speech

Noun inflection

Hill Mari nouns inflect in cases.



LEXICON N_КОЛ кол: кол

LEXICON N_ТӸРВӸ тӹрвӹ: тӹрвӹ The stem vowel "е" is found with possessor indices and the lative

LEXICON N_ПОЧТА почта: почта

LEXICON N_ОЛМА олма: олма

PxSg1+NB+CASE singular possessa

Pl Possessor Indices

K ; ! No possessor index