root-morphology

Morphology

INTRODUCTION TO MORPHOLOGICAL ANALYSER OF ERZYA.

Analysis symbols

The morphological analyses of wordforms of ERZYA are presented in this system in terms of following symbols. (It is highly suggested to follow existing standards when adding new tags).

  • +TYÄ WORK HAS TO BE DONE
  • %

The parts-of-speech are:

  • +A adjective
  • +Adp adposition
  • +Adv adverb
  • +CS subordinating conjunction
  • +CC coordinating conjunction
  • +Det determiner
  • +Descr descriptive
  • +Interj interjection
  • +N noun
  • +Num numerals
  • +Pcle particle
  • +Po postposition
  • +Pr preposition (in Russian loans)
  • +Pron pronoun
  • +Qnt quantifier
  • +V verb

Parts of speech are further split up into:

Adjectives

  • +Adn Adnominal (modifier) !! This is not an NP head like +Pron
  • +Bahuvrihi This is a nominative-case NP used as an adjective
  • +bahuvrihi get rid of these for upper-case

Adverbs

  • +Ideoph These are ideophonic descriptors used to modify the verb вырк ливтясь "flit and it flew off" "Ideophone: A vivid representation of an idea in sound. A word, often onomatopoeic, which describes a predicate, qualificative or adverb in respect to manner, colour, sound, smell, action, state or intensity." (Doke 1935: 118)
  • +Manner with reference to type of adverb
  • +Parenthetic parenthetic
  • +Spat spatial
  • +Temp temporal
  • +Iter Iterative form expressing number of times; myv: кавксть, kpv: кыкысь
  • +Mult Multiplicative, two-ply; myv: кавонькирда
  • +Deg Ad-adjective This is degree, depricate + AdA
  • +Epist epistemic modality marker speaker's evaluation/judgment of, degree of confidence in
  • +EvidNfh not first-hand келя
  • +EvidFh first-hand
  • +PerifMod periferal modifier ськамонзо

Interjections

  • +Formulaic

Nouns

  • +Prop proper
  • +CollN used with paired nouns, i.e. COLLECTIVE NOUNS

Particles

Postpositions + Spat, + Temp

Pronouns

  • +Dem demonstrative
  • +Indef indefinite
  • +Dep dependent word requiring the presence of another, e.g. мень
  • +Interr interrogative
  • +Pers personal
  • +Recipr reciprocal
  • +Refl reflexive
  • +Rel relative
  • +Relator relator noun
  • +Sel selective, when selecting from a set of definites
  • +Short тень, теть; эстень
  • +Long монень, тонеть; монстень
  • +Sg1 first person singular
  • +Sg2 second person singular
  • +Sg3 third person singular
  • +Pl1 first person plural
  • +Pl2 second person plural
  • +Pl3 third person plural

Quantifiers (numerals)

Quantifiers and Numerals are classified under:

  • +Appr Approximative numeral кавто-колмо, колмошка two or three NB! do not confuse with Komi case +Apr
  • +AssocColl -ne- ; avide-
  • +Assoc +мезть
  • +Card cardinal + NCard
  • +Coll collective
  • +Distr Distributive
  • +Ord ordinal + NOrd
+Exclusive
ськамонзо

Nominals are inflected for Number and Case

Number

  • +Sg singular
  • +Pl plural
  • +SP ambiguous for number, general number

Case

  • +Abe abessive
  • +Abl ablative case
  • +Com Comitative "-нек/-нэк"
  • +Cmpr Comparative case form -шка
  • +Dat dative
  • +Ela elative case
  • +Gen genitive case
  • +Ill illative
  • +Ine inessive
  • +Lat lative
  • +Loc Locative "вить ён : вить ёно"
  • +Nom nominative case
  • +Prl prolative "га/ка/ва"
  • +Tra translative: used in similative and depictive constructions to mark what would be a secondary subject: --вармакс оргодсь тосто--
  • +TempCx Temporalis case form "-не/-нэ"
  • +Voc Vocative

Possession and other declension types are marked with:

  • +PxSg1 first person singular
  • +PxSg2 second person singular
  • +PxSg3 third person singular
  • +PxSP3 third person singular or plural with dative only
  • +PxPl1 first person plural
  • +PxPl2 second person plural
  • +PxPl3 third person plural
  • +Def Definite

The comparative forms are:

  • +Comp comparative as opposed to superlative
  • +Superl superlative
  • +Attr Attribute

Verb moods are:

  • +Cond conditional Ындеря- (Derivational)
  • +Conj conjunctional "вОль"
  • +Des desiderative Ыксэль "was about to; wanted to"
  • +Ind indicative
  • +Imprt imperative
  • +Opt optative
  • +Prec precative
  • +Proh prohibitive is distinct from the negation of imperative Иля аварде! `Don't cry' (Proh); Аволь мелявтт, кецяк! `Don't worry, be happy!' (Neg + Imprt)

Infinitive moods

  • +Oblig modality: deontic/directive/obligative андомс: андома , якамс: якама
  • +Delib +Sugg modality: deontic/directive/deliberative I still need the right word for this андомс: андомсат

Tenses in the indicative and infrequently in the conditional

  • +Prs In Erzya There is no morphological distinction between present and future
  • +Prt1 Preterite 1
  • +Prt2 Preterite 2 (This is also used in predicate forms not involving a finite verb.)

Verb personal forms are:

  • +ScSg1 * subject conjugation first person singular
  • +ScSg2 * subject conjugation second person singular
  • +ScSg3 * subject conjugation third person singular
  • +ScPl1 * subject conjugation first person plural
  • +ScPl2 * subject conjugation second person plural
  • +ScPl3 * subject conjugation third person plural

Object conjugation

  • +OcSg1 * object conjugation first person singular
  • +OcSg2 * object conjugation second person singular
  • +OcSg3 * object conjugation third person singular
  • +OcPl1 * object conjugation first person plural
  • +OcPl2 * object conjugation second person plural
  • +OcPl3 * object conjugation third person plural

Other verb forms are

  • +Act * active voice (exo-tradition)
  • +PrsPrc * present participle (only non-contrastive usage)
  • +DemPrc * present participle (both contrastive and non-contrastive)
  • +ActPrcLong %{иы%}й (This is dealt with elsewhere as an active present participle)
  • +ActPrcShort %{иы%} (This is dealt with elsewhere as an active present participle)
  • +ActDemPrc %{иы%}ця (This is dealt with elsewhere as an active present participle)
  • +ConNeg * connegative, main verb complement to Neg, vow-stem
  • +ConNegII * connegative, main verb complement to Neg, cons-stem
  • +Ger * gerund This is used with Der/Ozj and VAbl
  • +Inf * infinitive
  • +Neg * verb of negation эзь, аволь, иля
  • +Prc * participle
  • +VGen * Verb Genitive, genitive form participle
  • +VAbl * Verb Ablative "озадо"
  • +Prc/Telic * Telic participle "саевть"
  • +Der/Abe * ВтОмО
  • +Der/Cmpr * шка
  • +Der/A * adjective derived from N or V
  • +Der/N2A * adjective derived from N
  • +Der/V2A * adjective derived from V
  • +Subst * deverbal nouns retaining verb arguments/gov
  • +PrfPrc

The Usage extents are marked using following tags:

  • +Err/Orth * Substandard
  • +Err/Sub * Substandard
  • +Err/Orth-no-linking-vowel linking vowel is missing
  • +Err/Orth-high-linking-vowel linking vowel is high
  • +Use/Marg * Marginal
  • +Use/-Spell * Exclude from speller
  • +Use/SpellNoSugg * recognized but not suggested in speller
  • +Use/Circ * Circular path
  • +Use/CircN * Circular number path
  • +Use/-Ped * Remove from pedagogical speller
  • +Use/NG * Do not generate, for isme-ped.fst and apertium
  • +Err/Lex * The lemma is not an Erzya word (Depricating --+Src/F--)

Dialect tags

  • +Dial/SH * Short forms
  • +Dial/L * Long forms
  • +Dial * No specification Specific to some dialects Rueter 2010: 8
  • +Dial/-C * Not central standard
  • +Dial/C * 1 Central or Kozlovka-Mokshlei
  • +Dial/W * 2 Western or Insar
  • +Dial/NW * 3 North-Western or Alatyr
  • +Dial/SE * 4 South-Eastern or Sura
  • +Dial/M * 5 Mixed or Drakino-Shoksha

Orthography tags

  • +Orth/PhonDeriv * Derivation is phonetic but declension and conjugation morphologic
  • +Orth/PhonInfl * Entire inflection is phonetic 1821, 1920-30
  • +Orth/Colloq Colloquial speech reflected in spelling

Abbreviated words are classified with:

  • +ABBR * Abbreviation
  • +Symbol = independent symbols in the text stream, like £, €, ©
  • +ACR * Acronym

Special symbols

Delimiter marks are classified with:

  • +CLB +PUNCT +LEFT +RIGHT *
  • %^excl *

The verbs are syntactically split according to transitivity:

  • +TV * transitive verb
  • +IV * intransitive verb
  • +NomAg Actor Noun From Verb - Nomen Agentis
  • +NomAct Actor Noun From Verb - Nomen Agentis

Auxiliary verbs

  • +Aux *

Special multiword units are analysed with:

  • +Multi

Non-dictionary words can be recognised with:

  • +Guess

Question and Focus particles:

  • +Qst +Foc

Semantic tags

Semantic tags to help disambiguation & synt. analysis: (before POS) Borrowed from main/langs/sme/src/morphology/root.lexc

Simplex tags

  • +Sem/Act Activity
  • +Sem/Amount Amount
  • +Sem/Ani Animate
  • +Sem/Aniprod Animal Product
  • +Sem/Body Bodypart
  • +Sem/Body-abstr siellu, vuoig?a, jierbmi
  • +Sem/Build Building
  • +Sem/Build-part Part of Bulding, like the closet
  • +Sem/Cat Category
  • +Sem/Clth Clothes
  • +Sem/Clth-jewl Jewelery
  • +Sem/Clth-part part of clothes, boallu, sávdnji...
  • +Sem/Ctain Container
  • +Sem/Ctain-abstr Abstract container like bank account
  • +Sem/Ctain-clth
  • +Sem/Curr Currency like dollár, Not Money
  • +Sem/Dance Dance
  • +Sem/Dir Direction like GPS-kursa
  • +Sem/Domain Domain like politics, reindeerherding (a system of actions)
  • +Sem/Drink Drink
  • +Sem/Dummytag Dummytag
  • +Sem/Edu Educational event
  • +Sem/Event Event
  • +Sem/Feat Feature, like Árvu
  • +Sem/Feat-phys Physiological feature, ivdni, fárda
  • +Sem/Feat-psych Psychological feauture
  • +Sem/Feat-measr Psychological feauture
  • +Sem/Fem Female name
  • +Sem/Food Food
  • +Sem/Food-med Medicine
  • +Sem/Furn Furniture
  • +Sem/Game Game
  • +Sem/Geom Geometrical object
  • +Sem/Group Animal or Human Group
  • +Sem/Hum Human
  • +Sem/Hum-abstr Human abstract
  • +Sem/Ideol Ideology
  • +Sem/Kin Kinship term (special PxSg2 forms),
  • +Sem/Kin_Fem Kinship term (special PxSg2 forms), female
  • +Sem/Kin_Mal Kinship term (special PxSg2 forms), male
  • +Sem/Lang Language
  • +Sem/Mal Male name
  • +Sem/Mat Material for producing things
  • +Sem/Measr Measure
  • +Sem/Money Has to do with money, like wages, not Curr(ency)
  • +Sem/Obj Object
  • +Sem/Obj-clo Cloth
  • +Sem/Obj-cogn Cloth
  • +Sem/Obj-el (Electrical) machine or apparatus
  • +Sem/Obj-ling Object with something written on it
  • +Sem/Obj-rope flexible ropelike object
  • +Sem/Obj-surfc Surface object
  • +Sem/Org Organisation
  • +Sem/Part Feature, oassi, bealli
  • +Sem/Perc-cogn Cognative perception
  • +Sem/Perc-emo Emotional perception
  • +Sem/Perc-phys Physical perception
  • +Sem/Perc-psych Physical perception
  • +Sem/Plant Plant
  • +Sem/Plant-part Plant part
  • +Sem/Plc Place
  • +Sem/Plc-abstr Abstract place
  • +Sem/Plc-elevate Place
  • +Sem/Plc-line Place
  • +Sem/Plc-water Place
  • +Sem/Pos Position (as in social position job)
  • +Sem/Process Process
  • +Sem/Prod Product
  • +Sem/Prod-audio Audio product
  • +Sem/Prod-cogn Cognition product
  • +Sem/Prod-ling Linguistic product
  • +Sem/Prod-vis Visual product
  • +Sem/Rel Relation
  • +Sem/Route Name of a Route
  • +Sem/Rule Rule or convention
  • +Sem/Semcon Semantic concept
  • +Sem/Sign Sign (e.g. numbers, punctuation)
  • +Sem/Sport Sport
  • +Sem/State
  • +Sem/State-sick Illness
  • +Sem/Substnc Substance, like Air and Water
  • +Sem/Sur Surname
  • +Sem/Fem-Sur Surname female
  • +Sem/Mal-Sur Surname male
  • +Sem/Symbol Symbol
  • +Sem/Time Time
  • +Sem/Tool Prototypical tool for repairing things
  • +Sem/Tool-catch Tool used for catching (e.g. fish)
  • +Sem/Tool-clean Tool used for cleaning
  • +Sem/Tool-it Tool used in IT
  • +Sem/Tool-measr Tool used for measuring
  • +Sem/Tool-music Music instrument
  • +Sem/Tool-write Writing tool
  • +Sem/Txt Text (girji, lávlla...)
  • +Sem/Veh Vehicle
  • +Sem/Wpn Weapon
  • +Sem/Wthr The Weather or the state of ground

Multiple Semantic tags:

  • +Sem/Act_Group
  • +Sem/Act_Plc
  • +Sem/Act_Route
  • +Sem/Amount_Build
  • +Sem/Amount_Semcon
  • +Sem/Ani_Body-abstr_Hum
  • +Sem/Ani_Build
  • +Sem/Ani_Build-part
  • +Sem/Ani_Build_Hum_Txt
  • +Sem/Ani_Group
  • +Sem/Ani_Group_Hum
  • +Sem/Ani_Hum
  • +Sem/Ani_Hum_Plc
  • +Sem/Ani_Hum_Time
  • +Sem/Ani_Plc
  • +Sem/Ani_Plc_Txt
  • +Sem/Ani_Time
  • +Sem/Ani_Veh
  • +Sem/Aniprod_Hum
  • +Sem/Aniprod_Obj-clo
  • +Sem/Aniprod_Perc-phys
  • +Sem/Aniprod_Plc
  • +Sem/Body-abstr_Prod-audio_Semcon
  • +Sem/Body_Body-abstr
  • +Sem/Body_Clth
  • +Sem/Body_Food
  • +Sem/Body_Group_Hum
  • +Sem/Body_Hum
  • +Sem/Body_Mat
  • +Sem/Body_Measr
  • +Sem/Body_Obj_Tool-catch
  • +Sem/Body_Plc
  • +Sem/Body_Time
  • +Sem/Build-part_Plc
  • +Sem/Build_Build-part
  • +Sem/Build_Clth-part
  • +Sem/Build_Edu_Org
  • +Sem/Build_Event_Org
  • +Sem/Build_Org
  • +Sem/Build_Route
  • +Sem/Clth-jewl_Curr
  • +Sem/Clth-jewl_Money
  • +Sem/Clth-jewl_Plant
  • +Sem/Clth_Hum
  • +Sem/Ctain-abstr_Org
  • +Sem/Ctain-clth_Plant
  • +Sem/Ctain-clth_Veh
  • +Sem/Ctain_Feat-phys
  • +Sem/Ctain_Furn
  • +Sem/Ctain_Tool
  • +Sem/Ctain_Tool-measr
  • +Sem/Curr_Org
  • +Sem/Dance_Org
  • +Sem/Dance_Prod-audio
  • +Sem/Domain_Food-med
  • +Sem/Domain_Prod-audio
  • +Sem/Edu_Event
  • +Sem/Edu_Group_Hum
  • +Sem/Edu_Mat
  • +Sem/Edu_Org
  • +Sem/Event_Food
  • +Sem/Event_Hum
  • +Sem/Event_Plc
  • +Sem/Event_Time
  • +Sem/Feat-phys_Tool-write
  • +Sem/Feat-phys_Veh
  • +Sem/Feat-phys_Wthr
  • +Sem/Feat-psych_Hum
  • +Sem/Feat_Plant
  • +Sem/Food_Perc-phys
  • +Sem/Food_Plant
  • +Sem/Game_Obj-play
  • +Sem/Geom_Obj
  • +Sem/Group_Hum
  • +Sem/Group_Hum_Org
  • +Sem/Group_Hum_Plc
  • +Sem/Group_Hum_Prod-vis
  • +Sem/Group_Org
  • +Sem/Group_Sign
  • +Sem/Group_Txt
  • +Sem/Hum_Lang
  • +Sem/Hum_Lang_Plc
  • +Sem/Hum_Lang_Time
  • +Sem/Hum_Obj
  • +Sem/Hum_Org
  • +Sem/Hum_Plant
  • +Sem/Hum_Plc
  • +Sem/Hum_Tool
  • +Sem/Hum_Veh
  • +Sem/Hum_Wthr
  • +Sem/Lang_Tool
  • +Sem/Mat_Plant
  • +Sem/Mat_Txt
  • +Sem/Measr_Time
  • +Sem/Money_Obj
  • +Sem/Money_Txt
  • +Sem/Obj-play
  • +Sem/Obj-play_Sport
  • +Sem/Obj_Semcon
  • +Sem/Clth-jewl_Org
  • +Sem/Org_Rule
  • +Sem/Org_Txt
  • +Sem/Org_Veh
  • +Sem/Part_Prod-cogn
  • +Sem/Perc-emo_Wthr
  • +Sem/Plant_Plant-part
  • +Sem/Plant_Tool
  • +Sem/Plant_Tool-measr
  • +Sem/Plc-abstr_Rel_State
  • +Sem/Plc-abstr_Route
  • +Sem/Plc_Pos
  • +Sem/Plc_Route
  • +Sem/Plc_Substnc
  • +Sem/Plc_Substnc_Wthr
  • +Sem/Plc_Time
  • +Sem/Plc_Tool-catch
  • +Sem/Plc_Wthr
  • +Sem/Prod-audio_Txt
  • +Sem/Prod-cogn_Txt
  • +Sem/Semcon_Txt
  • +Sem/Obj_State
  • +Sem/Substnc_Wthr
  • +Sem/Time_Wthr

Semantics are classified with

  • +Sem/Divinity Divinity (god personified),
  • +Sem/Constellation Constellation,
  • +Sem/Ant Anthroponym
  • +Sem/Fem Anthroponym female
  • +Sem/Mal Anthroponym male
  • +Sem/Patr Patronym
  • +Sem/Fem-Patr Patronym female
  • +Sem/Mal-Patr Patronym male
  • +Sem/Rvr name of river or water way, media of transportation,
  • +Sem/Mnth name of month
  • +Sem/Inanim Inanimate,

Semantic Fields

  • +Field/Agr agriculatural
  • +Field/Anat anatomical
  • +Field/Bio biological
  • +Field/Bot botanical
  • +Field/Chem chemical
  • +Field/Geol geological
  • +Field/Gram grammatical
  • +Field/Hist historical
  • +Field/Law law
  • +Field/Mar maritime
  • +Field/Math mathematical
  • +Field/Med medical
  • +Field/Mus musical
  • +Field/Relig church
  • +Field/Tech technical
  • +Field/Zool zoological

Other tags

Verbal arguments

  • +Subj/Zero This is used to mark verbs without a semantic subject

Derivations are classified under the morphophonetic form of the suffix, the source and target part-of-speech.

  • +V→N +V→V +V→A

Homonymy

Der begin

  • +Der In front of every derivation to make it possible to target derivations as a class e.g. in regular expressions etc
  • +Der/VtOmO
  • +Der/stO Deriving adverbs from adjectives A2Adv
  • +Der/ms эрзямс эрзя, истямс истя, вадрямс вадря
  • +Der/shka
  • +Der/GenAttr +Der/Onj genitive attribute derivation of non-nouns
  • +Der/aj vocative
  • +Der/kaj vocative
  • +Der/Ovt * telic deverbal noun also attr
  • +Der/Oms * infinitive illative
  • +Der/OmO * infinitive locative/nominative
  • +Der/OmstO * infinitive elative
  • +Der/OmsO * infinitive inessive
  • +Der/OmdO * infinitive ablative
  • +Der/Omga * infinitive prolative
  • +Der/Oma * modality: deontic/directive/obligative андомс: андома , якамс: якама
  • +Der/Omka * modality: deontic/directive/obligative андомс: андомка , якамс: якамка
  • +Der/Ycja * active (demonstrative) present participle
  • +Der/Y * active short present participle
  • +Der/Yj * active long present participle
  • +Der/Ozj * Gerund
  • +Der/Cond * conditional derivation +Der/Ynderja

Declaring noun derivations

  • +Der/pelj

Modifier without noun

  • +Der/MWN Modifier without Noun
  • +Der/Dem Speaker-Oriented Demonstrative

Conjugation of words other than finite verbs

  • +Der/Pr derivation to predicate head, e.g. nominal conjugation
  • +Der/Cop This is not a derivation
  • +Clt/Cop This will replace the nominal conjugation Der/Pr+V
  • +Clt/Cond

Declaring Indefinite Pronoun derivations

  • +Der/koj prefix +Indef in indefinite pronouns
  • +Der/ta prefix +Indef in indefinite pronouns
  • +Der/tago prefix +Indef in indefinite pronouns
  • +Der/Gak suffix +Indef in indefinite pronouns
  • +Der/buti suffix +Indef in indefinite pronouns
  • +Der/Yja suffix +Indef in indefinite pronouns ковия, зярыя

DECLARING NOUN DERIVATIONS

  • +Der/chi adjective-to-noun

the combinatory --Event-- preceding the NP-final noun

  • +Der/OmA verb-to-noun

DECLARING NUMERAL DERIVATIONS

  • +Der/cje +A+Ord
  • +Der/tjks +A+Ord (non-contrastive)

DECLARING DEVERBAL DERIVATIONS OF VERBS

  • +Der/kshnO verb2verb derivation
  • +Der/OkshnOms verb2verb derivation
  • +Der/OvOms verb2verb derivation
  • +Der/OvkshnOms verb2verb derivation
  • +Der/OvtOms verb2verb derivation
  • +Der/Ovtnjems verb2verb derivation
  • +Der/Ozevems verb2verb derivation
  • +Der/Ozevtems verb2verb derivation
  • +Der/Ozevtnjems verb2verb derivation
  • +Der/Ozevkshnems verb2verb derivation
  • +Der/sje this in verb2verb derivation and also in denominal demonstrative --Der/Dem--
  • +Der/nje verb2verb derivation
  • +Der/njems verb2verb derivation
  • +Der/Oncje old orth кудонцесь
  • +Der/Dimin
  • +Der/ka diminutive
  • +Der/NJE This is used in ошке, калнэ and кудыне
  • +Der/nJE diminutive
  • +Der/Ynje diminutive
  • +Der/Ynjka diminutive
  • +Der/Ynjkinje diminutive
  • +Der/ke diminutive in --ке--
  • +Der/kinje diminutive
  • +Der/ks Adv›N
  • +OLang/SME - North Sámi
  • +OLang/SMJ - Lule Sámi
  • +OLang/SMA - South Sámi
  • +OLang/FIN - Finnish
  • +OLang/SWE - Swedish
  • +OLang/NOB - Norw. bokmål
  • +OLang/NNO - Norw. nynorsk
  • +OLang/ENG - English
  • +OLang/MYV - Erzya
  • +OLang/MDF - Moksha
  • +OLang/RUS - Russian
  • +OLang/TAT - Tatar
  • +OLang/UND - Undefined
  • +F - Foreign

Morphophonology

To represent phonologic variations in word forms we use the following symbols in the lexicon files:

And following triggers to control variation

  • %{frontHard%} — front harmony hard
  • %{frontSoft%} — front harmony soft
  • %{back%} — back harmony
  • %{backHard%} — back harmony
  • %{dialM%} — for Shoksha and Drakino Dial/M morphology
  • Е3 testing тне тнэ

Special letters in the root that might be useful in dialect research and etymology later

  • Ь3 арсемс: арсе arśems vs арсемс: арЬ3се aŕśems
  • Ӓ3 эрямс: Ӓ3ря
  • Ӓ4 пелемс: пӒ4ль
  • %^Ь2ZERO removes stem-final soft sign
  • %{ое%} inflectional suffix protovowel аволь аволинь
  • %{оеэØ%} Suffix-initial archiphoneme
  • %{уиыØ%} Suffix-initial archiphoneme in dialect

вт%{оеэ%}мО1 suffix-internal archivowel

  • %{оэØ%} inessive, elative; this is the hard/broad s
  • %{ОØ%} Stem-final archiphoneme панго
  • %{ЕØ%} Stem-final archiphoneme тинге
  • %^NoLinkVow — No linking vowel is used only after consonants for error

MISC

  • +Cmp/Hyph A tag to indicate that a hyphen was used when compounding

Development tag

  • +WORK
  • +NoVowX
  • ZERO
  • %0
  • %-
  • +Dig1
  • +Dig2
  • +Dig3
  • +Dig4
  • +Rom Roman numerals

Compounding

  • +Cmp Dynamic compound - this tag should always be part of a dynamic compound. It is important for Apertium, and useful in other cases as well.
  • +Cmp/Hyph-Coll with nouns
  • +Cmp/Hyph-Redup with verbs
  • +Cmp/Hyph-Synonym with verbs
  • +Cmp/Hyph-Serial with verbs
  • +Cmp/Hyph-tejems with verbs

Tags

  • +Emphatic
  • +Gr2xxx
  • +Pref Prefix
  • +Exclusive "ансяк" only
  • +Intensive
  • +Intensifier уш
  • +Onom onomapoetic words
  • +Descr descriptive words

Different focus particles

Focus clitics

  • +Clt/Add Only one additive clitic
  • +Clt/AddGak

Imperative clitics

  • +Clt/Ga редяка Precative +Prec
  • +Clt/Gaja редякая
  • +Clt/Gajatj редякаять
  • +Clt/Gajatja редякаятя
  • +Clt/Gatja редякатя
  • +Clt/Gaka редякака ARE these real?
  • +Clt/Gakaja редякакая ARE these real?
  • +Pred2 secondary predicate. Examples: "Joe came in with his hat on." "Joe came in Joe had his hat on."

Tags distinguishing different versions of the same lemma (before POS)

  • +v1
  • +v2
  • +v3
  • +v4
  • +v5
  • +v6
  • +v7
  • +v8
  • +v9
  • +v10
  • +v11
  • +v12
  • +v13
  • +v14
  • +v15
  • +v16
  • +v17
  • +v18
  • +v19
  • +v20
  • +v21
  • +v22
  • +v23
  • +v24

FUNCTIONS are all upper-case!!!!

  • +ACC +DAT +COM This marks a function not a morpheme
  • +NoPoss used with personal pronouns in oblique cases, where a possessor index is expected

Symbols that need to be escaped on the lower side (towards twolc):

  • »
  • «
  • > (written with square brackets, see the root.lexc file)
  • < (written with square brackets, see the root.lexc file)

Flag diacritics

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:

@P.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@D.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@C.NeedNoun@ (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.

@P.CmpFrst.FALSE@ Require that words tagged as such only appear first
@D.CmpPref.TRUE@ Block such words from entering ENDLEX
@P.CmpPref.FALSE@ Block these words from making further compounds
@D.CmpLast.TRUE@ Block such words from entering R
@D.CmpNone.TRUE@ Combines with the next tag to prohibit compounding
@U.CmpNone.FALSE@ Combines with the prev tag to prohibit compounding
@P.CmpOnly.TRUE@ Sets a flag to indicate that the word has passed R
@D.CmpOnly.FALSE@ Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.

@U.Cap.Obl@ Allowing downcasing of derived names: deatnulasj.
@U.Cap.Opt@ Allowing downcasing of derived names: deatnulasj.

Flags used to identify parts of speech

  • @P.POS.PRON@
  • @P.POS.N@
  • @R.POS.N@
  • @P.POS.V@
  • @R.POS.V@
  • @C.POS@

Flags used with +Clt/Cop nonverbal predication

  • @U.PRED.NO@
  • @U.PRED.YES@
  • @C.PRED@

Flags used with transitivity

  • @U.TRANS.TV@
  • @U.TRANS.IV@
  • @P.TRANS.TV@
  • @P.TRANS.IV@ Flags used with serial verbs
  • @U.CONJ-INF.YES@
  • @U.CONJ-INF.NO@
  • @U.CONJ-TX.NONPAST@
  • @U.CONJ-TX.PRT1@
  • @U.CONJ-TX.PRT2@
  • @U.CONJ-MX.IND@
  • @D.CONJ-MX.IND@ 2012-11-04 should this be --D-- or --N--
  • @U.CONJ-MX.IMP@
  • @U.CONJ-MX.OPT@
  • @U.CONJ-MX.PREC@
  • @U.CONJ-MX.DES@
  • @U.CONJ-MX.CONJ@
  • @U.CONJ-MX.COND@
  • @U.CONJ-CONNEG.YES@
  • @U.CONJ-CONNEG.NO@
  • @U.CONJ-NX.PL@
  • @U.CONJ-NX.SG@
  • @U.CONJ-POSS.1@
  • @U.CONJ-POSS.2@
  • @U.CONJ-POSS.3@
  • @U.CONJ-POSS.2ACC@
  • @U.CONJ-POSS.3ACC@
  • @U.CONJ-PX.10@
  • @U.CONJ-PX.12@
  • @U.CONJ-PX.13@
  • @U.CONJ-PX.15@
  • @U.CONJ-PX.16@
  • @U.CONJ-PX.20@
  • @U.CONJ-PX.21@
  • @U.CONJ-PX.23@
  • @U.CONJ-PX.24@
  • @U.CONJ-PX.26@
  • @U.CONJ-PX.30@
  • @U.CONJ-PX.31@
  • @U.CONJ-PX.32@
  • @U.CONJ-PX.33@
  • @U.CONJ-PX.34@
  • @U.CONJ-PX.35@
  • @U.CONJ-PX.36@
  • @U.CONJ-PX.40@
  • @U.CONJ-PX.42@
  • @U.CONJ-PX.43@
  • @U.CONJ-PX.45@
  • @U.CONJ-PX.46@
  • @U.CONJ-PX.50@
  • @U.CONJ-PX.51@
  • @U.CONJ-PX.53@
  • @U.CONJ-PX.54@
  • @U.CONJ-PX.56@
  • @U.CONJ-PX.60@
  • @U.CONJ-PX.61@
  • @U.CONJ-PX.62@
  • @U.CONJ-PX.63@
  • @U.CONJ-PX.64@
  • @U.CONJ-PX.65@
  • @U.CONJ-PX.66@
  • @R.CONJ-PX.13@
  • @R.CONJ-PX.16@
  • @R.CONJ-PX.23@
  • @R.CONJ-PX.26@
  • @R.CONJ-PX.33@
  • @R.CONJ-PX.36@
  • @R.CONJ-PX.43@
  • @R.CONJ-PX.46@
  • @R.CONJ-PX.53@
  • @R.CONJ-PX.56@
  • @R.CONJ-PX.63@
  • @R.CONJ-PX.66@
  • @P.CONJ.ObjAll@
  • @R.CONJ.ObjAll@
  • @C.CONJ@
  • @P.TLOSS.ON@
  • @R.TLOSS.ON@
  • @P.PossPx.Sg1@
  • @P.PossPx.Sg2@
  • @P.PossPx.Sg3@
  • @P.PossPx.Pl1@
  • @P.PossPx.Pl2@
  • @P.PossPx.Pl3@
  • @U.PossPx.S3@
  • @U.PossPx.SP3@
  • @U.PossPx.Sg1@
  • @U.PossPx.Sg2@
  • @U.PossPx.Sg3@
  • @U.PossPx.Pl1@
  • @U.PossPx.Pl2@
  • @U.PossPx.Pl3@
  • @D.PossPx@
  • @C.PossPx@
  • @P.TNUM.SG@
  • @P.TNUM.PL@
  • @D.TNUM.SG@
  • @D.TNUM.PL@
  • @C.TNUM@

problematic

  • @P.TPERS.1@
  • @P.TPERS.2@
  • @P.TPERS.3@
  • @N.TPERS.1@
  • @N.TPERS.2@
  • @N.TPERS.3@
  • @U.TPERS.1@
  • @U.TPERS.2@
  • @U.TPERS.3@
  • @C.TPERS@
  • @U.CX.ABE@
  • @U.CX.ABL@
  • @U.CX.CMP@
  • @U.CX.COM@
  • @U.CX.DAT@
  • @U.CX.ELA@
  • @U.CX.GEN@
  • @U.CX.ILL@
  • @U.CX.INE@
  • @U.CX.LAT@
  • @U.CX.LOC@
  • @U.CX.NOM@
  • @U.CX.PRL@
  • @U.CX.TRA@
  • @U.CX.PRL@
  • @U.CX.TEMP@
  • @N.CX.ILL@
  • @N.CX.INE@
  • @N.CX.LAT@
  • @N.CX.ELA@
  • @C.CX@
  • @P.DNUM.PL@
  • @P.DNUM.SG@
  • @U.DNUM.PL@
  • @U.DNUM.SG@
  • @C.DNUM@
  • @P.NUM.SG@
  • @P.NUM.PL@
  • @D.NUM.SG@
  • @D.NUM.PL@
  • @C.NUM@
  • @U.INDEF.KOI@
  • @U.INDEF.TA@
  • @U.INDEF.TAGO@
  • @U.INDEF.BUTI@
  • @U.INDEF.GAK@
  • @C.INDEF-PRON@
  • @P.INDEF.PREF@
  • @D.INDEF.PREF@
  • @R.INDEF.PREF@
  • @C.INDEF@

This allows or disallows combining with hyphen through loop especially for acronyms 2012-11-04

  • @U.HYPH-COMBO.ACRO@
  • @D.HYPH-COMBO.ACRO@
  • @C.HYPH-COMBO@

Linking vowel for use with Translative

  • @P.LV.ON@
  • @P.LV.OFF@
  • @R.LV.ON@
  • @U.LV.ON@
  • @D.LV.ON@
  • @C.LV@
  • @C.CONJ-INF@
  • @C.CONJ-TX@
  • @C.CONJ-MX@
  • @C.CONJ-CONNEG@
  • @C.CONJ-NX@
  • @C.CONJ-PX@
  • @C.CONJ-POSS@
  • @C.KLOSS@
  • @C.TLOSS@

FLAGS USED WITH COLLECTIVE NOUNS

number

  • @U.DECL-NX.SG@
  • @U.DECL-NX.SP@
  • @U.DECL-NX.PL@
  • @R.DECL-NX.SG@
  • @R.DECL-NX.SP@
  • @R.DECL-NX.PL@

case

  • @U.DECL-CX.NOM@
  • @U.DECL-CX.ACC@
  • @U.DECL-CX.GEN@
  • @U.DECL-CX.DAT@
  • @U.DECL-CX.ABL@
  • @U.DECL-CX.ILL@
  • @U.DECL-CX.INE@
  • @U.DECL-CX.ELA@
  • @U.DECL-CX.LAT@
  • @U.DECL-CX.LOC@
  • @U.DECL-CX.TRA@
  • @U.DECL-CX.PRL@
  • @U.DECL-CX.COM@
  • @U.DECL-CX.TEMP@
  • @U.DECL-CX.ABE@
  • @U.DECL-CX.CMP@
  • @U.DECL-DX.DEF@
  • @U.DECL-DX.INDEF@
  • @U.DECL-DX.PX@

Removal

  • @C.DECL-NX@
  • @C.DECL-DX@
  • @C.DECL-CX@

The word forms in ERZYA start from the lexeme roots of basic word classes, or optionally from prefixes: Here follow all contlexes, appr 20.

  • Hyphenated-nouns ; entire serial nouns
  • Hyphenated-verbs ; entire serial verbs

CyrillicFemaleName ; HUNSPELL Type name derivation RussianSurnamesDerive ;

Not a real particle; it can take a clitic седеяк

увол-авол

alo-SPAT-1Arg ; >PO_KAL-LOC