Free and Open source Inupiaq analyser giella-ipk

Divvun and Giellatekno teams, community members
Software version
Documentation license
SVN Revision
$Revision:68217 $
SVN Date
$Date:2013-01-16 11:31:33 +0200 (Wed, 16 Jan 2013) $


This is free and open source Inupiaq morphology.

  • +N +V +Part +Prop +Pron POS
  • +Sg +Du +Pl Number
  • +1Sg +2Sg +3Sg +4Sg Intransitive number Sg
  • +1Du +2Du +3Du +4Du Intransitive number Du
  • +1Pl +2Pl +3Pl +4Pl Intransitive number Pl
  • +1SgO +2SgO +3SgO +4SgO Objective conjugation
  • +1DuO +2DuO +3DuO +4DuO Objective conjugation
  • +1PlO +2PlO +3PlO +4PlO Objective conjugation
  • +Symbol = independent symbols in the text stream, like £, €, ©

4th person still missing in the transitive conjugation

  • +Abs +Rel +Trm +Loc +Abl +Mod Cases
  • +Prs +Prt Tenses
  • +Ind +Int +Cau +ConReal +ConUnreal Modes NB! No Imp
  • +Arch tags for archaic forms. In this pilot just used to indicate twin forms

Symbols that need to be escaped on the lower side (towards twolc):

Literal »
Literal «
  %[%>%]  - Literal >
  %[%<%]  - Literal <
  • +VIK nominalizers
  • +LU +GUUQ +UNA clitics
  • %^TRUNC truncation dummy
  • %^CVCTRUNC dummy for very long truncations
  • %^VCTRUNC dummy for long truncation
  • %^FRIC dummy for fricativizing stem-final consonants. Needed to avoid a general rule that also would affect unwantedly as in *aaġagu for aaqagu. The alternative would have been to postulate truncating flexives with a fricative first consonant (*aiviq -q +ġit) but that is hokus pokus
  • %^EBLOCK dummy to block schwa going to a (aŋutik not *aŋuttak)
  • %^C dummy for intermediate gemination
  • %^DEFRIC dummy when fricatives go stops (amaġuq -> amaqquk) as apposed to %C in niġi+VIK -> niġġivik
  • %^SCHWADEL !dummy with derivatives truncating semi-final schwa

Flag diacritics

  • @P.IV.ON@ Flag - sets value for transitivity to IV
  • @P.TV.ON@ Flag - sets value for transitivity to TV
  • @R.IV.ON@ Flag - reset value for transitivity to IV
  • @R.TV.ON@ Flag - reset value for transitivity to TV
  • @D.IV.ON@ Flag - delete if unsaturated IV flag (=Verb was not IV)
  • @D.TV.ON@ Flag - delete if unsaturated TV flag (=Verb was not TV)

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:

@P.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@D.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@C.NeedNoun@ (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.

@P.CmpFrst.FALSE@ Require that words tagged as such only appear first
@D.CmpPref.TRUE@ Block such words from entering ENDLEX
@P.CmpPref.FALSE@ Block these words from making further compounds
@D.CmpLast.TRUE@ Block such words from entering R
@D.CmpNone.TRUE@ Combines with the next tag to prohibit compounding
@U.CmpNone.FALSE@ Combines with the prev tag to prohibit compounding
@P.CmpOnly.TRUE@ Sets a flag to indicate that the word has passed R
@D.CmpOnly.FALSE@ Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.

@U.Cap.Obl@ Allowing downcasing of derived names: deatnulasj.
@U.Cap.Opt@ Allowing downcasing of derived names: deatnulasj.

This file gives the start of the Iñupiaq lexicon. The lexicon Root points at the different parts of speech. Each POS has its own file stems/nouns.lexc, etc., which in turn points to affixes/nouns.lexc, etc. POS-changing nominalizers are found in affixes/verbs.lexc and verbalizers in affixes/nouns.lexc It might be a good idea to have noun-ipk-der.txt etc. as well. The common, final lexica, are found in clitics.lexc.


  • Nouns ;
  • Verbs ;
  • Determiners ;
  • Adverbs ;
  • prop ;
  • pron ;
  • part ;
  • Punctuation ;
  • Symbols ;