sjd
Free and Open source Kildin Sami analyser giella-sjd
- Authors
- Divvun and Giellatekno teams, community members
- Software version
- 2012
- Documentation license
- GNU GFDL
- SVN Revision
- $Revision
: 68217 $ - SVN Date
- $Date
: 2013-01-16 11: 31: 33 +0200 (Wed, 16 Jan 2013) $
giella-sjd
This is free and open source Kildin Sami morphology.
Kildin Sámi morphological analyser !
Definitions for Multichar_Symbols
- +Symbol = independent symbols in the text stream, like £, €, ©
- +Use/Elid = Elided substandard (Иванович~Иваныч, новее~новей, чтобы~чтоб, или~иль, коли~коль)
- +Use/Orth = Orthographic substandard
Flag diacritics
@P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
@D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
@C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised |
For languages that allow compounding, the following flag diacritics are needed
@P.CmpFrst.FALSE@ | Require that words tagged as such only appear first |
@D.CmpPref.TRUE@ | Block such words from entering ENDLEX |
@P.CmpPref.FALSE@ | Block these words from making further compounds |
@D.CmpLast.TRUE@ | Block such words from entering R |
@D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding |
@U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding |
@P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R |
@D.CmpOnly.FALSE@ | Disallow words coming directly from root. |
Use the following flag diacritics to control downcasing of derived proper
@U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. |
@U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj. |
Kildin Saami noun inflection
"class1" = monosyllabic/non-palatalized inflection foot, str-wk, no ablaut/umlaut, palatalization in Ill.Sg
LEXICON 1DB is orthographic subclass with xy-->x and "half-palatalization" in Ill.Sg;
Lexica 2D etc are classes from Kuruch dictionary.
LEXICON 2D like 1D ? these are just dummy for the moment
LEXICON 3D
LEXICON 4D is contract
LEXICON 4DА
LEXICON 5D goes to 1DA
LEXICON OAHPA_N is all the words from OAHPA_N, we just point them to 1DA.
LEXICON ELAN_N is all the words from ELAN_N, we just point them to 1DA.
Case affixes
LEXICON SgNomSuf
LEXICON SgGenAccSuf
LEXICON EssSuf
LEXICON ParSufE
LEXICON ParSufHalfE
LEXICON SgLocSuf
LEXICON SgIllSufA
LEXICON SgIllSufE
LEXICON SgIllSufJE
LEXICON SgIllSufHalfE
LEXICON SgAbeSuf
LEXICON SgComSuf
LEXICON PlNomSuf
LEXICON PlGenSuf
LEXICON PlAccSuf
LEXICON PlIllSuf
LEXICON PlLocSuf
LEXICON PlAbeSuf
LEXICON PlComSuf
- LEXICON Adjective
Just dumping the oahpa adjectives here
Kildin Saami nouns
- LEXICON Noun
class1
class1 orth. дт/ill. ӭ
2. declension (class 2) ! like 1. decl., but with C-clusters
- Ablaut
- куэсськ 2D "тётя / aunt" ;
3. declension (class 7) no stem gradation
4. declension (classes 3, 5) !these are two different declesions:the "puaz-class" and the "cyza-class"
5. declension (classes 4, 6, 9)
6. declension (classes )
ELAN dump
Oahpa dump
Kildin Saami verbs
- LEXICON Verb
- LEXICON NegAux
- LEXICON Verbstems
ELAN dump
Just dumping Oahpa verbs in the rest of the file
Kildin Sámi TWOLC file
Symbol declarations
Alphabet
- а А ӓ Ӓ а̄ А̄ б Б в В г Г д Д
- е Е ё Ё е̄ Е̄ ё̄ Ё̄ ж Ж з З һ Һ
- и И ӣ Ӣ й Й ҋ Ҋ ј Ј к К л Л
- ӆ Ӆ м М ӎ Ӎ н Н ӊ Ӊ ӈ Ӈ о О
- о̄ О̄ п П р Р ҏ Ҏ с С т Т у У
- ӯ Ӯ ф Ф х Х ц Ц ч Ч ш Ш щ Щ
- ъ Ъ ы Ы ь Ь ъ Ъ ҍ Ҍ э Э ӭ Ӭ
- э̄ Э̄ ю Ю ю̄ Ю̄ я Я я̄ Я̄
Trigger symbols
- %^1VOW: 0 Vow trigger for epenthetic -э-
- %^2VOW: 0 Vow trigger
- %^3VOW: 0 Vow trigger
- %^4VOW: 0 Vow trigger
- %^5VOW: 0 Vow trigger
- %^VOWSH: 0
- %^WG: 0 for weak consonant stem gradation
- %^EPHV: 0
- %^DI: 0
- %^CLPAL: 0
- %^CLDPAL: 0 for depalatalizing consonant stems
- %^DPAL: 0 for depalatalizing consonant stems
- %^MON: 0
- %^PAL: 0 for palatalizing consonsant stems
Sets
- C = б Б в В г Г д Д ж Ж з З һ Һ й Й ҋ Ҋ ј Ј к К л Л ӆ Ӆ м М ӎ Ӎ н Н ӊ Ӊ ӈ Ӈ п П р Р ҏ Ҏ с С т Т ф Ф х Х ц Ц ч Ч ш Ш щ Щ ;
- V = а А ӓ Ӓ а̄ А̄ е Е е̄ Е̄ ё Ё ё̄ Ё̄ и И ӣ Ӣ о О о̄ О̄ у У ӯ Ӯ ы Ы э Э ӭ Ӭ э̄ Э̄ ю Ю ю̄ Ю̄ я Я я̄ Я̄ ;
- PALMARK = ь Ь ҍ Ҍ ;
- VPLOS = г Г ;
- VLPLOS = к К ;
- PALCON = ж Ж й Й ҋ Ҋ ј Ј ч Ч ; can only by followed by PALVOW !so-called "palatalized C"
- PALVOW = е Е е̄ Е̄ ё Ё ё̄ Ё̄ и И ӣ Ӣ ю Ю ю̄ Ю̄ я Я я̄ Я̄ ; so-called "palatalized V"
- HPLVOW = ӓ Ӓ ӭ Ӭ ; so-called "half-palatalized V"
- NPLVOW = а А а̄ А̄ о О о̄ О̄ у У ӯ Ӯ ы Ы э Э э̄ Э̄ ; so-called "non-palatalized V"
- Dummy = %^WG %^DI %^DPAL %^MON %> %^PAL %^CLPAL ;
Rule section
Quantitative CG xxy:xy (of nonpal cлusters)e.g. са̄ррн: са̄рн, ко̄ллм: ко̄лм, сыййт: сыйт, etc.
Qualitative CG xx:y (of voiced affricates) i.e. джь : жь, дзь : зь, дз : з
Quantitative CG (of continuants)
ue>ua Ablaut
Epenthetic vowel deletion (npal vowels)e.g. ко̄рэб: ко̄рбэсьт, etc. шоабаш-:шоабш-