rus

Free and Open source Russian analyser giella-rus

Authors
Divvun and Giellatekno teams, community members
Software version
2012
Documentation license
GNU GFDL
SVN Revision
$Revision:68217 $
SVN Date
$Date:2013-01-16 11:31:33 +0200 (Wed, 16 Jan 2013) $

giella-rus

This is free and open source Russian morphology.

Russian tags

Stressed vowels

  • а́ е́ ё́ и́ о́ у́ ы́ э́ ю́ я́ Primary stress (lower)
  • а̀ ѐ ё̀ ѝ о̀ у̀ ы̀ э̀ ю̀ я̀ Secondary stress (lower)
  • А́ Е́ Ё́ И́ О́ У́ Ы́ Э́ Ю́ Я́ Primary stress (upper)
  • А̀ Ѐ Ё̀ Ѝ О̀ У̀ Ы̀ Э̀ Ю̀ Я̀ Secondary stress (upper)

Symbols that need to be escaped on the lower side (towards twolc):(copied from sme)

Markers

  • ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁰ = Used to enumerate homonymous lemmas
  • %> = End-of-stem marker (nominals)
  • %< = End-of-stem marker (verbs)
  • %^F = Fleeting vowel marker
  • %^o %^O = Verbal prefix fleeting vowel
  • %^G = Irregular GenPl marker (to keep ов/ев on n stems, e.g. ов%^G
  • %^Z = Zero ending (resolves to 0/й/ь)
  • %^M = Verb stem mutation
  • %^D = archiphoneme for д~жд alternation in past passive participles
  • %^T = archiphoneme for т~щ alternation in verbs
  • %^d = archiphoneme for verb stems with -дший past active participles (-сти 7 (-д-) )
  • %^t = archiphoneme for verb stems with -тший past active participles (-сти 7 (-т-) )
  • %^R = archiphoneme for бороть and пороть
  • %^U = Imperative ending (unstressed)
  • %^S = Imperative ending (stressed)
  • %^P = Attenuative comparative prefix: по~
  • %^A = Attenuative comparative prefix: по~
  • %^Y = Verbal prefix вы́-

POS

  • +A = Adjective
  • +Abbr = Abbreviation
  • +Adv = Adverb
  • +CC = Coordinating conjunction
  • +CS = Subordinating conjunction
  • +Det = Determiner
  • +Interj = Interjection
  • +N = Noun
  • +Num = Numeral
  • +Paren = Parenthetical вводное слово
  • +Pcle = Particle
  • +Po = Postposition (ради is the only postposition)
  • +Pr = Preposition
  • +Pron = Pronoun
  • +V = Verb

Sub-POS

  • +All = All: весь
  • +Coll = Collective numerals
  • +Def = Definite
  • +Dem = Demonstrative
  • +Indef = Indefinite: кто-то, кто-нибудь, кто-либо, кое-кто, etc.
  • +Interr = Interrogative: кто, что, какой, ли, etc.
  • +Neg = Negative: никто, некого, etc.
  • +Pers = Personal
  • +Pos = Possessive, e.g. его, наш
  • +Prcnt = Percent
  • +Prop = Proper
  • +Recip = Reciprocal: друг друга
  • +Refl = Pronoun себя, possessive свой
  • +Rel = Relativizer, e.g. который, где, как, куда, сколько, etc.
  • +Symbol = independent symbols in the text stream, like £, €, ©

Verbal MSP

  • +Impf +Perf = Imperfective, perfective
  • +IV +TV = Intransitive, transitive (Zaliznjak does not mark trans-only, so transitive verbs all have both TV and IV)
  • +Inf +Imp = Imperatives: 2nd person = читай, 1st person = прочитаем
  • +Pst +Prs +Fut = Past, present, future
  • +Sg1 +Sg2 +Sg3 = person sg
  • +Pl1 +Pl2 +Pl3 = person pl
  • +PrsAct +PrsPss = Participles (+PrsAct+Adv and +PstAct+Adv are used for the verbal adverbs)
  • +PstAct +PstPss = Participles
  • +Pass = Passive
  • +Imprs = Impersonal (cannot have explicit subject)
  • +Lxc = Lexicalized (for participial forms)
  • +Der = Derived (for participial forms)
  • +Der/PrsAct = Derived (for participial forms)
  • +Der/PrsPss = Derived (for participial forms)
  • +Der/PstAct = Derived (for participial forms)
  • +Der/PstPss = Derived (for participial forms)

Nominal MSP

  • +Msc +Fem +Neu +MFN = grammatical gender, +MFN = gender unspecifiable (pl tantum)
  • +Inan +Anim +AnIn = animacy (+AnIn = ambivalent animacy for non-accusative modifiers)
  • +Sem/Sur +Sem/Pat = Surname (фамилия), Patronymic
  • +Sem/Ant +Sem/Alt = Anthroponym/Given name, Other
  • +Sg +Pl = number
  • +Nom +Acc +Gen
  • +Loc +Dat +Ins
  • +Loc2 +Gen2 +Voc
  • +Count = Count (for человек/людей or лет/годов, etc. also шага́/шара́/часа́/etc.)
  • +Ord = Ordinal
  • +Cmpar = Comparative
  • +Sint = Synthetic comparative is possible, e.g. старее
  • +Pred = "Predicate", also used for short-form adjectives
  • +Cmpnd = "Compound", used for compounding adjectives, such as русско-английский
  • +Att = Attenuative comparatives like получше, поновее, etc.

Punctuation

  • +PUNCT = Punctuation
  • +CLB = Clause boundary ! TODO SENT vs CLB which is which?
  • +SENT = Clause boundary
  • +COMMA = Comma
  • +DASH = Dash
  • +LQUOT = Left quotation
  • +RQUOT = Right quotation
  • +QUOT = "Ambidextrous" quotation
  • +LPAR = Left parenthesis/bracket
  • +RPAR = Right parenthesis/bracket
  • +LEFT = Left parenthesis/bracket/quote/etc.
  • +RIGHT = Right parenthesis/bracket/quote/etc.

Other tags

  • +Prb = +Prb(lematic): затруднительно - предположительно - нет
  • +Fac = Facultative
  • +PObj = Object of preposition (prothetic н: него нее них)
  • +Epenth = epenthesis on prepositions (о~об~обо or в~во)
  • +Leng = Lengthened доброй~доброю (marks less-canonical wordform that has more syllables)
  • +Elid = Elided (Иванович~Иваныч, новее~новей, чтобы~чтоб, или~иль, коли~коль)
  • +Use/NG = Do not generate (used for apertium, etc.)
  • +Use/Obs = Obsolete
  • +Use/Ant = Antiquated "устаревшее"
  • +Err/Orth = Substandard
  • +Err/L2_ii = L2 error: Failure to change ending ие to ии in +Sg+Loc or +Sg+Dat, e.g. к Марие, о кафетерие, о знание
  • +Err/L2_Pal = L2 error: Palatalization: failure to place soft-indicating symbol after soft stem, e.g. земла (compare земля)
  • +Err/L2_FV = L2 error: Presence of fleeting vowel where it should be deleted, e.g. отеца (compare отца)
  • +Err/L2_NoFV = L2 error: Lack of fleeting vowel where it should be inserted, e.g. окн (compare окон)
  • +Err/L2_SRo = L2 error: Failure to change о to е after hushers and ц, e.g. Сашой (compare Сашей)
  • +Err/L2_SRy = L2 error: Failure to change ы to и after hushers and velars, e.g. книгы (compare книги)

Key lexicon

  • LEXICON Root
  • Abbreviation ;
  • : %^P%^A Adjective ;
  • Adverb ;
  • Comparative ;
  • Conjunction ;
  • Interjection ;
  • Noun ;
  • Numeral ;
  • Parenthetical ;
  • Particle ;
  • Predicative ;
  • Preposition ;
  • Pronoun ;
  • Verb ;
  • Propernoun ;
  • Punctuation ;
  • Symbols ;
  • LexicalizedParticiple ;