Work Plan

Work Plan with background for Cree FST:

(Potential) orthographical differences and addressing them

  • UofA Native Studies Cree teaching materials (NS)
  • Wolvengrey (dissertation 2011, and personal communications)
  • Wolfart's materials (1973 N.B. Bloomfield legacy, etc.)
  • Okimâsis' text book (2014)
    • writing <i> and <o> in <iy>, <ow> sequences: e.g. <sipiy> vs. <sipîy>
      • generally recognize both short/long <i>, <î>, but output always short <i> before <y> and <o> before <w>, except at morpheme boundaries and unless we have a genuine minimal pair (this would be according to the Ahenakew-Okimâsis convention).
    • orthographical vs. pronunciation discrepancy?
      • <iw> = /ow/: mismatch in lemma, but <i> realized as /i/ in other forms such as imperative and 1Sg.
    • elision: short <i> elision with apostrophe
      • recognize forms with apostrophe, but always generate the underlying <i>.
    • circumflex vs. macron for long vowels
      • circumflexes historically, and still currently, easier to type than macrons.
      • h-joiners (and other epenthetical sounds)
      • present/output hyphen as the canonical joiner (does away with the need for explic use of -h- or others joiners), but nevertheless recognize words not using hyphen joiners (for now). Then, the speaker can choose amongst various different options, given the phoneme context, which joiner sound to use in the spoken form.
      • exception: -h- instead of -t- joiner <nihayân> (1st person of <ayâw>)
      • -h- also possibly used as an "archaic" variant at the end of pre-verbal morphemes: eh-:ê, kîh: kî-, wîh-, (mâcih-, ponih-), etc (in part representing times when these preverbs were written as separate words).
      • mark all such non-hyphen joiner forms with tag: +Err/Orth (so that they will not be generated by the FST).
    • <e>: represent the /e/ phoneme consistently as short, i.e. <e>.

Modeling of morphology:background and modeling decisions

  • Various conjunct form prefixes beyond 'ê-': kâ-, etc.
    • historically changed conjunct form replaced by ê-/kâ- and regular conjunct form (e.g. nêpâyan / nipâyan → ê-nipâyan) in Plains Cree (but not in East Cree).
    • conjunct forms expressing some level of antecedent (at discourse level) or TAM and/or evidentiality.
    • worthwhile reference: Clare Cook (2008) "The syntax and semantics of clause-typing in Plains Cree"
    • one solution would be to explictly mark which grammatical preverb precedes the actual conjunct form, allowing also for the absence of such a preverb in some cases. This grammatical preverb info could also be used for determining subsequent analysis of the suffix, as well as set set constraints on allowable conjunct-preverb + suffix combinations, resulting in e.g. the following analyses for conjunct forms of nîpâw (this is in fact quite similar to the East Cree situation):
ê-nipâyan	PV/e+nipâw+V+AI+Cnj+Prs+1Sg
kâ-nipâyan	PV/ka+nipâw+V+AI+Cnj+Prs+1Sg
nipâyan	nipâw+V+AI+Cnj+Prs+1Sg
  • Subjunctive vs. Future Conditional
    • coding as +Fut+Cond might make more sense than +Subjn.
  • Coding obviation in the context of various third person features
    • Wolvengrey (2011): Proximate 3rd vs. Obviative 3rd and Further Obviative 3rd persons: use 4th and 5th person for these Obviative and Further Obviative forms; moreover, use ambiguity tags for 4th/5th obviated forms as they cannot be differentiated in terms of number (Sg vs. Pl) by any structural cues in the sentential context.
    • Revise tags as: +3Sg, +3Pl, +4Sg/Pl, +5Sg/Pl for Actor, and +3SgO, +3PlO, +4Sg/PlO, +5Sg/PlO for Goal features.
    • Inclusive Plural 2nd/1st (+21Pl) person forms of Future Definite, i.e. -yâhko vs -yahki endings: areal variants?
    • Both equally good, recognize and generate both.
  • Weak, strong, and combined reduplication for vowel initial verb stems
    • Generally, the combined reduplicative prefix is ây-ah-, irregardless of the initial vowel.
    • However, o-initial stems, there appears to be an alternation with roughtly equal frequencies between ay-/oy- in the case of weak reduplication, and âh-/wah- for strong reduplication. Likely solution is to implement both variants for o-initial verb stems.

Frequency counts from Wolfart texts:

WEAK with ay-:

kâ-kî-ay-ohtinamihk kâ-kî-ay-ohtinamihk +?

WEAK with oy-:

ê-oy-oswâcik	ê-oy-oswâcik	+?
nitati-oy-otâpân	nitati-oy-otâpân	+?
kâ-kî-oy-ohpikihicik	kâ-kî-oy-ohpikihicik	+?

STRONG with âh-:

kî-âh-oskinîkiwak	kî-âh-oskinîkiwak	+?
k-âh-oskinîkicik	k-âh-oskinîkicik	+?
ê-kî-âh-ocawâsimisicik	RdplS+ocawâsimisiw+V+AI+Cnj+Prt+3Pl
kiwî-kakwê-âh-onâpêminâwâw	PV/kakwe+RdplS+onâpêmiw+V+AI+Ind+Fut+Int+2Pl
kâ-âh-otinahk	kâ-âh-otinahk	+?
kiwî-kakwê-âh-onâpêminâwâw	PV/kakwe+RdplS+onâpêmiw+V+AI+Ind+Fut+Int+2Pl
kâ-âh-otinahk	kâ-âh-otinahk	+?
kâ-kî-kakwê-âh-otinikoyâhk.	kâ-kî-kakwê-âh-otinikoyâhk.	+?
nikî-âh-otinâhtikwânân	nikî-âh-otinâhtikwânân	+?
âh-oyôsisimiw;	âh-oyôsisimiw;	+?
âh-oyôsisimiw,	âh-oyôsisimiw,	+?
nikî-pê-âh-otihtikonân,	nikî-pê-âh-otihtikonân,	+?
wî-âh-osâmêyihtam,	wî-âh-osâmêyihtam,	+?
ê-at[i]-âh-ocawâsimisit.	ê-at[i]-âh-ocawâsimisit.	+?

STRONG with -wâh- (or wâh- preverb)

ê-wâh-onâpêmicik.	ê-wâh-onâpêmicik.	+?
ê-wâh-ocihcihkwanapiyâhk.	ê-wâh-ocihcihkwanapiyâhk.	+?
kî-wâh-osîhcikâtêwa,	kî-wâh-osîhcikâtêwa,	+?
nitawi-wâh-ocipitêwak	PV/nitawi+PV/wah+ocipitêw+V+TA+Ind+Prs+3Pl+4Sg/PlO
ê-wâh-onâpêmicik,	ê-wâh-onâpêmicik,	+?
ê-pê-wâh-otihtinikoyâhk	PV/pe+PV/wah+otihtinêw+V+TA+Cnj+Prs+3Sg+1PlO
ê-pê-wâh-otihtinikoyâhk	PV/pe+PV/wah+otihtinêw+V+TA+Cnj+Prs+4Sg/Pl+1PlO
ê-wâh-ocêmikoyâhk	PV/wah+ocêmêw+V+TA+Cnj+Prs+3Sg+1PlO
ê-wâh-ocêmikoyâhk	PV/wah+ocêmêw+V+TA+Cnj+Prs+4Sg/Pl+1PlO
ê-kî-wâh-osîhtamawâcik	PV/wah+osîhtamawêw+V+TA+Cnj+Prt+3Pl+4Sg/PlO
ê-kî-papâmi-wâh-otinât	PV/papami+PV/wah+otinêw+V+TA+Cnj+Prt+3Sg+4Sg/PlO
ê-wâh-otinahk	PV/wah+otinam+V+TI+Cnj+Prs+3Sg
ê-pimi-wâh-ohpahtênaman	ê-pimi-wâh-ohpahtênaman	+?
ê-wâh-ohtohtêcik,	ê-wâh-ohtohtêcik,	+?

differences in animacy/transitivity types of nouns and verbs: some verbs seem to apply to both NI and NA objects (tomina vs tominam: AW says that one is VTI, the other VTA; Maskwacîs says the opposite): to be looked into later.

Action items with priorities and assignments


  1. DONE! implement 3rd person proximate/obviative features as 3rd, 4th, and 5th persons with number-wise ambiguity tags (for 4th and 5th persons), making sure changes LEXC and YAML files are in full agreement (Atticus).
  1. DONE! Add -wici- -m- forms (Atticus)
  2. DONE! Add -ikawi- unspecified actor suffix (Atticus)
  3. DONE! fully implement VTA-5 paradigm (instead of using VTA-1 as the default). According to Arok, VTA-5 is basically the same as VTA-1, with the addition of <i> to the stem in the Immediate Imperative forms (Atticus)
  4. include reciprocals and ensure they show up only in the Singular forms (Atticus)
  5. Go through IICONJ stems and verify with Arok which are SG or PL only. Infrastructure for this is already implemented in affixes/verbs.lexc; one only needs to adjust the coding in the stems file to redirect to relevant continuation lexica. After II verbs, do this process systematically for AICONJ, TICONJ and TACONJ verbs. (Atticus)
  6. implement reduplication for o-inital verb stems as discussed above (Antti)

The following will take more work and research to implement:

  1. DONE! Implementation of <kâ-> prefixed conjunct forms (line 68 in affixes/verbs.lexc). Research to be done to determine how kâ- (and other grammatical preverbs such as kâ-kî-) interacts with the various verb moods and functions (relativizer, infinitivizer), and how we could code this. One solution would be mark explicltly the grammatic preverb preceding a conjunct form, allowing also for the absence of such grammatical preverb (cf. above) (Atticus)
  2. Deal with two/three-letter preverb problematics (analysing the reduction of a potential -ta- as a reduced -t-, instead of as an epenthetic -t-). This can be partially solved with requiring hyphens as joiners, as well as with some restrictions on preverb combinatorics. Arok to provide some categorical restrictions, if possible, but otherwise to be explored based on Wolfart corpus data (Antti, Arok)
  3. DONE! Change preverb tags to represent vowel length accurately (to distinguish e.g. maci- 'start' from mâci- 'bad') (Antti)
  4. Allow for analysis of forms with -h- joiners, but not their generation. (TBD)


  1. -ici- 'fellow' forms (Atticus)
  2. MOSTLY DONE! <-m-> in possession (some nouns are not coded for the right continuation lexicon to allow for -m- suffix). Discuss the forms that do this with Arok. If fuzzy, allow for both possession options, but if it is categorically the case that a possessed noun must, or must not, use -m-, then code it as such (Atticus)


  1. Incorporate common contractions of particles in the lexicon as +Err/Orth cases. (Atticus)


  • Make the assessment file for the UCRK analyzer easier to read, then put into Google Drive/Plains Cree Finite State Morphology/WolfartTexts/Analyses. Arok will then go through the top of the list. Perhaps he can see if there are preverbs that categorically cannot occur in certain forms/moods. (Antti)
  • recode flag diacritics without FST regular expressions (the < ... > notation) using Måns' double coding scheme in matching positions on both the left and right sides (Atticus, Antti).
  • Go through the continuation lexicon structure for crk and systematize (in some way) their naming conventions, organization and structure for easier later maintenance (cf. e.g. IICONJ-SG vs. IIPLCONJ). (Atticus)