phonology-morphology
The Somali morphophonological/twolc rules file
Phonological Processes in Somali
Spreading processes
Lenition
ilig ‘tooth' ~ iligga ‘tooth (Def.)' ~ ilko ‘teeth (Indef.)’ arag ‘see' ~ aragtaa ‘2Sg/3SgF sees' ~ arkaa ‘1Sg/3SgM sees’
Voicing assimilation
Stops assimilate for voicing (or lenis/fortis), particularly across morpheme boundaries,
aragtaa ‘2Sg/3SgF sees’ wararka ‘the news’ buugga ‘the book’ naagta ‘the woman’ jaamacadda ‘the university’
This also follows for the retroflex segment <dh>, however the sequence <dhdh> is shortened
gabadh ‘(a) girl’ gabadha ‘the girl’
Vowel ablaut
waxaan ‘foc+1Sg' magac rah wuxuu ‘foc+3SgM' magucu / magacu ruhu magicii / magicii
Full ablaut appears to be optional in some words.
tag ‘go' tegi ‘to go' tegeen ‘they went’ bax ‘leave' bexi ‘to leave' bexeen ‘they left’
<k> deletion
<k> is deleted when it follows a back consonant (which is not <k> itself):
wax + ka => waxa magac + ka => magaca rah + ka => raha
<l> + <t> -> <sh>
Vowel harmony
Tonal Processes
Reduplication
Somali has two kinds of reduplication: partial and full. Reduplication is typically
af ‘mouth, language' => afaf ‘... +Pl’ qoys ‘family' => qoysas ‘... +Pl’
Note: There may be some partial reduplication that also copies the vowel.
qof ‘person' => qofqof cad ‘part' => cadcad
Adjectives:
quruxsán ‘beautiful' => qurquruxsán ‘ ... +Pl’ fudúd ‘easy, light' => fudfudúd adág ‘difficult' => ad'adág yár ‘a little' => yaryára gaabán ‘short' => gaaggaabán ‘ ... +Pl’ laabán ‘folded' => laallaabán jabán ‘broken' => ‘jajabán’
Irregular adjectives:
dhéer ‘long, tall (sg)' => dhaadhéer wéyn ‘big' => waawéyn
Note: Other possibilities might be found greping: with ‘^\(.\)aa\1e', ‘^\(.\)aa\1\1aa',
Person morphemes
The realization of personal suffixes on verbs is a little complex and depends mostly on
Realization of these suffixes is currently all handled by twolc:
sg pl 1 {đ} {ñ} ==> 2 {ŧ} 0 3 Masc {đ} {đ} 3 Fem {ŧ} Declension 1 sg pl 1 keenaa keennaa 1 dhacay dhacnay 2 keentaa keenaan 2 dhacday dheceen 3 Masc keenaa keentaan 3 Masc dhacay dhecdeen 3 Fem keentaa 3 Fem dhacday Declension 2 sg pl sg 1 sameeyay sameynay 1 kariyay karinay 2 sameysay sameyseen 2 karisay kariseen 3 Masc sameeyay sameeyeen 3 Masc kariyay kariyeen 3 Fem sameysay 3 Fem karisay Declension 3 sg pl 1 furay furannay 1 watay wadannay 2 furtay furateen 2 wadatay wadateen 3 Masc furay fureen 3 Masc watay wateen 3 Fem furtay 3 Fem wadatay
A short summary:
%{đ%} - 1Sg, 3SgM and 3Pl; realized as 0/y/d in classes %{ŧ%} - 2Sg, 3SgF, 2Pl; realized as t/s in classes %{ñ%} - 1Pl; realized as n in all classes ‘special' here just in case.
This process could easily be removed from twolc and done in just flag diacritics, why not?
More processes to describe:
- wataa / wadatay dissimilation
- a/e alternation in -e nouns (noun type is a bit more complex too, as corpus reveals)
- vowel deletion ilig, ilko
- e/y alternation in samee- verbs
- w/b alternations
- vowel lengthening magaalo / magaalooyin
Notes:
- If forms disappear but the rule should work, make sure
Alphabet
For ease of reduplication and treating sh as separate from l + t => sh
TODO: write tests for gacmahii / gacmihii, magacii / magicii, magucu / magacu
TODO: tests from these
ilig+N+Masc+Pl+Def+PxSg3M+Abs+Prox il^ik´%>%{N%}a%>híisa# -- > ilkihiisa walaal+N+Masc+Pl+Def+PxSg3M+Abs+Prox walaal´´%>%{N%}a%>híisa# -- > walaalihiisa magac+N+Masc+Sg+Def+Abs+Dist magac´´%>%{N%}kii# -- > magicii magac+N+Masc+Sg+Def+Nom+Prox magac´´%>%{N%}ku# -- > magucu magac+N+Masc+Sg+Def+PxSg3F+Abs+Prox magac´´%>%{N%}kíisa# -- > magiciisa waxa+CS+Foc/R+Sg3M wax%>uu# -- > wuxuu wax+N+Masc+Sg+Def+PxSg3M+Nom+Prox wax´´%>%{N%}kíisu# -- > wixiisu / waxiisu
List of rules:
Ablaut around back fricatives at morpheme boundaries
Ablaut in declension 7 with final vowel
LT -> SH part 1
LT -> SH part 2
LT -> SH part 2 for V3B
LT -> SH in reduplicative plurals, pt 1
V2 {y} deletion
Definite suffix -k to -g next to preceding voiced
t:0 before d h
t/d alternation - definite suffix, V1
TODO: dukaammo
Declension 2:-C doubling in plural
TODO: This rule is currently conflicting with the next one.
Declension 2:-y insertion after back sounds and fricatives; and after -i
Declension 4:Plural Partial Reduplication
Declension 4:Plural Partial Reduplication %{m}
Declension 4:C2->n in m contexts
-y- insertion in plurals
Disallow yy
Doesn't work with wuxu, where k has been deleted.
Declension 6:final vowel
Declension V1:magacow / magacaabay / magacawday, shorten a
Delete preceding a
AllBoundaries \?: WordChars
Declension 7:{-e} delete
Declension 7:final vowel
Declension 7:final vowel in final position
Verbal Rules
V2A -> V3B derivations, delete -i
V2A -> V3B derivations, depalatization
V1:a/e umlaut before i/e
V1:a/e umlaut before a
V1, D3:Vowel deletion
V1:Simplification of dhd to dh (t)
V1:n/m when deleted vowel results in C0n
V2A
tbw.
V2B
V2B:y-insertion
V2B:ey before consonants
V2B:ee before y
V2B:Remove e
V3A
V3B
Following rules are for the funky things that happen with 1Sg, 3SgM, 3Pl
wada>d>aa# wada>nn>aa#
wataa wadannaa
fura>d>aa# fura>nn>aa#
furtaa furannaa
Also another type: TODO: merge
sigo+V+IV+1Sg+Ind+Past sig{a}>d>ay#
sigtay
But: obd / cd / hd / qd / xd / yd
V3B:%{o%}:o finally
V3B:%{o%}:0 in d _ d
V3B:d dissimilation
V3B:d dissimilation w/ cor
V3B:d dissimilation 2
V3B:root {a} deletion before suffixes with -d
V3B:otherwise
General Rules
Vowel reduction VVV -> VV
m->n in codas
m->m in codas
Optional m doubling
Velar deletion after Back, but avoiding digraphs
Final lenition.
-ee changes to -ye after other vowels
-ee changes to -e after y
-ee clitic deletes preceding -o
progressive nasal assimilation across morphological boundaries
-
primus%>s
- primus00
examples:
examples:
examples:
examples: