130815
sme-sma-mt meeting 15.8.2013
Francis, Lene, Trond.
Agenda
- Documentation
- Linguistics
- Technicalities
Documentation
Now also on http://wiki.apertium.org
Linguistics
ieš:
http://pastebin.com/raw.php?i=kZDbYK1z
Numerals
- Hashes are gone from above
- Case paradigms have not been checked
- Some numerals have case, others not.
1 ^vïjhte<num><sg><ill><attr>$ #vïjhte\<num\>\<sg\>\<ill\>\<attr\> ^vïjhte/vïjhte<num><sg><acc>/vïjhte<num><sg><nom>$^./.<clb>$ 1 ^göökte<num><sg><ill><attr>$ #göökte\<num\>\<sg\>\<ill\>\<attr\> ^göökte/göökte<num><sg><acc>/göökte<num><sg><nom>$^./.<clb>$ 1 ^gellie<num><sg><ill><attr>$ #gellie\<num\>\<sg\>\<ill\>\<attr\> ^gellie/gellie<num><sg><nom>/gellie<pron><indef><sg><nom>$^./.<clb>$ 1 ^akte<num><sg><ill><attr>$ #akte\<num\>\<sg\>\<ill\>\<attr\> ^akte/akte<num><sg><nom>$^./.<clb>$ 1 ^akte<num><sg><gen><der_lágan><a><attr>$ #akte\<num\>\<sg\>\<gen\>\<der_lágan\>\<a\>\<attr\> ^akte/akte<num><sg><nom>$^./.<clb>$ 1 ^golme<num><sg><gen><cmp>$ #golme\<num\>\<sg\>\<gen\>\<cmp\> ^golme/golme<num><sg><acc>/golme<num><sg><nom>$^./.<clb>$
passl
<e><p><l>viiddidit<s n="v"/><s n="tv"/><s n="der_passl"/><s n="v"/><s n="iv"/></l><r>væjranidh<s n="v"/><s n="iv"/></r></p></e>
Sámegiela dilli olggobealde hálddašanguovllu gal ii leat buorre, ja evalueren konkludere dainna ahte hálddašanguovlu ferte viiddiduvvot.
- Ij leah dan hijven tsiehkieh saemien gïelese reeremedajven ålkolen, jïh vuarjasjimmie vihteste reeremedajve tjuara væjranidh.
- 1 Saemiengïelen tsiehkie bæjngolen reeremedajven hågkh ij leah buerie, jïh evalueereme *konkludere dejnie ahte reeremedajve tjuara vijriesovvedh.
- 2 Saemiengïelen tsiehkie bæjngolen reeremedajven hågkh ij leah buerie, jïh evalueereme *konkludere dejnie "ahte" reeremedajve tjuara væjranidh.
- 3 Manne daajram, ahte saemiengïelen tsiehkie bæjngolen reeremedajven hågkh ij leah buerie, jïh evalueereme *konkludere dejnie ahte reeremedajve tjuara væjranidh.
Lemma
Variants getting the same lemma. - looking for the lemma in the bidix.
Fran: make analysis use desc fst.
Technicalities
Declaring symbols
Apertium svn messages as mail
We get source forge mail, but not svn mail. Fran to have a look.
https: //sourceforge.net/auth/subscriptions/
"Apertium: machine translation toolbox svn direct day All artifacts "
Syntax
echo "Mun oasttán láibbi buvddas." | preprocess | usme | lookup2cg | vislcg3 -g src/sme-dis.rle | vislcg3 -g src/smi-syn.rle "<Mun>" "mun" Pron Pers Sg1 Nom @SUBJ> "<oasttán>" "oastit" V TV Ind Prs Sg1 @+FMAINV "<láibbi>" "láibi" Food N Sg Acc @<OBJ "<buvddas>" "buvda" Org Build N Sg Loc @<ADVL "<.>" "." CLB
"<Mus>" "mun" Pron Pers Sg1 Loc @HAB> <====== Gen "<lea>" "leat" V IV Ind Prs Sg3 @+FMAINV "<láibi>" "láibi" Food N Sg Nom @<SPRED "<.>" "." CLB
Results:
echo "Mus lea láibi." | apertium -d . sme-sma Mannesne lea laejpie. $ echo "Mus lea láibi" | apertium -d . sme-sma Mov (lea) laejpie. echo "Mus leat guokte láibbi." | apertium -d . sme-sma Mov (leah) göökte laejpien. <=== laejpieh
http://wiki.apertium.org/wiki/North_S%C3%A1mi_and_South_S%C3%A1mi/Pending_tests
fran@eki: ~/source/apertium/nursery/apertium-sme-sma$ bash pending-tests.sh
Syntactic functions
We copy old gt/sme/src/smi-syn.rle to functions.cg3
it is here:
langs/sme/src/syntax/functions.cg3
genration report
cat generation-report.sme-sma.txt | grep "#"
hash gives
freq postchunk form generated lemma analysed ------------------------------------------------------------------------------------------- 9 ^jienebe<a><comp><sg><nom>$ #jienebe\<a\>\<comp\>\<sg\>\<nom\> ^jienebe/jienebe<pron><indef><sg><nom>$^./.<clb>$ 7 ^rolle<n><sg><acc>$ #rolle\<n\>\<sg\>\<acc\> ^rolle/*rolle$^./.<clb>$ 6 ^njoetseldh<a><attr><cmp>+almetje<n><pl><ill>$ #njoetseldh\<a\>\<attr\>\<cmp\>almetjidie ^njoetseldh/njoetsedh<v><iv><der_l dh><n><sg><nom>/njoetseldh<a><sg><nom>$^./.<clb>$ <e><p><l>eambbo<s n="a"/></l><r>jienebe<s n="a"/></r></p></e> <e><p><l>eanet<s n="a"/></l><r>jienebe<s n="a"/></r></p></e>
find line with output: cat texts/samediggidiedadus_samegiela_birra_2012.sme.txt | apertium -d . sme-sma-postchunk | cat -n | grep "jïjtje<pron><refl><sg><gen><pxpl1>" find input: head -n 17 texts/samediggidiedadus_samegiela_birra_2012.sme.txt | tail -1 | apertium -d . sme-sma-tagger