Meeting_2010-03-15
Contents:
- Meeting setup
- Agenda
- Opening, agenda review, participants
- Updated task status since last meeting
- Oahpa!
- Corpus gathering
- Promoting Divvun
- Future plans, directions and ideas
- Infrastructure
- Linguistics
- Name lexicon/risten.no infrastructure
- Proofing tools
- Other
- Next meeting, closing
- Appendix - task lists for the next week
Meeting setup
- Date: 15.3.2010
- Time: 09.30 Norw. time
- Place: Internet
- Tools: SubEthaEdit, iChat
Agenda
Cf. one of the following, depending on context:
- the upper bar of the SEE window (provided you use the JSPWiki syntax mode)
- the TOC in Forrest-rendered output, like HTML and PDF
Opening, agenda review, participants
Opened at 10: 17.
Present:
Absent: none
Agenda accepted as is.
Updated task status since last meeting
Børre
- contact Ávvir about renewed corpus cooperation
- not done
- not done
- Jabber server and group chats: user name issues + lost chat history
- nothing new
- nothing new
- implement language switch for static divvun site
- not done
- not done
- improve XSL script to transform leaflet Forrest XDocs to an OOo Draw document
- not done
- not done
-
gt/Makefile remake
- not done
- not done
- get translations of thank-you letter
- not done
- not done
- make the new SL Server services functional:
- group calendars
- not done
- not done
- group calendars
- set up corpus mirroring on the XServe again
- not done
- not done
-
fix bugs!
- other:
- made hunspell spellers for North, Lule and South sami. They had a few bugs
- made hunspell spellers for North, Lule and South sami. They had a few bugs
Ciprian
- infrastructure
- make a schema/dtd description of the lexC-file (experiment with
- transform sme-lexC files into XML format
- restructure and clean the script catalogue as suggested in the newsgroup
- todo
- todo
- make a schema/dtd description of the lexC-file (experiment with
- GT web:
- colouring output of the disambiguation
- ongoing
- ongoing
- add a tree visualizer for the dependency trees
- ongoing
- ongoing
- add input help for special characters on the tool sites
- ongoing
- ongoing
- automatise the web statistics
- filter (English, German, etc.) input using language detection tools
- todo
- todo
- put a note on the sites that these are NOT MT tools
- todo
- todo
- input help for generating wordforms (dropdown menus)
- todo
- todo
- colouring output of the disambiguation
- Bodø Oahpa:
- finish the MorfaC implementation
- done
- done
- finish the setup of the bdoahpa versions on victorio
- done for the following target languages: smj, smn, sjd, sje, and yrk
- done for the following target languages: smj, smn, sjd, sje, and yrk
- finish the MorfaC implementation
- Sandbox Oahpa:
- Numra for Skolt Sámi and Finnish
- todo
- todo
- finish the db installation for sb_oahpa
- done: but to debug (docu is not up-to-date, insufficient information, no temporal order of commands, dependencies?)
- done: but to debug (docu is not up-to-date, insufficient information, no temporal order of commands, dependencies?)
- Numra for Skolt Sámi and Finnish
- Running Oahpa:
- add an Oahpa clock and date excercise (cf. Numra)
- todo
- todo
- email notification when the server goes down
- todo
- todo
- check the XXX?
- todo
- todo
- add an Oahpa clock and date excercise (cf. Numra)
- dictionaries, generally:
- synchronize the source language entries from a specific dictionary with the
- todo (long term process)
- todo (long term process)
- the StarDict on Windows: try the HTML-plugin (that means that users can use
- todo
- todo
- try to reduce the dict-size on mac: experiment with xPointer, etc.
- todo
- todo
- synchronize the source language entries from a specific dictionary with the
- Fkv: Nob - Nob: Fkv:
- compile the dictionaries for both directions and in both formats (internal
- todo: waiting for Trond, Verena for the data (apparently deadline not actual any longer)
- todo: waiting for Trond, Verena for the data (apparently deadline not actual any longer)
- try to implement a web version of the dictionaries (internal deadline:
- todo: deadline not actual any longer
- todo: deadline not actual any longer
- compile the dictionaries for both directions and in both formats (internal
- KomEngFin:
- test the automatic sorting by Komi alphabet in xsl (as discussed
- todo
- todo
- test the automatic sorting by Komi alphabet in xsl (as discussed
- SmeNob:
- incorporate the passives into the last version of the sme: nob
- todo
- todo
- incorporate the passives into the last version of the sme: nob
- SmaNobSwe:
- extend the smanobswe dictionary: waiting for data (incorporate the data from
- waiting for Maja-Lisa
- waiting for Maja-Lisa
- extend the smanobswe dictionary: waiting for data (incorporate the data from
- SjdRus:
- continue the work at the Kildin-Russian dictionary, next internal deadline is
- ongoing
- ongoing
- continue the work at the Kildin-Russian dictionary, next internal deadline is
- Corpus:
- check the processing of new corpus documents (error logging during
- todo
- todo
- check the processing of new corpus documents (error logging during
- Lexicon workshop
- evaluate the participants' materials
- todo
- todo
- write feedback emails to each of participant with an individual electronic
- todo
- todo
- contact Kimberly Mäkäräinen and ask whether she might be willing to share
- todo
- evaluate the participants' materials
Maja
- add missing sma verbs
- Done - and I´m now adding from PiaList
- Done - and I´m now adding from PiaList
- check verb Umlaut for several verbs
- Will go through th verblist when I´m looking to Pias verbs
- Will go through th verblist when I´m looking to Pias verbs
- Prepare text´s about normativity issue to SGL/SGM
- DONE for loanwords substantives, flertall, stavelesestakter-
- DONE for loanwords substantives, flertall, stavelesestakter-
- more work on sma adjectives
- check the comparation, vokalutvidelser and add from PiaList, and spelling
- check the comparation, vokalutvidelser and add from PiaList, and spelling
- contact authors of sma aticles in 9-10 eds. of the yearbook
- write a letter to Saemien sïjte
- write a letter to Saemien sïjte
- look at incoming loanwords
- I´ll do this when I´m working with the SGMdocs - I dont understand this,
- I´ll do this when I´m working with the SGMdocs - I dont understand this,
- make a goldstandard doc. av SÄPO-dok - korrektur
- Not done yet
- Not done yet
- continue gathering sma corpus texts
- Not done yet
- Not done yet
- fix bugs!
Sjur
- write letter to Fylkesmannen i Nordland
- not done, and won't do - the case is dismissed
- not done, and won't do - the case is dismissed
- gold standard testing for sma
- still no PL tools
- still no PL tools
- difftest for fst and PL speller
- @Barents: plan a meeting/seminar with potential cooperation partners
- meeting moved to Tromsø
- meeting moved to Tromsø
- @Barents: plan for The Real Thing
- @TTS: continue public tender process
- gathered examples
- gathered examples
- make Leif Åge send out CD's to distribution points
- start Nordplus Sprog project
- wrote to the NordPlus secretary and asked for a contract extension
- wrote to the NordPlus secretary and asked for a contract extension
- Write a formal letter to Davvi girji about electronic dictionaries
- make XSL script to transform leaflet Forrest XDocs to an OOo Drawer document
- name db/risten.no
- follow-up on some Polderland-related bugs: 621, 630, 652
- find and contact the correct person in SD, to get the manuscript for all Sámi
- write new build commands for make
- when the new build infrastructure works as it should, delete the old ones
-
fix bugs!
- other things:
- license discussions in news
- dictionary and other discussions in news
- made several of our dictionaries ($GTHOME/words/dicts/) styled with CSS, and
- license discussions in news
Thomas
- sort out words that do not follow usual Umlaut rules to specific lexica
- continually
- continually
- change rules for vhn, vhl and similar in julev sámi hyphen-file
- done
- done
- prepare text´s about normativity issue to SGL/SGM
- nothing this week
- nothing this week
- Digitalize south saami books
- worked
- worked
- add compound tags for sma
- worked
- worked
-
fix bugs!
- some work done
Tomi
- test that the new analysers behave as the old one (=give exactly the same res)
- done more testing
- done more testing
- put together the TTS preprocessing transducers and scripts
- not done
- not done
- write new build commands
- working
- working
- when the new build infrastructure works as it should, delete the old ones
- not done
- not done
- document how compounding is controlled in the PLX conversion
- not done
- not done
- fix double hyphen bugs
- not done
- not done
- fix PL smj hyphenator bug
- not done
- not done
- fix PL conversion bugs
- not done
- not done
- fix bugs!
Trond
- fkv: nob and nob: fkv scheduled for 18.3. release:
- content and webpage update
- The release is postponed until April
- The release is postponed until April
- content and webpage update
- Northern areas
- Progress, but slowly
- Progress, but slowly
- MT/Terminology
- Some admin done.
- Some admin done.
- fix bugs!.
Oahpa!
7 different Oahpa!'s now in svn! The outcome of the Bodø seminar. It was a great
Finnish version of Oahpa ready for testing and deployment on the production
Meeting memos can be found at
TODO
- Register oahpa.no (Trond)
- Generate fin/xml/{nouns|verbs|adjectives}.xml, and implement the new Leksa
- From the start: sjd_oahpa Leksa in deu and eng as well
- clock and date for Numra (Trond, Ciprian)
- Numra for Skolt Sámi (Trond, Ciprian)
- implement the Bodø-Oahpa (Ciprian, Trond, Lene)
- email notification when the server goes down (Ciprian)
- Finding a volunteer to translate the sme Leksa lexicon to Swedish (Trond)
Corpus gathering
Trond: we have been asked to come and present our tools in the first or second
TODO:
- continue gathering sma corpus texts (Maja)
- get sma articles in Š-bláđđi
- the Gun Utsi book is almost there - one contract missing (Maja)
- get sma articles in Š-bláđđi
- write formal letter to Davvi Girji (Sjur)
- send a copy of the signed contracts back to the authors, translators and
- find and contact the correct person in SD, to get the manuscript for all Sámi
- get the sma yearbooks from Saemien sïjhte ( Maja)
- contact certain sma writers (Børre)
- contact Ávvir about renewed corpus cooperation (Børre)
Promoting Divvun
TODO:
- make leaflet to inform about the project (Børre)
- add InDesign text (Sjur)
- make XSL script to transform Forrest XDocs to an OOo Drawer document
- add InDesign text (Sjur)
- distribute CD version through the library bus, the language centres and common
- make him send out CD's accordingly (Sjur)
- make him send out CD's accordingly (Sjur)
- update online download log statistics page (Børre)
Future plans, directions and ideas
See a separate document in plan/strat/5year.jspwiki.
Northern areas project
TODO:
- Attend a beginners' course in Russian (priority: the alphabet!) near you..
- plan a meeting/seminar with potential cooperation partners (Trond, Sjur)
- plan for The Real Thing (Trond, Sjur)
Infrastructure
- License
- Dictionaries, styling and editing - > see Dictionaries further down
- Corpus interface
License
Discussion. All need to be able to read the newsgroup. New Unison licenses:
- Lene
- Maja
- Biret Ánne
- Linda?
- Upgrades:
- Sjur, Tomi
Please read and comment in the newsgoup. Trond and Sjur will get the required
TODO:
- install & configure the Unison news reader (Sjur, Trond)
Corpus interface
Made by Lars Nygård, UiT/Tekstlabben. Based on CWB from Stuttgart. Plus: made by
Our corpus (that is, Oslo) is using http://cwb.sf.net/.
A more user friendly interface is wished for before the Giellagiella
TODO:
- make a simple web search form for the UiT corpus repository (Ciprian, X)
- before May
- before May
- check out the new version of CWB with Unicode support (Ciprian)
- fall
Makefile + tag simplification
TODO:
- test that the output from the new transducers is identical to the old one
- working session to go through the remaining issues: 16.3. (Tomi, Sjur)
- write new build commands (Sjur, Tomi)
- when the new build infrastructure works as it should, delete the old ones
General list
- licensing of our source code
- the discussion goes on in news — needs to be finished soon
- the discussion goes on in news — needs to be finished soon
- fit repository:
- maintenance issues
- FOSS policy and communication of it
- what do we actually want to support?
- structure of repository
- maintenance issues
Trond will write an e-mail to the fit group explaining our situation.
To accommodate future enhancements in different directions (in rough order of
- test bench for all parts of our language technology efforts
- test bench enhanced, but not yet complete
- test bench enhanced, but not yet complete
- improve Forrest i18n support with static sites
- reorganise the documentation:
- differ between target groups
- get better grouping
- decide what to write in Forrest and what in wiki
- update/add missing parts
- differ between target groups
- migrate lexc lexicons to XML, splitting the task
- Name lexica (the Name project)
- Dictionaries (already in XML, task is to integrate them)
- At least migrate the lexc open POSes (Komi as a pilot case)
- Name lexica (the Name project)
- change the look of the documentation web
- corpus content moved to Max Planck repositories? Norsk språkbank?
- update infrastructure to allow content-restricted spellers for special target
TODO:
- make the new SL Server services functional: ( Børre)
- group calendars
- group calendars
- Jabber server and group chats: user name issues + lost chat history
- set up corpus mirroring on the XServe again (Børre)
- restructure and clean the script/ directory, using subdirs to categorise
- please act as suggested, cf comments in the news group
- please act as suggested, cf comments in the news group
- infrastructure remake: ( Børre, Ciprian, Sjur, Tomi, Trond)
- make a test-all target that runs all tests we have (Ciprian, Sjur, Trond)
- delayed until we have restructured the make/build process
- delayed until we have restructured the make/build process
- define and document testing routines (Ciprian, Sjur, Trond)
- delayed until we have restructured the make/build process
Linguistics
North Sámi
(nothing new, see proofing bugs below)
Lule Sámi
(nothing new, see proofing bugs below)
South Sámi
TODO:
- Prepare texts about normativity issue to SGL/SGM (Maja, Thomas)
- adjectives (Maja with Thomas, Trond, Sjur)
- two competing naming conventions of continuation lexicons
- One naming goes ATTRSUFF-PREDSUFF-STEMTYPE
- One follows the sme convention of naming key adjectives
- There are duplicate lexica
- One naming goes ATTRSUFF-PREDSUFF-STEMTYPE
- The comparative issue open here and there
- The ATTRSUFF-PREDSUFF-STEMTYPE lexica now go to EVENCOMP and
- The ATTRSUFF-PREDSUFF-STEMTYPE lexica now go to EVENCOMP and
- two competing naming conventions of continuation lexicons
- add compound tags (Thomas)
- ongoing
- ongoing
- write letter to Fylkesmannen i Nordland (Sjur)
- won't happen
- won't happen
- check verb Umlaut for several verbs (Maja, Sjur, Thomas, Trond)
- done
- done
- clitics: -gih, -gan, -ge,-gænnah(most in use) (Maja, Thomas)
Name lexicon/risten.no infrastructure
TODO:
- find already approved lists, in paper or electronic form (term team)
- convert paper lists to electronic lists (term team)
- convert lists to standard XML (Sjur, Tomi)
- add prepared lists to risten.no (Sjur, Tomi)
- fix i18n bug in risten.no/G5 (so they will work without the proper locale
- fix bugs in lexc2xml; add comments to the log element (Saara)
- finish first version of the editing (Sjur)
- test editing of the xml files. If ok, then: ( Sjur, Thomas, Trond)
- make terms-smX.xml <=== automatically from propernoun-sme-lex.xml (add nob
- convert propernoun-($lang)-lex.txt to a derived file from common xml files
- implement data synchronisation between risten.no and
- start to use the xml file as source file
- clean terms-sme.xml such that all names have the correct tag for their use
- merge placenames which are errouneously in different entries: e.g. Helsinki,
- publish the name lexicon on risten.no (Sjur)
- add missing parallel names for placenames (linguists)
- add informative links between first names like Niillas and Nils
Dictionaries
- Dictionaries, styling and editing
Lecture on Thursday, 10.00 Norw. time.
TODO:
- fkv: nob and nob: fkv is still scheduled for 18.3. release:
- content and webpage update (Trond, Verena)
- compile the MacDict and StarDict versions (Ciprian)
- content and webpage update (Trond, Verena)
- kom: fin-eng
- moved the original kom-lex.xml to the inc-dir and froze it
- split it by pos into the working_file dir, the ONLY place to work with
- now, the lexC files are generated via XSLT sheets, no perl scripts
- adjusted the Makefile
- prepared the pipeline for compiling the mac dict
- todo: make a pipeline for StarDict also (as far as I know, Jaska has)
- moved the original kom-lex.xml to the inc-dir and froze it
- set up risten.no on eXist/XServe (as a beta version site) (Sjur)
- set up required infra for smenob on risten.no/XServe (Sjur)
- Continue the dictionary infrastructure discussion (Ciprian, Sjur, Trond)
- end user documentation (how to download and install) (Ciprian, Trond)
- Contact Davvi Girji about cooperation on electronic dictionaries
- developing the mobile phone version of smenob:
- http: //pfed.info/wksite/
- http: //pfed.info/wksite/
- Komi
- take out the doublets to a separate file (Ciprian)
- merge the doublets (Jaska, Trond)
- Completing the automaton to some state (Trond, Jaska, Paula)
- take out the doublets to a separate file (Ciprian)
Proofing tools
South Sámi
TODO:
- Go through the phonrules file (Thomas)
- difftest for fst and PL speller (Sjur)
- External beta testers:
- David
- Jovsset
- the Røros group
- David
- gold standard testing (Sjur)
HFST-based proofing tools
The work with Voikko+HFST is moving forward.
Testing
Spelling Error Markup
TODO:
- test new and nested error markup (Sjur)
- nesting still needs to be tested, depends on new ccat feature
Speller testing
TODO:
- test the error type selection feature in ccat (Sjur)
Testing open-source Norwegian spellers
Sjur has invited the open-source group to test their spell-checker using our
We should go to their developer meetings, and present our work and how to work
Speller bugs
List of bugs returned from Polderland:
- 621
- 630
- 652
- 656
- 676
Tag reordering for abbreviations have caused a lot of problems:
smj: hr. hr. hr+ABBR+Acc cand.philol. cand.philol. cand.philol+ABBR+N+Acc Per Per Per+N+Prop+Mal+Sg+Attr sme: hr. hr. hr+N+ABBR+Acc Per Per Per+N+Prop+Mal+Sg+Attr
Open issues based on test results:
sme
- 399 - missing numerals (plural forms) - still OPEN
- 425 - X not recognised; single letters were left out - still OPEN
- 435 - roman numbers - inflection of single letter numbers
- we should pregenerate all numbers once and for all, and store them in a
- we should pregenerate all numbers once and for all, and store them in a
- 461 - REGRESSION: missing suggestion (sáhkki)
- 508 - REGRESSION: accepts smj entries (most likely abbreviation missing)
- 520 - REGRESSION: r9 and š9 not defined (abbr. missing)
- 595 - prefix+name without hyphen (ovdaLot instead of ovda-Lot) - still
- 603 - suomabealdi accepted - still OPEN
- 606 - compound-tags LEXICON VUOHTA - still OPEN
- 613 - short gen. as second compound part - still OPEN
- 619 - numerals and pronouns to NAMÁK and SASJ fails - vihttasoarttat
- 629 - a taking part in compounding without hyphen - still OPEN
- only open case has word A-finálaid compounded
- only open case has word A-finálaid compounded
- 647 - numerals+NOUN - still OPEN, open case has uppercase letters
- 648 - unmotivated suggestions with numeral+noun - still OPEN
- 661 - REGRESSION: abbr. not recognized
- 709 - sámedikkeválga accepted - OPEN
- 728 - vowel shortening GenCmp+Left-tagged - still OPEN
- 779 - caseforms of pronoun okatahat - still OPEN
- 785 - does not recognize alphabet-abbr+noun - OPEN
- 802 - NEW: multiword propernouns
- 803 - NEW: FINJU- words accepted single-handed
- 804 - NEW: guovttilogát, njealjilogát
- 805 - NEW: Nouns+acronyms
smj
- 435 - roman number - single letter numbers now recognised
- we should pre-generate all numbers once and for all, and store them in a
- please note that inflection of single letter numerals is fine in
- we should pre-generate all numbers once and for all, and store them in a
- 482 - polardutkamin not recognized - FIXED
- 496 - REGRESSION: unrecognised clitics
- 556 - non-existent word accepted - FIXED
- 594 - lågenanguoktáj not recognized - still OPEN
- 595 - REGRESSION: prefix+name as split comp without hyphen
- 596 - C-giellan is not accepted - still OPEN
- 600 - Gen+hyph compound - FIXED
- 627 - prefix + hyhpen does not get accepted - FIXED
- 647 - numerals+NOUN - still OPEN, open case has uppercase letters
- 648 - unmotivated suggestions with numeral+noun - still OPEN
- 650 - REGRESSION: noun prefix+name compound without hyphen
- 652 - UPPERCASE-typos only get acronym-suggestions - still OPEN
- 692 - numeral-variants - all but one fixed (gáktsalågenantjuotakta), but
- 744 - REGRESSION: numerals + clitic
- 803 - NEW: VINJU- words accepted single-handed
- 805 - NEW: Nouns+acronyms
TODO:
- document how compounding is controlled in the PLX conversion (Tomi)
Hyphenator bugs
Open issues based on test results :
sme
No known issues!
smj
- 670 - Hard hyphen replaced with soft hyphen: 10-biejvvásattja (the word is
sma
Command to test the hyphenator:
preprocess dev/corp/pressemelding.txt | lookup bin/hyph-sma.fst | cut -f2 | \ lookup bin/hyphrules-sma.fst | grep -v '^$' | cut -f2 | uniq | see
TODO:
- fix PL hyphenator errors (Tomi)
- almost done - one smj bug left
Installer changes
TODO:
- test InDesign installer (Sjur)
User documentation
TODO:
- InDesign documentation (Sjur)
- Norwegian translation received from Davvi Girji
1.2 release
Content:
- several smj bug fixes
- lexicalisations
- InDesign Mac & Win
- new OOo beta
- improved installers, at least for Mac, preferably also for Windows
Other
Easter holidays
Holiday for all of us: th, fri, mon 1.4, 2.4, 4.4
Name | Vacation |
---|---|
Børre | 29-31 of march |
Cip | orthodox or non-orthodox (I celebrate them all) |
Maja | 29-31 of march ? |
Sjur | 29-31 of March |
Thomas | 29-31 of march |
Tomi | Not sure yet |
Trond | Not clear yet (but traveling the week after easter) |
Thursday inhouse seminar
A short (less than 1h) seminar every Thursday at 10 AM. Possible topics:
- next time: XXE and dictionaries (etc)
Spring planning
Topics:
- speller test project -> Sjur, Børre, Thomas, X
- speech synthesis -> Sjur, Trond, BA as a starter
- risten.no -> Ciprian, Sjur, Tomi
- Barents project follow-up meeting -> Trond, Sjur
- sme and smj proofing tools, next version -> Thomas, Tomi, Sjur
- HFST-based proofing tools -> Sjur
- MT terminology project -> Trond, Linda, Fran, Kevin
Dates:
- March 1-14: Bodø (conference,
- Cip in Bodø 2-5
- Lene in Bodø 8-12
- Trond in Bodø 2-4?
- Cip in Bodø 2-5
- March 23-24: SMA school conference, Trondheim - Cancelled for us
- May 4-5: Lene, Giellagiella
- May: LREC Malta: WS: 17-18, Conf. 19-21, WS 22-23
- July: ACL Uppsala
Text to speech
Tomi has familiarised himself with the code.
TODO:
- put together the preprocessing transducers and scripts (Tomi)
- refine syntax / dependency rules (Biret Ánne)
- continue public tender process (Sjur)
CAT
Autshumato ITE won't work on the Mac because of Apple software security
A-ITE seems to be released as 1.0, we will test it.
TODO:
- update and test (Ciprian, Sjur)
Next meeting, closing
The next meeting is 22.03.2010, 9.30 Norwegian time.
The meeting was closed at 12: 31.
Appendix - task lists for the next week
Boerre
- contact Ávvir about renewed corpus cooperation
- Jabber server and group chats: user name issues + lost chat history
- implement language switch for static divvun site
- improve XSL script to transform leaflet Forrest XDocs to an OOo Draw document
-
gt/Makefile remake
- get translations of thank-you letter
- make the new SL Server services functional:
- group calendars
- group calendars
- set up corpus mirroring on the XServe again
- fix bugs!
Ciprian
- infrastructure
- make a schema/dtd description of the lexC-file (experiment with
- transform sme-lexC files into XML format
- restructure and clean the script catalogue as suggested in the newsgroup
- make a schema/dtd description of the lexC-file (experiment with
- GT web:
- colouring output of the disambiguation
- add a tree visualizer for the dependency trees
- add input help for special characters on the tool sites
- automatise the web statistics
- filter (English, German, etc.) input using language detection tools
- put a note on the sites that these are NOT MT tools
- input help for generating wordforms (dropdown menus).
- colouring output of the disambiguation
- Sandbox Oahpa:
- Numra for Skolt Sámi and Finnish
- debug the installed sb_oahpa
- update, correct, complete the Oapha docu site
- Numra for Skolt Sámi and Finnish
- Running Oahpa:
- add an Oahpa clock and date excercise (cf. Numra)
- email notification when the server goes down
- check the XXX?
- add an Oahpa clock and date excercise (cf. Numra)
- dictionaries, generally:
- synchronize the source language entries from a specific dictionary with the
- the StarDict on Windows: try the HTML-plugin (that means that users can use
- try to reduce the dict-size on mac: experiment with xPointer, etc.
- synchronize the source language entries from a specific dictionary with the
- Fkv: Nob - Nob: Fkv:
- compile the dictionaries for both directions and in both formats (internal
- try to implement a web version of the dictionaries (internal deadline:
- compile the dictionaries for both directions and in both formats (internal
- KomEngFin:
- test the automatic sorting by Komi alphabet in xsl (as discussed
- test the automatic sorting by Komi alphabet in xsl (as discussed
- SmeNob:
- incorporate the passives into the last version of the sme: nob
- incorporate the passives into the last version of the sme: nob
- SmaNobSwe:
- extend the smanobswe dictionary: waiting for data (incorporate the data from
- extend the smanobswe dictionary: waiting for data (incorporate the data from
- SjdRus:
- continue the work at the Kildin-Russian dictionary, next internal deadline is
- continue the work at the Kildin-Russian dictionary, next internal deadline is
- Corpus:
- check the processing of new corpus documents (error logging during
- check the processing of new corpus documents (error logging during
- Lexicon workshop
- evaluate the participants' materials
- write feedback emails to each of participant with an individual electronic
- contact Kimberly Mäkäräinen and ask whether she might be willing to share
- evaluate the participants' materials
- update and test Autshumato ITE
Maja
- add missing sma verbs
- Prepare text´s about normativity issue to SGL/SGM
- more work on sma adjectives
- contact authors of sma aticles in 9-10 eds. of the yearbook
- look at incoming loanwords
- make a goldstandard doc. av SÄPO-dok - korrektur
- continue gathering sma corpus texts
- fix bugs!
Sjur
- install & configure the Unison news reader
- FST make working session to go through the remaining issues: 16.3.
- XXE & dictionaries workshop Thursday
- update and test Autshumato ITE
- gold standard testing for sma
- difftest for fst and PL speller
- @Barents: plan a meeting/seminar with potential cooperation partners
- @Barents: plan for The Real Thing
- @TTS: continue public tender process
- make Leif Åge send out CD's to distribution points
- start Nordplus Sprog project
- Write a formal letter to Davvi girji about electronic dictionaries
- make XSL script to transform leaflet Forrest XDocs to an OOo Drawer document
- name db/risten.no
- follow-up on some Polderland-related bugs: 621, 630, 652
- find and contact the correct person in SD, to get the manuscript for all Sámi
- write new build commands for make
- when the new build infrastructure works as it should, delete the old ones
- fix bugs!
Thomas
- sort out words that do not follow usual Umlaut rules to specific lexica
- prepare text´s about normativity issue to SGL/SGM
- Digitalize south saami books
- add compound tags for sma
- fix bugs!
Tomi
- FST make working session to go through the remaining issues: 16.3.
- test that the new analysers behave as the old one (=give exactly the same res)
- put together the TTS preprocessing transducers and scripts
- write new build commands
- when the new build infrastructure works as it should, delete the old ones
- document how compounding is controlled in the PLX conversion
- fix double hyphen bugs
- fix PL smj hyphenator bug
- fix PL conversion bugs
- fix bugs!
Trond
- install & configure the Unison news reader
- fkv: nob and nob: fkv scheduled for april release:
- content and webpage update
- content and webpage update
- Northern areas
- MT/Terminology
- fix bugs!.