Meeting_2008-02-04
Contents:
- Meeting setup
- Agenda
- Opening, agenda review, participants
- Updated task status since last meeting
- Pedagogical software online
- Workshop in Tromsø, end of February
- Documentation
- Corpus gathering
- Infrastructure
- Linguistics
- Name lexicon infrastructure
- Proofing tools
- Other
- Next meeting, closing
- Appendix - task lists for the next five days
Meeting setup
- Date: 4.2.2008
- Time: 09.30 Norw. time
- Place: Internet
- Tools: SubEthaEdit, iChat/Skype
Agenda
Cf. one of the following, depending on context:
- the upper bar of the SEE window (provided you use the JSPWiki syntax mode)
- the TOC in Forrest-rendered output, like HTML and PDF
Opening, agenda review, participants
Opened at 09: 43.
Present: Børre, Per-Eric, Sjur, Thomas, Tomi
Absent: Maaren, Trond
Agenda accepted as is.
Updated task status since last meeting
Børre
- start to reorganise the documentation
- not done
- not done
- gather sma texts
- not done
- not done
- improve forrest stability with i18n, site look
- did some changes in the layout, but that had to be reversed
- did some changes in the layout, but that had to be reversed
- set up the Leopard Server features for collaborative support
- not done
- not done
- Hunspell lexicon conversion
- debugged propernoun generation, not finished
- debugged propernoun generation, not finished
- InDesign documentation
- not done
- not done
- investigate the NSIS installer
- nothing new this week
- nothing new this week
- release InDesing tools Jan. 30.
- not done
- not done
- work on Tromsø Sami workshop paper
- not done
- not done
-
fix bugs!
- http: //giellatekno.uit.no/bugzilla/show_bug.cgi?id=640
Lene
- Ped project
- work on Tromsø Sami workshop paper
Maaren
- Put the list of possible sma corpus sources into a document
- update the Changes document
Per-Eric
- check some unusual and missing words from the last Olavi missing list
- Working and working, maybe ready this week
- Working and working, maybe ready this week
- keep the contact with Kurt Tores family about his texts.
- Heard nothing yet
- Heard nothing yet
- try to find other authors who have smj texts digitaly
- Nothing yet
- Nothing yet
-
fix bugs!
- Worked with it
Saara
- add new XSL/XML headers for proofing test docs
- Set up ways of adding meta-information for proofing correct corpus docs
- discuss more parallel texts
Sjur
- start to reorganise the documentation
- gather sma texts
- improve forrest stability with i18n, site look
- set up the Leopard Server features for collaborative support
- check the present sma sources
- name db/risten.no
- investigate the NSIS installer
- get hotel rooms in Snåsa
- done
- done
- make a first sma project plan
- publish corpus contracts and project infra as open-source on NoDaLi-sta
- release InDesing tools Jan. 30.
- almost done, release candidate ready by Friday Feb. 1., but not all bugs
- almost done, release candidate ready by Friday Feb. 1., but not all bugs
- work on Tromsø Sami workshop paper
- updated Polderland tools by Wednesday
- arrived on Thursday, and only speller updates. Hyphenator bugs still open
- arrived on Thursday, and only speller updates. Hyphenator bugs still open
- final changes and bug fixes by Thursday afternoon
- known ones done, we need the hyph fixes from Polderland before we can analyse
- known ones done, we need the hyph fixes from Polderland before we can analyse
-
fix bugs!
- other things:
- identified the non-displaying column in risten.no on the G5 to be
- tried to solve the non-displaying column (iframe) issue
- tried to solve the i18n issue in risten.no on the G5
- identified the non-displaying column in risten.no on the G5 to be
Thomas
- look at test cases still not behaving properly
- worked some
- worked some
- release InDesing tools Jan. 30.
- almost
- almost
- work on Tromsø Sami workshop paper
- not done
- not done
- final changes and bug fixes by Thursday afternoon
- done
- done
-
fix bugs!
- worked some
Tomi
- Hunspell lexicon conversion
- not done
- not done
- document how compounding is controlled in the PLX conversion
- not done
- not done
- release InDesing tools Jan. 30.
- done
- done
- work on Tromsø Sami workshop paper
- not done
- not done
- debug %> problem in Hunspell conversion
- done
- done
- fix double hyphen bugs
- worked on it
- worked on it
- new lexicons by Tuesday
- done
- done
- final changes and bug fixes by Thursday afternoon
- done
- done
- final lexicons by Friday morning
- done
- done
-
fix bugs!
- done
Trond
- Report the smesmj project
- Start working on the samdoc talk
-
sme->smj lexicon conversion to build bilingual lexicon resources
- Reorganise documentation (with Børre and Sjur)
- Gather sma texts (with Børre and Sjur)
- Look at the sma source files (with Sjur)
- Name lexicon project: Test editing xml files (when they are ready for it)
- Make a first sma project plan
- work on Tromsø Sami workshop paper
- fix bugs!.
Pedagogical software online
TODO:
- Setting up the user documentation with an external address, and
- get an easy-to-remember URL (UiT/IT)
- More thorough skin, layout, ... (External person within the Ped team,
Workshop in Tromsø, end of February
TODO:
- Presentation of our work
- Basic tools (Sjur, Trond, Thomas)
- Applications (Lene, Sjur)
- Corpus infrastructure (Børre, Saara, Sjur)
- Overall infrastructure ("Makefile") (Sjur, Tomi)
- Basic tools (Sjur, Trond, Thomas)
- Plans for future work (Sjur, Trond)
- Relevance for other projects
- Standard written language texts (Trond)
- Existing written dialect texts (Lene, Trond)
- Existing dialect recordings (Lene)
- Standard written language texts (Trond)
- Turn the text into slides (samdoc08.tex into samdoc08-sem.tex (Trond)
Documentation
TODO:
- start to reorganise the documentation (Børre, Sjur, Trond)
Corpus gathering
TODO:
- follow-up on the smj texts from Kurt Tore ( Per-Eric)
- get texts from Sigga Tuolja Sandstrøm ( Per-Eric)
- gather sma texts (Børre, Sjur, Trond)
- Put the list of possible corpus sources into a document
Infrastructure
TODO:
- add Jabber account in iChat (all)
- improve forrest stability with i18n, site look (Børre, Sjur, Tomi)
- set up the Leopard Server features for collaborative support - permanent chat
Linguistics
North Sámi
Hyphenation bugs still there, now properly documented by the improved test
Lule Sámi
Hyphenation: same as for sme.
TODO:
-
sme->smj lexicon conversion to build bilingual lexicon resources, and
- Add the words when all words are ready.
South Sámi
TODO:
- check the present sources (Sjur, Trond)
Name lexicon infrastructure
TODO:
- fix i18n bug in risten.no/G5 (so they will work without the proper locale
- it works ok locally, set-up / config needs to be checked on the G5; probably
- looked at it
- looked at it
- it works ok locally, set-up / config needs to be checked on the G5; probably
- fix display in column 3 (Sjur)
- it works in Firefox and other Mozilla-based browsers; not in Safari and other
- looked at it
- looked at it
- it works in Firefox and other Mozilla-based browsers; not in Safari and other
- fix bugs in lexc2xml; add comments to the log element (Saara)
- finish first version of the editing (Sjur)
- test editing of the xml files. If ok, then: ( Sjur, Thomas, Trond)
- make terms-smX.xml <=== automatically from propernoun-sme-lex.xml (add nob as
- convert propernoun-($lang)-lex.txt to a derived file from common xml files
- implement data synchronisation between risten.no and
- start to use the xml file as source file
- clean terms-sme.xml such that all names have the correct tag for their use
- merge placenames which are errouneously in different entries: e.g. Helsinki,
- publish the name lexicon on risten.no (Sjur)
- add missing parallel names for placenames (linguists)
- add informative links between first names like Niillas and Nils
Proofing tools
Hunspell
The %> marker does not survive into Hunspell to work as a boundary marker,
TODO:
- debug the missing > marker - the problem is on the Java side (Børre, Tomi)
- add smj to the soup, make sure it works roughly as good as sme
- fix the remaining conversion bugs for sme
- return to smj, and fix whatever is left to fix
- integrate the derivations as separate "continuation lexicons"
Testing
Spelling Error Markup
TODO:
- Set up ways of adding meta-information (source info, used in testing or not,
- test new and nested error markup (Sjur)
Speller bugs
Open issues based on test results :
sme
- 425 - roman number - will not be fixed in 1.0 release - FIXED
- 426 - comp words from Divvun.no - guoktedássásaš accepted - still open
- 536 - speller accepts "impossible" compound-forms, geažideapmigárvu and
- 593 - missing words in beta2 - FIXED
- 595 - prefix+name wihtout hyphen (ovdaLot instead of ovda-Lot)
- 597 - does not recognize nubbelohki - FIXED
- 603 - suomabealdi, norggabealdi accepted
- 606 - speller accepts VUOHTA compound
- 611 - double hyphen sugg still accepted
- 613 - short gen. as second compound part
- 619 - REGRESSION: - numerals and pronouns to NAMÁK and SASJ fails
- 625 - word+footnote - possibly Polderland or MS, or a consequence of allowing
- 627 - prefix + hyhpen does not get accepted
- 629 - a taking part in compounding without hyphen
- 631 - numbers starting with 0 - FIXED
- 633 - double hyphens accepted in Word, not by cmdline speller
- 634 - PropGen+hyph+PropGen
- 637 - nai(go) becomes -naj(go) - FIXED
- 641 - umeral+noun compounds
- 642 - noun/adj/proper + hyphen + ain
smj
- 482 - Nuorttalijguovlojn accepted again
- testcase changed, test PASSED
- testcase changed, test PASSED
- 607 - acro + hyphen, NRKGA accepted - test pair is wrong, should be corr.
- 615 - actio and actor compounds - FIXED
- 616 - Bispadime-me-ráden - still OPEN
- 618 - dipht. simpl. - FIXED
- 619 - REGRESSION: - numerals and pronouns to NAMÁK and SASJ fails
- 629 - a taking part in compound - still OPEN
- 631 - number compounds starting with 0 - FIXED
- 634 - rop gen + hyphen + Prop gen
- 641 - umeral+noun compounds
TODO:
- look at test cases still not behaving properly (Thomas, Tomi)
- document how compounding is controlled in the PLX conversion (Tomi)
Hyphenator bugs
Open issues based on test results :
sme
- 468 - Márkomenau -> Polderland
- 548 - duostan -> Polderland
- 549 - missing hyph at word boundary -> Polderland
- 633 - extra hyphen inserted -> Polderland
smj
- 549 - missing hyph at word boundary -> Polderland
- 633 - extra hyphen inserted -> Polderland
- 636 - hyphen before last char -> Polderland
InDesign tools
Near-final tools were released on Friday, Feb. 1, including working user
TODO:
- test twolc hash mark bug solution (Tomi, Trond, Sjur)
- done - it worked fine, and is the only possible solution due to special treatment of this bug in twolc
- done - it worked fine, and is the only possible solution due to special treatment of this bug in twolc
- fix double hyphen bugs (Tomi)
- new lexicons by Tuesday (Tomi)
- done
- done
- updated Polderland tools by Wednesday (Sjur)
- done, delivered on Thursday
- done, delivered on Thursday
- final changes and bug fixes by Thursday afternoon (Thomas, Sjur, Tomi)
- done
- done
- final lexicons by Friday morning (Tomi)
- done
Windows installer
TODO:
- investigate the NSIS installer (Børre, Sjur)
Releases
TODO:
- update the Changes document (Maaren)
- release InDesing tools Jan. 30. (Børre, Sjur, Thomas, Tomi)
- compile new lexicons (Tomi)
- done
- done
- test (all)
- partially done
- partially done
- document (Sjur)
- not really
- not really
- package and release (Sjur)
- done
- compile new lexicons (Tomi)
Other
South Sámi project startup meeting
- in Snåsa
- 11th - 15th of Feb, kick-off meeting Wednesday 13.
- Participants: SD (incl. Divvun), Nord-Trøndelag fylkeskommune, Snåsa kommune,
We extend the meeting on our part, to have this project's first gathering.
Travel plans - arriving/leaving at Værnes:
-
Børre:
-
Maaren: Tuesday or Wednesday
-
Per-Eric: 11: 30 / xxx
-
Sjur: sunday /
-
Svenne:
-
Thomas:
-
Tomi: sunday /
- Trond: sunday / Tuesday afternoon
Goal: to be at Værnes Friday afternoon around 14, targeting planes from
TODO:
- get hotel rooms (Sjur)
- done
- done
- make a first sma project plan (Sjur, Trond)
Corpus contracts + open source
TODO:
- publish corpus contracts and project infra as open-source on NoDaLi-sta
Next meeting, closing
The next meeting is 11.2.2008 in Snåsa.
The meeting was closed at 10: 40.
Appendix - task lists for the next five days
Boerre
- start to reorganise the documentation
- gather sma texts
- improve forrest stability with i18n, site look
- set up the Leopard Server features for collaborative support
- Hunspell lexicon conversion
- InDesign documentation
- investigate the NSIS installer
- release InDesing tools Jan. 30.
- work on Tromsø Sami workshop paper
- fix bugs!
Lene
- Ped project
- work on Tromsø Sami workshop paper
Maaren
- Put the list of possible sma corpus sources into a document
- update the Changes document
Per-Eric
- check some unusual and missing words from the last Olavi missing list
- keep the contact with Kurt Tores family about his texts.
- try to find other authors who have smj texts digitaly
- fix bugs!
Saara
- add new XSL/XML headers for proofing test docs
- Set up ways of adding meta-information for proofing correct corpus docs
- discuss more parallel texts
Sjur
- start to reorganise the documentation
- gather sma texts
- improve forrest stability with i18n, site look
- set up the Leopard Server features for collaborative support
- check the present sma sources
- name db/risten.no
- investigate the NSIS installer
- publish corpus contracts and project infra as open-source on NoDaLi-sta
- work on Tromsø Sami workshop paper
- fix bugs!
Thomas
- look at test cases still not behaving properly
- work on Tromsø Sami workshop paper
- fix bugs!
Tomi
- Hunspell lexicon conversion
- document how compounding is controlled in the PLX conversion
- work on Tromsø Sami workshop paper
- fix double hyphen bugs
- fix bugs!
Trond
- Report the smesmj project
- Start working on the samdoc talk
-
sme->smj lexicon conversion to build bilingual lexicon resources
- Reorganise documentation (with Børre and Sjur)
- Gather sma texts (with Børre and Sjur)
- Look at the sma source files (with Sjur)
- Name lexicon project: Test editing xml files (when they are ready for it)
- Make a first sma project plan
- work on Tromsø Sami workshop paper
- fix bugs!.