Meeting_2005-06-13
Meeting setup
- Date: 13.06.2005
- Time: 10.00 Norw. time
- Place: Wherever we are : -)
- Tools: Phone, iChat, SubEthaEdit
Agenda
- Opening, agenda review
- Reviewing the task list from a week ago
- Documentation - divvun.no
- Corpus gathering
- Corpus infrastructure
- Linguistics
- Term db
- Other issues
- Summary, task lists
- Closing
1. Opening, agenda review, participants
Opened at 10.00. Agenda accepted as is.
Present: Børre, Maaren, Sjur, Thomas, Tomi, Trond
Main secretary: Tomi
2. Reviewing the task list from the last meeting
Børre
- Contact Kevin Scannell of an Crúbadán fame, ask if we can use his north sami material.
- We can get his material as either a word list, or a sentence-wise corpus
- He wanted to continue the cooperation (initiated last year, it seems)
- We can get his material as either a word list, or a sentence-wise corpus
- Contact Roy Dragseth about cvs-mailing
- He will do it this week
- He will do it this week
- Work on divvun.no
- Waiting for Tomcat to be upgraded
- Sysadm back from vacation today
- Waiting for Tomcat to be upgraded
- Continue http: //giellatekno.uit.no conversion to forrest ...
- Not a lot of work has been done, but an alpha version is running.
- Not a lot of work has been done, but an alpha version is running.
- Contact Svenska bibelsällskapet
- Nothing done
- Nothing done
- Add issue to forrest issue tracker about utf-8 ihtml documents.
- Check for existing bug reports, it might already be in there
- Nothing done
- Nothing done
- Check for existing bug reports, it might already be in there
- Continue forrest i18n research
- Sometimes abides the =locale=se/smj/fi in the url (the document gets translated). The
- Will contact che-che who is actively pursuing i18n in forrest.
- Sometimes abides the =locale=se/smj/fi in the url (the document gets translated). The
Maaren:
- continue missing list
- find and prepare issues for the language board meeting
- translate Helsinki contract into rough Norwegian; send it to Sjur,
- Asked Elli-Kirsti Nystad to do the translation and she has promised to translate it into Norwegian
Sjur:
- Check the translation of the Helsinki contract with Maaren; forward the task
- n/a
- n/a
- Ask Anne-Britt to update the contract with the University (we are getting 3, not
- not yet done
- not yet done
- write the LIST option to our test tools, as requested in the discussion group.
- not done
- not done
- complete the action summary after our half-year evaluation
- not done
- not done
- fix risten.no bugs
- Took most of the week, many fixes, 9 open.
- Took most of the week, many fixes, 9 open.
- contact Kimmo Koskenniemi about kwic-snt and UTF-8 bug
- not done. We all miss it.
- not done. We all miss it.
- voice group-chat not working to Sámediggi
- Leif Åge has been on vacation
- Leif Åge has been on vacation
- add i18n bug to Forrest issue tracker
- not done
- not done
- add i18n menu as feature request to Forrest issue tracker
- not done
Thomas:
- Begin to prepare issues to giellalávdegotti čoahkkin
- Prepared issues and sent draft to Maaren
- Prepared issues and sent draft to Maaren
- Present issues in dicussion forum
- Done
- Done
- Finish the work on verb valency
- Finished - Congratulations!!!
- Finished - Congratulations!!!
- Start to look at Lule Sámi
- Started
Tomi:
- Start looking at the smj infrastructure, turn it into utf-8, look at whether updates
- Done
- Done
- update catxml to remove sections in unwanted language(s)
- Some problems getting the removal to work
- Some problems getting the removal to work
- Look into shortening of three-part compounds, middle part
- nothing done
- nothing done
- decapitalisation of proper nouns when (if?) compounded, and when derived
- nothing done
Trond
- translate Helsinki contract into rough Norwegian; send it to Sjur,
- Totally forgotten. Did I ever receive it?
- No, I decided to wait till Maaren was back, which was good, since she had already
- No, I decided to wait till Maaren was back, which was good, since she had already
- Worked on linguistics, on bugs, to a certain extent on documentation.
- Totally forgotten. Did I ever receive it?
All:
- The www.divvun.no release.
- The Bugzilla bugs, and a quip or two?
- continue the discussion on the language policy of divvun.no in the newsgroup, esp.
- How much in 5 languages?
- Size of general public info: Trond: only a few paragraphs - about the project, and the
- what the divvun project is
- why it is made, by whom, on whose decision
- what we make, (why we do not make sma)
- how we make it
- how, when, and for whom it eventually will be used, and on what terms and price
- an explanation of our lg policy, and a good set of links.
- what the divvun project is
- should news be in all languages? What about RSS feeds from Forrest?
- How much in 5 languages?
3. Documentation - divvun.no
Deadline 22.6.
Børre will be the main responsible for getting it all in place till the deadline.
Things required before the opening:
- Press release text, conforming to what we actually release (in 3 languages? More?)
- divvun.no front pages up and running, with a nice look and interlinking in place, as
- front page and general info in all languages
- general outline of the project, with deliveries and main timetable
- giellatekno.uit.no front pages up and running, with a nice look and interlinking in
- The actual content of the documentation, our common child, should be gone through with
Requirements for the opening:
- i18n in place? What if we can't get it to work? Subtabs for each language. The submenus
- frontpage contains the points outlined at the last meeting. Cf. this list:
- Information in 5 languages (sme, smj, nob, fin, eng)
- Size of general public info: only a few paragraphs. It should include:
- what the divvun project is
- why it is made, by whom, on whose decision
- organization (project board, part of GIO, with UiT, what else?)
- how, when, and for whom it eventually will be used, and on what terms and price
- what we make
- why we do not make sma (background on separate page, in nob and sme)
- timetable (separate page, all langs.)
- how we make it (separate page, all langs)
- an explanation of our lg policy (separate page, all)
- a good set of links, incl our own newsfeed (RSS)
- what the divvun project is
- Information in 5 languages (sme, smj, nob, fin, eng)
The separate pages:
- The intro page (all lgs)
- sma (in nob and sme)
- timetable (all langs.)
- how we make it (all langs)
- an explanation of our lg policy (all lgs)
Deadlines:
- all docs, orig lang: Wednesday, 15.6.
- translations: Friday, 17.6.
- Alpha site (Tomcat running, present version of the site): Wednesday, 15.6.
- Beta site (all docs in place): Monday, 20.6.
- QA, checking links, i18n, content etc.: Tuesday, 21.6.
- Opening: Wednesday, 22.6.
Authoring
- The intro page: Sjur x5
- sma omission background: Trond x2
- timetable: Sjur x5
- how we make it: Børre x5
- an explanation of our lg policy: Sjur x5
Translations
- eng: Sjur, others?
- sme: Børre (Maaren proofreads)
- smj: Thomas, proofread by e.g. Kåre Tjikkom
- fin: Maaren
Files
Børre will make document stubs in the correct places, with the correct filenames.
4. Corpus gathering
The license contract is being translated. Maaren will check today whether it is finished.
5. Corpus infrastructure
catxml: some problems with section extraction/deletion. Tomi will continue the work on it.
We need routines for handling other formats than Microsoft Word:
- Clean text
- html
- pdf, if possible
- ideosyncratic formats
6. Linguistics
We should postpone the substantial issues to after the divvun release. The bug list should still be followed.
7. Term db / risten.no
Preparing the release, solving bugs.
8. Other issues
Bugzilla for sme and smj parsers.
When we set up Bugzilla, we didn't think of the fact that there would be two languages in
Here comes an overview of status quo, with the sme/smj relevance indicated.
- Corpus
- smj
- sme
- smj
- Disambiguation
- sme disambiguation rules
- smj disambiguation rules
- sme disambiguation rules
- Documentation
- no point in distinguishing between sme and smj here (?), as the lang catalogue is a
- no point in distinguishing between sme and smj here (?), as the lang catalogue is a
- Hyphenation
- Language division postponed
- Language division postponed
- Infrastructure
- lg-independent
- lg-independent
- Lexicon
- sme lexicon
- smj lexicon
- sme lexicon
- Morphophonology
- sme grammar
- smj grammar
- sme grammar
- Preprocessor
- The preprocess file is language independent. The files abbr.txt are language dependent.
- The preprocess file is language independent. The files abbr.txt are language dependent.
- risten.no
- N/A
- N/A
- Spell checking
- We postpone the inner structure of this one, and the question on whether it should be one product or not.
- We postpone the inner structure of this one, and the question on whether it should be one product or not.
- Test
- Largely language independent.
- Largely language independent.
- Tools (Programs other than the analysers and the preprocessors)
- Language independent.
Suggestion: bottom level (components). Agreed.
The sme/smj issue will be handled as indicated, by splitting products according to language
Language(s) of RSS Newsfeed
Forrest can generate RSS news for us, which we will use to broadcast project events. What
- Language
- Interested persons and institutions: e.g. Audun Lona, project board, Skuvlalinux, all beta testers
- If monolingual: nob or sme? If multilingual
- sme for linguistic issues, sme and nob for other issues
- sme for linguistic issues, sme and nob for other issues
- Interested persons and institutions: e.g. Audun Lona, project board, Skuvlalinux, all beta testers
- Content Type
- Major updates to the docs
- test releases (Alpha, Beta, etc.)
- ...
- Major updates to the docs
To be decided by next meeting at latest. We (Børre/Sjur) need to find out what is possible in Forrest.
9. Summary, task list
All:
- The divvun.no release.
- authoring, translations, techn. setup
Børre
- Coordinate the release and the tasks ahead of the release.
- Tomcat/Divvun setup
- Authoring "How we make it"
- i18n issues in Forrest
- Translate Divvun docs to English (with Sjur)
Maaren:
- divvun.no
- Proofread sme translation
- Proofread sme translation
- Linguistics
- The catch 20 list for the lg board
- in CVS under doc/lang/sme/norm_issues.xml
- The catch 20 list for the lg board
Sjur:
- Author divvun.no pages for the opening
- Fix risten.no bugs, prepare that opening
- Translate Divvun docs to English (with Børre)
Thomas:
- Work with lule sami
- buy books - dictonaries and grammars to Romssa kántuvra
- Write documentation
- smj divvun.no-translation, to be proofread later by e.g. Kåre Tjikkom
Tomi:
- Common Makefile for all the languages
- catxml language removal
- Look into shortening of three-part compounds, middle part
- decapitalisation of proper nouns when (if?) compounded, and when derived
Trond
- divvun.no release
- Documentation: Work with Børre on the disamb documentation.
- Write the sma omission text, perhaps look at the translation to English
- Technical documentation: Read through, look at omissions
- Documentation: Work with Børre on the disamb documentation.
- Linguistic work
- Back up Thomas on smj.
- Make a normativity-issues.xml in the doc/lang/sm{ej}/ directories.
- Back up Thomas on smj.
10. Next meeting, closing
20.06.2005 10.00
Closed at 12.08