Meeting_2005-04-18
Contents:
- Meeting setup
- Agenda
- Task list since last meeting:
- 1. Opening, agenda review, participants
- 2. Reviewing the task list from the last meeting
- 3. Documentation - divvun.no
- 4. Corpus gathering
- 5. Corpus infrastructure
- 6. Linguistics
- 7. Term db
- 8. Other issues
- 9. Summary, task lists
- 10. Next meeting, closing
Meeting setup
- Date: 18.04.2005
- Time: 11.00 Norw. time
- Place: Wherever we are : -)
- Tools: Phone, iChat, SubEthaEdit
Agenda
- Opening, agenda review, participants
- Reviewing the task list from a week ago
- Documentation - divvun.no
- Corpus gathering
- Corpus infrastructure
- Linguistics
- Term db
- Other issues
- Mouse problems
- Job positions
- Upgrade to 10.4
- Physical meeting?
- Vacation
- Mouse problems
- Summary, task lists
- Closing
Task list since last meeting:
-
Tomi: Move the corpus discussion from newsgroup to forrest
- has been doing XSLT for docbook to our xml format (docbook2sme.xsl)
- awaiting final XML format
- has been doing XSLT for docbook to our xml format (docbook2sme.xsl)
-
Thomas: continue work with verb transitivity. Contact Olavi about
-
Maaren: Will work with this project, starting tomorrow.
- I`ll take contact with Trond and Thomas (telephone meeting Tuesday 12.4.)
- I`ll take contact with Trond and Thomas (telephone meeting Tuesday 12.4.)
-
All: discuss our own corpus format in the newsgroup - to be continued.
- deadlline: 25.4.
- deadlline: 25.4.
-
Tomi and Børre: identify the input encodings antiword can handle
- Tomi makes a perl-script, Børre helps out on that one.
- Tomi makes a perl-script, Børre helps out on that one.
-
Børre: divvun.no:
- to set up the documentation and integration with CVS
- Coordinate with Sjur as well
- Still missing a project e-mail address. Børre contacts Leif Åge.
- to set up the documentation and integration with CVS
-
Sjur, Børre: terminology database
- Still working with it...
- Still working with it...
-
Børre: should make the How-To tab appear as intended:
- still to be done
- still to be done
-
Børre: wiki - will follow up on the UTF-8 problem
- temporary workaround: use 8-bit encoding for the memos (=MacRoman works correctly)
- convert all memos to MacRoman
- temporary workaround: use 8-bit encoding for the memos (=MacRoman works correctly)
-
Trond, Børre, all: fix the links
- Børre will check if there are still more link problems
- Lot's of links to .txt files are broken. Trond and Børre takes a look at that
- Børre will check if there are still more link problems
- Børre: contact Min Áigi, discussing possible practical arrangements for cooperation.
1. Opening, agenda review, participants
Opened at 11.15 (more than 1 hour late due to a miss by Sjur). Agenda accepted as is.
Present: Maaren, Sjur, Thomas, Tomi, Trond, Børre
Main secretary: Trond
2. Reviewing the task list from the last meeting
-
Tomi: Move the corpus discussion from newsgroup to forrest
- awaiting final XML format
- cont. disc in newsgroup: Tomi and Sjur has discussed in the thread
- awaiting final XML format
-
Thomas: continue work with verb transitivity. Contact Olavi about
- has worked on the verb transitivity all week, and contacted Olavi about Lule Sámi.
- has worked on the verb transitivity all week, and contacted Olavi about Lule Sámi.
-
Maaren: Will work with this project, starting tomorrow.
- I`ll take contact with Trond and Thomas (telephone meeting Tuesday 12.4.)
- I`ll take contact with Trond and Thomas (telephone meeting Tuesday 12.4.)
-
All: discuss our own corpus format in the newsgroup - to be continued.
- deadlline: 25.4.
- deadlline: 25.4.
-
Tomi and Børre: identify the input encodings antiword can handle
- Tomi makes a perl-script, Børre helps out on that one.
- Done
- Tomi makes a perl-script, Børre helps out on that one.
-
Børre: divvun.no:
- to set up the documentation and integration with CVS
- Contacted Thor Øyvind Johansen. Told about forrest, cochise. Discussed the
- Coordinate with Sjur as well
- Still missing a project e-mail address. Børre contacts Leif Åge.
- to set up the documentation and integration with CVS
-
Sjur, Børre: terminology database
- Still working with it...
- Still working with it...
-
Børre: should make the How-To tab appear as intended:
- still to be done.
- It seems to be a fundamental problem for forrest, we have to return to this.
- still to be done.
-
Børre: wiki - will follow up on the UTF-8 problem
- temporary workaround: use 8-bit encoding for the memos (=MacRoman works correctly)
- convert all memos to MacRoman.
- Sjur did this.
- temporary workaround: use 8-bit encoding for the memos (=MacRoman works correctly)
-
Trond, Børre, all: fix the links
- Børre will check if there are still more link problems
- Lot's of links to .txt files are broken. Trond and Børre takes a look at that
- Nothing has been done on this issue.
- Børre will check if there are still more link problems
-
Børre: contact Min Áigi, discussing possible practical arrangements for cooperation.
- We have received a very positive letter from Min Áigi. But, to receive texts
- Metsähallitus (Finland) is waiting for a contract as well.
- Thomas has forwarded a contract from Lantmäteriverket (Maanmittauslaitos)
- Trond has sent Thomas the contract between Statens Kartverk and UIT:
- Thomas has forwarded the contract to Maanmittauslaitos and they have written a licence
- We have received a very positive letter from Min Áigi. But, to receive texts
3. Documentation - divvun.no
We discussed to set up forrest as a standalone on the web server, or to make forrest
The task is to set up the server and to integrate it with the cvs repository.
Deadline for when we want to have the site up and running is the end of April.
4. Corpus gathering
Contract prototype! We have two models: Textlaboratoriet in Oslo, and the Helsinki model,
Børre: Contact Kimmo K &% Ruth V F
Deadline: The middle of May(?).
We should contact the Writers' organisations, and suggest for them to recommend to
5. Corpus infrastructure
Location of charset conv perl script
Tomi has made a perl script to convert problematic charsets to utf8. Where do we store it? Presently, we have cvs files in gt/, and we plan corp files outside the cvs. The question is then where to have cvs-included corpus files, in gt/ with the other cvs files, or in the corpus catalogue with the other corpus files. The discussion will go on in the newsgroups.
6. Linguistics
Thomas: Nothing more than already reported. Work continues on verb transitivity.
Maaren: working missing list. There are very many misspellings in the missing list.
/misspelledform/correctspelledform/frequency_of_misspelledform?/
There is a thread for this on the newsgroup, "Misspellings in the corpus material ".
One frequent error type is the a - á errors. We could perhaps have a special
Compilation error
see BUG #69. The parser did not compile on the 17th of April. Many of the errors
7. Term db
Deadline is end of April for the internal beta. Work is in progress, but not done.
8. Other issues
Job positions.
- Sámi language technology position vacant in Tromsø (Nordlys today)
- Marit's position soon available for others to apply for
- There will in practice be money for one more position for large parts of the
Mouse problems
Mainly Sjur has problems, Tomi sometimes
Upgrade to 10.4
Leif Åge has ordered Tiger, and he will distribute the installation packages from
Things that we have modified may get broken as we move to 10.4. We should be aware
Physical meeting?
Vacation
To be discussed and decided in the next meeting.
9. Summary, task lists
TODO:
-
All:
- Follow up the corpus format discussion.
- Decide upon vacation time
- Views on physical meeting place (May meeting)
- Follow up the corpus format discussion.
-
Tomi:
- Waiting for the corpus xml-format verification, to complete xml-conversion
- Waiting for the corpus xml-format verification, to complete xml-conversion
-
Børre:
- Contact Kimmo Koskenniemi and Ruth Vatvedt Fjeld about contract
- Follow up on the divvun.no issue, suggest tomcat and .war files
- Work on the termdb
- Find a solution for wiki and utf-8
- Link problems in forrest
- Contact Min Áigi, discuss technical details
- Contact Kimmo Koskenniemi and Ruth Vatvedt Fjeld about contract
-
Maaren: tries to work with the missing list. Else?
-
Thomas: work with verb transitivity
-
Sjur:
- terminology db
- text license contract
- divvun.no
- terminology db
10. Next meeting, closing
25.04.2005 09.30
Closed at 13.00