Meeting_2005-02-28
Meeting setup
- Date: 28.02.2005
- Time: 09.30 Norw. time
- Place: Wherever we are: -)
- Tools: Phone, iChat, SubEthaEdit
Agenda
- Opening, agenda review
- Reviewing the task list from a week ago (see below)
- Documentation:
- are we ready to remove the old HTML docs?
- is the Wiki format for meeting memos ok?
- are we ready to remove the old HTML docs?
- Corpus gathering
- Corpus infrastructure
- Linguistics
- Language technology:
- compilation problems on cochise
- compilation problems on cochise
- Term db
- Other issues:
- Pargas seminar: who is going?
- Pargas seminar: who is going?
- Summary, task lists
- Closing
Last week's task list:
-
Børre: suggest port numbers in the documentations
- Continue .profile discussion in news, conclusion this week
-
Børre: mpage & UTF-8
-
Sjur: Post a description on how to patch XXE for Alt key support
-
Børre: contact Anne-Britt today, and send the letter.
-
Børre and Trond: update emacs and xml-mode, check automatic version-stamp
-
Sjur: Send e-mail to Trond regarding binary files on the news group
-
Børre: wiki support in forrest
-
Trond: invite Lars N
-
Børre + Trond: set up computers, check forrest in cochise
-
Børre: divvun.no
- Tomi/Børre: do some work for Sjur on the termdb
1. Opening, agenda review
2. Reviewing the task list from a week ago (see below)
-
Børre: suggest port numbers in the documentations:
- Continue .profile discussion in news, conclusion this week
-
Børre: mpage & UTF-8
-
Sjur: Post a description on how to patch XXE for Alt key support
-
Børre: contact Anne-Britt today, and send the letter.
-
Børre and Trond: update emacs and xml-mode, check automatic version-stamp
-
Sjur: Send e-mail to Trond regarding binary files on the news group
-
Børre: wiki support in forrest
-
Trond: invite Lars N
-
Børre + Trond: set up computers, check forrest in cochise
-
Børre: divvun.no
-
Tomi/Børre: do some work for Sjur on the termdb
3. Documentation:
- Are we ready to remove the old HTML docs?
- Not until forrest works on cochise, no.
- Marit is fine with emacs in cochise, Børre will add info about XML
- Not until forrest works on cochise, no.
- is the Wiki format for meeting memos ok?
4. Corpus gathering
Thomas: Mikael Svonni has promised to give us his lexicon/dictionary. Asbjörg Skåden
Thomas and Trond have discussed a letter to the Swedish ministry of agriculture,
5. Corpus infrastructure
- dir/file structure
- some of the meta information
- processing of incoming texts
- For .doc input: antiword via DocBook to xml is a strong candidate
- For .pdf we have no good solution yet
- For .html input: Not looked at yet, but html > xml should be easy.
- For .txt: Here we are interested in using Lars Nygård's txt structure guesser.
- The encoding issue: We have identified between 5 and 10 different formats,
- Other formats: we don't know which yet
- For .doc input: antiword via DocBook to xml is a strong candidate
- thus, preferred formats are so far (in descending order): .doc, *.html, *.txt, *.pdf
- target format (suggestions in newsgroup, this discussion is still open)
6. Linguistics
String categories still to cover:
- html addresses, number formats, number-letter combinations "*CG2" as opposed
- These strings can be covered by regular expressions in lexc number generator,
- For HTML addresses, something like:
http://[a-z]+.[a-z0-9]+.{com,org,etc.}
- Trond files bug reports.
7. Language technology
- lexicon compiles fine on the Mac, but not in cochise
- Trond has written a letter to Ken and Lauri (the authors of the Xerox tools),
8. Term db
9. Other issues:
- who is going?
- Sjur and Børre go, one more (Tomi or Thomas)
- Sjur and Børre go, one more (Tomi or Thomas)
- topics of the seminar:
- User, developer and normative body perspective.
10. Summary, task lists
-
Sjur, Børre, Tomi: terminology database
-
Børre: will set up a separate main tab How-To
-
All: we move from .profile to .bash_profile, Børre: docu to be updated
-
Børre: mpage & UTF-8 - file a bug in Bugzilla, with the solution postponed
-
Børre:
- update emacs and xml-mode: Børre has a solution, will document
- check automatic version-stamp in cvs $id$? As above.
- update emacs and xml-mode: Børre has a solution, will document
-
Trond: will follow up the issue with binary postings in news
-
Børre: wiki support in forrest works, will follow up on the UTF-8 problem
-
Børre + Trond: set up computers, check forrest in cochise
-
Børre: divvun.no -
11. Closing
Closed at 10.57