2014-04-08
Project Planning
Participants:
- Hanna
- Michael
Place:
- Freiburg
Time:
- 8 April 2014, 1–3pm
Topic 1: presentation of planned pilot study by Hanna
- N-Kurdish, ~14.000 words, 7 texts recorded ~50-100 years ago, transcribed and translated (no original audio)
- creating ELAN -files of the digitized transcription, devided into sentences (work finished)
- manually annotating according to GRAID conventions (work started)
- analyzes (work planned to be done in April)
Topic 2: discussion
- possible improvements in ELAN-tier structure (according to other Freiburg corpus projects
- possible workflows for collaborative work between Freiburg, Bamberg, Tromsø
- language for second pilot study will be Skolt Saami (Lagercrantz-texts digitized by Julia)
Open questions:
- where to store our working copies of the ELAN -files: SVN, Git-Hub, TLA, other?, preliminary solution is Dropbox
- can we use computational linguistic tools for (semi-) automatic annotation of Kurdish?
- which other Kurdish languages to include in the final study?
- which other Saami languages to include in the final study?
Tasks:
- Hanna checks copyright issues for the Kurdish texts
- Hanna checks wether or not there are any computational linguistic tools available for Kurdish
- Micha adjusts the ELAN -tier structure once the annotations of the first file are completed
Next meeting:
- 30 April Freiburg (RegEx training together with members of the other Freiburg corpus projects)