Oahpa Seminar2014 Eng


  1. Methodologies for distributed cooperation on language technology
  2. What is Oahpa - the pedagogical philosophy
  3. What lingustic content to include - and how to add it
  4. Practical work
  5. Project planning
  6. How to join Oahpa development with language research

Course planning day by day

Preparing before the course

  • Setting up user accounts, basic SVN.

Day 0:

  • Preliminary course in Unix, svn, etc, for people not having done this before.
  • All participants shall have checked out (at least) the ped catalogue and a working version of their own fst on their own machine (cf. the getting started page.)
  • basic SVN course. What is it, how to update, check in, etc.



  • Giellatekno overview (infrastructure, projects, tools, Oahpa) (Heli, Trond)
  • Presentations by the participants about their languages and existing resources (textbooks a.o. teaching materials, corpora, language technology tools)
  • Methodologies for distributed cooperation on language technology
    • If needed: Practical instruction divided into groups
      • Group 1: For the local guru (Heli)
      • Group 2: For the computer illiterate (Jack)
      • Installing, general help (Trond)


Start with Leksa



  • The key role of Leksa in Oahpa (these are the words we use for different purposes)
    • How to choose the vocabulary for Leksa. Textbook word lists, frequency dictionaries.
  • Creating word lists in csv format.
  • Checking in new files and updating the existing ones.
  • svn ci -> Heli updates the db -> online check.

Working versions:

  • myv 124
  • smn 1335
  • yrk 502
  • fkv 63 -- tulossa lista
  • kpv 142

New versions:

  • mrj 229
  • bxr 396
  • mdf 0 <---- lis
  • olo 302

Test files 396

Format for later additions:

lemma __ vartalo __ POS __ contlex __ trans1 __ trans2 __ trans3 __ kirja __ sem


  • Drafting the case/number and person/number/tense forms to be included (Individuals should have this all thought out before hand ... )
  • Evaluating the fsts
  • Setting up the infrastructure for the respective exercises

Homework for Heli:

  • Set up Morfa-S for the existing files.
  • Homework by Thursday:
    • think about the productive Morfa-C frames in your language - which cases, which verbs?




  • Evaluating existing Numras
  • How to improve them:
    • Learning how to correct automata
    • Learning how to extend Numra to ordinal, date, clock.

Leksa, Morfa-S

  • Continuing work



  • Setting up Morfa-S. Case list, possible additional menus.
  • Writing Morfa-C frames.
  • Extending Leksa: Place names. The names that are different in the indigenous language


  • General discussion
    • New ideas, thoughts that have come up while implementing Oahpa for your language.
    • Discussing how to integrate Oahpa in language courses. Presenting kursa
  • Workshop: We write grant applications
    • Background: Different groups plan to get funding
    • We either do brainstorming or (if possible) work in groups on concrete proposals
  • Summing up and future work
    • How to proceed with the development of your Oahpa.