vocabularyoutline

Vocabulary games

We would like to make a set of vocabulary games. The idea is to divide the vocabulary according to part of speech, adjective, verb, noun, adverb.

Outline

The vocabulary lists may be found as follows:

  1. Find the 1000 most common Norwegian nouns and verbs, and the 500 most common adjectives and adverbs.
  2. Translate them by using the ismenob.fst tool (words/dicts/smenob/ismenob.fst).
    1. Problem: The ismenob.fst is not that good, it is just the inverse of the smenob.fst. A better tool will be achieved by converting the underlying xml document from smenob.xml to ismenob.xml, and then make a nobsme.fst from that.
    2. Eventually we would like to take a Sámi frequency list as starting point, and translate via smenob.fst.
  3. check whether important Sámi words are missing, and add them
  4. Divide the word pairs into three groups according to frequency.
  5. In order to be able to reverse the game, we actually need two lists, one sme-nob and one nob-sme, in order to limit variation to one-many, and not many-many.

The target program should have the following characteristics:

  1. The user chooses POS, level and direction (nob-sme or sme-nob)
  2. The machine draws an arbitrary set of 10 words
  3. The machine presents the source words, one by one
  4. For each word, the user provides a translation
  5. The machine reacts in the usual way (green = ok, red = wrong, or similarily)
  6. After a completed set of 10, the machine gives statistics and some encouraing words

In addition to the frequency-based games we should have games ordered after semantic class: Bodyparts, food, drinks, professions, etc.

Proper name drill

Now, we have an overview of parallel names in different municipalities

  • A list showing the number of parallel names per municipality is found here.
  • The names are found in the words/terms/ folder

Discussion

  • What do you think?
  • We need a spec for the format of the lists. While waiting, smeTABnobCR should be ok.

Lene: Good idea, but we have to remember synonyms and sideforms. We have to make the wordlist so the machine will find the synonyms, not only one-to-one.

nob		sme
hus	- viessu, dállu, stohpu

It could also be pedagogically good to make groups of words, instead of only the 1000 most common words. Then we also could use them as a part of dialogues. If you are in a shop, then you could also have a part with vocabulary game with things which probably are in the shop (e.g. grocers).