algorithm_for_dict_turning
Algorithm for turning Oahpa entries
- revert all <l> - <t(f)> pairs, for which the tg HAS a sematics element.
- unify for identical instances of former <t(f)>, now <l>
- remove all <l> (i.e., <e>) for <l> with no stat="pref" tag
- keep the rest of the Norwegian lemmas
- remove multiple stat="pref" tags manually (Comment: This can't be the case!)
Issues
- what about attribute oa?
- what about lacking this attribute?
<e>
<lg>
<l pos="a">aales</l>
</lg>
<apps>
<app name="oahpa">
<sources>
<book name="åa5" />
</sources>
</app>
</apps>
<mg>
<tg>
<semantics>
<sem class="NATURE_A" />
<sem class="HERDING" />
</semantics>
<t dict="yes" oa="yes" pos="a" stat="pref" tcomm="no" xml:lang="nob">rovdyrfritt</t>
<tf dict="no" oa="yes" pos="phrase_a" tcomm="no" xml:lang="nob">fredelig for rovdyr</tf>
<tf dict="no" oa="yes" pos="phrase_a" tcomm="no" xml:lang="nob">fred for rovdyr</tf>
<tf dict="no">fritt for rovdyr</tf>
</tg>
<tg xml:lang="swe">
<t dict="yes" oa="yes" pos="a" tcomm="no" xml:lang="swe"></t>
</tg>
</mg>
</e>
Principle for dictionary conversion.
- Turn smanob to nobsma
- Thereafter, manually remove superfluous <t> synonyms in smanob tg-s
- in the new nobsma, unify according to lemma
- for lemma pairs with conservative/radical Bokmål, pict the (e.g.)
conservative as the lemma, and make a pointer from the radical one
Cf. the following example:
vatn -> vann vann 1. (væske) tjaetsie 2. (innsjø) jaevrie jaevrie innsjø tjaetsie (væske) vann

