algorithm_for_dict_turning

Algorithm for turning Oahpa entries

  1. revert all <l> - <t(f)> pairs, for which the tg HAS a sematics element.
  2. unify for identical instances of former <t(f)>, now <l>
  3. remove all <l> (i.e., <e>) for <l> with no stat="pref" tag
  4. keep the rest of the Norwegian lemmas
  5. remove multiple stat="pref" tags manually (Comment: This can't be the case!)

Issues

  1. what about attribute oa?
  2. what about lacking this attribute?
   <e>
      <lg>
         <l pos="a">aales</l>
      </lg>
      <apps>
         <app name="oahpa">
            <sources>
               <book name="åa5" />
            </sources>
         </app>
      </apps>
      <mg>
         <tg>
            <semantics>
               <sem class="NATURE_A" />
               <sem class="HERDING" />
            </semantics>
            <t dict="yes" oa="yes" pos="a" stat="pref" tcomm="no" xml:lang="nob">rovdyrfritt</t>
            <tf dict="no" oa="yes" pos="phrase_a" tcomm="no" xml:lang="nob">fredelig for rovdyr</tf>
            <tf dict="no" oa="yes" pos="phrase_a" tcomm="no" xml:lang="nob">fred for rovdyr</tf>
            <tf dict="no">fritt for rovdyr</tf>
         </tg>
         <tg xml:lang="swe">
            <t dict="yes" oa="yes" pos="a" tcomm="no" xml:lang="swe"></t>
         </tg>
      </mg>
   </e>

Principle for dictionary conversion.

  1. Turn smanob to nobsma
  2. Thereafter, manually remove superfluous <t> synonyms in smanob tg-s
  3. in the new nobsma, unify according to lemma
  4. for lemma pairs with conservative/radical Bokmål, pict the (e.g.) conservative as the lemma, and make a pointer from the radical one

Cf. the following example:

  
vatn -> vann  

vann
	1. (væske) tjaetsie
	2. (innsjø) jaevrie
	
jaevrie
	innsjø
	
tjaetsie
	(væske) vann