Converting Numbers To Words
This document looks at different ways of expanding numbers to words.
- Whole number converted to an integer: we have this.
- Number converted to a string of digits: we don´t have this
- Number converted as a type of compound: we have this for date and time. We need something similar for money, and to-from expressions.
(In English, there is also the possibility of pairs of digits, as in years or street addresses, but I have not found any examples of when that would be natural for North Sámi).
In Sproat et al. 2001 there is a taxonomy of non-standard words and how they should be expanded/converted to words. This is based on English. Using their taxonomy, this is my suggestion for North Sámi:
Number to integer
The integer solution should be our default solution for short numbers (less than four digits), when we cannot identify the type of number.
Type | Description |
---|---|
NUM | number (cardinal), for amounts, room numbers, book chapters, reindeer herding districts, university courses, law paragraphs |
NORD | ordinal numbers |
NADDR | street adress: Davviluohkká 27: Davviluohkká guoktelogičieža |
" | post box: Poastaboksa 208: Poastaboksa guoktečuođigávcci |
PRCT | percentage: 32%: golbmalogiguokte proseantta |
NYER | years. We have some of this already in the date converter, but we need more for expressions like 1980-logut, 80-logut, 1900-logus, and also for case markings. In Sweden and Norway the years between 1100-1999 are read out as in the Scandinavian languages, with eleven hundred etc. |
Number to string of digits
The string of digits should probably be our default solution for longer number (more than three digits), when we cannot identify which type of number we have:
NDIG | expands to string of digits |
Phone numbers
NTEL | phone numbers | Pause at white spaces |
- 776 46 742: čiežačiežaguhtta njealljeguhtta čiežanjealljeguokte
- 905 47 317: ovccinullavihtta njeallječieža golbmaoktačieža
If there is no white space, we will have to make default pauses.
Zip codes
NZIP | zip code | 9520 Guovdageaidnu: ovcci vihtta guokte nulla |
Some zip codes look like "clear" integers. We could also have them expand to integers, such as
- 9008 Tromsa: ovcciduhátgávcci Tromsa vs. ovcci-nulla-nulla-gávcci Tromsá
- 9100 Kvaløysletta: ovcciduhátčuođi Kvaløysletta vs. ovcci-okta-nulla-nulla
My experience is that people get addresses and phone numbers more easily when you read each digit separately.
Numeral to compound
- NTIME: we have this
- NDATE: we have this
- MONEY of any currency: kr 35 : golbmalogivihtta ruvnno //
- year + lohku: 1800-logus (we have this)
- number + jahkásaš: 20-jahkásaš, 5-mánnosaš (we have this for some
- to + from expressions 1200-1700, 35-45% (must be differentiated)
TODO:
Money
Money is more complicated than this in North Sámi, because, as with months, we don´t have a "short hand" way of saying things using only numerals:
- kr 3,50: golbma ruvnno ja vihttalogi evrre
- kr 155,50: čuođivihttalogivihtta ruvnno ja vihttalogi evrre
What do we do with other currencies. Do we specify for each currency, like
- $56,50: vihttalogiguhtta dollara ja vihttalogi sentte
- £56,50: vihttalogiguhtta punda ja vihttalogi penni
- €56, 50: vihttalogiguhtta euro ja vihttalogi sentte
Or do we just use čuokkis or rihkku:
- $56,50: vihttalogiguhtta čuokkis vihttalogi dollara
- £56,50: vihttalogiguhtta čuokkis vihttalogi punda
- €56,50: vihttalogiguhtta čuokkis vihttalogi euro
There is also another class, BMONEY, where you have millions and trillions:
- NOK 2.5 mill: guokte čuokkis vihtta miljovnna norgga ruvnno
year
Concord rules are not the same in these structures as they are otherwise. The numeral is the first part of a compound, and should not change. Compare with golbmaoktalaš (Pekka Sammallahti, personal communication).
1600-lohku: guhttanuppelotčuohte-lohku (both are nominative) 1600-logu rájes: guhttanuppelotčuohte-logu rájes 1600-logus: guhttanuppelotčuohte-logus 1600-lohkui: guhttanuppelotčuohtelohkui 1600-loguin: guhttanuppelotčuohteloguin 1600-loguid guhttanuppelotčuohteloguid 1600-loguide: guhttanuppelotčuohteloguide 1950-lohku: ovccinuppelotčuođivihttalot-lohku 1950-logu rájes ovccinuppelotčuođivihttalot-logi rájes 1950-logus ovccinuppelotčuođivihttalot-logus 1950-lohkui ovccinuppelotčuođivihttalot-lohkui 1950-loguin ovccinuppelotčuođivihttalot-loguin 1950-loguid ovccinuppelotčuođivihttalot-loguid 1950-loguide ovccinuppelotčuođivihttalot-loguide
Number + jahkásaš etc:
To-From-expressions
Numbers separated by a hyphen
(This only includes expressions which are not separated by some word in the text, i.e., not things like Vázzen mánáidskuvlla Mázes 1960 čavčča rájes gitta 1966 geassái.)
Sometimes from-to- expressions should be separated by some word, sometimes not. When the numbers refer to years, a word lika gitta seems to be required. When the numbers refer to amounts, it is not always necessary to have a word intervene between them:
ja borrá 20 000 – 30 000 divresuosa ovtta geasis:
Other things
Fractions:
1/4: | njealjádas, njealját-oassi |
3/4: | golbma njealjádasa, golbma njealját-oassi |
1/2: | bealli |
1½: | beannot, okta ja bealli |
+ case marking of these.
Types with case marking:
NUM: SAM-3014:s lea rievdan lohkanmearri. SAM golbmaduhátnjealljenuppelogis LOC NYER: Lean bargan dáppe 1991 rájes. ovccinuppelotčuođiovccilogiovtta GEN PRCT: Jávkan lea lassánan 32%:s 40%:i. golbmalogiguovtti proseanttas njealljelot prosentii GEN LOC ATTR ILL NDATE: Mii váldit vuostá studeantačállosiid 03.12 rájes juovlamánu goalmmát beaivvi rájes GEN NTIME: Mii fertet geargat 15:00 rádjái golmma rádjái GEN