meeting-2009-09-10

Kick-off meeting

antti keeps the system
the parts we will have to do for new languages (they are not modules yet)
- letter-to-sound
- how we will model prosody
- phrasing
- how to split training sentences

a big issue is the handling of numbers and abbreviations
inflect numbers for case
tts for prosody
html marking as clues

hmm synthesis
does not glue
make a reverse speech recognition

not tobi
perceptual prominence model
accent
stress on the first syllable

The Antti model for now:

0 - deaccented
1 - weak cont wo
2 - strong foc
3 - emphatic

As comparision: the Trondheim model:

1 - nonfoc
2 - foc
3 - acc

3Pekkakin sen möi
Minä puhun 2Pekastakin

cg-stype rule set:

prom of -1 0 1 on word level
prom of -1 0 1 on syll level

word level

funct w deacc
pron v weak
num adv N adj strong

NP cost can be considered having focus and background


theme rheme
      verb ....
      
      
if theme has two constituents
the first is sprominent
and others are secontary or olc

3Pekka 2kalan 1osta

3-keskiviikkona 2-kaikki tutkijat 1-læhtivæt kotiin

rheme

S O systemic ordering 

deviations from syystemic ordering has prosodic contecequence

if A B in text
sys B A
act A B
=> prominent-B, less-prominent-A

hypothesis for finnish SO
subj < obj < loc < manner < osurce < goal
if any of these is moved to the left in the string
it will be part of the theme and therefor less prominent.

pekka osti KALAN-obj TORILTA-src
Pekka osti torilta-src KALAN-obj

if word order follows the systemic order we cannot say anything
if word order deviates from systemic order (as indicated) we can draw conclusions on theme status and therefore (de)accent accordingly

rule against data driven approch:

pro
- insight to phon
- motivated
- not huge c
- rare
con