ELA Ntiers
This page documents the ELAN tier structures used by our projects.
Intro
The following presents an inventory of both the linguistic and the tier types used by our projects.
- Note that adherance to these structures is necessary to use the ELAN-FST script which automatically adds annotations on word, lemma, morphological categories and part of speech.
ELAN Linguistic Types
Name | Stereotype | Controlled Vocabulary | Purpose |
---|---|---|---|
refT | - | - | used for ref-tiers; no stereotype because it ref-tiers are independent, root nodes, and time-alignable |
orthT | symbolic association | - | used for orth-tiers, exact time-aligned copy of the superordinate ref-tier |
wordT | symbolic subdivision | - | used for word-tiers, overall time-aligned copy of the orth-tier (and thus the ref-tier), but able to be divided into multiple equal parts |
lemmaT | symbolic subdivision | - | used for lemma-tiers, overall equally-spaced, time-aligned copy of the word-tier, but able to be divided into multiple equal parts |
morphT | symbolic subdivision | - | used for morph-tiers, overall equally-spaced, time-aligned copy of the word-tier, but able to be divided into multiple equal parts |
posT | symbolic subdivision | pos | used for pos-tiers, overall equally-spaced, time-aligned copy of the word-tier, but able to be divided into multiple equal parts |
ftT | symbolic association | - | used for free translation tiers, overall time-aligned copy of the orth-tier (and thus the ref-tier) |
noteT | symbolic association | - | used for tiers adding notes to a given parent tiers, overall time-aligned copy of the parent-tier |
langT | symbolic subdivision | languages | used for lang-tier to indicate language(s) being used in the utterance |
ELAN Tiers and Tier Hierarchy
Required for each speaker:
Level | Name | Parent Tier | Linguistic Type | Language | Purpose |
---|---|---|---|---|---|
0 | ref | - | refT | - (numbered) | root node, time-aligned annotation units are set here, each annotation is provided with a unique number here |
-1 | orth | ref | orthT | vernacular | an orthographic transcription is provided here; this provides the input for the FST engine |
-2 | word | orth | wordT | vernacular | tokenized version of the orth-tier; automatically created by ELAN-FST-script |
-3 | lemma | word | lemmaT | vernacular | lemma (or lemmata in case of ambiguities) for word form listed in parent tier; automatically created by ELAN-FST-script |
-3 | morph | word | morphT | English (linguistics) | morphological category (or categories in case of ambiguities) for word form listed in parent tier; automatically created by ELAN-FST-script |
-3 | pos | word | posT | English (linguistics) | part of speech (or parts of speech in case of ambiguities) for word form listed in parent tier; adheres to 'pos'-list of controlled vocabulary; automatically created by ELAN-FST-script |
Optional for each speaker:
Level | Name | Parent Tier | Linguistic Type | Language | Purpose |
---|---|---|---|---|---|
-2 | ft-XYZ | orth | ftT | a relevant lingua franca | provides a free translation of the annotated text; XYZ is replaced with a language code (e.g. eng, rus, etc.); can occur multiple times for multiple lingua francas |
-2 | lang | orth | langT | English | indicates the language being used in an annotated utterance or part of an annotated utterance; the language name is in English; adheres to 'languages'-list of controlled vocabulary |
Optional for any tier or as a root node with its own time-alignment:
Level | Name | Parent Tier | Linguistic Type | Language | Purpose |
---|---|---|---|---|---|
* | note-XYZ | XYZ | noteT | anything | provide unstructured text-based notes for any given parent tier XYZ |
- all tiers for a given speaker are named using the tier name plus the @ symbol plus an short form referring to the relevant speaker, such as ref@JKW, lemma@JKW
ELAN Tier Template Files for Download
Template files (in ELAN .etf format):
- For spoken corpus data
-
Master template (including all possible annotation tiers and linguistic types)
-
PSDP template (including only annotation tiers and linguistic types relevant for PSDP)
- KSDP template (including only annotation tiers and linguistic types relevant for KSDP
- IKDP template (including only annotation tiers and linguistic types relevant for IKDP
-
Master template (including all possible annotation tiers and linguistic types)
- For written corpus data
- Master template (including all possible annotation tiers and linguistic types)
- PSDP template (including only annotation tiers and linguistic types relevant for PSDP)
- KSDP template (including only annotation tiers and linguistic types relevant for KSDP
- IKDP template (including only annotation tiers and linguistic types relevant for IKDP
- Master template (including all possible annotation tiers and linguistic types)