Flask Software
This text is taken from  Johnson, Antonsen and Trosterud 2013 
Software
User Interface
Thanks to the open-source community, there are numerous resources available 
which make it easy to produce designs with good cross-browser compatibility. 
Previously, troubleshooting these issues for each individual browser would take 
time, when one would rather focus on implementation and basically, producing 
usable software.  
In this case, we used  Twitter Bootstrap 
to get the most 
for less, and it has resulted in an easy to use and very minimal layout.  
The layout works simultaneously on all the major browsers 
for desktop operating systems, as well as the most popular mobile browsers.  
Thus, there is no real need to produce code specific to 
Apple's iOS or the Android operating system, or pay for the licensing setup 
involved with iOS development, and we get all of these things for free. 
Server Architecture
Having to not worry about the design meant that there was more time left for 
developing functionality. Our dictionary is based on 
Flask, a light, and flexible web 
framework for Python.  
As mentioned above, the lexical data used in this application is stored in an 
XML format, with one file per language pair, per direction (thus making
separate files for language 1 to language 2, and language 2 to language 1). 
These files range in size from 2MB to 5MB, and are used in the live site, 
without the need for a relational database to store the data.  On our server, 
queries end up being quite fast, but to ensure that this continues to be true 
for larger dictionaries, we have also used one of the quickest XML libraries 
for Python currently available,  lxml,  
benchmarks.  
This allows us to 
simply update the files, and restart the service, and any new lexical entries 
are immediately available to users. 
Because the application relies on external tools for lemmatisation and tagging, 
communication between these processes are stored in an in-memory cache. All 
analyses are cached by lemma, and all generated forms are cached by lemma and 
tag. Thus, when a future query includes a compound word containing one of these 
lemmas, we can just retrieve base forms from the cache, instead of sending them 
directly to the external tools. 
The need for a cache arose in response to the start-up time for some of the larger 
FSTs, however once running, actual lookups are extremely fast. For 
now caching is enough, however if usage and load indicate that optimisation is 
necessary, one possible solution is to keep the external tools running in 
separate processes, and simply communicate with them via sockets. There are a 
variety of solutions for this, such that the work necessary would be minimal. 
Our previous wordform dictionaries demanded installation in two steps: 
installing a separate dictionary program (StarDict for Windows and Linux; and
the preinstalled Dictionary.app for Mac OS X), and then downloading and installing 
the linguistic files in the dictionary program.  New and updated dictionary 
versions demanded new downloading and installing.  Our new, web-based approach 
naturally avoids all of this, as users only require the URL.  The web 
dictionary may also be updated by the providers at any time, without the need 
for users to be aware of and perform the updates themselves. 
Compared to our web dictionary, the wordform dictionaries had one major 
advantage: they could be used to click on words in any application running 
within the operating system in order to get an analysis and definition, whereas 
the similar functionality provided in the web dictionary only works on web 
pages within the browser. However, newer versions of Mac OSX have lost a 
user-friendly means of installing additional dictionaries to the preinstalled 
dictionary application, as such, this has become a point in favor of web-based 
solutions. 
There is also an advantage for the providers of the dictionary, programmers and 
linguists alike. With the previous wordform dictionaries, new versions of the 
software (such as with StarDict), required adjustments in the format of the
dictionary files, and we would often find ourselves concerned over whether we 
should add more linguistic content, or aim for smaller file sizes. As such, 
running the dictionary on a server with already existing lexicons is a big step forward. 
Having a server-based system also allows us to pay attention to actual usage of 
the systems. As such, we log all incoming queries along with their 
results, in order to detect areas where the dictionary needs expansion, and 
these updates are then available to users as they are made. 
 
Dictionary API
In addition to being searchable via a form in the web interface, we provide 
detailed lexical entries in an easily linkable HTML format, and in a more 
bare-bones format, JSON (JavaScript Object Notation). JSON is a widely adopted,
and open standard for communication between applications, specifically with a 
focus on web applications. The intent here is that data is provided not just 
for our web-based dictionary via the interface that we provide, but that it may 
also be used within external applications, on other websites, and even 
potentially in mobile services. 
The data is exposed in a couple of public-facing API endpoints or URL paths, 
more or less following REST (Representational State Transfer) architecture. One
of the endpoints, which provides detailed word entries with inflectional 
paradigms has already been included in MultiDict's 
Wordlink, a reading 
comprehension tool that includes many other languages and dictionaries. 
WordLink is quite nice, but naturally, we had some of our own designs for how 
to use this API. 
 
Example Applications
Wordpress Plugin and Cross-browser Bookmarklet
Two of the learning tools already constructed for North Saami are $Kursa$ and 
$Oahpa$.  $Kursa$ is a free, multimedia-rich set of online course materials in North Saami, 
containing lessons with text, and audio recordings, which are implemented in 
WordPress, a free and open-source 
blogging tool.  
To go with these learning materials, we have created a plugin for WordPress 
written in JavaScript, jQuery, and Twitter Bootstrap, which provides access to 
lemmatisation, compound analysis and lexicon lookup. Users simply 
Alt/Opt+Double Click a word, and it is highlighted with a text-bubble 
appearing below that contains word translations and wordform analysis 
mobile. Users can quickly and easily look up as many words as they need 
to comprehend a text, which erases one of the barriers to reading in a new 
language, namely: the need to frequently look up words in a dictionary, while 
being unacquainted with potential "dictionary" word forms.
The modular nature of the core library within the plugin allows it to be inserted 
into several other potential situations with ease. For example, it could be 
included on a specific page or website, or inserted via a web browser plugin in 
any page.  We have ensured that the library works in the most commonly used, 
and current web browsers, as such, this functionality is available on Windows, 
Mac OS X and Linux; in Internet Explorer, Firefox, Chrome, Opera, and Safari.  
In addition to plugin for  Kursa, we have produced a cross-browser solution 
which is similar to a browser extension, but instead, is accessible via a 
bookmarklet, which is a bookmark providing functionality, instead of a 
link to a website. As it turns out, this option has been much more preferable 
to developing (and also convincing users to install) browser specific plugins,
and "installation" is simply a drag-and-drop affair. Thus, when on a page they
wish to read, users may simply click the bookmarklet, which downloads and 
includes the plugin source in the HTML document structure facing the user. Now 
the world of news, blogs, or even Facebook, is accessible in all of the 
language pairs that we support.