In-source Documentation
How to write documentation as comments in your source code.
There is now preliminary support for writing structured comments to document the
First come some general notes, then an overview over differences between lexc, twolc and vislcg3 source files. At the end there is an overview of the compilation procedure.
General notes
The basic idea is that comments following a certain format will be extracted and
- Basic rule
- Everything that is supposed to be included in the published
:
!! Some documentation text here.
Such comments will be extracted, and converted to a jspwiki document for further processing.
That is, to write comments that should become part of the public documentation, you first type two exclamation marks, then one space, and then the jspwiki markup you want. To get a heading, you thus type the following:
...some LexC code... !! !!!Top-level heading ...some LexC code...
In the resulting jspwiki dokument this is turned into:
!!!Top-level heading
- Ignored comments
- If a single comment char is used, that comment is ignored,
- Formatting convention
- For all source file types, the comments use jspwiki
- Raw copy of source code
- To copy a line of source code as is into the
Example (the extra space in the triple { and } in the example is only needed to avoid double triplets, and should not be included in the actual code):
!! !!Symbols that need to be escaped on the lower side (towards twolc): !! {{ { %[%>%] !!= @CODE@ - Literal > %[%<%] !!= @CODE@ - Literal < !! }} }
This should give the following jspwiki fragment:
!!Symbols that need to be escaped on the lower side (towards twolc): {{ { %[%>%] - Literal > %[%<%] - Literal < }} }
In this case we need to encapsulate the multichar symbol declaration within jspwiki source code tags, because otherwise jspwiki will interpret the symbol declaration as links. And we can't escape the bracket using the double bracket notation, because then we are altering the LexC source code. Instead we surround the lines with triple { and }, and just copy the lines in question using the !!= notation.
The full syntax and specification for the markup conventions has its own specification page.
LexC notes
Conventions
Each lexicon is documented below the keyword LEXICON. It is possible to use the keyword @LEXNAME@ in the text, where it will be replaced with the actual lexicon name. A typical lexicon could looke like the following:
! ================================ !! !!!Nominal inflection sublexica ! ================================ LEXICON N_ODD !! !!Inflection for odd-syllable nouns: lexicon @LEXNAME@ ! ------------------------------------------------------- ! !! Short descrioption of this lexicon, and its purpose. ! +N+Sg: N_ODD_SG ; +N+Pl: N_ODD_PL ; +N: N_ODD_ESS ; +N+SgNomCmp:e%^DISIMP R ; +N+SgGenCmp:e%>%^DISIMPn R ; +N+PlGenCmp:%>%^DISIMPi R ; +N+Der1+Der/Dimin+N:%»adtj GIERIEHTSADTJE ;
Test data
! Test data: !!€gt-norm: gierehtse # Odd-syllable test !!€ gierehtse gierehtse+N+Sg+Nom !!€ gierehtsem gierehtse+N+Sg+Acc !!$ gieriehtsem gierehtse+N+Sg+Acc # negative test - don't accept this! !!€ gierehtsen gierehtse+N+Sg+Gen
NB! The negative test data convention is not yet fully functional. For now it is best to avoid it.
Presently, the above test data will give the following yaml file (sans header):
gierehtse: # Odd-syllable test # gierehtse+N+Sg+Nom: gierehtse # gierehtse+N+Sg+Acc: gierehtsem # gieriehtsem: ~gieriehtsem # gierehtse+N+Sg+Acc gierehtse+N+Sg+Gen: gierehtsen #
The negative test data is NOT the way it should be, and this test will fail.
gierehtse: # # gierehtse+N+Sg+Nom: gierehtse # gierehtse+N+Sg+Acc: [gierehtsem, ~gieriehtsem] # gierehtse+N+Sg+Gen: gierehtsen #
This will be fixed in a future version of the test bench.
Twolc notes
Support for TwolC files is not yet implemented.
Follows the same structure as the LexC comments, except that it documents twol rules instead of lexicons.
A future version might also allow for documentation of Alphabet, Sets and Definitions.
To Be Written.
Twolc test data
Support for TwolC files is not yet implemented.
Similar to LexC, except that the output is turned into twolc test pairs used in the pair-testing tool.
To Be Written.
Xfst script and regex files
Support for Xfst files is not yet implemented.
CG3
Support for CG3 files is not yet implemented.
Compilation procedure
The documentation files are compiled when you write make in the $lang catalogue ($lang meaning any language catalogue in langs/). There is a makefile in the $lang/doc catalogue that governs which sourcefiles to harvest for documentation. Linking to the generated files is done automatically, in the generated file $lang/doc/Links.jspwiki.
As a default, only the root.lexc file is scheduled for generating a documentation
In order to compile again (regardless of compilation status), do make -B in $lang/doc.
Check in the converted jspwiki files:
svn ci -m "documentation update" doc/*.jspwiki
Debugging
In order to find out what file is broken, write (in $lang/doc):
forrest
This will tell you what file is broken, unfortunately without a line number
To debug the documentation added to e.g som/src/morphology/stems/nouns.lexc,
Debugging procedure
The error messages from forrest are notoriously unhelpful, typically of the type
nouns-affixes.html BROKEN: Couldn't accept input hardbreak ["\n\n"]
The "input hardbreak" tells that somewhere there is an error in the document.
Couldn't accept input emitem ["''"]
In this case the error was italics embedded in monotype:
{{''син син(м)-''}}
Common errors:
- Other errors include single _ instead of double ones as bold
- and single _ inside italics
- stacked formatting symbols, italics AND bold
- square brackets are links in jspwiki
- Jumping directly from !!! to ! etc.
The best advice if you do not spot the error is to open the broken file