Localisation

[Previously posted on Goodreads 2018-07-26.] The chances are that you’re reading this in a web browser. Perhaps it has a menu bar along the top with words like ‘Bookmarks’ or ‘History’, or perhaps it has a hamburger style menu that appears when you click on a button with three horizontal lines. However you interact with an application, the instructions are provided in words or pictures (or a combination). Commonly known icons, such as a floppy disk or printer, are easy to understand for those familiar with computers, but more complex actions, prompts, and warning or error messages need to be written in words.

For example, if you want to check your email, there might be a message that says ‘1 unread email(s)’ or ‘2 unread email(s)’. If the software is sophisticated, it might be able to say ‘1 unread email’ or ‘2 unread emails’. Naturally, you’ll want this kind of information to be in a language you can understand. Another user may be using the same application in, say, France or Germany, in which case they’ll probably want the messages in French or German.

An application that supports localisation is one that is designed to allow such textual information to be displayed in different languages, and (where necessary) to format certain elements, such as dates or currency, according to a particular region. This support is typically provided in a file that contains a list of all possible messages, each identified by a unique key. Adding a new language is simply a matter of finding someone who can translate those messages and creating a new file with the appropriate name.

The recommended way of identifying a particular language or region is with an ISO code. The ISO 639-1 two-letter code is the most commonly used code to identify root languages, such as ‘en’ for English, ‘fr’ for French and ‘de’ for German. (Languages can also be identified by three-letter codes or numeric codes.) The language code can be combined with an ISO 3166 country code. For example, ‘en-GB’ indicates British English (so a printer dialogue box might ask if you want the ‘colour’ setting), ‘en-US’ indicates US English (‘color’) and ‘fr-CA’ indicates French Canadian (‘couleur’).

On Friday 20th July 2018, Paulo Cereda presented the newly released version 4.0 of his arara tool at the TeX User Group (TUG) 2018 conference in Rio de Janeiro. For those of you who have read my LaTeX books, I mentioned arara in Using LaTeX to Write a PhD Thesis and provided further information in LaTeX for Administrative Work. This very useful tool for automating document builds has localisation support for English, German, Italian, Dutch, Brazilian Portuguese, and — Broad Norfolk.

Wait! What was that?

Broad Norfolk is the dialect spoken in the county of Norfolk in East Anglia. There’s a video of Paulo’s talk available. If you find it a bit too technical but are interested in the language support, skip to around time-frame 18:50. Below are some screenshots of arara in action. (It’s a command line application, so there’s no fancy point and click graphical interface.)

Here’s arara reporting a successful job (converting the file test.tex to test.pdf) with the language set to Broad Norfolk:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

Hold yew hard, ole partner, I'm gornta hev a look at 'test.tex'
(thass 693 bytes big, that is, and that was last chearnged on
07/26/2018 12:09:08 in case yew dunt remember).

(PDFLaTeX) PDFLaTeX engine ..... THASS A MASTERLY JOB, MY BEWTY
(Bib2Gls) The Bib2Gls sof....... THASS A MASTERLY JOB, MY BEWTY
(PDFLaTeX) PDFLaTeX engine ..... THASS A MASTERLY JOB, MY BEWTY

Wuh that took 1.14 seconds but if thass a slight longer than you
expected, dunt yew go mobbing me abowt it cors that ent my fault.
My grandf'ar dint have none of these pearks. He had to use a pen
and a bit o' pearper, but thass bin nice mardling wi' yew. Dew
yew keep a troshin'!

For comparison, the default English setting produces:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

Processing 'test.tex' (size: 693 bytes, last modified: 07/26/2018
12:09:08), please wait.

(PDFLaTeX) PDFLaTeX engine .............................. SUCCESS
(Bib2Gls) The Bib2Gls software .......................... SUCCESS
(PDFLaTeX) PDFLaTeX engine .............................. SUCCESS

Total: 1.18 seconds

For a bit of variety, I then introduced an error that causes the second task (Bib2Gls) to fail. Here’s the Broad Norfolk response:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

Hold yew hard, ole partner, I'm gornta hev a look at 'test.tex'
(thass 694 bytes big, that is, and that was last chearnged on
07/26/2018 12:23:42 in case yew dunt remember).

(PDFLaTeX) PDFLaTeX engine ..... THASS A MASTERLY JOB, MY BEWTY
(Bib2Gls) The Bib2Gls sof....... THAT ENT GORN RIGHT, OLE PARTNER

Wuh that took 0.91 seconds but if thass a slight longer than you
expected, dunt yew go mobbing me abowt it cors that ent my fault.
My grandf'ar dint have none of these pearks. He had to use a pen
and a bit o' pearper, but thass bin nice mardling wi' yew. Dew
yew keep a troshin'!

For comparison, the default English setting produces:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

Processing 'test.tex' (size: 694 bytes, last modified: 07/26/2018
12:23:42), please wait.

(PDFLaTeX) PDFLaTeX engine .............................. SUCCESS
(Bib2Gls) The Bib2Gls software .......................... FAILURE

Total: 0.91 seconds

Here’s the help message in Broad Norfolk:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

arara 4.0 (revision 1)
Copyright (c) 2012-2018, Paulo Roberto Massa Cereda
Orl them rights are reserved, ole partner

usage: arara [file [--dry-run] [--log] [--verbose | --silent] [--timeout
 N] [--max-loops N] [--language L] [ --preamble P ] [--header]
 | --help | --version]
 -h,--help          wuh, cor blast me, my bewty, but that'll tell
                    me to dew jist what I'm dewun rite now
 -H,--header        wuh, my bewty, that'll only peek at directives
                    what are in the file header
 -l,--log           that'll make a log file wi' orl my know dew
                    suffin go wrong
 -L,--language      that'll tell me what language to mardle in
 -m,--max-loops     wuh, yew dunt want me to run on forever, dew
                    you, so use this to say when you want me to
                    stop
 -n,--dry-run       that'll look like I'm dewun suffin, but I ent
 -p,--preamble      dew yew git hold o' that preamble from the
                    configuration file
 -s,--silent        that'll make them system commands clam up and
                    not run on about what's dewin
 -t,--timeout       wuh, yew dunt want them system commands to run
                    on forever dew suffin' go wrong, dew you, so
                    use this to set the execution timeout (thass in
                    milliseconds)
 -V,--version       dew yew use this dew you want my know abowt
                    this version
 -v,--verbose       thass dew you want ter system commands to hav'
                    a mardle wi'yew an'orl

For comparison, the default English setting produces:

Image of arara output (reproduced below).

For those who can’t see the image, the transcript is as follows:

arara 4.0 (revision 1)
Copyright (c) 2012-2018, Paulo Roberto Massa Cereda
All rights reserved

usage: arara [file [--dry-run] [--log] [--verbose | --silent] [--timeout
 N] [--max-loops N] [--language L] [ --preamble P ] [--header]
 | --help | --version]
 -h,--help          print the help message
 -H,--header        extract directives only in the file header
 -l,--log           generate a log output
 -L,--language      set the application language
 -m,--max-loops     set the maximum number of loops
 -n,--dry-run       go through all the motions of running a
                    command, but with no actual calls
 -p,--preamble      set the file preamble based on the
                    configuration file
 -s,--silent        hide the command output
 -t,--timeout       set the execution timeout (in milliseconds)
 -V,--version       print the application version
 -v,--verbose       print the command output

In case you’re wondering why Broad Norfolk was included, Paulo originally asked me if I could add a slang version of English as an Easter egg, but I decided to take advantage of this request and introduce Broad Norfolk to the international TeX community as it’s been sadly misrepresented in film and television, much to the annoyance of those who speak it. As far as we know, it’s the only application that includes Broad Norfolk localisation support. (If you know of any other, please say!)

Having decided to add Broad Norfolk, we needed to consider what code to use. The ISO 3166-1 set includes a sub-set of user-assigned codes provided for non-standard territories for in-house application use. These codes are AA, QM to QZ, XA to XZ, and ZZ. I chose ‘QN’ and decided it’s an abbreviation for Queen’s Norfolk, as the Queen has a home in Norfolk.

Plural Pronouns

[Originally published on Facebook 2015-09-12.] When I was a child I was taught that when conversing in French with any of my adult Belgian relatives I had to use the plural form “vous” when addressing them (rather than the singular “tu”) since the plural form has to be used in formal contexts. Not many people realise this, but English had the same rule. The second person singular is “thou” and the plural is “you”. The plural form was used in formal contexts. Prayer books used the informal singular form to denote the closeness encouraged by instructions such as “call God Father” and the symbolic tearing of the temple veil.

Living languages have an interesting fluidity. They evolve with the people who continually use them. The English language is often criticised for its many exceptions to rules. Some of this is caused by the blending of the different languages that have contributed to its evolution, but some of it comes from so many people breaking a rule that the broken rule becomes standard. (How many use “awful” to mean “full of awe”?) Over time, people began to use the formal “you” in increasingly informal contexts to the point where “thou” was considered old-fashioned. Additionally, its retention in prayer books and the change in attitudes towards religion made “thou” seem stuffy and formal.

This use of the plural form in singular contexts to indicate formality can also be seen in the so-called “royal we”. When the Queen addresses the nation, she refers to herself in a formal context using the plural “we”. Over the past decade or so, more and more non-fiction writers are using the plural third person “they” in a singular context. This is done to avoid the reference to gender and has attracted some criticism from people who feel it’s breaking the language rules, but it actually follows the old linguistic tradition of using the plural instead of the singular in a formal context. So, if you abhor the use of “they” in a singular context, perhaps thou shouldst consider thy use of “you” when addressing only one individual.