Smile for the Camera: a new cybercrime short story ebook.

Writing a datetime2 Language Module

The article Localisation with tracklang.tex describes the reason why I created the tracklang package. The article Integrating tracklang into Language Packages gives an example of how to integrate tracklang into a language package. The article Using tracklang in Packages with Localisation Features is for those who are writing a package that needs to detect the document’s localisation settings.

This article describes how to write a datetime2 language resource file. I recommend you first read the previous articles (particularly Using tracklang in Packages with Localisation Features) to understand the purpose of tracklang and how it’s designed to allow packages to input the appropriate language resource files.

I originally created both the tracklang and datetime2 packages around the same time, with the initial release of datetime2 occurring around six months after the initial release of tracklang. This means that the original .ldf language resource files don’t use some of the more convenient commands added to tracklang at a later date.

For example, instead of using \TrackLangRequireDialect datetime2 defines its own internal command that uses \IfTrackedLanguageFileExists. Instead of using \TrackLangRequireResource, datetime2 uses \RequireDateTimeModule. (The new tracklang commands were introduced as a result of developing the datetime2 commands and expanding them for more general purpose use.)

Each datetime2 language file identifies itself with:

\ProvidesDateTimeModule{localeid}[version]

This is analogous to the new \TrackLangProvidesResource added to tracklang v1.3.

The code to redefine the date hook is also quite convoluted. For example, datetime2-en-GB.ldf uses:

\ifcsundef{date\CurrentTrackedDialect}
{% do nothing
  \ifundef\dateenglish
  {%
  }%
  {%
    \def\dateenglish{%
      \DTMifcaseregional
      {}% do nothing
      {\DTMsetstyle{en-GB}}%
      {\DTMsetstyle{en-GB-numeric}}%
    }%
  }%
}%
{%
  \csdef{date\CurrentTrackedDialect}{%
    \DTMifcaseregional
    {}% do nothing
    {\DTMsetstyle{en-GB}}%
    {\DTMsetstyle{en-GB-numeric}}%
  }%
}%

This essentially works like \TrackLangRedefHook except that it doesn’t take mappings into account. With tracklang v1.4, a more compact method would be:

\TrackLangRedefHook
{% new definition of \date... hook:
  \DTMifcaseregional
  {}% do nothing
  {\DTMsetstyle{en-GB}}%
  {\DTMsetstyle{en-GB-numeric}}%
}%
{date}

In time I may add v1.4 as a requirement for datetime2 and update it to use \TrackLangRequireDialect and \TrackLangRequireResource, but it’s currently too early to do that as v1.4 has only just been released at the time of writing. However, if your language files require other features of v1.4 then you may insist on that version as a minimum requirement.

As discussed in the previous article, it’s useful to have a base file for common elements (such as month names). If you have multiple scripts, you may need a base file for each script. Remember that when the file starts with the file identifier (\ProvidesDateTimeModule) each resource file can only be loaded once with \RequireDateTimeModule (just as a package can only be loaded once with \usepackage/ or \RequirePackage). For example, the file datetime2-serbian.ldf might simply contain the line:

\RequireDateTimeModule{sr-\CurrentTrackedDialectScript}

which will load datetime2-sr-Cyrl.ldf or datetime2-sr-Latn.ldf depending on the script. This provides a fallback in the event that a pre-v1.4 version of tracklang is installed.

Each language file should contain a date-time style with month names (and optionally day of week names), which should be used with the useregional=text option, and a numeric style, which should be used with useregional=numeric. The \DTMifcaseregional code in the date hook is used to determine which style to set. The first case, which corresponds to useregional=false, indicates that the date-time style shouldn’t change when the document localisation changes. In other words, the \date… should do nothing.

The naming convention for the locale date-time styles is now localeid for the textual date and localeid〉-numeric for the numeric date. This allows \DTMtryregional to guess the style name. However, that command was only introduced to datetime2 v1.5.2 and the regionless style provided in datetime2-english.ldf doesn’t define a numeric style, but instead just switches to the default style if useregional=numeric, so that’s a legacy case.

One of the major issues with the precursor datetime package was that the date commands couldn’t expand. (They were made robust because the definitions contained fragile content.) This causes a problem in certain situations where the date needs to expand, typically because the date needs to be written to an external file (such as hyperref’s bookmarks file or the table of contents file). The datetime2 package was designed with expandable commands in mind (although there are some robust commands). Therefore new styles are encouraged to be designed in an expandable manner.

An exception to this is the style provided by the datetime2-en-fulltext package, but this isn’t a locale-sensitive resource file and the style is documented as being non-expandable. This is an example of an ornate style rather than a practical everyday style.

Here are some general guidelines to help keep styles expandable:

The inputenc package provides a trick to support UTF-8 characters where the first octet is made active and takes the second octet as the argument. This meant that UTF-8 characters expanded to a form involving \IeC when being written to a file. This isn’t a problem for XeLaTeX and LuaLaTeX, which both natively support UTF-8. Therefore the language resource files that use extended Latin or non-Latin characters provided a UTF-8 version for XeLaTeX/LuaLaTeX and an ASCII version for LaTeX/PDFLaTeX. For example, datetime2-danish.ldf has:

\RequirePackage{ifxetex,ifluatex}
\ifxetex
 \RequireDateTimeModule{danish-utf8}
\else
 \ifluatex
   \RequireDateTimeModule{danish-utf8}
 \else
   \RequireDateTimeModule{danish-ascii}
 \fi
\fi

The LaTeX kernel release 2019/10/01 changed the way UTF-8 characters are dealt with so they should now not be expanded when written to a file. This means that the ASCII version can start to be phased out in favour of the UTF-8 version. You can test the LaTeX version with:

\@ifl@t@r\fmtversion{2019/10/01}
% code for new format
{ ... }
% older formats
{ ... } 

If your date style ends with a period (full-stop) then you can use \DTMfinaldot in its place. (Requires at least datetime2 v1.5.5.) The starred versions of \DTMdate and \DTMDate locally redefine \DTMfinaldot to do nothing to allow the user to display the date without the terminating punctuation.

If you want to allow for minor variations to the style that can be set using \DTMlangsetup, then you can provide keys for the given module (which should be identified in the optional argument of \DTMlangsetup). For example, datetime2-en-GB.ldf defines the key daymonthsep, which indicates the separator to use between the day and month for the en-GB style. This can be assigned within the document like this:

\DTMlangsetup[en-GB]{daymonthsep={-}}

This would make the day and month appear as 17-May rather than the default 17 May.

This type of option, which may take any value, is defined in the .ldf file with:

\DTMdefkey{module-name}{key}{definition}

For example, datetime2-en-GB.ldf defines the key daymonthsep like this:

\newcommand*{\DTMenGBdaymonthsep}{\space}
\DTMdefkey{en-GB}{daymonthsep}{\renewcommand*{\DTMenGBdaymonthsep}{#1}}

Note that the associated command is initialised first to its default value (with \newcommand). When the daymonthsep key is used in \DTMlangsetup[en-GB], this associated command is redefined to the given value. This means that:

\DTMlangsetup[en-GB]{daymonthsep={-}}

is equivalent to:

\renewcommand*{\DTMenGBdaymonthsep}{-}

This associated command is used within the en-GB date style (instead of using a hard-coded value).

If you want a boolean key (a key that may only take the values true or false), then you need to define it with:

\DTMdefboolkey{module name}{key}[default]{code}

where default is the value to use if the key is provided without a value, and code is any additional code that needs to be implemented when the key is set. The default state (used if the key isn't set in \DTMlangsetup) can be set with:

\DTMsetbool{module name}{key}{boolean}

(This should be done after the key has been defined.) For example:

\DTMdefboolkey{en-GB}{abbr}[true]{}
\DTMsetbool{en-GB}{abbr}{false}

You can check this value with:

\DTMifbool{module name}{key}{true code}{false code}

For example, the en-GB style has:

\DTMifbool{en-GB}{abbr}{\DTMenglishshortmonthname{##2}}{\DTMenglishmonthname{##2}}%

You may also have a “choice” key, which only permits certain values:

\DTMdefchoicekey{module name}{key}[assign]{allowed values}{code}

The optional assign may be used to provide commands that store the given value and the corresponding index. Note that this assignment isn't scoped and will override any previous definition of the given commands. The datetime2-en-GB.ldf file uses scratch variables \@dtm@val (to store the value) and \@dtm@nr (for the index) to prevent conflict. If you want to be able to reference the values later (outside of code) then use unique commands rather than scratch ones.

The allowed values argument should be a comma-separated list of allowed values. The code argument is implemented when the given key is set. The supplied value can be obtained with ##1 (which avoids any expansion issues that might occur when trying to use a scratch variable, such as \@dtm@val).

For example:

\newcommand*{\DTMenGBfmtordsuffix}[1]{#1}
\DTMdefchoicekey{en-GB}{ord}[\@dtm@val\@dtm@nr]{level,raise,omit,sc}{%
 \ifcase\@dtm@nr\relax
   \renewcommand*{\DTMenGBfmtordsuffix}[1]{##1}%
 \or
   \renewcommand*{\DTMenGBfmtordsuffix}[1]{%
    \DTMtexorpdfstring{\protect\textsuperscript{##1}}{##1}}%
 \or
   \renewcommand*{\DTMenGBfmtordsuffix}[1]{}%
 \or
   \renewcommand*{\DTMenGBfmtordsuffix}[1]{%
    \DTMtexorpdfstring{\protect\textsc{##1}}{##1}}%
 \fi
}

Note that this first defines a command (used in the date style) with \newcommand and the code part redefines that command according to the value supplied for the key. Rather than performing a string test on the value, it's simpler to test the index with \ifcase. (In the above, 0 corresponds to “level”, 1 corresponds to “raise”, 2 corresponds to “omit” and 3 corresponds to “sc“.