9.4 ⁂Using the `datatool` Package for Exams or Assignment Sheets

Since the datatool package and the datatooltk application have already been described in this book, it's worth mentioning that they can also be used to store a database of problems and their associated solutions. This can be done by creating a database with a label field, a question field and an answer field. Other fields can also be added to store, for example, the topic or level of difficulty.

If you already have a file containing probsoln problem definitions, datatooltk can convert it to a datatool database.¹ For example, the mth101.tex file from Example 45 can be imported either using the --probsoln command line option or the File→Import→Import probsoln file menu item in the GUI mode. Figure 9.1 shows the mth101.tex file imported into datatooltk. Since LaTeX is used to assist the conversion, the “pretty-printing” of the code has unfortunately been lost, but this won't affect the typeset output. (This also happens if you use \DTLsaverawdb or \DTLprotectedsaverawdb.)

**Figure 9.1:** Importing a `probsoln` Dataset into `datatooltk`

The import process has created three fields: Label, Question and Answer. Extra fields can be added using the Edit→Column→Insert Column After menu item. For example, in Figure 9.2, I've added a new integer field called Level, where a value of 1 indicates easy, 2 indicates medium difficulty and 3 indicates hard. This database can then be saved as, say, mth101.dbtex and loaded into a document using \DTLloaddbtex, as described in §2.2.2 Loading Data From a .dbtex File. You can add other columns as well, such as a topic.

**Figure 9.2:** New `Level` Column Added

⚠

Note that datatool has a drawback that probsoln doesn't have, and that is the lack of support for verbatim. You can, however, use \lstinputlisting (provided by the listings package [34], described in Volume 2) or \verbatiminput (provided by the verbatim package [85]).

A new boolean variable can be defined using:

\newboolean {⟨name⟩}

defined by the ifthen package, or

\newbool {⟨name⟩}

defined by the etoolbox package, where ⟨name⟩ is the name of the variable. (Note that ⟨name⟩ is not a control sequence.) The state can be set using:

\setboolean {⟨name⟩}{⟨state⟩}

defined by the ifthen package, or

\setbool {⟨name⟩}{⟨state⟩}

defined by the etoolbox package, where ⟨state⟩ may be either true or false. With the etoolbox package, you can also use:

\boolfalse {⟨name⟩}

to set the state to false or

\booltrue {⟨name⟩}

to set the state to true.

The variable's state can be tested using:

\ifthenelse {\boolean {⟨name⟩}}{⟨true⟩}{⟨false⟩}

defined by the ifthen package, or

\ifbool {⟨name⟩}{⟨true⟩}{⟨false⟩}

defined by the etoolbox package.

Note that \newboolean and \newbool both use the same underlying TeX command to define a conditional so they have the same effect. The etoolbox \setbool can be prefixed with \global but ifthen's \setboolean can't.

It's therefore possible to define your own boolean flag that determines whether or not the solutions should be displayed.

Example 46. Creating a Problem Sheet using datatool

Returning to the database shown in Figure 9.2. Suppose that database is saved as mth101.dbtex. Now it can be loaded and iterated over to display all the questions:

\documentclass{article}

\usepackage{etoolbox}
\usepackage{datatool}

\newbool{showanswers}
\booltrue{showanswers}

\DTLloaddbtex{\problemDB}{mth101.dbtex}

\begin{document}
\begin{center}\bfseries\Large
Assignment~1\ifbool{showanswers}{ (Solution Sheet)}{}
\end{center}

\begin{enumerate}
\DTLforeach*{\problemDB}
 {\Label=Label,\Question=Question,\Answer=Answer}%
 {%
   \item \Question
   \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
 }
\end{enumerate}

\end{document}

You can download or view this example document.

Alternatively, you could gather all the solutions at the end of the document:

\documentclass{article}

\usepackage{etoolbox}
\usepackage{datatool}

\newbool{showanswers}
\booltrue{showanswers}

\DTLloaddbtex{\problemDB}{mth101.dbtex}

\begin{document}
\begin{center}\bfseries\Large
Assignment~1
\end{center}

\begin{enumerate}
\DTLforeach*{\problemDB}
 {\Label=Label,\Question=Question}%
 {%
   \item \Question
 }
\end{enumerate}

\ifbool{showanswers}
{%
 \section{Solutions}

\begin{enumerate}
\DTLforeach*{\problemDB}
 {\Label=Label,\Answer=Answer}%
 {%
   \item \Answer
 }
\end{enumerate}

}{}
\end{document}

You can, of course, use the exam class or probsoln package with datatool. That way you don't need to define your own boolean variable.

It may, however, be that you only want a random selection of the questions from the database. While this could be done within the document using commands provided by the datatool package, it's more efficient to do this using datatooltk. That way, the random selection only needs to be done once per problem sheet (possibly repeated after any modifications to the database) which reduces the time taken for TeX to compile the document. The datatooltk has a number of command line options that can help with this:

--shuffle
Shuffle the rows in the database.
--seed ⟨number⟩
Set the random generator seed to ⟨number⟩.
--shuffle-iterations ⟨number⟩
Sets the number of iterations performed in the shuffle to ⟨number⟩
--truncate ⟨number⟩
Truncate the database to the first ⟨number⟩ rows. (This option is always performed after the shuffle option, regardless of the option order.)
--filter ⟨key⟩ ⟨operator⟩ ⟨value⟩
Adds a filter. This option may be used multiple times. Here ⟨key⟩ is the column label used by the filter. The ⟨operator⟩ may be one of: eq (equals), ne (does not equal), le (is less than or equal to), lt (is less than), ge (is greater than or equal to), gt (is greater than) or regex (matches the regular expression). In the last case, ⟨value⟩ should be a regular expression as used by java.util.regex.Pattern. In the other cases, ⟨value⟩ may be an integer, real number or string. If the datatype for the column identified by ⟨key⟩ is numerical and ⟨value⟩ is also numerical, then a numerical comparison is used, otherwise a string comparison is used. For example, --filter Level le 2 indicates that the filter should return a true value for any row where the value in the Level column is less than or equal to 2.
Filtering is always applied after shuffling and before truncating (if either of those options have been specified).
--filter-and
The default action in the event of multiple --filter options is to apply a logical “or”. The --filter-and changes this behaviour to apply a logical “and” to all the filter results instead. For example, suppose the database also has a column labelled Topic and you want to select five easy questions from the topic “Algebra”, then you need a logical “and”:

$ datatooltk --in mth101.dbtex --shuffle --filter-and --filter Level eq 1 --filter Topic eq Algebra --truncate --output problems.dbtex
--filter-exclude
When applying any filters, the --filter-exclude option will cause any matching rows to be excluded. (The default behaviour is to exclude non-matching rows.)
--merge ⟨col-label⟩ ⟨filename⟩
Merges the loaded database with the database in the file whose name is given by ⟨filename⟩. The merge is performed by merging each row in ⟨filename⟩ with the row in the database where the column given by the label ⟨col-label⟩ has the same value as the column with the same label in ⟨filename⟩. If no match is found, a new row is added.

With a combination of these options, it's possible to create a database file (called, say, problems.dbtex) that only contains a random subset of the complete database.

Examples:

Select five questions (of any level) at random:

$ datatooltk --in mth101.dbtex --shuffle --truncate 5 --output problems.dbtex
Select two level 1 questions at random:

$ datatooltk --in mth101.dbtex --shuffle --filter Level eq 1 --truncate 5 --output problems.dbtex
Select four non-easy questions at random with the seed set to 2014:

$ datatooltk --in mth101.dbtex --shuffle --seed 2014 --filter Level ne 1 --truncate 4 --output problems.dbtex

The document from Example 46 just needs one line changed, and that's the line that loads the database:

\DTLloaddbtex{\problemDB}{problems.dbtex}

Alternatively, if you want, say, four level 1 questions, two level 2 questions and one level 3 question, you can create three separate databases:

$ datatooltk --in mth101.dbtex --shuffle --filter Level eq 1 --truncate 4 --output problems1.dbtex $ datatooltk --in mth101.dbtex --shuffle --filter Level eq 2 --truncate 2 --output problems2.dbtex $ datatooltk --in mth101.dbtex --shuffle --filter Level eq 3 --truncate 1 --output problems3.dbtex

Now you need to load all three databases into your document:

\DTLloaddbtex{\problemDBi}{problems1.dbtex}
\DTLloaddbtex{\problemDBii}{problems2.dbtex}
\DTLloaddbtex{\problemDBiii}{problems3.dbtex}

and iterate over each of them:

\begin{enumerate}
\DTLforeach*{\problemDBi}
 {\Label=Label,\Question=Question,\Answer=Answer}%
 {%
   \item \Question
   \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
 }
\DTLforeach*{\problemDBii}
 {\Label=Label,\Question=Question,\Answer=Answer}%
 {%
   \item \Question
   \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
 }
\DTLforeach*{\problemDBiii}
 {\Label=Label,\Question=Question,\Answer=Answer}%
 {%
   \item \Question
   \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
 }
\end{enumerate}

If you do intend to do this, I suggest you define a command to perform these iterations. For example:

\newcommand{\doquestions}[1]{%
  \DTLforeach*{#1}
   {\Label=Label,\Question=Question,\Answer=Answer}%
   {%
     \item \Question
     \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
  }%
}

If the original database contains, say, two hundred problems, using datatooltk in this way can significantly speed up the document build. Each year you can run the datatooltk commands with a different random generator seed to produce a new assignment sheet or exam paper.

If you prefer to store your problems in a SQL database, you can perform the random selection with the SELECT statement. For example, if the problems are stored in a table called calculus within a database called mth101, then you can select, say, five questions at random using:

$ datatooltk --output problems.dbtex --sqluser username --sqldb mth101 --sql "SELECT * FROM calculus ORDER BY RAND() LIMIT 5"

What if you don't want to select any problems that appeared in the exam paper or assignment sheet in, say, the previous two years? You could add a year column to the original complete database, but this can be tiresome and prone to error if done manually. It could possibly be done by the LaTeX document, but this would require loading the entire database and saving it using \DTLsaverawdb, which means it's pointless using the datatooltk options described above and, as noted earlier, you'd lose any pretty-printing in the code.

Instead, I think it's more practical to keep a separate database containing just the problem labels and the year that problem was selected. This database can be updated by the document, but since any problems that haven't been used in the past two years can be discarded, this database is much smaller than the original database. Let's call this database file, say, mth101-years.dbtex. On the first year, this file won't exist. Recall from Example 33 the \InputIfFileExists command. If the file doesn't exist, a new database can be created using:

\DTLnewdb {⟨db-name⟩}

where ⟨db-name⟩ is the database name.

Example:

\InputIfFileExists{mth101-years.dbtex}
{}% file exists
{\DTLnewdb{mth101-years}}% file doesn't exist

While the main database is iterated over, each question label can be added to the mth101-years database with the current year. To add data, you first need to add a new row to the database using:

\DTLnewrow {⟨db-name⟩}

and then you can add the entries for that row using:

\DTLnewdbentry {⟨db-name⟩}{⟨col-label⟩}{⟨value⟩}

where ⟨col-label⟩ is the column label and ⟨value⟩ is the value for that column. By default, the value isn't expanded. To change this, you first need to use the command:

\dtlexpandnewvalue

Example 47. Randomly Selecting Problems Not Used in the Past Two Years

(This exercise assumes that the current year is 2014.) Adapting the earlier code from Example 46:

\documentclass{article}

\usepackage{etoolbox}
\usepackage{datatool}

\newbool{showanswers}
\booltrue{showanswers}

\DTLloaddbtex{\problemDB}{mth101.dbtex}

\InputIfFileExists{mth101-years.dbtex}
{}% file exists
{\DTLnewdb{mth101-years}}% file doesn't exist

\begin{document}
\begin{center}\bfseries\Large
Assignment~1\ifbool{showanswers}{ (Solution Sheet)}{}
\end{center}

\dtlexpandnewvalue
\begin{enumerate}
\DTLforeach*{\problemDB}
 {\Label=Label,\Question=Question,\Answer=Answer}%
 {%
   \item \Question
   % add this label to the new database:
   \DTLnewrow{mth101-years}% add a new row
   \DTLnewdbentry{mth101-years}{Label}{\Label}%
   \DTLnewdbentry{mth101-years}{Year}{\number\year}%
   % print the solution if this is the answer sheet:
   \ifbool{showanswers}{\par\textbf{Solution: }\Answer}{}%
 }
\end{enumerate}

At the end of the document, the database needs to be saved:

\DTLsaverawdb{mth101-years}{mth101-years.dbtex}
\end{document}

(You can download or view this document.)

The call to datatooltk can use the --merge command line option. For example, to randomly select five problems:

$ datatooltk --in mth101.dbtex --merge Label mth101-years.dbtex --shuffle --filter-and --filter Year ne 2013 --filter Year ne 2012 --truncate 5 --output problems.dbtex

If the mth101 database doesn't need editing, this call only really needs to be done once a year. However, if you edit the database by removing, adding or swapping rows, you may end up with a different selection, and labels that are no longer selected will still be assigned to the current year. For example, suppose diff:arcsin was selected for this year, but then you add another problem to mth101.dbtex so that now diff:arcsin is no longer selected, but it's still listed in mth101-years.dbtex as having been selected this year. You can fix this using:

$ datatooltk --in mth101-years.dbtex --filter Year eq 2013 --filter Year eq 2012 --output mth101-years.dbtex

This also has the advantage of removing any problems from pre-2012, which trims down the database.

If you use make on a Unix-like system, the Makefile could look something like:

CURRYEAR:=$(shell date +%Y)
LASTYEAR:=$(shell expr $(CURRYEAR) - 1)
YEARBEFORE:=$(shell expr $(CURRYEAR) - 2)

assignmentsheet1.pdf     : assignmentsheet1.tex problems.dbtex
                        pdflatex assignmentsheet1

problems.dbtex  : mth101.dbtex
                datatooltk --in mth101.dbtex \
                --merge Label mth101-years.dbtex \
                --shuffle \
                --filter-and \
                --filter Year ne $(LASTYEAR) \
                --filter Year ne $(YEARBEFORE) \
                --truncate 5 \
                --output problems.dbtex

update          : 
                datatooltk --in mth101-years.dbtex \
                --filter Year eq $(LASTYEAR) \
                --filter Year eq $(YEARBEFORE) \
                --output mth101-years.dbtex

Now, at the start of each year (or after altering the structure of the database in mth101.dbtex) you can use

$ make update

to trim mth101-years.dbtex to just the entries for the previous two years. (There's probably a more efficient way of writing this Makefile, but a discussion of the make utility is beyond the scope of this book. If you want to copy the above code, remember to use the TAB character in the appropriate places. Alternatively, you can download the file from the examples directory.)

Note that the --merge option will be ignored if the file to be merged doesn't exist. (Just a warning message will be displayed on the standard error stream.) This means that the problems.dbtex target will work on the first instance, even though the mth101-years.dbtex file doesn't exist.

Recall the \marginpar command from Exercise 21. This can be used to, say, display the number of points for a question in the margin. For example, if all questions are worth 20 points, then within the body of \DTLforeach the number of points can be inserted into the margin:

   \item \marginpar{(20 points)}\Question

Although it may be better to define a command called, say, \points to make it easier to customize. For example, in the preamble:

\newcommand*{\points}[1]{%
  \marginpar{(#1 points)}%
}

Then the body code of \DTLforeach can be simplified:

   \item \points{20}\Question

Now you just need to modify the definition of \points if you want to change the way the points are displayed. For example, if the argument of \points is always an integer, you could check for a single point and change “points” to “point”:

\newcommand*{\points}[1]{%
  \marginpar{(#1 
  \ifnum#1=1\relax
     point%
  \else
     points%
  \fi)}%
}

If the argument may be a decimal number, the datatool package provides the command:

\dtlifnumeq {⟨number 1⟩}{⟨number 2⟩}{⟨true⟩}{⟨false⟩}

which can be used with decimal numbers. For example:

\newcommand*{\points}[1]{%
  \marginpar{(#1 
  \dtlifnumeq{#1}{1}{point}{points})}%
}

Perhaps the points should depend on the difficulty level. For example, 5 points for a level 1 question, 10 points for a level 2 question and 20 points for a level 3 question. The \ifcase command described in §7.3 Displaying a Date can be used to check the level:

 \item
   \ifcase\Level
   \or
     \points{5}%
   \or
     \points{10}%
   \or
     \points{20}%
   \fi
   \Question

Again, you can define a command that will simplify the document code:

\newcommand*{\PointsForLevel}[1]{%
   \ifcase#1
   \or
     \points{5}%
   \or
     \points{10}%
   \or
     \points{20}%
   \fi
}

Now the code in the loop is:

   \item \PointsForLevel{\Level}\Question

Exercise 26. Creating an Assignment Sheet with the datatool Package

The exercises directory that comes with this book has a database called mth102.dbtex (shown in Figure 9.3). You can download this file or create your own. This database is an amalgamation of the two databases from Example 45 with an extra column labelled “Topic”. The topics are set to either “Basic” or “Theory”. The questions taken from the problems-1stprinciples database have all been given a value of 3 for the level. Create an assignment sheet (or exam paper) that has the questions randomly selected from the mth102 database. There should be two Level 1 questions from the “Basics” topic, one Level 2 question from the “Basics” topic and one Level 3 question from the “Theory” topic. Each question should have the points displayed, using the above point allocation scheme.

**Figure 9.3:** The `mth102` Database

For the More Adventurous:

Adjust the \points command so that it keeps a running total. This total should ideally occur at the start of the document, but as the value isn't known until the end of the document, the information needs to be written to the auxiliary (.aux) file . LaTeX provides the command:

\protected@write {⟨output stream⟩}{⟨init code⟩}{⟨text⟩}

which will write ⟨text⟩ to the file identified by ⟨output stream⟩. The second argument, ⟨init code⟩, is provided for any initialisation that needs to be done prior to writing the text. The output stream for the document's auxiliary file is identified by the command \@auxout. You'll need to wrap the point total up in a command that can be used to reference the total at the start of the next run. Remember to use \protect in ⟨text⟩ to prevent expansion of this helper command.

You can download or view a solution to this exercise.

Footnotes

... database.¹: You can't export back to the probsoln format.

⇦

⇧

⇨

This book is also available as A4 PDF or 12.8cm x 9.6cm PDF or paperback (ISBN 978-1-909440-07-4).

9.4 ⁂Using the datatool Package for Exams or Assignment Sheets

Footnotes

9.4 ⁂Using the `datatool` Package for Exams or Assignment Sheets