About
Shop
LaTeX
Software
Books
Gallery
News
Contact
Blog
Settings
Account
Latest news 2024-10-15: New blog post: Tales for Our Times Book Launch.


2.2.1 Loading Data From a CSV File

Most spreadsheet applications can export data to a CSV file. By default datatool assumes that the data in this file is separated by commas where the values are optionally delimited with the double-quote character " but if this isn't the case you need to specify the separator using:

\DTLsetseparator{character}

and the delimiter using:

\DTLsetdelimiter{character}

A common alternative is to use the tab character ↹ as a separator but this is awkward to specify in LaTeX, so datatool provides

\DTLsettabseparator

which is the same as \DTLsetseparator{character} where ⟨character⟩ is a tab. Note that this command changes the category code of the tab character. If you later want to treat a tab as a regular space, you can reset the category code after you have loaded the data using:

\DTLmaketabspace

Remember to specify the separator and delimiter characters before you load the file.

For example, suppose your data is saved in a CSV file in the form:

Surname;First Name;Title;Registration Number
|Smith, Jr|;John;Mr;12345
Brown;Jane;Miss;12346
Brown;Andy;12347
Adams;|Z\"oe|
then in your document, before you load this file, you need to write:

\DTLsetseparator{;}\DTLsetdelimiter{|}

For convenience, all the examples in this book assume the default comma separator and double-quote delimiter.

When creating your CSV files be careful of spurious spaces. For example, the following line of data:

Brown , Jane , Miss , 12356

isn't the same as:

Brown,Jane,Miss,12356

If the CSV file contains extended characters, make sure the file was saved with the same encoding as your LaTeX document and use the inputenc [40] and fontenc [59] packages.4 (See Volume 1.) The sample files that accompany this book all use UTF-8 encoding. Make sure you load the inputenc package with the utf8 option before you load any of these comma-separated variable (CSV) files.

Once your data is in a CSV file you can load it into a datatool database using:

\DTLloaddb[options]{db-name}{filename}

where the CSV file is called ⟨filename⟩. The argument ⟨db-name⟩ is the database label, as described above.

The \DTLloaddb command assumes the CSV file either doesn't contain any of TeX's special characters (see Volume 1) or, if it does, they form correct LaTeX code. If this isn't the case, instead of using \DTLloaddb, you can use:

\DTLloadrawdb[options]{db-name}{filename}

This is like \DTLloaddb except that it performs a substitution on nine of the ten special characters. (The backslash always retains its special state.) The mappings are listed in Table 2.1.


Table 2.1: Special character mappings used by \DTLloadrawdb
Character Mapping
% \%
$ \$
& \&
# \#
_ \_
{ \{
} \}
~ \textasciitilde
^ \textasciicircum

You can add extra mappings using:

\DTLrawmap{string}{replacement}

For example, to replace the character £ with \pounds:

\DTLrawmap{£}{\pounds}

(Alternatively use the inputenc package.)

Examples

  1. Suppose your CSV file looks like:
    Experiment,Result
    1,$42.08\pm 0.1$
    2,$48.03\pm 0.2$
    
    In this file, the $ character has been used to indicate in-line maths mode (see Volume 1). This should be left as it is when the data is loaded, so use \DTLloaddb.

  2. Suppose your CSV file looks like:
    Title,Price
    "Duck & Goose's Adventures",$10.00
    "The Return of Duck & Goose",$11.00
    
    Now the characters & and $ are intended literally and should not be interpreted by TeX, so you need to use \DTLloadrawdb to ensure they are converted to the correct commands.

  3. Now suppose your CSV file looks like:
    Title,Price
    "Duck & Goose's Adventures",£10.00
    "The Return of Duck & Goose",£11.00
    
    Again you need to use \DTLloadrawdb but before you do that you may need to define a new mapping for the £ character using \DTLrawmap, as described above. (Or use the inputenc package.)

The commands \DTLloaddb and \DTLloadrawdb can't read data where there are EOL characters within a cell. For example, neither command can read a CSV file that looks like:

Title,Price
"Duck and Goose's
Adventures",10.00
"The Return of
Duck and Goose",11.00

If you have a CSV file in this form, you can use datatooltk to convert the CSV file to a datatool (.dbtex) file, described in the next section.

You may have noticed that both \DTLloaddb and \DTLloadrawdb have an optional argument. This is a key=value list. Available keys are as follows:

noheader
This is a boolean key. The value can be either false (the CSV file has a header row, as in the examples above) or true (the CSV file doesn't have a header row). The default is false. If you want to set this value you may omit =true, so noheader=true is the same as just noheader.

keys
Each column must have a unique label assigned to it. This makes it easier to reference but, like the database label, the column label mustn't contain special characters. The default action of \DTLloaddb and \DTLloadrawdb is to use the value given in the header row as a label. This may be inappropriate, so you can set a different set of labels using this option. The value must be a comma-separated list of labels in the same order as the columns in the CSV file. For example,

\DTLloaddb[keys={title,price}]{books}{booklist.csv}

(note the braces).

You should use this option if the header row contains special characters. If the CSV file has no header and no label has been specified, a default label is generated in the form

where ⟨n⟩ is the column number. By default, \dtldefaultkey is just “Column”, but you can change this by redefining the command (see Volume 1). Note that an empty item in the keys list indicates an empty label for that column.

autokeys
This is a boolean key that was introduced in version 2.22. If true, all the column labels will automatically be assigned the default label \dtldefaultkeyn⟩ described above. This is useful if you have a lot of columns where the header may contain special characters, and you don't want to have to list every column in the keys list. This means that you need to know the column index if you want to reference the data in it.

headers
Each column not only has a unique label assigned to it, but also has a header or title. The column headers are used in commands such as \DTLdisplaydb described in §2.6 Displaying Tabulated Data. The default column headers are taken from the header row in the CSV file but if they aren't appropriate or your file doesn't have a header row, you can use this option to assign headers. As with the keys option, described above, the value must be a comma-separated list of header text in the same order as the columns in the CSV file. For example:

\DTLloaddb[headers={Book Title,Price (\pounds)}]
 {books}% database label
 {booklist.csv}% filename

(note the braces). An empty item in the headers list indicates an empty header for that column. For example:

\DTLloaddb[headers={Book Title,}]{books}{booklist.csv}

indicates that the second column has a blank header.

omitlines
The value must be a non-negative number indicating how many lines to skip at the start of the CSV file. For example, if the CSV file contains two lines of unwanted material at the start, then you need to use omitlines=2.

Examples

  1. Suppose your CSV file called products.csv looks like:
    This is a list of products in my shop.
    Last edited 2014-01-22
    
    Title,Price (£)
    "Duck & Goose's Adventures",10.00
    "The Return of Duck & Goose",11.00
    
    When you load this data into your document, you need to skip the first three lines as they don't form part of the data. You also need to map the pound symbol (£) and the ampersand (&). Additionally, it's a good idea to provide short unique labels to identify the columns:

    \DTLrawmap{£}{\pounds}% add mapping for £ symbol
    \DTLloadrawdb
     [% 
       omitlines=3,% header row is on line 4
       keys={title,price}% column labels
     ]%
     {products}% database label
     {products.csv}% filename
    

  2. Suppose your CSV file called products.csv looks like:
    "Duck and Goose's Adventures",10.00
    "The Return of Duck and Goose",11.00
    
    Here there isn't a header row. You could assign labels as in the previous example:

    \DTLloaddb
     [%
      noheader,% no header row in file
      keys={title,price},% column labels
      headers={Title,Price (\pounds)}% column titles
     ]% 
     {products}% database label
     {products.csv}% filename
    

    However, if you're not interested in referencing any columns (for example, you just want to display the data in a table, as described in §2.6 Displaying Tabulated Data) you can let datatool assign default labels:

    \DTLloaddb
     [% 
       noheader,% no header row in file
       headers={Title,Price (\pounds)}% column titles
     ]% 
     {products}% database label
     {products.csv}% filename
    



Footnotes

... packages.4
Just use fontspec [76] for XeLaTeX.

This book is also available as A4 PDF or 12.8cm x 9.6cm PDF or paperback (ISBN 978-1-909440-07-4).

© 2015 Dickimaw Books. "Dickimaw", "Dickimaw Books" and the Dickimaw parrot logo are trademarks. The Dickimaw parrot was painted by Magdalene Pritchett.

Terms of Use Privacy Policy Cookies Site Map FAQs