Feature Tracker
I’m currently working on a major new version of the datatool package. This may take a while. Please be patient. (Experimental version available for testing.)
ID | 37🔗 |
---|---|
Date: | 2016-02-01 10:42:42 |
Status | Open Sign in if you want to like this report. |
Category | datatool |
Summary | Load databases faster |
Sign in to subscribe to notifications about this report.
Description
Hi Nicola,I think it is possible to speed up the loading of databases from external files significantly.
Currently, each new row is appended to the toks register piece-wise: First the row ID is added to the toks register and then each subsequent element is added within this where it splits the contents of the toks register in two before inserting. This is fine if you're adding single entries one at a time.
However, when loading a file with several hundred records, you could make some different arrangements to avoid the constant splitting. I was thinking a good method would be to store all row details in a separate toks register while parsing the current line. Once parsing of the entire line is complete, the resulting row entries could be appended to the database as a fast put-right operation rather than a split-up-and-insert near the end. I haven't written the code to do it but I would expect the performance to be significantly better.
Note that for my purposes it is unfortunately not an option to use the datatooltk tools.
MWE
No mwe.tex
Evaluation
Comments
1 comment.
Add Comment
Page permalink: https://www.dickimaw-books.com/featuretracker.php?key=37
Date: 2016-02-01 18:23:10
Hi Morten,
your suggestions are much appreciated :-) Your original redesign in v2.0 was a big improvement to datatool. I think that's a good idea. I'll try implementing it when I have the time.