fscanf-speed
fscanf-speed copied to clipboard
How fast is fscanf?
How fast is fscanf?
Suppose you wish to read a large text file containing N floating point
numbers on each line. Such a file that might have been created, for
example, by numpy.savetxt
in python.
What is the fastest way to read such a file? In python, one option is
to use numpy.loadtxt
. Unfortunately, this is quite slow. For a file
with one million rows and five columns:
$ time python -c 'import numpy; numpy.loadtxt("large.txt")'
real 0m7.037s
user 0m6.800s
sys 0m0.230s
I decided to find out much of this was I/O bound and how much was
consumed by actual processing, so I wrote a simple C++ program that
uses fscanf
to read the five floating point numbers. The critical
loops is very simple:
FILE* fd = fopen(path, "r");
while (!feof(fd)) {
fscanf(fd, "%lf %lf %lf %lf %lf\n",
&line[0], &line[1], &line[2], &line[3], &line[4]);
}
fclose(fd);
Here are the results of parsing the same file (one million rows, five columns):
$ c++ -O3 --std=c++11 -o loadtxt loadtxt.cpp
$ time ./loadtxt --fscanf large.txt
real 0m2.712s
user 0m2.673s
sys 0m0.038s
As you can see, it is just shy of a three times as fast as
numpy.loadtxt
. This is of course much less general than
numpy.loadtxt
, not least because the number of columns is
hard-coded.
Now for the surprising part. I imagined that since fscanf has been a part of libc for so very long, it would be about as fast as it is possible to be. Certainly I imagined that the C loop above would be almost entirely I/O bound. But I was wrong.
The first give-away was when I tried using pandas.load_csv
from
python:
time python -c 'import pandas; pandas.read_csv("large.txt", sep=" ", header=None)'
real 0m1.086s
user 0m0.930s
sys 0m0.149s
This gives nearly a three times speedup over the fscanf loop, even
including the overhead of loading up python and importing pandas
! At
this point I became curious about how quickly a lean C program could
parse the text file if instead of using fscanf
I used a custom
ascii-to-floating-point parser. Here are the results:
$ c++ -O3 --std=c++11 -o loadtxt loadtxt.cpp
$ time ./loadtxt --direct large.txt
real 0m0.320s
user 0m0.287s
sys 0m0.032s
That is a further 3x speedup over pandas! At this point we are
almost 10 times faster than the fscanf
loop and a full 30 times
faster than numpy.loadtxt
.