fscanf
to read data in from a file. A detailed manual page about the function is here.The usual approach is that we must first pre-allocate storage of the appropriate type for the data we expect (at least, if we wish to save it for further manipulation). So far I've used char, int and double data.
A difficulty is that we usually don't know how much data we will read from the file. For the sites program (here), when we read scores or counts, the file contains an integer as the first value and that specifies how much data is present. But in general we don't know, so we'll have to allocate storage as we go.
fscanf
takes a pointer to the storage as an argument. We can either do this directly, or pass the address of a variable of the correct type. There are lots of options for fscanf
, including the ability to read data in larger chunks, to read data of different types (in a specified order), or to skip certain characters, but I'm not going to worry about those complications here.In the first example, we don't save the data, just read it and echo to stdout. The file 'lorem.txt' is the default source, but an alternate can be specified on the command line.
example1.c
:output:
We read integers (or floats) by appropriate changes to the arguments to fscanf:
In the third example, we're reading the DNA sequence of E. coli (more than 4 million nucleotides). A rather crude approach is to just allocate an array of sufficient size (N = 5000000 works). A more flexible and less wasteful implementation would appropriately scale up the memory usage as needed. We substitute the following for the second half of the code above (starting at int count = 0;), reading just the first 240 nt:
Output:
Of course, the appropriate source file must exist for this to work.
Zipped project files on Dropbox (here)