I worked with CSV data some years back and I’m always curious to know how well NodeJS handles CSV filesystem in the backend compare with the likes of Java, .NET, Ruby or PHP environments.
As it turns out, I figured you can do things with it in the smallest possible way, by importing the following modules.
let fs = require('fs');
let fastcsv = require('fast-csv');
We have the fast-csv
module to handle the CSV data(especially if it’s fairly large dataset thus the chosen module has to be performant-driven) when reading it from the FileStream input like so.
let readableStreamInput = fs.createReadStream('./some-csv-table.csv');
let csvData = [];
fastcsv
.fromStream(readableStreamInput, {headers: true})
.on('data', (data) => {
let rowData = {};
Object.keys(data).forEach(current_key => {
rowData[current_key] = data[current_key]
});
csvData.push(rowData);
}).on('end', () => {
console.log('csvData', csvData);
console.log('total rows of table', csvData.length);
})
In here, what we’re saying is as follows:
- First, we create and open up some ReadableStream object
readableStreamInput
on our CSV file so we can perform some type of file manipulation within the filesystem. - We then use npm module
fastcsv
to read up on ourreadableStreamInput
infromStream
method. In the same method, I want to includeheaders
in the dataset because I want to reference each row fields’ values to their corresponding field names when parsing. - Then within our
data
callback handler, ourfastcsv
module will look at each line of the CSV input stream so we can decide what to do with them. Normally, we can perform a number of interesting operations here, especially when refining and parsing CSV input data. In this case, I’m creating newrowData
object for each new line that comes through. At the same time, I want to capture each row’s field data and store them individually as object properties by using its respective field namecurrent_key
. - Once I’m done parsing all the column fields in the row, I add the same row instance to my
csvData
array. - And then repeat the process for the rest of the ‘rows’ in the CSV table file.
What’s cool with this is you can immediately serve up this parsed CSV data on the front end as JSON format without adding other web server middleware to be configured in backend - just by adding just a few lines of code.
let http = require('http');
let server = http.createServer((req, resp) => {
resp.writeHead(200, {'content-type': 'application/json'});
resp.end(JSON.stringify(csvData));
})
server.listen(5050);
console.log('Server listening on port: 5050');
And that’s it!
With this, it leaves room for other possibilities in your web stack application such as encapsulating your JSON data representation into some ORM model such as non-relational databases ie MongoDB, MySQL, Cassandra as well as relational ones ie MySQL, Postgres etc. And then design your RESTFUL endpoints layer on top of it. Thus you will end up with a perhaps, unsophisticated, but light-weight architecture you can start with and scale your web app without much friction accordingly.
Using libraries inside NodeJS ecosystem certainly takes care all of the nitty gritty complicated stuff on streaming, fetching and serializing data online for users to consume. Their rich packages certainly offer plenty of useful and, often, simpler abstractions to make you write reliable code.
Happy Coding!