sum-csv

After the last blog categories reorganization, I realized that I talk less about what I do and more about what others do. That make sense, as this blog is part of my learning process and I’m always looking around to find ways to improve myself. Yet, I’d like to start writing more about the little things I do. Writing helps me to reflect upon the how, so eventually I’ll learn more about my thought processes. These are likely to be very small things.

sum-csv

sum-csv is a small utility I have built to help me to crunch some statistics I was working with. I had a complete dataset in a CSV file, but what I wanted was an ordered list of the number of times something happened.

Original CSV:	What I wanted:

A data transformation

This is a small task – my old self whispered. Yet, instead of opening the editor and start coding right away, the first thing I did was drawing things. I am a visual person and drawing helps me to gain understanding. The algorithm I came up with was a succession of mathematical transformations: Which is to say:

transpose the original matrix
eliminate the rows I was not interested in
for each row, group all numerical values (from column 1 onwards) by adding them, to calculate the total
sort the rows by the total

Now, I was prepared to write some code. Amusingly, the gist of it is almost pure English:

d3-array.transpose( matrix ).filter( isWhitelisted ).map( format ).sort( byCount );

Reflection

Creating production-ready code took me four times the effort of devising an initial solution: finding good and tested libraries for some of the operations not built-in in the language such as reading a CSV into a matrix or transpose the matrix itself, creating the tests for being able to sleep well at night, distributing the code in a way which is findable (GitHub/npm) and usable by others/my future self, and, actually, writing the code.

I am not always able to write code as a series of mathematical transformations, but I find pleasure when I do: it is much easier to conceptually proof whether the code is correct. I also like how the code embodied some of the ideas I’m more interested in lately, such as how a better vocabulary helps you to make things simpler.

oandre.gal

sum-csv

sum-csv

A data transformation

Reflection

Like this:

Comments

Leave a Reply Cancel reply