To optimise performance, should I store web data as CSV or JSON?

I'm working with a dataset that's relatively large for web users, especially smartphone users. I'm worried about performance. Which is a bigger problem for users?

  1. Forcing the client's browser to fetch/request a large data file (JSON).
  2. Forcing the client's browser to reformat a smaller file (CSV) into a larger file (JSON) so it can be used.

When I compile the data as JSON, it's about 570KB – far larger than I would normally use. And that's stripped right down (e.g. I've reduced the keys to a single character each).

When I compile the data as CSV, it's about 220KB. However, I then need the browser to reformat it into JSON format anyway.

Here's a tiny example. A CSV file:

"year","birth","101","102","103","104","105"
1981,"Australia",5972,1099,573,747,667
1981,"China",141,4,3,2,2
1981,"India",139,5,4,6,2
1981,"Indonesia",371,9,14,5,6
1981,"Malaysia",838,72,42,11,14 

... compared with the same data as JSON:

[{"year":1981,"birth":"Australia","101":5972,"102":1099,"103":573,"104":747,"105":667},
{year":1981,"birth":"China","101":141,"102":4,"103":3,"104":2,"105":2},
{year":1981,"birth":"India","101":139,"102":5,"103":4,"104":6,"105":2},
{year":1981,"birth":"Indonesia","101":371,"102":9,"103":14,"104":5,"105":6},
{year":1981,"birth":"Malaysia","101":838,"102":72,"103":42,"104":11,"105":14}]

TLDR: What's more important for performance: (1) minimising the size of data files, or (2) minimising the amount of data processing the browser must do?

1 answer

  • answered 2019-10-08 04:21 Dai

    Preface:

    I'd argue that what you're wanting to do is a kind of premature microoptimization ( https://en.wikipedia.org/wiki/Program_optimization ), this is because most webservers will GZip HTTP responses anyway, so as far as actual transferred data is concerned both the CSV and expanded JSON representations will have roughly the same GZip size because they have the same Information Entropy.

    Also, I recommend reading this article from Google (dated June 2019): https://v8.dev/blog/cost-of-javascript-2019 - in short: JavaScript is cheap and you only need to worry about optimization on mobile devices, not desktops/laptops.

    Anyway:

    There are a few other alternatives besides CSV and JSON Objects.

    JSON Arrays:

    One option, which is probably the best of both worlds, is using JSON arrays, like so:

    [
     [ "year","birth","101","102","103","104","105" ],
     [ 1981,"Australia",5972,1099,573,747,667 ],
     [ 1981,"China",141,4,3,2,2 ],
     [ 1981,"India",139,5,4,6,2 ],
     [ 1981,"Indonesia",371,9,14,5,6 ],
     [ 1981,"Malaysia",838,72,42,11,14 ]
    ]
    

    You can use named const array indexes to access each data member:

    const Idx = {
        YEAR: 0,
        BIRTH: 1,
        _101: 2,
        _102: 3,
        _104: 4,
        // etc
    };
    
    var data = JSON.parse( text ); // the array from above
    
    for( var i = 1; i < data.length; i++ ) {
        var row = data[i];
    
        console.log( "Year: %d, Birth: %s", row[Idx.YEAR], row[Idx.BIRTH] );
    }
    

    You could also have your own materialization function to convert each row into a strongly-typed object:

    function Item( row ) {
        this.year = row[Idx.YEAR];
        this.birth = row[Idx.BIRTH];
    }
    
    var data = JSON.parse( text ); // the array from above
    
    var items = data.map( row => new Item( row ) );
    

    Array of constructor calls

    Another alternative to representing each record as an array within a parent array is to represent each record as a constructor call - however this will not work with JSON.parse - you must either use eval() (NOT RECOMMENDED), render data directly within a web-page inside your server-side generation script, or have the client load it into a <script> element (this is how JSONP works, but is dangerous).

    I use this approach myself when rendering data to a webpage for consumption by third-party data-visualization components like D3 or various other charting libraries:

    function Item( year, birth, _101, _102, _103, _104, _105 ) {
        this.year = year;
        this.birth = birth;
        this._101 = _101;
        this._102 = etc...
    }
    
    data = [
        new Item( 1981,"Australia",5972,1099,573,747,667 ),
        new Item( 1981,"China",141,4,3,2,2 ),
        new Item( 1981,"India",139,5,4,6,2 ),
        new Item( 1981,"Malaysia",838,72,42,11,14 ),
        // etc
    ];
    
    renderChart( data );
    

    I use this approach when I need to perform client-side transformation of data and I don't want to render two copies of the data in different formats to the response, for example. But as I said, this technique does not work with JSON.parse because json must be just static data and not a constructor call.