And now, if e’er by chance I put
My fingers into glue,
Or madly squeeze a right-hand foot
Into a left-hand shoe…


– Lewis Carroll

I should start by giving fair warning that this is part-post and part-rant, but I shall try and minimise the ranting. I was very pleased to read Allison Schrager’s excellent piece on the perils of data journalism on Quartz. She argues that journalists using data should be careful of its economic meaning because

Empirical researchers spend years learning how to apply statistics and countless hours dissecting dataAnd then even the most experienced, well-intentioned researcher might end up with biased results. “ 

This is important, because it is easy to be misled and in turn easy to mislead those who are unlikely to check how you’ve reached a particular conclusion. There’s now a whole bunch of really interesting data that’s accessible, and we’re getting better at representing this in eye-catching ways that are cognizable to a much wider audience, hence Schrager’s warning is both timely and spot-on. However it is hard to guard against misinterpretation, even when the conscientious make a point of stating caveats and riders. Sometimes bias or laziness overrides all of these.   

To give you an example, I got drawn into a discussion on Facebook on how individual Indian states are ranked on human development indicators only to realise that the person’s claims were based on a complete misreading of the data. The person’s identity is irrelevant, but their position as a member of the faculty of a premier higher education institution in India is a cause for some consternation

This individual posted on friend’s wall that according to a new UNDP index, many of the states that led in previous rankings had been displaced, citing this in support of the Gujarat model of development, claiming that the new rankings indicated Gujarat’s true place.

Quote: “Recently UNDP Announced that, they are going to adopt Gujarat governments definition of Development and in new scale Himachal Pradesh, Kerala, Panjab are the least developed states.”

I recently wrote a post on the rankings of Indian States, and was curious, so I looked up the UNDP index.  
     
I found a paper by Suryanarayana, Aggarwal & Prabhu (2011), where they argue that current Human Development Index (HDI) rankings do not take into account income inequality, and propose a new inequality adjusted index  – IHDI. Read the entire paper here. Some states that do quite well on conventional HDI measures have high levels of inequality, which means that their IHDI score is lower in comparison. They only consider  19 states, so these rankings are distinct from a ranking of all the states as given here (with a lovely interactive map – go check it out). Below is table 3, from the paper with scores of each State according to HDI and IHDI,  the ratio and loss percentage, the rankings according to each and finally the change in ranking when you shift from HDI to IHDI.

As can be seen from the table, most states positions in the rankings don’t change hugely, with the exceptions of Madhya Pradesh and Uttarakhand which move down 3 places each and Bihar and Orissa which gain two places. On a side note, this says a lot about Orissa and Bihar’s development model, since they’re traditionally poor states that have developed rapidly in the past few years, but it is particularly laudable that they’ve managed to grow without exacerbating inequality. 
So what of my learned friend’s contention that Gujarat has beaten Kerala and Punjab, amongst others? I’m still scratching my head over that one, but the first figure in Suryanarayana et al’s paper may shed some light, though it still requires a great amount of imagination and ingenuity.
The states are ranked in ascending order to their HDIs as indicated by the Notes below the figure, perhaps that is confusing? The height of the bars should give some indication one would think? 
I posted some links on the FB thread, and tried to make the point that there might have been a misreading of the data, but my arguments failed to have any impact and I gave up. However it has highlighted the trickiness of data representation and the futility in some cases of trying to idiot-proof it.