`


THERE IS NO GOD EXCEPT ALLAH
read:
MALAYSIA Tanah Tumpah Darahku

LOVE MALAYSIA!!!


 


Saturday, September 13, 2014

Statistics, the art of telling official ‘lies’? – Khor Zi Jian



I was browsing the blog of YB Lim Kit Siang and the article “RM5,900 a month income is simply not true” by YB Steven Sim caught my attention.
Sim, MP of Bukit Mertajam, began his article by asking the difference between telling lies and citing statistics.
He answered the question by quoting Mark Twain, “There are three kinds of lies – lies, damned lies and statistics”.
As a Malaysian who receives formal education in statistics and currently working as a statistical analyst in the US, I feel that I have an obligation to respond to this article.
First and foremost, I would like to point out that it is a blatant disregard to the statisticians and the discipline of statistics by equating statistics to lies.
On the contrary, statistics is a sophisticated field that has broad applications in various disciplines, such as economics, social science, IT, medicine and pharmaceutical industry.
Sim mentioned in his webpage that he was one of the award winners of World Economic Forum several years ago.
Anyone who has the opportunity to speak with top-notch economists would have learned from them that a significant portion of their work relies on statistics.
In fact, the sub-field econometrics is closely related to statistics, and many Nobel Prize winners in Economics utilise their extraordinary statistical skills to help them attain major breakthroughs in their research.
The contribution of statistics does not stop in the field of economics.
Last year, Hollywood superstar Angelina Jolie received global attention when she revealed that she underwent a double mastectomy to reduce her risk of breast cancer from 87% to below 5%.
Has anyone ever questioned where the figures (87% and 5%) came from? How could a doctor come up with such a precise figure 87% (instead of a more general estimate, say 90%)? Answer: the medical personnel work together with biostatisticians (statisticians who conduct testing in medical/biological field) to get these figures.
In the US, thousands of biostatisticians collaborate with medical researchers, physicians, and pharmacists to analyse the likelihood of patients surviving chronic diseases and conduct clinical trials to test the effectiveness of new drugs.
In fact, politicians like Sim and Lim could benefit greatly from statistics themselves, as statistics can help them understand the voting preference of the Malaysian population better and enable them to formulate a winning strategy to capture Putrajaya one day.
Anyone who is familiar with the US presidential election would know that statistics played an essential role in the election.
In 2012, the Obama for America analytics team recruited a group of bright data scientists (who later founded Bluelabs after running a successful campaign) to help President Barack Obama analyse the political spectrum across all states and ethnicities.
By building statistical models and utilising the innovation of big data technology, the data scientists could quantify the voters’ political belief.
As a result, the team could reach out to the right voters, solicit the best donors and channel the campaign resources effectively. It is worth mentioning here that Nate Silver, a Bayesian statistician, gained global fame when he accurately predicted the winner of all the 50 states in the 2012 US presidential election.
Considering the enormous contributions of statistics, not only it is unfair to accuse statistics as a form of lie, such a remark is ignorant and distasteful. Statistics do not lie, humans do.
So what exactly is the problem behind the issue of RM5,900 household monthly income? The problem lies in the interpretation of statistics.
As Sim mentioned in the article, there is a distinction between mean and median (basic statistical concept).
In a dataset where outliers present, the median value portrays a better picture of the actual situation. Hence, in a nation where there is a huge income inequality, the median income level is a better indicator in understanding the population’s earning ability.
It is not my intention to belittle or criticise Sim or DAP, nor is it my intention to back the numbers provided by the government.
But rather, I would like to use this opportunity to discuss the significance of statistics, as it is unfair to lay the blame on statistics when the actual problem arises from the lack of ability to analyse data effectively.
While I appreciate Sim saying that when used responsibly, statistics provide “useful insights to the reality”, such a consolation is insufficient.
It is already saddening that statistical methodology are underutilised in Malaysia, ridiculing statistics as a low-level tool to help politicians lie is simply rubbing salt to injuries.
The field of statistics is full of young talents who are equipped with strong quantitative skills and the passion to make a positive impact.
Every year, technology giants like Google, Amazon, and Facebook hire dozens of data scientists to help quantify consumer behaviour and make better prediction of their businesses.
In 2012, Harvard Business Review even rated data scientist as the sexist job of the 21st century. While not all of us will end up pursuing a career in statistics, we can all learn more about statistics to help us make sense of data.
* Khor Zi Jian reads The Malaysian Insider.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.