Download json data files below
#The Story and Codes
So recently I undertook a small geospatial assignment where I needed to find out the population data of Singapore by planning zones. So I looked up in http://data.gov.sg/ and did some searching.
What I have found is
- There is one dataset called “Subzone Census2010” and another called “Region Census2010”, but I just can’t download them correctly. Every time I click on the SHP file icon, I did download a zip file, but the file is always only 22 bytes in size, so clearly there is something wrong with it and I couldn’t use it directly. And you could try yourself by following links: subzone census data and region census data.
- There is no total population distribution data, but only data about resident population, which I believed consists of only Permanent Resident and Citizen based on this report from National Population And Talent Division. So as an international student, I am not considered as a resident yet. :P
- The dataset mentioned above is called Resident Population by Planning Area/ Subzone, Age Group and Sex (2014), and could be downloaded from the link. I understand that this is only around 70% of the population, but this dataset is really awesome. By saying awesome I mean this dataset is very updated (lastly updated at 26-SEP-2014) and extremely detailed, which breaks population into both gender groups and age groups by very small intervals (5 years).
However, this awesome dataset is in XLS format and I couldn’t do much with it directly, not mentioning using this inside my web application.
So I decided to do some clean up with the data and transformed it into something usable. And of course, using Ruby, since it’s my favorite until now!
I exported the excel file into CSV by using excel 2010 on my mac. And I pick the 2014 CSV file since that’s one relevant to me, then I name the CSV file to
I did a bit search and found this gem called smarter_csv, and it works like a charm. So the code is
And the above code produces results like following
While, this is not exactly what I want, since the key has too many dashes inside(not elegant), and the value I expect is purely number, not string with separator
So I did a simple looping and convert them into the format I want by following codes
So with this snippet, I have converted the number string into actual number by removing the
, symbol and also map the key names into the format I would like to use.
So now I need to slice the whole data into three sub-groups, namely “total”, “female,” and “male”.
After examining the data for a while, I came with a simple workaround which helps me achieve the goal.
sp1 simply means stop point 1,
sp2 means stop point 2 and
ep just means end point.
After this I realize that the
total attribute is still in string format and I haven’t converted it yet, but I couldn’t directly convert it before my slicing, since I am actually using the attribute as index points.
So I did another hack on this.
As simple as this, I did a hack and convert all
total into actual numbers.
Now let’s just write those array into files!
All done. Sit back and have some fun with the data.