Skip to main content


Showing posts from November, 2013

Book Review: Practical Data Analysis by Hector Cuesta

I have been reading this book since last week, and now I want to share my thoughts about it. I was excited to review this because I've never heard most of the tools it features, like OpenRefine, MongoDB, and MapReduce. The book has 360 pages and surprisingly it covers a lot of topics. Along with that, is the Github repository for all the codes. 
Practical Data Analysis is all about applications of statistical methodologies on computer science. I find it very useful since this was not taught in my statistics class. In college, we only practice statistics on fields like sociology, psychology, agriculture, economics, chemistry, biology, industrial engineering, and many others, but we were not onto computer science, we only deal with it when coding in R or SAS. Hal Varian once said in this video that,
. . . we've got at least hundred statisticians on Google . . .
And I was curious about that, I mean, what are they doing on Google? What are the statistical tools do they use? Thanks …

R: Mapping Super Typhoon Yolanda (Haiyan) Track

After reading Enrico Tonini post, I decided to map the super typhoon Haiyan track using OpenStreetMap, maptools, and ggplot2. If mapping with googleVis was possible with 13 lines only, that can also be achieved with the packages I used; but because I play with the aesthetic of the map, thus I got more than that. The data was taken from Weather Underground, and just to be consistent with the units from JMA Best Track Data, which I utilized for mapping typhoon Labuyo (Utor), the wind speed in miles per hour (mph) was converted to knots. So here is the final output,

TD - Tropical DepressionTS - Tropical StormTY - TyphoonSTY - Super Typhoon As for the super typhoon, please don't visit my country again. I would like to thank all who prayed for Philippines, especially for countries who helped us recover from this tragedy.

Python: Venn Diagram

Venn Diagram is very useful for visualizing operations between events/sets. So in this post, we will learn how to visualize one in Python. First, we need to install the module matplotlib-venn. Open the terminal or command prompt, and run the following code:

Now that we have it, here are the three set operations I visualized:

R: Mapping Philippine Earthquakes (October 2013)

Last month, October 15, 2013 around 8:12 am (Philippine Time), a magnitude 7.2 earthquake hit Bohol island, detroying several infrastructures and killing hundreds of residents. The Philippine Institute of Volcanology and Seismology or PhiVolcs recorded more than 3000 aftershocks, but only a fraction of these is available in their Earthquake Bulletin. There are 448 data points in total for last month's earthquakes, and here is the final output,
Adding layer for two-dimensional density of the data points, we have