Jun 25, 2012

Monotonically increasing row ids with MapReduce


Assigning monotonically increasing IDs in a distributed system can be tricky. Here's a way to do it in MapReduce.

Feb 06, 2012

Graph partitioning in MapReduce with Cascading (part 2)


Follow up on the previous post. This time we're going to do partitioning on fully connected graphs, using some criterion to split it into parts.

Jan 29, 2012

Graph partitioning in MapReduce with Cascading (part 1)


Take a big graph represented as files on HDFS and partition it in to smaller pieces using a iterative MapReduce based solution implemented with Cascading.

Nov 29, 2011

Twitter data fun


Mapping your Twitter followers, getting distributions and more.

Oct 23, 2011

Choosing a Twitter handle


How I obtained a three-character Twitter name using Python and the Twitter API

Oct 21, 2011

Why blog?!


Rant-like observation on why I require this online presence.