Assigning monotonically increasing IDs in a distributed system can be tricky. Here's a way to do it in MapReduce.
Follow up on the previous post. This time we're going to do partitioning on fully connected graphs, using some criterion to split it into parts.
Take a big graph represented as files on HDFS and partition it in to smaller pieces using a iterative MapReduce based solution implemented with Cascading.
How I obtained a three-character Twitter name using Python and the Twitter API