Need a cheap MapReduce? Amazon EC2 and Hadoop is your answer.
It’s time to re-examine those long running batch jobs. Could you partition the data to allow for MapReduce? I bet you can. I know I’ve always wanted an affordable way to fire up 30 servers and run MapReduce operations against giant datasets, it’s confirmed; I’m a dork.
Tom White sent me a note this week to inform me that he had implemented a Hadoop file system on top of S3. This file system can be used as a full or partial replacement for HDFS, the Hadoop Distributed File System.
Because bandwidth between EC2 instances and data stored in S3 is not metered or billed, this is a very cost-effective way to process large amounts of data.
Related Posts:
- Who doesn’t need an Amazon EC2 Search Engine AMI?
- Hadoop Summit: Facebook creates business intelligence tool called Hive
- Why is the Digipede network good for Windows environments?
- Amazon Elastic Compute Cloud (Amazon EC2) - Limited Beta
- Trivia: In what movie did a guide horse interact with a former Beatles member?
I want to use this.