Lecture on Large-scale data processing with Hadoop MapReduce
- Date:
- 02 Dec 2011
- Time:
- 11:00 to 12:00
- Location:
- Amolf, 2nd floor, Science Park 104, Amsterdam
BiG Grid and Amsterdam Information Retrieval (AIR) present a guest lecture by Dr. Jimmy Lin on the processing of large datasets using the open-source MapReduce implementation Apache Hadoop. If you are curious to see how Hadoop can help you with your processing needs, and are not afraid of some technical examples, then this should be a helpful and interesting talk for you. Jimmy will give special attention to Bioinformatics. Specifically: about doing de novo short read assembly using the De Bruijn graph approach on Hadoop.
Jimmy received his Ph.D. at MIT and is an associate professor at the University of Maryland, holding several positions. He is currently on sabbatical at Twitter. Jimmy is author of the book 'Data-Intensive Text Processing with MapReduce', the most exhaustive source of information on MapReduce currently available. (See http://www.umiacs.umd.edu/~jimmylin/book.html)
Apache Hadoop is an open source data-processing framework that is able to leverage the power of very large clusters of commodity computers. It's today's most widely used software for distributed data processing and provides a rich ecosystem of related tools, together with a large, enthusiastic, and helpful developer community. It has been used to win the 2009 edition of Jim Gray's Sort benchmark, and it is the processing framework powering Watson, IBM's supercomputer that won the game-show Jeopardy earlier this year.
BiG Grid and SARA offer access to a prototype Apache Hadoop service to scientists in The Netherlands, and will have a production-level service ready by the first quarter of 2012. This production service will offer more than 500 cores of processing power, and 500 terabytes of storage capacity, and will be made available for all scientists in The Netherlands. For more information contact us at evert.lammerts@removethis.sara.nl.
More information on:
- Dr. Jimmy Lin: http://ischool.umd.edu/content/jimmy-lin-0
- Hadoop: http://hadoop.apache.org/
- Hadoop @ BiG Grid: http://www.sara.nl/project/hadoop
Directions to venue: http://www.amolf.nl/about-amolf/visitor/#c42

