This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Thursday, October 29 • 11:50am - 12:30pm
Hadoop on OpenStack: Scaling Hadoop-SwiftFS for Big Data

Sign up or log in to save this to your schedule and see who's attending!

Elastically scalable big data clusters that can respond to varying workload demands, while efficiently utilizing and sharing cloud resources, is a reality that is attainable with Hadoop on OpenStack.  To achieve that reality requires seperating cluster compute from cluster storage in order to enable scaling compute independently of data.  In this session we discuss how OpenStack Swift can serve as the basis for an elastically scalable Hadoop cluster on OpenStack and detail the challenges faced when using Swift as the primary data store for big data.  We describe the cluster storage design and enhancements to the Hadoop Swift file system implementation that are necessary to achieve performance at big data scale.

Successful approaches to a number of the challenges are presented:

  • Storage architecture design addressing object, block, and transient storage

  • Hadoop SwiftFS enhancements to handle tens of thousands to millions of objects

  • Vendor specific support for Swift API implementations (CEPH)

  • Tool ecosystem interoperability

avatar for Andrew Leamon

Andrew Leamon

Director, Engineering Analysis, Comcast
Drew Leamon started his career at Microsoft while studying Computer Science at Princeton University.   In his studies, he delved into Computer Graphics, Artificial Intelligence and Computational Neurobiology.  At Microsoft, he collaborated with Microsoft Research on one of the first commercial implementations of collaborative filtering for e-commerce.  This was released as Microsoft Site Server: Commerce Edition... Read More →
avatar for Christopher Power

Christopher Power

Principal Engineer, Comcast
At Comcast I am leading the architecture definition and implementation of elastically scalable big data platforms built on the OpenStack cloud designed to support analytics, visualizations, simulations, and machine learning.

Thursday October 29, 2015 11:50am - 12:30pm

Attendees (147)