Apache Spark and Its Role in the Enterprise Data Hub


1 Apache Spark and Its Role in the Enterprise Data Hub Mike Olson, Chief Strategy Officer, Cloudera mike.olson@cloudera.com, @mikeolson 2 © 2014 Cloudera, Inc. All rights reserved. Spark Unifies and Simplifies Hadoop Batch Processing Stream Processing Machine Learning 3 © 2014 Cloudera, Inc. All rights reserved. Developing and supporting Spark together to ensure customer success 4 © 2014 Cloudera, Inc. All rights reserved. Spark at Cloudera October 2013 February 2014 July 2014 Databricks and Cloudera partner Spark support added to CDH Continuing support & innovation 5 © 2014 Cloudera, Inc. All rights reserved. Spark is a Core Component of Hadoop Hadoop Core, 2589 Spark, 4149 All Other Ecosystem Projects Shipped by Cloudera, 1243 8 Commit Activity Past 12 Months 6 © 2014 Cloudera, Inc. All rights reserved. Fully Integrated into CDH • Integrated and supported part of our platform • Diverse use cases in production • Well-trained support and external trainings 3RD PARTY APPS STORAGE BATCH PROCESSING INTERACTIVE SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING WORKLOAD MANAGEMENT FILESYSTEM ONLINE NOSQL 7 © 2014 Cloudera, Inc. All rights reserved. Customer Adoption Search personalization through machine learning investigations Fast processing of millions of stock positions and future scenarios Genomics research using Spark pipelines Predictive modeling of disease conditions 8 What’s Next? 9 The only hands-on deep dive into building unified applications with Spark Cloudera Developer Training for Apache Spark Public GA: Aug 5, Redwood City © 2014 Cloudera, Inc. All rights reserved. 10 © 2014 Cloudera, Inc. All rights reserved. • Simplifies and speeds up complex cluster deployments • Includes Cloudera Enterprise and ScaleMP's Versatile SMP (vSMP) architecture • Built on the Intel(R) Xeon(R) processor-based Dell R920 hardware • Optimized for Spark Dell In-Memory Appliances for Cloudera Enterprise 11 © 2014 Cloudera, Inc. All rights reserved. Spark as the Standard Processing Engine 12 © 2014 Cloudera, Inc. All rights reserved. The Hive and Spark communities are coming together to drive consolidation in the Hadoop ecosystem Bringing the Communities Together 13 © 2014 Cloudera, Inc. All rights reserved. Hive on Spark 14 © 2014 Cloudera, Inc. All rights reserved. Architecture SPARK BATCH PROCESSING STREAM PROCESSING HIVE Parser, Metastore, Semantic Analyser, Logical Plan, Optimizer, Task execution layer HDFS MR Tez 15 © 2014 Cloudera, Inc. All rights reserved. Our SQL on Hadoop Vision SQL BI and SQL Analytics Batch Processing Mixed Spark and SQL Applications 16 Mike Olson mike.olson@cloudera.com @mikeolson Thank you! © 2014 Cloudera, Inc. All rights reserved.
还剩15页未读

继续阅读

下载pdf到电脑,查找使用更方便

pdf的实际排版效果,会与网站的显示效果略有不同!!

需要 5 金币 [ 分享pdf获得金币 ] 0 人已下载

下载pdf