Tim Hunter: Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark

Tim Hunter in the Flowers room of the Turing Building

Abstract: Deep learning has shown tremendous successes, yet it often requires a lot of effort to leverage its power. Existing deep learning frameworks require writing a lot of code to run a model, let alone in a distributed manner. Deep Learning Pipelines is a Spark Package library that makes practical deep learning simple based on the Spark MLlib Pipelines API. Leveraging Spark, Deep Learning Pipelines scales out many compute-intensive deep learning tasks. In this talk we dive into

 – the various use cases of Deep Learning Pipelines such as prediction at massive scale, transfer learning, and hyper parameter tuning, many of which can be done in just a few lines of code.
 – how to work with complex data such as images in Spark and Deep Learning Pipelines.
 – how to deploy deep learning models through familiar Spark APIs such as MLlib and Spark SQL to empower everyone from machine learning practitioners to business analysts.
Finally, we discuss integration with popular deep learning frameworks.

Bio (French): Timothée Hunter est ingénieur chez Databricks, l’entreprise créée par les fondateurs de Apache Spark, et un des contributeurs réguliers du projet MLlib. Ancien élève de l’Ecole Polytechnique et titulaire d’un doctorat en intelligence artificielle à UC Berkeley, il a développé de nombreux algorithmes distribués avec Spark depuis la version 0.2 de Apache Spark.

Comments are closed.