14 Oct 2021 , 06:00 PM

How to Share Terabytes of Live Data with Delta Sharing

Technology

In this technical session Databricks' Frank Munz, will introduce Delta Sharing, a Linux Foundation open source solution for sharing massive amounts of live data in a cheap, secure, and scalable way.

 

Delta sharing uses pandas or Apache Spark for the real-time exchange of large data sets, enabling secure data sharing across products for the first time. It leverages modern cloud object stores, such as S3, ADLS, or GCS, to reliably transfer large data sets. An open-sourced reference sharing service is available to get started.

 

Within the Databricks ecosystem, Unity Catalog implements the open source Delta Sharing protocol, so you can share data across organization regardless of system boundaries, compute platform or cloud provider.   

 

This is a technical session with difficulty level easy/medium (L300).

 

Speaker Bio

 

Dr. Frank Munz is a Staff Developer Advocate at Databricks. He is the published author of three computer science books. Frank built up technical evangelism for Amazon Web Services in Germany, Austria, and Switzerland. He is a certified AWS and GCP Professional Cloud Architect and ML/Data Engineer. Frank fullfilled his dream to speak at top-notch conferences on every continent (except antarctica, because it is too cold there). He presented at conferences such as Devoxx, Kubecon, re:Invent, Voxxed Days, and Java One. He holds a PhD in Computer Science from TU Munich. 

When Frank is not working, he enjoys travelling in Southeast Asia, skiing in the Dolomites, tapas in Spain, and scuba diving in Australia.