Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

«Кім тапқыр?» бағдарламасы

Это было очень близко...

Who’s the Real Dad Doll Squid? Can You Guess in 60 Seconds? | Roblox 3D

Inside Out 2: ENVY & DISGUST STOLE JOY's DRINKS!!

Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe

Рет қаралды 336

Databricks

Күн бұрын

Modern data workloads come in all shapes and sizes - numbers, strings, JSONs, images, whole PDF textbooks and more. To process this data we still rely on utilities such as: ffmpeg for videos, jq for JSON and Pytorch for tensors. However, these tools were not built for large-scale ETL. This means that we often need to build bespoke data pipelines that orchestrate data movement and custom tooling. If only downloading images, resizing them and running vision models was as simple as extracting a substring in SparkSQL! Daft (www.getdaft.io) is a next-generation distributed query engine built on Python and Rust. It provides a familiar dataframe interface for easy and performant processing of multimodal data at scale. Join us as we demonstrate how to build a multimodal data lakehouse using Daft on your existing infrastructure (S3, DeltaLake, Databricks and Spark).
Talk By: Jay Chia, Co-Founder, Eventual Computing
Here’s more to explore:
Big Book of Data Engineering: 2nd Edition: dbricks.co/3Xp...
The Data Team's Guide to the Databricks Lakehouse Platform: dbricks.co/46n...
Connect with us: Website: databricks.com
Twitter: / databricks
LinkedIn: / data…
Instagram: / databricksinc
Facebook: / databricksinc

Пікірлер

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

24:30

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

Shaw Talebi

Рет қаралды 51 М.

How to Improve LLMs with RAG (Overview + Python Code)

21:41

How to Improve LLMs with RAG (Overview + Python Code)

Shaw Talebi

Рет қаралды 65 М.

«Кім тапқыр?» бағдарламасы

00:16

«Кім тапқыр?» бағдарламасы

Balapan TV

Рет қаралды 293 М.

Это было очень близко...

00:10

Это было очень близко...

Аришнев

Рет қаралды 1,6 МЛН

Who’s the Real Dad Doll Squid? Can You Guess in 60 Seconds? | Roblox 3D

00:34

Who’s the Real Dad Doll Squid? Can You Guess in 60 Seconds? | Roblox 3D

Minec Music Short

Рет қаралды 24 МЛН

Inside Out 2: ENVY & DISGUST STOLE JOY's DRINKS!!

00:32

Inside Out 2: ENVY & DISGUST STOLE JOY's DRINKS!!

AnythingAlexia

Рет қаралды 18 МЛН

How Fast can Python Parse 1 Billion Rows of Data?

16:31

How Fast can Python Parse 1 Billion Rows of Data?

Doug Mercer

Рет қаралды 219 М.

Multi-modal RAG: Chat with Docs containing Images

17:40

Multi-modal RAG: Chat with Docs containing Images

Prompt Engineering

Рет қаралды 22 М.

Building Iceberg native applications in simple Python (Eventual)

41:03

Building Iceberg native applications in simple Python (Eventual)

Apache Iceberg

Рет қаралды 601

How to Create Databricks Workflows (new features explained)

37:58

How to Create Databricks Workflows (new features explained)

Bryan Cafferky

Рет қаралды 15 М.

vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024

56:09

vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024

Neural Magic

Рет қаралды 1,1 М.

Jay Chia - Blazing fast I/O of data in the cloud with Daft Dataframes | PyData Global 2023

29:39

Jay Chia - Blazing fast I/O of data in the cloud with Daft Dataframes | PyData Global 2023

PyData

Рет қаралды 109

SQL Databases with Pandas and Python - A Complete Guide

16:59

SQL Databases with Pandas and Python - A Complete Guide

Rob Mulla

Рет қаралды 130 М.

Apps on Databricks: Build Data Applications on Databricks in 20 Minutes!

41:02

Apps on Databricks: Build Data Applications on Databricks in 20 Minutes!

Databricks

Рет қаралды 5 М.

Snowflake vs. Databricks: A deep dive

59:09

Snowflake vs. Databricks: A deep dive

SELECT

Рет қаралды 10 М.

Building Python Best Practices and Fundamental Skills | Real Python Podcast #176

1:03:38

Building Python Best Practices and Fundamental Skills | Real Python Podcast #176

Real Python

Рет қаралды 2,4 М.

«Кім тапқыр?» бағдарламасы

00:16

«Кім тапқыр?» бағдарламасы

Balapan TV

Рет қаралды 293 М.