Building Real Time BI Systems with Kafka, Spark & Kudu: Spark Summit East talk by Ruhollah Farchtchi

Apache Kudu and Spark SQL for Fast Analytics on Fast Data (Mike Percy)

Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava

Who is More Stupid? #tiktok #sigmagirl #funny

Этот метод используют в Японии | Метод “Shisa Kanko” | Маргулан Сейсембай #маргулансейсембай

Жездуха 41-серия

SLIDE #shortssprintbrasil

Building Real Time BI Systems with Kafka, Spark & Kudu: Spark Summit East talk by Ruhollah Farchtchi

Рет қаралды 13,915

Spark Summit

Spark Summit

Күн бұрын

One of the key challenges in working with real-time and streaming data is that the data format for capturing data is not necessarily the optimal format for ad hoc analytic queries. For example, Avro is a convenient and popular serialization service that is great for initially bringing data into HDFS. Avro has native integration with Flume and other tools that make it a good choice for landing data in Hadoop. But columnar file formats, such as Parquet and ORC, are much better optimized for ad hoc queries that aggregate over large number of similar rows.

Пікірлер

Apache Kudu and Spark SQL for Fast Analytics on Fast Data (Mike Percy)

28:54

Apache Kudu and Spark SQL for Fast Analytics on Fast Data (Mike Percy)

Spark Summit

Рет қаралды 18 М.

Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava

31:37

Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava

Spark Summit

Рет қаралды 12 М.

Who is More Stupid? #tiktok #sigmagirl #funny

0:27

Who is More Stupid? #tiktok #sigmagirl #funny

CRAZY GREAPA

Рет қаралды 10 МЛН

Этот метод используют в Японии | Метод “Shisa Kanko” | Маргулан Сейсембай #маргулансейсембай

0:32

Этот метод используют в Японии | Метод “Shisa Kanko” | Маргулан Сейсембай #маргулансейсембай

МАРГУЛАН СЕЙСЕМБАЙ

Рет қаралды 6 МЛН

Жездуха 41-серия

36:26

Жездуха 41-серия

Million Show

Рет қаралды 5 МЛН

SLIDE #shortssprintbrasil

0:31

SLIDE #shortssprintbrasil

Natan por Aí

Рет қаралды 49 МЛН

Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by Tom Phelan

30:33

Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by Tom Phelan

Spark Summit

Рет қаралды 4 М.

Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine

31:40

Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine

Spark Summit

Рет қаралды 9 М.

Think Fast, Talk Smart: Communication Techniques

58:20

Think Fast, Talk Smart: Communication Techniques

Stanford Graduate School of Business

Рет қаралды 44 МЛН

3. Apache Kafka Fundamentals | Apache Kafka Fundamentals

24:14

3. Apache Kafka Fundamentals | Apache Kafka Fundamentals

Confluent

Рет қаралды 502 М.

Lecture 3 | Loss Functions and Optimization

1:14:40

Lecture 3 | Loss Functions and Optimization

Stanford University School of Engineering

Рет қаралды 905 М.

Apache Spark Meet Up at Spark Summit East 2017

1:35:47

Apache Spark Meet Up at Spark Summit East 2017

Spark Summit

Рет қаралды 4,5 М.

Spark + Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland

29:50

Spark + Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland

Spark Summit

Рет қаралды 70 М.

Building a Dataset Search Engine with Spark & Elasticsearch: talk by Oscar Castañeda-Villagrán

30:13

Building a Dataset Search Engine with Spark & Elasticsearch: talk by Oscar Castañeda-Villagrán

Spark Summit

Рет қаралды 11 М.

What is Apache Kafka®?

11:42

What is Apache Kafka®?

Confluent

Рет қаралды 382 М.

But what is a neural network? | Deep learning chapter 1

18:40

But what is a neural network? | Deep learning chapter 1

3Blue1Brown

Рет қаралды 18 МЛН

Who is More Stupid? #tiktok #sigmagirl #funny

0:27

Who is More Stupid? #tiktok #sigmagirl #funny

CRAZY GREAPA

Рет қаралды 10 МЛН