No video

How to Analyze Data from a REST API with Flink SQL

  Рет қаралды 1,797

Confluent

Confluent

Күн бұрын

Пікірлер: 5
@ConfluentDevXTeam
@ConfluentDevXTeam 2 ай бұрын
Hey, it's Lucia! Hopping in here to say that if you're working on transforming, cleaning, and connecting data, we've got resources for you. Head over to Confluent Developer and check out more of our demos: cnfl.io/3X9niaV PS- like I said in the video, I'm happy to answer questions in the comments! Chime in.
@Fikusiklol
@Fikusiklol 2 ай бұрын
Is this different from KSQL-db and stream processing happening there? Sorry for being ignorant, just genuinely confused :)
@ConfluentDevXTeam
@ConfluentDevXTeam 2 ай бұрын
Lucia here. That's a great question! Both ksqlDB and FlinkSQL can analyze data that is in Kafka topics. However, ksqlDb is built on Kafka Streams, while FlinkSQL is an API offered by Apache Flink. You can read more about FlinkSQL here: cnfl.io/4bR5ndv
@shintaii84
@shintaii84 2 ай бұрын
I’m sorry to say, but i did not learn anything, besides that we can create a cloud account… I think i can build a script that does this in a few hours with any db. So what is unique here? Why just not a cronjob running my script, saving to postgres or even without saving just put events on the topic, line by line.
@DaveTroiano
@DaveTroiano 2 ай бұрын
Hello, demo developer Dave here :) Thanks for tuning in! I would call out these uniqueness points, particularly compared to the "script into any db" idea: 1. Ease of use, both in terms of development and deployment. Connector config plus SQL running in a cloud engine is easier than scripting and needing to run that script reliably. To your point, neither approach is very difficult if the goal is to get to a demo point, but the runtime aspect in particular has many big hard things lurking underneath if going beyond demo (next point...) 2. All of the other solution hardening that you would face doing this for real is a lot easier with this approach compared to rolling your own. Resilience with respect to REST API flakiness, fault tolerance with respect to connector infrastructure, logging, etc. Where do you want to build all of these features that you’ll probably need post-demo and how much time do you want to spend on developing these things on your own and maintaining them? In other words, many people would be able to get to the equivalent demo point via a script and postgres pretty quickly, but the marginal effort needed to harden would be significantly higher. 3. Ad hoc streaming / real-time analytics. This is mostly a response to the question “why not use any DB??” This is more a demo about getting started, but it then enables real-time answers to ad hoc questions, say a QoS type question like “how many aircraft at taxiing *currently* and what’s the avg / max taxiing time in the past minute? A Cronjob and postgres might work for batch and answering these kinds of questions after the fact, but the streaming aspect is unique and the reason to be looking at technologies like Kafka and Flink (many more details on the benefits of Flink in this Confluent Developer course: developer.confluent.io/courses/apache-flink/intro/ ). In the case of this example, it's seconds type latency to get from data available via API to "data reflected in a streaming aggregation" given the latency delay inherent in this particular API. Still pretty snappy and a difficult "time to live" bar to achieve with a cronjob and postgres though... 4. This is more of a “coming soon”, but I would expect data quality rules to become available for connectors (not supported as of today). This feature would unlock data quality at the source and help developers to proactively address REST APIs changing under the rug. (In my experience, REST APIs can be a bit of a wild west when it comes to format reliability.) More here: docs.confluent.io/cloud/current/sr/fundamentals/data-contracts.html. This would be a demo enhancement when that feature becomes available, but I’m thinking ahead to yet another problem that developers would face in building a pipeline like this in production and opting for a managed quality control feature rather than having to implement it yourself. Cheers 🙂 Dave
What is a Headless Data Architecture?
11:11
Confluent
Рет қаралды 11 М.
Turns out REST APIs weren't the answer (and that's OK!)
10:38
Dylan Beattie
Рет қаралды 150 М.
Violet Beauregarde Doll🫐
00:58
PIRANKA
Рет қаралды 25 МЛН
SPILLED CHOCKY MILK PRANK ON BROTHER 😂 #shorts
00:12
Savage Vlogs
Рет қаралды 49 МЛН
Пройди игру и получи 5 чупа-чупсов (2024)
00:49
Екатерина Ковалева
Рет қаралды 4,2 МЛН
What is Apache Kafka®?
11:42
Confluent
Рет қаралды 351 М.
Data API builder is now Generally Available | Data Exposed
9:39
Microsoft Developer
Рет қаралды 6 М.
Introduction to Data Mesh with Zhamak Dehghani
1:05:31
Stanford Deep Data Research Center
Рет қаралды 31 М.
What is a Data Streaming Platform?
11:50
Confluent
Рет қаралды 3,1 М.
you need to learn SQL RIGHT NOW!! (SQL Tutorial for Beginners)
24:25
NetworkChuck
Рет қаралды 1,5 МЛН
3. Apache Kafka Fundamentals | Apache Kafka Fundamentals
24:14
Confluent
Рет қаралды 465 М.
No-Nonsense Backend Engineering Roadmap
10:16
Codebagel
Рет қаралды 191 М.
Deep Dive into REST API Design and Implementation Best Practices
12:02
Software Developer Diaries
Рет қаралды 47 М.
4. How Kafka Works | Apache Kafka Fundamentals
26:41
Confluent
Рет қаралды 195 М.
Violet Beauregarde Doll🫐
00:58
PIRANKA
Рет қаралды 25 МЛН