Spark Structured Streaming Checkpoint
6:26
Spark Structured Streaming Introduction
10:42
Kafka Fundamentals Part-2
11:56
Kafka Fundamentals Part -1
15:16
DBT Tutorial: DBT Lineage
2:41
9 ай бұрын
DBT Tutorial: Macros in DBT
8:22
9 ай бұрын
DBT Tutorial : Introduction to DBT
14:25
Пікірлер
@RaghavendraRao-hx6js
@RaghavendraRao-hx6js 5 сағат бұрын
I need complete video on Pyspark
@asntechies8017
@asntechies8017 Күн бұрын
Can you create a project video where iot device data is processed using kafka streams in real time that would be great.thx in advance 😊
@anirvandecodes
@anirvandecodes 14 сағат бұрын
Thanks for the idea! , Will try to create some project videos
@asntechies8017
@asntechies8017 13 күн бұрын
I have a query that you might be able to solve I am saving data from Kafka to timescaleDB But for each offset mesages I need to query db to get userId associated with IOT sensor. So for each offset processing one query is executed. Causing max connection error. Any solution for that (for now I added redis + connection pooling) but I don't think it will solve it for the long term 2. As data grows to 30-40gb of the single table inserts get slower in timescaleDB what should we do to make it fast
@asntechies8017
@asntechies8017 13 күн бұрын
Thx in advance
@anirvandecodes
@anirvandecodes 13 күн бұрын
You should try to batch your query (Task will be to mimize the db call) or you can copy the data from db to databricks , Checkout this one : kzbin.info/www/bejne/pGG1hGp6d8Ssb9U
@asntechies8017
@asntechies8017 13 күн бұрын
Just subscribed your channel
@anirvandecodes
@anirvandecodes 13 күн бұрын
Thank you , Please share the playlist with your Linkedin network which will help this channel to grow.
@asntechies8017
@asntechies8017 13 күн бұрын
Nice video bro
@anirvandecodes
@anirvandecodes 13 күн бұрын
Thank you so much
@KBEERU
@KBEERU 13 күн бұрын
It's Short and Sweet and Very Descriptive. I have install as per the video. But i encountered with an error: Error from git --help: Could not find command, ensure it is in the user's PATH and that the user has permissions to run it: "git". Please let me know How to resolve this error.
@anirvandecodes
@anirvandecodes 13 күн бұрын
Thank you . So git path is not set properly , Check out this one : kzbin.info/www/bejne/oqWcoHd3q9V6aqs , Please share the playlist with your network which will help this channel to grow.
@KBEERU
@KBEERU 12 күн бұрын
@@anirvandecodes Thank you so much
@andriifadieiev9757
@andriifadieiev9757 15 күн бұрын
Great content, thank you for sharing! Special respect for github link
@anirvandecodes
@anirvandecodes 15 күн бұрын
thank you , please share the playlist with your LinkedIn network so that it reach to wider audience.
@ur8946
@ur8946 17 күн бұрын
hi how to setup Kafka ? Do we have any video on this ?
@anirvandecodes
@anirvandecodes 16 күн бұрын
Yes , Checkout this video to setup Kafka on confluent cloud : kzbin.info/www/bejne/o5qxZYqCnq-hiKc Playlist link : kzbin.info/aero/PLGCTB_rNVNUNbuEY4kW6lf9El8B2yiWEo
@hiteshkaushik7739
@hiteshkaushik7739 18 күн бұрын
Hey, great series, thanks. how can i make the producer, produce faster?
@anirvandecodes
@anirvandecodes 18 күн бұрын
There are a few configuration changes you can do Batching: Set linger.ms (5-50 ms) and increase batch.size (32-128 KB). Serialization: Opt for efficient formats like Avro or Protobuf. Partitions & Proximity: Add more partitions and deploy near Kafka brokers. In production people generally use more scalable solutions than just a Python producer app, Check out this : docs.confluent.io/platform/current/connect/index.html Do share the playlist with your LinkedIn community.
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
14 completed ❤
@anirvandecodes
@anirvandecodes 21 күн бұрын
You are making a great progress , Please share with your friends and colleagues
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
13 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
12 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
11 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
10 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
9 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
8 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
6 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
5 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
4 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
3 completed ❤
@ChinnaDornadula
@ChinnaDornadula 21 күн бұрын
3 completed ❤
@ChinnaDornadula
@ChinnaDornadula 22 күн бұрын
2 completed ❤
@ChinnaDornadula
@ChinnaDornadula 22 күн бұрын
1 completed ❤
@mihirit7137
@mihirit7137 22 күн бұрын
I have copied the yml file in the folder staging, marts, I am getting the conflict to rename the yml sources , how do we effectively define sources in the models
@anirvandecodes
@anirvandecodes 22 күн бұрын
Can you share the complete error text and project structure?
@mihirit7137
@mihirit7137 21 күн бұрын
@@anirvandecodes so in your video you pasted the yml file containing sources in the all the 3 folders, since the source is the same for all 3 files I just pasted the model sql files inside the folder and kept the yml file outside the folder so this resolved the error, I believe with the new dbt version you cannot have 2 yml files having the same source referencing the same table at the same folder level currently my folder structure looks like models -staging - - staging_employee_details.sql -intermideate - - intermideate _employee_details.sql -marts - - marts_employee_details.sql -employee_source.yml in the video you pasting the yml file in each 3 folders (staging, intermideate, marts) which gives naming_conflict_error your videos have been very informative, I went through the whole playlist was struggling to install dbt on my system and understand it thank you so much ! 😄😄
@anirvandecodes
@anirvandecodes 21 күн бұрын
@ i think you might have same spirce name mentioned in two place , take a look into that
@mihirit7137
@mihirit7137 21 күн бұрын
@@anirvandecodes dbt found two sources with the name "employee_source_EMPLOYEE". Since these resources have the same name, dbt will be unable to find the correct resource when looking for source("employee_source", "EMPLOYEE"). To fix this, change the name of one of these resources: - source.dbt_complete_project.employee_source.EMPLOYEE (models\marts\marts_employee_source.yml) - source.dbt_complete_project.employee_source.EMPLOYEE (models\staging\stg_employee_source.yml)
@mihirit7137
@mihirit7137 21 күн бұрын
should the source name always be unique ?
@Shivanshpandey-c4e
@Shivanshpandey-c4e Ай бұрын
bro, what if I don't want to share my data with confluent. Can we do the confluent kafka setup on premises?
@anirvandecodes
@anirvandecodes Ай бұрын
Absolutely , They call it self managed kafka , Check this out www.confluent.io/get-started/?product=self-managed
@iWontFakeIt
@iWontFakeIt Ай бұрын
best dbt playlist man! searched for a lot throughout youtube, no one comes closer to clarity of explanation!
@anirvandecodes
@anirvandecodes Ай бұрын
Made my day , Thank you , Do share with your network.
@iWontFakeIt
@iWontFakeIt Ай бұрын
@@anirvandecodes you deserve it man!
@Sunnyb-u8g
@Sunnyb-u8g Ай бұрын
How to see the column lineage?
@anirvandecodes
@anirvandecodes Ай бұрын
dbt core does not have any out of box column mapping lineage . You can explore column lineage in dbt cloud or check out this one tobikodata.com/column_level_lineage_for_dbt.html
@SnowEra-k9v
@SnowEra-k9v 2 ай бұрын
Hi @anirvan, thanks for your detailed explanation dbt concepts.which has helped me a lot
@anirvandecodes
@anirvandecodes 2 ай бұрын
Glad to hear that , Please share the content with your network.
@VikashKumar0409
@VikashKumar0409 2 ай бұрын
Complted the tutorials, I loved it. Please create more tutorials playlist for more topics.
@anirvandecodes
@anirvandecodes 2 ай бұрын
Thank you for the support , Yes I will be publishing content on spark structured streaming with kafka.
@VikashKumar0409
@VikashKumar0409 2 ай бұрын
loved your video, it cleared my doubt about sources and models and how we create sources.
@anirvandecodes
@anirvandecodes 2 ай бұрын
Glad it was helpful! , do share with your network
@guddu11000
@guddu11000 2 ай бұрын
Hi, I ran dbt dubug from command prompt and worked well, i am running from pycharm and getting error , The term 'dbt' is not recognized as the name of a cmdlet, function, script file
@anirvandecodes
@anirvandecodes 2 ай бұрын
looks like this is some pycharm path related issue , try to debug if path is coming properly in pycharm or you can also select different terminal as git bash , you can get more info on google
@VikashKumar0409
@VikashKumar0409 2 ай бұрын
This error generally comes when the path is not added in the system ,try to use stackoverflow or chatgpt and you can try to do with git bash
@nguyenkhiem2318
@nguyenkhiem2318 3 ай бұрын
Hey my man, just wanna say thanks for this whole series you did. Extremely helpful to people who are specifically looking for guidance in this new tool. Really appreciate your hard work man.
@anirvandecodes
@anirvandecodes 3 ай бұрын
Thank you so much, it really made my day :)
@ahmedmohamed-yo2hb
@ahmedmohamed-yo2hb 3 ай бұрын
hello I have question dbt doesn't recognize my model as incremental I using incremental modling to take snap shot of table row count and insert it to build time serise table contain row conut for every day
@anirvandecodes
@anirvandecodes 3 ай бұрын
I will upload one video on snapshot soon , check that out !
@sandeshbidave565
@sandeshbidave565 3 ай бұрын
How to achieve incremental insert in dbt without allowing duplicates base on specific columns?
@anirvandecodes
@anirvandecodes 3 ай бұрын
You can apply distinct in sql to remove the duplicates or use any other strategy to remove the duplicates
@mohammedvahid5099
@mohammedvahid5099 5 ай бұрын
Pleas teach on snowflake dbt integration and how dbt works on entire process SCD type 2 thnk u
@anirvandecodes
@anirvandecodes 5 ай бұрын
sure will create one video on it
@reddyreddy-np4zx
@reddyreddy-np4zx 5 ай бұрын
Man, amazing work. Can't wait....Subscribed! Do keep the videos coming, please?
@anirvandecodes
@anirvandecodes 5 ай бұрын
Thanks! Will do!
@reddyreddy-np4zx
@reddyreddy-np4zx 5 ай бұрын
I was looking for this and you are like a saviour. Thanks
@anirvandecodes
@anirvandecodes 5 ай бұрын
Glad I could help
@hemalathabuddula7923
@hemalathabuddula7923 5 ай бұрын
Hiii
@anirvandecodes
@anirvandecodes 5 ай бұрын
hello
@divityvali8454
@divityvali8454 6 ай бұрын
Are you teachjng dbt
@anirvandecodes
@anirvandecodes 6 ай бұрын
Yes I have a complete dbt playlist here : kzbin.info/aero/PLGCTB_rNVNUON4dyWb626R4-zrLtYfVLa
@jeseenajamal6495
@jeseenajamal6495 6 ай бұрын
Can you please share the dbt models as well
@anirvandecodes
@anirvandecodes 6 ай бұрын
sorry i lost the model file
@srinathravichandran8796
@srinathravichandran8796 7 ай бұрын
Awesome tutorials.. keep the good work going...when can we expect tutorials on other tools like airflow, airbyte etc ?
@anirvandecodes
@anirvandecodes 7 ай бұрын
thank you so much , I have two more videos dbt to complete the playlist , will plan after that
@balakrishna61
@balakrishna61 7 ай бұрын
Nice explanation
@anirvandecodes
@anirvandecodes 7 ай бұрын
Keep watching
@saketsrivastava84
@saketsrivastava84 7 ай бұрын
Very nice explained
@anirvandecodes
@anirvandecodes 7 ай бұрын
Thank you so much 🙂
@vshannkarind
@vshannkarind 8 ай бұрын
how to deploy code from DEV to QA to PRD , Please make video on this... thank you
@anirvandecodes
@anirvandecodes 8 ай бұрын
yes , i am in the process on making video on how to deploy dbt project on cloud. stay tuned!
@SaiSharanKondugari
@SaiSharanKondugari 8 ай бұрын
Hey Anirvan, Thanks for clearly explaining. I am currently learding dbt and I came across this question whether we can keep multiple where conditions in incremental load
@anirvandecodes
@anirvandecodes 8 ай бұрын
Yes, definitely. think it as a sql query with which you are filtering out the data.
@SaiSharanKondugari
@SaiSharanKondugari 8 ай бұрын
@@anirvandecodes Hello Anirvan, any code snippet or any format suggestion from your end??
@vidmichL
@vidmichL 8 ай бұрын
cool thanks
@anirvandecodes
@anirvandecodes 8 ай бұрын
No problem!
@SanjayChakravarty-v5f
@SanjayChakravarty-v5f 9 ай бұрын
Hello, i have a question how to do insert update and delete based on column other than a date. I am loading from Excel into postgres and generating a hashed column, every time a new record, or updated record a new hash key is generated for that column. I am trying to do an incremental update. Here is the select stmnt SELECT id, "name", alternate_name, description, email, website, tax_status, tax_id, year_incorporated, legal_status, logo, uri, parent_organization_id, hashed_value FROM organization; My CDC is based on hashed_column. Lets say name is changed in excel, when i load the data into postgres i get a new hashed key for the hashed_value column and similarly for a new record. How do i do my incremental load? any suggestion
@anirvandecodes
@anirvandecodes 9 ай бұрын
that concept is called as change data feed , you need to first find out the records which have changed , There are different techniques , like in sql you can do SELECT * FROM table1 EXCEPT SELECT * FROM table2 to find what rows have changes then only insert those records.
@lostfrequency89
@lostfrequency89 9 ай бұрын
It’s not 1.29 seconds. we have to also include other steps right like when we add salt key tf to both dataframe. That should be considered the full time
@anirvandecodes
@anirvandecodes 9 ай бұрын
Hi, in this technique we are not adding salt to both dataframe which is not needed , We are adding salt to one dataframe and we are exploding the other dataframe to do a join at the end.
@ionutgabrielepure996
@ionutgabrielepure996 9 ай бұрын
Thank you for the video, very helpful!
@anirvandecodes
@anirvandecodes 9 ай бұрын
thank you for watching, I am glad it helped you to understand the concept.
@syedronaldo4758
@syedronaldo4758 9 ай бұрын
So we can use: 'table' -- for truncate and load 'incremental' -- for append or insert 'incremental' with 'unique_key' -- for upsert or merge.... Is the above statement is right?
@anirvandecodes
@anirvandecodes 9 ай бұрын
Yes that is correct, keep watching