I got lost at the very beginning, from the "print(os.listdir())" what i got as my output is very different from what you got
@stephanpetzchen460822 сағат бұрын
confusing
@ankur555Күн бұрын
Thanks for the video Soumil. Nice video. I have a similar requirement of reading from Kafka but in my case I processing using Glue Streaming with few transformations. I could not find any transformation your code. Can I assume that Hudi Streamer is kind of Low/Code feature. What is to be done if I have some transformations?
@SoumilShah21 сағат бұрын
You can use sql transformer in hudi to do that
@muhammadfahad847Күн бұрын
Very bad explain
@domfilho4442 күн бұрын
Very great explanation, thanks for Sharing. Quick question: Can a single Kinesis Data Stream ingest multiple Atlas Triggers (Being each trigger configured for 1 collection ), Or for each collection I need create 1 Kinesis Data Stream. Thanks in advance!
@harishhere5742 күн бұрын
hi bro been watching a few of your videos,very enlightening..im just confused on integrating mongo db with python could i contact you somehow to understand it better
@adilsaju2 күн бұрын
awesomeness
@althafalthu55543 күн бұрын
How to upgrade ELK without data loss
@emonymph69114 күн бұрын
reading other peoples questions is such a smart way of learning, thanks for sharing your journey
@emonymph69114 күн бұрын
wow LSM trees on parquet base... Hudi is going to run circles around the other meta tables.
@SoumilShah4 күн бұрын
I think approach is very smart I have already tested there is savings in term of storage and also writes are much faster
@user-zm1ng8zh6r5 күн бұрын
Great video Soumil
@SoumilShah5 күн бұрын
Glad you liked it
@polarbear9866 күн бұрын
You're very talented in explaining. Easy to understand.
@MojilShah6 күн бұрын
What is the use of ACL here?
@ElSrJuancho6 күн бұрын
For MoR, in commit time 2, where is F file registered if not in parquet or log?
@electricalsir7 күн бұрын
thanks
@lakshmiivaturi-o4t7 күн бұрын
Simply Awesome. Thanks man!
@pateladarsh14638 күн бұрын
Nice video , learnt a lot
@praqash8 күн бұрын
You are a real gem on KZbin. Your content is rare. I learned so from your videos which can only be found on official documents.
@SoumilShah8 күн бұрын
Thank you very very much sir
@ankur5558 күн бұрын
Can we run this code using PySpark?
@SoumilShah8 күн бұрын
Yes
@ankur5559 күн бұрын
Nice video, thanks. I tried this on AWS Glue. It is working but it is not able to generate Glue catalog based table + It generated many many small files (It is not automatically compacting)? Have you tried running this on AWS Glue?
@saurabhshinde71359 күн бұрын
Congratulations Soumil and Thank you for excellent content.
@SoumilShah9 күн бұрын
Thank you sir
@MrLeanhduclk149 күн бұрын
HI Soumil, Perfect solution Soumil. So my case, I have some streaming tables same like your demo, and after landing on S3, how can I do join them for further real time analytics ? Can Flink do it by select data from Sink table and join each other for further analytics ?
@DamosyTheFreckle9 күн бұрын
nope doesn't work, don't waste your time
@shyamgurunath587610 күн бұрын
You will reach more heights soumil… will be there to watch ❤
@SoumilShah9 күн бұрын
Thank you sir
@surajbhardwaj259910 күн бұрын
Sir you are amazing. Thanks for the content...
@SoumilShah10 күн бұрын
So nice of you
@employedgorilla10 күн бұрын
You deserve it bro
@SoumilShah10 күн бұрын
Thanks❤
@emonymph691110 күн бұрын
@@SoumilShah you're welcome. please make a video on where in the stack we should build data objects e.g. metadata layer or somewhere else. The idea is if we have to replace tech X with Y when X is outdated and Y is new and improved processing speed, how can we keep our tables in-tract and unchanged (assuming the storage layer remains unchanged)? full object rewrite is not fun.
@SoumilShah10 күн бұрын
@@emonymph6911sure thing !!
@SachinShukla23018710 күн бұрын
Amazing, I have good experience in Python but no video gave me the right insight or interest to understand these patterns thank you Soumil because of you I have learnt these things otherwise I was running away here and there.....
@SoumilShah10 күн бұрын
Thanks a lot Really thank you I mean it
@MrHatemfaheem11 күн бұрын
gtihub link not working
@electricalsir11 күн бұрын
essentially enjoyed
@KartikGautam11 күн бұрын
Hi Soumil, I am unable to access the pdf can you help me with that. Thanks
@electricalsir13 күн бұрын
good
@melojuan13 күн бұрын
what a legend!
@harivigsp793414 күн бұрын
can you please put a video on iceberg DR?
@rigseoservice14 күн бұрын
very annoying to watch. frequent switching between windows very stressing
@martingregson713614 күн бұрын
Do you bowl as fast as you talk?
@sarathju386715 күн бұрын
Thanks for posting this ❤ it
@SoumilShah15 күн бұрын
Thank you sir
@chandini76615 күн бұрын
Hi Soumil, Thank you for your detailed videos. Could you point to any resource that can help setup the IntelliJ for pyspark?
@4BroGame16 күн бұрын
Hey bro I cloned a website and now I am opening that website code in vs code editor but after doing the necessary editing only text is changing not the images. Like I am putting my image URL on the place of website image URL but after saving it and opening it with live server the preview is showing me the images of cloned website not mine and in inspect element it is showing the image code of cloned website not mine why. I am trying from 6 hours and nothing is works for me. Will you plz tell me how can I change the images and edit it.
@Vamsikuruva-d8b16 күн бұрын
After installing, when I try to run elasticsearch.bat file it is showing error like \Java\jdk-21.0.1 was unexpected at this time. But my jdk and java bin folder paths setted correctly in environment variables
@prasantkumarsrivastava592516 күн бұрын
yes, please slow down yourself in every respect pls
@krishnendudas857317 күн бұрын
Thanks for the video. It's a good one. Do you have any samples related to the scenario where we have to read the Avro data from a Kafka topic and upsert into the Hudi tables?
@BabaiChakraborty-ss8pt17 күн бұрын
amazing work @soumil. Thanks
@debmidya41117 күн бұрын
Hi Soumil, thanks for the video. Using openjdk 11 and Python 3.8. I can't see the table printed when run 'Creating Dataframe from List of Tuples'. Used Jupyter notebook as well as VS code Editor. Any idea.
@IleniaQuintero18 күн бұрын
Hello, I was looking at your video channel. We may be helping a company that uses secure images to increase supply chain security and help cloud native development. Would you be willing to help try their software, make a video, and help show devs how to use their tools? This is not an offer, but just to start a conversation about your willingness to take on sponsorship. Please provide me with your email if you are interested. You'd have a chance to look at their technology and decide if it's the type of software that you'd be interested in covering in your channel.