am trying to grab an opportunity on data bricks, glad i found your channel. Your explanations are far better than these trainings
@rajasdataengineering75853 ай бұрын
Welcome aboard! Thank you
@gulsahtanay23419 ай бұрын
Explanations couldn't be better! I'm very happy that I found your work. Thank you Raja!
@rajasdataengineering75859 ай бұрын
Hope it helps you learn the concepts! Thanks
@raviyadav255212 күн бұрын
I found the explanation very detailed, grt work ,keep it up sir
@rajasdataengineering758511 күн бұрын
Thank you, Ravi! Keep watching
@patiltushar_8 ай бұрын
Sir, you way of teaching is fabulous. Earlier i learnt spark, but your teaching is better than that.
@rajasdataengineering75858 ай бұрын
Thanks and welcome! Glad to hear that
@omprakashreddy42303 жыл бұрын
what an explanation sir ji !!! Please continue making videos on adb. Thanks a lot !!
@rajasdataengineering75853 жыл бұрын
Thanks Omprakash. Sure, will post more videos
@shivayogihiremath4785 Жыл бұрын
Superb! Concise content, properly explained! Thank you very much for sharing your knowledge! Please keep up the good work!
@patnaik476 Жыл бұрын
Your videos are lifesavers .. !!
@rajasdataengineering7585 Жыл бұрын
Thank you
@Jaipreksha Жыл бұрын
Excellent explanation. ❤❤
@rajasdataengineering7585 Жыл бұрын
Glad it was helpful!
@VinodKumar-lg3bu Жыл бұрын
Neat explanation to the point ,thanks for sharing
@rajasdataengineering7585 Жыл бұрын
Glad it was helpful! You are welcome
@Ustaad_Phani3 ай бұрын
Nice explanation sir
@rajasdataengineering75853 ай бұрын
Thank you! Keep watching
@sravankumar17673 жыл бұрын
Nice explanation bro.. simply superb
@battulasuresh93062 жыл бұрын
Raja Sir, hope these videos all are in series
@rajasdataengineering75852 жыл бұрын
Yes all videos are in series
@kketanbhaalerao Жыл бұрын
Very Good Explanation!! really great >> Can anyone please share those csv file/ link. Thanks in advance.
@ravisamal3533 Жыл бұрын
nice explanation!!!!!!!!!
@rajasdataengineering7585 Жыл бұрын
Glad you liked it!
@abhinavsingh1173 Жыл бұрын
Your course it best. But problem with you course is that you are not attching the github link for your sample data and code. Irequest you as your audience please do this. Thanks
@amiyarout217Ай бұрын
yes please give us github link
@wolfguptaceo19 күн бұрын
How entitled can you be? Did you put in money in this selfless teacher's pocket?
@sumitchandwani9970 Жыл бұрын
Awesome
@rajasdataengineering7585 Жыл бұрын
Thanks!
@shahabshirazi64412 жыл бұрын
Thank you very much! very helpful!
@rajasdataengineering75852 жыл бұрын
Thanks for your comment
@AtilNL6 ай бұрын
To the point explanation. Thank you sir! Have you tried to import using sql from a sharepoint location?
@rajasdataengineering75856 ай бұрын
No, I haven't tried from SharePoint
@NikhilGosavi-go7be3 ай бұрын
done
@MrTejasreddy2 жыл бұрын
Hi raja really enjoyed u r content information is very clear and clean explanation...one of my frd refered u r channel..really nice...but i noticed that pyspark playlist some off the videos are missed...if possible pls check on it..thanks in advance.
@rajasdataengineering75852 жыл бұрын
Hi Tejas, thank you. Few videos are related to azure synapse analytics. So they might not be part of Pyspark playlist
@nurullahsirca88196 ай бұрын
thank you for your great explanation. I love it. How can I reach the data and code snippets? where do you share them?
@deepanshuaggarwal70425 ай бұрын
Can you please explain in the video why these many job and stages are created. To understand internal working of spark is very necessary for optimisation purpose
@upendrakuraku6052 жыл бұрын
Hi bro , it was nice explanation..👍 Can you please help on below points points to cover in Demo : how to read CSV, TSV, Parquet, Json, Avro file formats, how to write back, how you can add unit tests to check transformation steps outputs, how to read DAG, how to work with Delta tables, how to create clusters
@rajasdataengineering75852 жыл бұрын
Sure Upendra, I shall cover all these topics
@upendrakuraku6052 жыл бұрын
@@rajasdataengineering7585 day after tomorrow I have to give demo on this can you please solve this as soon as possible 🙏
@battulasuresh93062 жыл бұрын
Please acknowledge This will help to lot of people All videos are in series or not?
@rajasdataengineering75852 жыл бұрын
Yes all videos are in series
@KumarRaghavendra-u9l Жыл бұрын
can you able send those Csv files. i will try in my system.
@lalitsalunkhe94224 ай бұрын
Where can I find the datasets used in this demo? is there any github repo you can share?
@suman33163 жыл бұрын
please upload the github link of these files also..
@pcchadra Жыл бұрын
when I am runing schema_alternate in ADB notebook its throwing error [PARSE_SYNTAX_ERROR] Syntax error at or near 'string'.(line 1, pos 24) am i missing something
@himanshubhat3252 Жыл бұрын
Hi Raja, I have a query, that while writing data to csv format, the csv file contains the last line as blank/empty, ( Note : data is ok, but seems last line blank/empty is the default nature of spark ) Is there any way to remove that last blank line while writing the csv file.
@rajasdataengineering7585 Жыл бұрын
Usually it doesn't create empty line. There should be specific reason in your use case. Need to analyse more to understand the problem. Using python code, we can remove last line of a file.
@himanshubhat3252 Жыл бұрын
@@rajasdataengineering7585 I tried writing csv file using PySpark on Databricks, when i downloaded the file on my system and tried to open it using Notepad++, it shows the last line as blank / empty
@patiltushar_8 ай бұрын
Sir, could you share all those datasets used with us for practice purpose, it will be helpful for us.
@ANKITRAJ-ut1zo Жыл бұрын
could you provide the sample data
@PraveenKumar-ev1uv25 күн бұрын
How to get the opprtunity to work on databricks with pyspark..what all real time scenarios to get started with?
@Aspvillagetata2 жыл бұрын
Hi bro I have some facing issues reading all CSV files and the same all files how to write delta format finally. Finally how delta tables access user view in table format?
@rajasdataengineering75852 жыл бұрын
Hi Pinjari, you can keep all CSV files under a folder and create a dataframe by Spark reader. Then write that dataframe into some other folder in delta format. Delta format is actually parquet file internally. After creating delta table as above, you can use SQL language to do any analytics
@ramsrihari17102 жыл бұрын
Hi Raja, Nice video.. quick questions.. What if I want to override the existing schema? Also if we add schema in the notebook, will it not be created over and over whenever the notebook is executed? Is there a way to have it executed one time?
@DillipBarad-f1m7 ай бұрын
Sir, Can we get practice notebook?share with us?
@vamsi.reddy11002 жыл бұрын
aa intro sound theesivesi manchi pani chesav anna
@subhashkamale3862 жыл бұрын
Hi Raja...I hav some doubt..I wanted to read and display a particular column in data frame...could you please tell me which command should I use... 1. To read single column in data frame 2. To read multiple columns in data frame
@rajasdataengineering75852 жыл бұрын
Hi Subhash, you can use select command in dataframe to read specific columns
@subhashkamale3862 жыл бұрын
@@rajasdataengineering7585 could you pls send me the command...I am giving different sytaxes but getting error...I am giving below command df.select(column name)
@rajasdataengineering75852 жыл бұрын
You can give df.select(df.column_name) There are different approaches to refer a column in dataframe. We can prefix dataframe name in front of each column in this method. You can try this method and let me know if still any error
@subhashkamale3862 жыл бұрын
@@rajasdataengineering7585 ok Raj..I am trying this in spark data bricks...will let you know if it is working fine..thanks for ur response
@rajasdataengineering75852 жыл бұрын
Welcome
@hkpeaks Жыл бұрын
What is time required if loading billions of rows?
@rajasdataengineering7585 Жыл бұрын
It is depending on many parameters. One of the important parameter is your cluster configuration
@hkpeaks Жыл бұрын
@@rajasdataengineering7585 My desktop PC can process a 250GB Seven billion-row csv kzbin.info/www/bejne/Z3-5YaqhfM-qpbM (for this use case, 1 billion-row/minute)
@keshavofficial4542 Жыл бұрын
hi bro, how can i find those csv files?
@nitinpandey485727 күн бұрын
How does spark.read differ from spark.load?
@البداية-ذ1ذ3 жыл бұрын
Can you mention full projects done by pyspark
@areeshkumar-n5e9 ай бұрын
can you provide sample data as well
@sachinchandanshiv75782 жыл бұрын
Hi Sir, Can you please help in understanding the .egg and .zip files we use in --py-files while spark-submit job. Thanks in advance 🙏
@sk3489011 ай бұрын
Hi Raja where can we access files for practice
@ANJALISINGH-nr6nk10 ай бұрын
Can you please share these files with us?
@shaulahmed498610 ай бұрын
same request for me as well
@SurajKumar-hb7oc Жыл бұрын
What is the solution If I am reading two files with different column names and different number of columns with a single command? Because I am finding inappropriate output. Please...
@4abdoulaye2 жыл бұрын
What happen if you read multiple files that do not have same schema?
@rajasdataengineering75852 жыл бұрын
The rows which don't have same schema will go to corrupted record
@4abdoulaye2 жыл бұрын
@@rajasdataengineering7585 Thanks sir. 😎❤
@rajasdataengineering75852 жыл бұрын
Welcome
@vinoda34804 ай бұрын
can i get files which your are worked for demo
@tripathidipak11 ай бұрын
Could you please share the sample input files.
@SPatel-wn7vk8 ай бұрын
please provide ideas to make project using Apache Spark
@NetNet-sn3nd2 ай бұрын
Can you share this CSV file in drive for practice
@bashask21213 жыл бұрын
Can you please provide sample data in the video description
@rajasdataengineering75853 жыл бұрын
Sure Basha, will provide the sample data
@varun89522 жыл бұрын
@@rajasdataengineering7585 , Thanks for sharing the video, is there any GIT link with the data sets and the files you used in the tutorial? If so, could you please share?
@dataengineerazure29832 жыл бұрын
@@rajasdataengineering7585 Please provide sample data. Thank you
@dinsan4044 Жыл бұрын
Hi, Could you please create a video to combine below 3 csv data files into one data frame dynamically File name: Class_01.csv StudentID Student Name Gender Subject B Subject C Subject D 1 Balbinder Male 91 56 65 2 Sushma Female 90 60 70 3 Simon Male 75 67 89 4 Banita Female 52 65 73 5 Anita Female 78 92 57 File name: Class_02.csv StudentID Student Name Gender Subject A Subject B Subject C Subject E 1 Richard Male 50 55 64 66 2 Sam Male 44 67 84 72 3 Rohan Male 67 54 75 96 4 Reshma Female 64 83 46 78 5 Kamal Male 78 89 91 90 File name: Class_03.csv StudentID Student Name Gender Subject A Subject D Subject E 1 Mohan Male 70 39 45 2 Sohan Male 56 73 80 3 shyam Male 60 50 55 4 Radha Female 75 80 72 5 Kirthi Female 60 50 55
@SurajKumar-hb7oc Жыл бұрын
I am writing code for the same data but find inappropriate output. What is the solution ?
@surajpoojari518210 ай бұрын
I am not able to create a folder in pyspark community edition in DBFS File system please tell me how to do it and not able to delete existing files