Рет қаралды 4,175
In this video we were discussed about storing our Final dataFrames into Hive Tables and Local MySQL
Hive is designed to handle large-scale datasets, including terabytes or even petabytes of data. If your DataFrame is expected to grow significantly in size or if you already have a large dataset, Hive's distribute
If your data ecosystem already involves other Hadoop components, such as HDFS, MapReduce, or Spark, Hive integrates well with these technologies. This allows you to leverage the broader Hadoop ecosystem for data processing, analysis, and integration with other tools and frameworks.
If you work with tools or applications that rely on SQL databases, storing your DataFrame in a local SQL database allows for seamless integration and sharing of data with these systems. Examples include reporting tools, business intelligence platforms, or web applications that interface with the database.
link for notes::
drive.google.c...
part1:
• Implementing Pyspark R...
part2:
• Implementing Pyspark R...
part3:
• Implementing Pyspark R...
part4:
• Implementing Pyspark R...
#azuredatabricks
#dataengineering
#dataanalysis
#pyspark
#pythonprogramming
#dataengineering
#dataanalysis
#pyspark
#python
#sql