Snowflake Vs Databricks - 🏃‍♂️ A Race To Build THE Cloud Data Platform 🏃‍♂️

  Рет қаралды 164,946

Seattle Data Guy

Seattle Data Guy

Күн бұрын

Пікірлер
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
If you need data infra or data strategy help, then feel free to set-up a free consultation! - calendly.com/seattledataguy/30-minute-meeting-requirements-yt
@markwan7141
@markwan7141 Жыл бұрын
Love this. I'm a Solutions Architect and the information you give out is priceless and accurate. Well done!
@mg00
@mg00 2 жыл бұрын
Micropartitions is just another name for... partitions aka shards, something that all databases use to scale out (because that's the only way to do it). There have been dozens of databases doing this for data warehouse/OLAP uses for decades. Snowflake (like BigQuery) was more powerful because of cloud-native scaling rather than provisioning real hardware, not just separating storage and compute. The other thing was usability from being in the cloud and running on object stores with features like zero-copy clones, sharing datasets across companies, etc.
@permiek
@permiek 2 жыл бұрын
Spark (and therefore Databricks) is really a game changer
@kevintaylor463
@kevintaylor463 Жыл бұрын
Snowflake is for BI and more traditional analytics. Excels in data warehousing, storage, analytics. Data Warehouse engineers and Data Analysts. Databricks is for big data processing (machine learning, AI workloads). Data Engineers & Data Scientists.
@19FizZ91
@19FizZ91 7 ай бұрын
Ty!
@thomsondcruz5456
@thomsondcruz5456 2 жыл бұрын
Snowflake provides credit usage per second for each T-shirt size no VM costs to run the service. Databricks has its DBU cost per second for it’s service plus there is an underlying cloud VM cost to run databricks. This could be a plus or minus because databricks allows you to select the type of VM you want - Compute optimised vs Memory Optimised
@brothermalcolm
@brothermalcolm 2 жыл бұрын
I must be watching too many of your videos lately - searching for databricks landed me almost straight here
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
That's pretty impressive considering I just uploaded this 👀
@Rex_793
@Rex_793 2 жыл бұрын
@@SeattleDataGuy you were top on my recommendations too; the youtube algo knows. 🏃
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Thanks for your support!
@andrew3068
@andrew3068 Жыл бұрын
our boy has been hitting the gym. loved the philosophical background on them at the beginning.
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
Just a little
@riffsoffov9291
@riffsoffov9291 2 жыл бұрын
Thanks for the video. Also thanks for the relevant title, rather than clickbait or one tailored for youtube's algos. I found you because Codestrap mentioned you (youtube's algos are useless for finding people who know what they're talking about).
@alexanderpotts8425
@alexanderpotts8425 2 жыл бұрын
coming from an SSIS background I figured I'd end up landing in snowflake when I got to working with cloud tech, but I've really, really taken to spark and databricks. even if writing pipeline code in notebooks scares me sometimes
@Aussified
@Aussified 2 жыл бұрын
SSIS is more aligned with what Azure Data Factory does and ADF is tightly integrated with Azure Databricks.
@pavankumar-ni3my
@pavankumar-ni3my Жыл бұрын
I primarily do ETL activities for a product based org and would like to know when to use Databricks and Snowflake or rather how DB/Snowflake can help ETL engineers who migrate data from one product to another. I couldn't understand this video properly because I was expecting a one line answer on when one should use Databricks/Snowflake
@agilejro
@agilejro 2 жыл бұрын
I feel Streamlit has helped increase the depth and specialty application possibilities for Snowflake while keeping the streamline usability, do you feel this is true?
@digithat6496
@digithat6496 2 жыл бұрын
Been loving snowpark sno far. It's like a wrapper of spark but by snowflake.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Any cool use cases that you have implemented? I haven't got to really play as deep in snowpark as I would like.
@digithat6496
@digithat6496 2 жыл бұрын
@@SeattleDataGuy Yeah I love the new UDTF functions on snowfpark where you can like call python written udtf in SnowSQL with your py files or other packages in it. It's more like macros on dbt. Was able to join multiple tbales on different snowalke servers using snowpark and push it to a prod warehouse. I'm thinking to implement complex udfs like I did with pandas hopefully that works.
@michakrzeminski2857
@michakrzeminski2857 2 жыл бұрын
its not spark underneath
@robwozniak9912
@robwozniak9912 Жыл бұрын
Well done! You very succinctly summarized the essence of both platforms.
@eklok5000
@eklok5000 2 жыл бұрын
Awesome video and comparison! Was missing some details on their features in the sense of functionality e.g. a Delta lake might have time travel while snowflake has some pretty Advanced sql functionality that one might use for gdpr hashing. On these „implementation“ details i Would love to see some content. Also bq vs snowflake would be very interesting i guess
@cyclonus01
@cyclonus01 2 жыл бұрын
Thanks for the overview. Our organization went through a very contentious evaluation internally between these two platforms. Ultimately ended up going with SnowFlake due to internal politics. Would also love to see a video comparing all four: bigquery redshift snowflake databricks
@gj4king1
@gj4king1 2 жыл бұрын
Which would you have gone with?
@jamaswin88
@jamaswin88 2 жыл бұрын
What would have gone with without the politics
@cyclonus01
@cyclonus01 2 жыл бұрын
@@gj4king1 probably DataBricks.
@cyclonus01
@cyclonus01 2 жыл бұрын
@@jamaswin88 probably DataBricks.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
That'd be an intense video!
@HannayGwyn
@HannayGwyn Жыл бұрын
"Using tools for what they're for", - yes, 100% ❤
@EverettRo
@EverettRo 29 күн бұрын
This is great. Very helpful. Thanks.
@SeattleDataGuy
@SeattleDataGuy 8 күн бұрын
thank you!
@seasql
@seasql 2 жыл бұрын
need both databricks and snowflake for any enterprise application.. databricks ingestion the data from various sources and processing the data and stores in any database such as snowflake and postgres... snowflake is mainly used to OLAP application.. we cannot stores the data in delta table in databricka.. it's taking more time to return the data from delta table due to the more processing steps involved in spark.. job,stages,task,partition and cores but these kind of processing steps is not involved in snowflake and snowflake return data at earliest..
@kamusial
@kamusial 2 жыл бұрын
have you tried Databricks SQL warehouse?
@user-zv9um9pb6w
@user-zv9um9pb6w 2 жыл бұрын
You missed core difference's. Databricks has a strong flexible offering that includes, ml, geospatial, ect. Serverless, ephemeral clusters ect.. snowflake does hosted data warehouse well... Thats what makes it good. You don't want to mess with managing complexity use snowflake ( with databricks) if not use databricks
@SreejaThumma
@SreejaThumma 5 ай бұрын
Can you also make a video on the difference between DataBricks, Snowflake and Solix technologies
@janardhand6200
@janardhand6200 2 жыл бұрын
Big like from india.. nice presentation
@badoiuecristian
@badoiuecristian 2 жыл бұрын
Databricks is clearly the winner here. The Snowflake offers simplicity but at the cost of customisation. If u have an app and want to optimise the cost of usage by selecting specific hardware for processing big data you can't do it in Snowflake as you can't touch the back end. I am certified in both and by now there isn't even a realistic comparison. Snowflake plays catch-up to spark. Not even mentioning the python native api that u can use with Databricks to do anything vs sql like syntax u need to use with snowflake.
@ryanromero2049
@ryanromero2049 2 жыл бұрын
What do you think about Palantir?
@krishnayogi
@krishnayogi Жыл бұрын
Indeed databricks is giving run for money to the big cloud guys at this moment
@alfredsfutterkiste7534
@alfredsfutterkiste7534 Жыл бұрын
You make however a great argument in favor of Snowflake in general. Simplicity and hassle-free data warehousing are hugely attractive to companies struggling to attract and retain data engineering talent. Just pull in tables with Stitch, Fivetran or Airbyte and let a group of SQL monkeys dbt the data into shape. Pay the higher cloud bill because these SQL monkeys hired obv. can’t optimize queries but it doesn’t matter for the most part. Cheaper than hiring experienced data engineers who can turn all the knobs of your dwh to optimize it.
@neech235
@neech235 Жыл бұрын
@@alfredsfutterkiste7534 you can do exactly that in Databricks too though- it is actually more flexible to talent diversity as it supports all of the most popular languages used by analysts, including SQL. It has the simplicity that Snowflake offers but also far more depth and customisation
@badoiuecristian
@badoiuecristian Жыл бұрын
@@alfredsfutterkiste7534 well I disagree with the statement that hassle-free is better if you get limited fine tuning opportunity. Yes if you work with 2FB of data you don't care but as soon as you start working with TB of data which have to feed into an app that a user is frequently using then you feel the limitations. In my humble opinion you can have simple aka default settings but still leave the option for fine tuning.
@nicky_rads
@nicky_rads 2 жыл бұрын
Nice video detailing two very popular data products, thanks ! Personally a big fan of notebook dev, but as you explain both products are solid and can serve customers well.
@arjunrattan8190
@arjunrattan8190 Жыл бұрын
Very informative. Curious, which Netflix show's clip you put in the scaling section?
@rajandalawai5640
@rajandalawai5640 Жыл бұрын
I suggest that topics such as this that are heavy on content, it is better to go a little slower
@tahirashafmithapuri2686
@tahirashafmithapuri2686 2 жыл бұрын
Good analysis but I want to add as we have been heavy snowflake users. Features like replication, Time-Travel and Data Shares do add a lot of value to our ecosystem.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Data sharing is honestly great! I have seen several teams save so much development time because of that feature!
@deemahdee
@deemahdee 2 жыл бұрын
Can you make any comparisons or parallels with Palantir?? Not sure how Palantir competes with these companies
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Been working to get access to palantir
@thomasadams6860
@thomasadams6860 2 жыл бұрын
Have experience will a 3. Palantir is databricks but with a ui and a lot more walled garden. Snowflake is a good warehouse but starts to fail at massive scales. You need iceberg or delta with trino spark to do many trillions of rows.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
​ @Thomas Adams I would agree with the palantir to databricks similarity. Its the fact that they treat everything like files 😆
@deemahdee
@deemahdee 2 жыл бұрын
@@thomasadams6860 what do you mean walled garden?
@thomasadams6860
@thomasadams6860 2 жыл бұрын
@@deemahdee Palantir wants you to stay strictly in their ecosystem for everything and make it very difficult to use different things easily.
@isaactucker-rasbury710
@isaactucker-rasbury710 Жыл бұрын
Thank you for this video. Looking at options for my team and this is really useful!
@KearlHat
@KearlHat Жыл бұрын
Hi SeattleDataGuy - Which product has the best support? From the vendor, partner & community perspective? Thanks!
@Skandawin78
@Skandawin78 9 ай бұрын
Excellent video, thanks
@SeattleDataGuy
@SeattleDataGuy 9 ай бұрын
thank you!
@thomasnourse7506
@thomasnourse7506 2 жыл бұрын
Snowflake also has clustered warehouses... not more work... and also, the bigger the warehouse the faster performance therefore the same cost, just data retrieved faster... you need to catch up.
@bharanidharanm2653
@bharanidharanm2653 9 ай бұрын
I got into a LinkedIn thread dispute where a snowflake executive replied calling out snowflake is not proprietary and whoever calls it is a canard as per his viewpoint
@xd4rkneo
@xd4rkneo 8 ай бұрын
One thing that I have not understood so far is whether Databricks can be perceived as a data virtualization tool (since they are not promoting it to be such). Yet to my understanding, Databricks is a perfect example for data virtualization. Am I wrong here?
@juliustuckayo8973
@juliustuckayo8973 2 жыл бұрын
Waking up to another nugget of gold here
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Glad you enjoyed it!
@HistoricCookie992
@HistoricCookie992 2 жыл бұрын
Hello SDG, Would you suggest IBM Data Engineering followed by IBM Data warehousing and GCP at the last on coursera or DataCamp Data Engineering track ? Time isn't a problem. Yes, I'm switching career. I'm already learning Python and Sql already.
@norbinn
@norbinn 5 ай бұрын
In terms of data consulting, do you find more clients in need of Snowflake or Databricks expertise? Is there any correlation with the size / price point of the project?
@ig2947
@ig2947 Жыл бұрын
Thank you for great video. So if I have Databricks with Deltalake, do I need a seperaste data warehouse ? I mean will I be missing anything, by not having a DW ?
@scarothers20
@scarothers20 2 жыл бұрын
Both platforms are top notch. But at the end of the day Databricks takes the cake.. And If done correctly, MUCH more cost effective than Snowflake.
@scarothers20
@scarothers20 Жыл бұрын
@Audrey Delou until your stuck with Snowflakes proprietary data formats. Without being biased, ill stick with open source route. Snowflake cost is everyone's main complaint..
@laananihamza1192
@laananihamza1192 Жыл бұрын
Hi there thank you so much for this great video, can you please tell me how databricks/snowflake could help me as a power bi/tableau developper ? and why we should consider them instead of a more simpler etl tool like power query for example ? Thanks again
@ashishk81
@ashishk81 Жыл бұрын
What databrick do differently which gcp, azure or AWS can't do ?
@victoriwuoha3081
@victoriwuoha3081 2 жыл бұрын
Nicely explained.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Glad it was helpful!
@JohnKonye
@JohnKonye 2 жыл бұрын
Do much content around Data Security?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Not currently. But I am aiming to dig into a few options in this space.
@cargouvu
@cargouvu 6 ай бұрын
I am using a pivot on data in snowflake. It's causing the values in the rows to show as strings. How do I call these columns? How can I remove the quotes after the pivot?
@houmandehesh740
@houmandehesh740 Жыл бұрын
this is amazing thank you
@lord_voldemort44
@lord_voldemort44 2 ай бұрын
could we get an updated video on this lol
@SebastianWallkotter
@SebastianWallkotter 7 ай бұрын
I must disagree with the diminishing returns in Snowflake. From XS - 4XL I have thus far experienced a consistent inverse linear relationship, i.e., increase the cluster by one t-shirt size -> compute time reduces by 50%. Occasionally performance of the bigger cluster was even better, because more RAM means less disk spill for certain jobs, but I never noticed the inverse. This worked for small jobs (couple thousand rows moved between tables) all the way to our currently largest runs (joining around 1B rows against around 200B+ rows).
@EderFerreiraDias
@EderFerreiraDias 2 жыл бұрын
Great video.
@neuera9556
@neuera9556 3 ай бұрын
What is the output its snowflakee or databricks
@qwerty99060
@qwerty99060 2 жыл бұрын
For data engineer beginner, aws is better or azure?
@fa11en1ce
@fa11en1ce 2 жыл бұрын
They are the same, but there are more AWS jobs
@tomthetitan101
@tomthetitan101 2 жыл бұрын
AWS has some fantastic free learning resources on EdX - I'd start with the cloud practitioner course if you know nothing.
@spencerhan6983
@spencerhan6983 2 жыл бұрын
Microsoft runs the enterprise, aws runs the internet. Now it's your choice for your career vision.
@culpritdesign
@culpritdesign 2 жыл бұрын
Azure data factory is very easy to learn and use
@SS-ud6nt
@SS-ud6nt 2 жыл бұрын
Aws
@josh-cc9oy
@josh-cc9oy 2 ай бұрын
Scrap all of this get oracle autonomous DB
@geehaf
@geehaf 2 жыл бұрын
Like what you do. :)
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Thank you!
@JoeG2324
@JoeG2324 2 жыл бұрын
I'm thinking about having my team get started with databricks, but concerned about the cost. I think we get 3000 a month to spend, but after we start getting billed. Does anyone know how quickly that bill starts to add up? We're not doing any kind of streaming of data. we'd be mostly using it to run jobs once a day. Some of the queries can take anywhere from 5 to 10 minutes to run and then they output to various tables. I just don't want my department getting hit with a 30,000 dollar bill or something at month end
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Have you talked to a databricks account executive? When you talk to them, they should give you some perspective on controlling costs. Also there are things you can do such as autotermination(similar to snowflake) which will shutdown your clusters when its not running. There are also lots of other things you can do to avoid costs such as putting limits. Personally I know a lot more about Snowflake cost management(which based on what I have looked up is far easier). But overall they both want you to spend as much as possible but they don't want you to leave. So when you talk to an AE, make sure they give some documentation on that(otherwise you have a terrible AE). docs.databricks.com/clusters/clusters-manage.html
@JoeG2324
@JoeG2324 2 жыл бұрын
@@SeattleDataGuy hey thanks. yeah, I need to setup a meeting with one of the account manager. hopefully, we can implement it without incurring too many costs. it looks like a great product.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
@@JoeG2324 Are you going the machine learning route or the data warehouse route? I am curious why the databricks choice vs other options? What was the tipping point or feature that really made it make sense?
@JoeG2324
@JoeG2324 2 жыл бұрын
@@SeattleDataGuy thanks,we're going the datawarehouse route, my team doesn't handle any machine learning. The reason we're going with databricks is because my company already has a relationship with them as many other groups in the company use it. They have much larger budgets than my team so cost is def a concern
@jimthorstad
@jimthorstad 2 жыл бұрын
@@JoeG2324 Hi Joe, once you schedule your jobs the Databricks clusters start and stop automatically and they also auto scale so if you’re data volume varies from day today you won’t have to oversize or risk under sizing your clusters. I’m sure you’re going to have fun setting up your first use case. If you have any trouble finding the right contacts at Databricks feel free to PM me and I’ll get you connected.
@gabrieldurkin7366
@gabrieldurkin7366 3 ай бұрын
spark is so 2010... there has to be a better 2024 solution to big data ETL and exploration - lazy execution is not suited to analytical /exploratory work (databricks sits on spark, as does snowpark)
@culpritdesign
@culpritdesign 2 жыл бұрын
Azure synapse is a great contender
@vaibhavs120
@vaibhavs120 7 ай бұрын
But it's expensive as compared to snowflake
@GoldJSilverio
@GoldJSilverio 2 жыл бұрын
i felt this was a bit unstructured ....
@AnkitYadav-nh5by
@AnkitYadav-nh5by 2 жыл бұрын
Hey SDG, have you looked at Cloudera’s Cloudera Data Platform (CDP). What are your thoughts on that ?
@shreshti82
@shreshti82 2 жыл бұрын
CDP is based on Spark and other open source solutions. We reviewed it but given up, as there is a lot of management work. It's better to head to Snowflake / Databricks based on your current situation of data, if it's in cloud / on-prem.
@SelvarajPrabhuMoorthy
@SelvarajPrabhuMoorthy 6 ай бұрын
Snow flakes simply Structured data for analytics
@thomasnourse7506
@thomasnourse7506 2 жыл бұрын
Snowflake is on AWS, Azure, and GCP now.... just FYI
@FranckCHAILLAT
@FranckCHAILLAT 11 ай бұрын
The issue with Snowflake in my opinion: is that it is too abstract and too over simplified, i mean honestly it's good to work with, and you can do great things really fast. But then, after the rush of migration, when you want to scale and optimize the cost and performance of your workloads because it starts to be used for real. Then you realize that the only way to have better performance it to change warehouse size. Of course you can try to tune your queries you'd say, but the query plan is not very chatty, it's very minimalistic to say the least. You cannot tune the engine to fit your need, as you can do with Spark with partition related tuning of playing with different joining strategies ... You have absolutely no control on things like codecs, or compression, or dataformats ... And when you ask Snowflake support to know more about what is going on underneath, then you understand that, it is not going anywhere. My point is, that you'll have a very limited number of cost-free options to optimise your jobs and queries. And it can be ok for some companies because at the end of the day, you pay what you get, but it can also be very frustrating for others.
@echoestakingls1218
@echoestakingls1218 2 жыл бұрын
PALANTIR?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Just filmed a video, need to edit and get approved
@unitycatalog
@unitycatalog 2 жыл бұрын
Databricks is light years ahead of snowflake
@adamesd3699
@adamesd3699 Жыл бұрын
Different people have different perspectives on this, and it all depends on use case anyway.
@manichand1996
@manichand1996 3 ай бұрын
Light year is distance and not time sir
@unitycatalog
@unitycatalog 3 ай бұрын
@@manichand1996 cope
@Babayagaom
@Babayagaom 5 күн бұрын
Only thing that can challenge databricks now is fabric
@dave11F
@dave11F Жыл бұрын
Unfortunate that snowflake does not support the R language.
@user-bs8ku6cg9f
@user-bs8ku6cg9f 2 жыл бұрын
where is palantir??
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I finished my first video with them, waiting for some reviews from them
@user-bs8ku6cg9f
@user-bs8ku6cg9f 2 жыл бұрын
@@SeattleDataGuy okey thanks buddy
@TheCheukhin
@TheCheukhin Жыл бұрын
Databricks completely defeated Snowflake.
@vancediaz7656
@vancediaz7656 2 жыл бұрын
✋ p̾r̾o̾m̾o̾s̾m̾
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Not sure what this means
@plovecricket
@plovecricket 3 ай бұрын
This video is waste of time
@kylelarson5074
@kylelarson5074 2 жыл бұрын
Both inferior solutions, I dont see either of these companies offering natural language processing applications for no code operational data analysis as other more innovative ML and A.I. native companies have.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Please do share some of the solutions you prefer!
@kylelarson5074
@kylelarson5074 2 жыл бұрын
@@SeattleDataGuy I am not as generous as you are 😉
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
@@kylelarson5074 Well at least you're honest 😆
@vijayrajan5792
@vijayrajan5792 2 жыл бұрын
I would like to connect & talk with you on LinkedIn.
Perfect Pitch Challenge? Easy! 🎤😎| Free Fire Official
00:13
Garena Free Fire Global
Рет қаралды 97 МЛН
When u fight over the armrest
00:41
Adam W
Рет қаралды 29 МЛН
FOREVER BUNNY
00:14
Natan por Aí
Рет қаралды 21 МЛН
What type of pedestrian are you?😄 #tiktok #elsarca
00:28
Elsa Arca
Рет қаралды 30 МЛН
Snowflake vs Databricks - And the Battle For Iceberg
10:22
Seattle Data Guy
Рет қаралды 17 М.
How Snowflake Broke Warren Buffet's Lifelong Rule
18:07
Modern MBA
Рет қаралды 273 М.
Do NOT Learn Kubernetes Without Knowing These Concepts...
13:01
Travis Media
Рет қаралды 323 М.
Vocabulary for Data Engineers - Data Engineering 101
15:11
Seattle Data Guy
Рет қаралды 42 М.
What is Snowflake? | 10 minute Snowflake Tutorial
12:06
Mastering Snowflake
Рет қаралды 47 М.
Why Everyone Cares About Snowflake
11:41
Seattle Data Guy
Рет қаралды 115 М.
Data Warehouse vs Data Lake vs Data Lakehouse
9:32
Jesper Lowgren
Рет қаралды 51 М.
Perfect Pitch Challenge? Easy! 🎤😎| Free Fire Official
00:13
Garena Free Fire Global
Рет қаралды 97 МЛН