I like this video. As a data scientist, with data engineering responsibilities I see clear use cases for both tools. We use Sagemaker instead of DataBrew which should point to the differences. I would say that Glue studio is focused on data. There is no intention of doing anything with it except getting it, transforming it, and storing it (ETL). Jobs done. DataBrew is there for people who use data. For them, data is the tool, not the product. You want a dataset to get insights or train a model. The intention is different which also means that the skills and preferences of the users are different.
@vincenthuysmans21372 жыл бұрын
FYI: AWS Glue Studio also provides data preview btw. But I see that they have included it after this video was released.
@chriskondiah7413 жыл бұрын
By the way I love this video. I feel AWS has many redundant tools. And they should start to narrow their tools to limit confusion
@AWSTutorialsOnline3 жыл бұрын
I agree. It indeed sometime creates confusion due to duplicate capabilities.
@skiran63162 жыл бұрын
Uses of data lineage is that when we are collaborating with multiple teams and if we have multiple sources lineage would be a easier way to track where data is coming, transforming.
@prathapn016 ай бұрын
very informative sir... :)
@vivekjacobalex3 жыл бұрын
Ok thanks for the information. Now I understood, Databrew is more towards data preparation using ML. And data glue is more towards job processing using pyspark . And the similarity is both can do gui etl .
@AWSTutorialsOnline3 жыл бұрын
Glue can do limited ETL to S3 only.
@zpino2 жыл бұрын
Thanks a lot. Very clear.
@AWSTutorialsOnline2 жыл бұрын
Glad it was helpful!
@chriskondiah7413 жыл бұрын
What is the difference between Databrew and sagemaker Data Wrangler?
@AWSTutorialsOnline3 жыл бұрын
SageMaker Data Wrangler is part of SagaMaker Studio and it can be used to build end to end pipeline along with other components of pipeline such as model training, model deployment etc.. However - DataBrew is also for data scientist but it is only for feature engineering nothing else. Hope it helps.
@LittleBoodhaOne Жыл бұрын
Thank you for this informative video :) I would to submit a problem that i've experienced in Glue Databrew, if any of you can help it would be a blessing. Here's the situation : I would like to filter on a value of a column that isn't in the sample dataset. And I've found out that the recipe only focuses on the sample dataset. The fact that the sample is limited to only 5000 rows max, is preventing me from completing my recipe. Does somebody have an Idea on how to bypass the limits of the sample size ?
@grhaonan2 жыл бұрын
Another key difference is that DataBrew doesn't offer custom transformation I rekon ?
@vincenthuysmans21372 жыл бұрын
Nope, it doesn't. DataBrew is a no-code solution, where Glue Studio is hybrid (low-code/heavy-code)