Looking for help with your team's data strategy? → www.kahandatasolutions.com Looking to improve your data engineering skillset?→ bit.ly/more-kds
@HansHelmutKohls3 жыл бұрын
Really well structured and explained, this really is what dbt's own tutorials should be like.
@KahanDataSolutions3 жыл бұрын
Thank you!
@billal27nulnul Жыл бұрын
Thanks for the video, i took over a DBT project and it seems that whoever I took over from saw this video also. Lucky for me
@krishnachinta89373 жыл бұрын
I got below error, cloud you please advice whats was it "depends on a source named 'snowflake_sample_data.store_sales' which was not found"
@MrChalak4 жыл бұрын
Great , I was searching for some dbt tutorial and came across your video. It's really informative 👍👍
@KahanDataSolutions4 жыл бұрын
Thank you! Glad you found it helpful.
@_saaskey4 жыл бұрын
This is great !! Keeping doing more of dbt with snowflake.. Ty!!
@KahanDataSolutions4 жыл бұрын
You are very welcome! I plan on it!
@adegbiteogunbode3567 Жыл бұрын
Hello david, after following your process bit by bit , this error pop up after dbt run: (1:04:27 Running with dbt=1.5.0 11:04:28 Unable to do partial parsing because saved manifest not found. Starting full parse. 11:04:28 Encountered an error: Parsing Error Error reading adetola: staging/source.yml - Runtime Error Syntax error near line 4 ------------------------------ 1 | version: 2 2 | sources: 3 | -name: northwind 4 | database: analyticproject1 5 | schema: stg_northwind 6 | tables: 7 | -name: customer Raw Error: ------------------------------ mapping values are not allowed in this context in "", line 4, column 12 please kindly advice on how to solve this problem
@practicalhead Жыл бұрын
Thanks for the clean explanations. How can we use dbtspark to read from and write to postgresql database schema instead of hive warehouse, is it possible to leverage spark distributed processing instead of native postgresql compute. Reason being, all transformations may not be enough within postgres servers, and will need to scalable.
@sunny259842 жыл бұрын
Hi ,.... When I run the dbt after created snowflake_sample_data_store_sales,I get the below error 10:26:17 Running with dbt=1.1.0 10:26:18 Found 3 models, 4 tests, 0 snapshots, 0 analyses, 181 macros, 0 operations, 0 seed files, 1 source, 0 exposures, 0 metrics 10:26:18 10:26:22 Encountered an error: Runtime Error Database error while listing schemas in database "DEMO_DB" Database Error 002043 (02000): SQL compilation error: Object does not exist, or operation cannot be performed. Please help
@KahanDataSolutions2 жыл бұрын
Hi Sailesh. Based on the error, my guess is the role you're using for dbt doesn't have the right permissions for that DB. Check out your profiles.yml file to see what role you are using by default. Then go to Snowflake and make sure that role has permissions to perform actions on DEMO_DB and any underlying schemas you are trying to build. Hope that helps!
@pranavbharadwaj3350 Жыл бұрын
Great work man . thanks !!
@KahanDataSolutions Жыл бұрын
No problem!
@emrea67993 жыл бұрын
thanks for the video, is really helpful! May be one little comment: It makes no sense to configure once again the source database in the source yaml file, since we can use only one db in a project and it is already defined in the profile yaml file.
@KahanDataSolutions3 жыл бұрын
Hi Emre, glad you found it helpful! Technically that is true if you always plan to source and deploy your projects within the same DB. However, often times there will be a separation. For example... You may have one database (or several) where your raw data is stored - think of it like a data lake. The pointers to these databases/schemas is what is configured as dbt sources (ex. what is in the sources.yml). But...then you actually deploy your transformed models to a different database completely. That deployment database is what would be set in your profiles.yml on the project level. Note that this could also be dynamically changed by changing the +database model configuration. Here is a good article about this design w/ dbt - blog.getdbt.com/how-we-configure-snowflake/
@emrea67993 жыл бұрын
@@KahanDataSolutions Hi, it would be really nice to have that what you describe. Than we would have an ELT tool, and that would be great ! But the reality is: DBT is only a "T"(Transform) Tool. It simply does not support multiple databases in a project, so it is sufficient only to specify the database with its credentials in the profile file. You can still overwrite the that in the project YAML files but after compiling you should still need to have only one host and one db referenced across your project. Bottom line: If your source data for your project is not in your target db, you need to load it (with a tool like Stitch) (By the way I tried to connect to 2 db's and got "cross-database references are not implemented" error in Postgre :)
@KahanDataSolutions3 жыл бұрын
@@emrea6799 Ah I see... didn't realize you were using dbt directly on Postres. Other DB platforms do support cross db calls (such as Snowflake). Sounds like you may only need one db reference in your case if you are unable do that.
@emrea67993 жыл бұрын
@@KahanDataSolutions Ok I see, you mean on the same Snowflake platform allows you can query the db's underneath as it is in the video. But the other technologies do not allow that, not only Postgre. Besides normally you have a lot of use cases with cross platform calls. And for that case even Snowflake is not an option, you need an etl tool..
@abhinavjain10563 жыл бұрын
Hi, Is there any way to reference user defined functions, just like we add tables in the source files.
@KahanDataSolutions3 жыл бұрын
I think the closest thing in dbt to a custom user defined function would be the concept of a "macro". Here is some documentation - docs.getdbt.com/docs/building-a-dbt-project/jinja-macros/ With macros you can create some custom logic, take inputs and reference it in any of your models as needed. Hope this helps!
@megabyte47103 жыл бұрын
The series of this video is very helpful. Thanks for the video. I have a question, using templates can we loop over the multiple tables under same schema? If yes pleases suggest how can we do that. Thank you.
@KahanDataSolutions3 жыл бұрын
Glad you found it helpful! For the looping question, what are you trying to accomplish with each loop? Depending on your goals, you could create a macro, leverage the information schema to set a variable that holds all tables in a particular schema, and then use a for loop to loop through each table. Within that loop you could execute whatever logic you wanted on the particular table.
@TubeDirektor4 жыл бұрын
Unfortunately could not get through this one. Followed instructions as is, but get an error on the execution of the sql file Object 'DEMO_DB.PUBLIC.STORESALESDATA' does not exist or not authorized. compiled SQL at target un\dbt_demo\models\snowflake_sample_data\snowflake_sample_data__StoreSales.sql Any pointers on what could be going on?
@KahanDataSolutions4 жыл бұрын
When you look at your profiles.yml file, are you set up to run as a user who has permissions to "create views"? In the previous video, I used my default user who is assigned the ACCOUNTADMIN role, which has universal permissions.
@TubeDirektor4 жыл бұрын
@@KahanDataSolutions Yes the sample models were created successfully both the table and view. that is why I was confused as to why this failed.
@KahanDataSolutions4 жыл бұрын
If you go to file in the "target/" folder that it shows in the error, you can find the actual SQL being executed. Try running that in Snowflake directly and see if you can troubleshoot from there. Perhaps there is a typo somewhere or try running as a differnt user.
@TubeDirektor4 жыл бұрын
@@KahanDataSolutions that worked, it was indeed a typo :-) but the error msg was misleading. Thanks for the guidance, very much appreciate it. 19:10:44 | 2 of 3 OK created view model PUBLIC.snowflake_sample_data__StoreSales [SUCCESS 1 in 3.00s]
@KahanDataSolutions4 жыл бұрын
You bet! Glad it all worked out.
@prudhvik5603 Жыл бұрын
Hey, great explanation! Will the new model inherit all the constraints from source data?
@imohammd20032 жыл бұрын
Thanks for the great video. One quick question. For the target database(dev or prod), you have specified the connection info in profiles.yml. But for the source database, where do we specify the connection information? Thanks!
@KahanDataSolutions2 жыл бұрын
Thanks for watching! You can set that info in the yml file where you declare your sources. You can set "database", "schema" as properties which will then fully qualify the name of the source table when you use the {{ source() }} function in queries. Here is the documentation on the source properties: docs.getdbt.com/reference/source-properties
@surjeetkarmakar31882 жыл бұрын
Hi, don't we need to put the model details along with source in the schema.yml file.
@leiaduva2 жыл бұрын
From previous videos, I had renamed the models\example folder to \staging. I then created a new folder under models like you did in this current video. However, when I run a dbt run, it says I have 3 models but it only runs the "my_first_dbt_model" and "my_second_dbt_mdoel" It doesn't run my new one under snowflake_sample_data. I have the correct permissions. I thought it might be because I had been trying out the materialized settings like you did in your previous videos in the dbt_project.yml. However, I've tried every combination I can & even commented it out. models: Course_Instance: # Config indicated by + and applies to all files under models/staging/ staging: +materialized: table +schema: staging snowflake_sample_data: +materialized: view +schema: staging
@KahanDataSolutions2 жыл бұрын
Can you share what the models/ directory looks like in your project?
@leiaduva2 жыл бұрын
I can't figure out how to put a screenshot here, but I started the entire project over & I'm trying new. In File Explorer, it's Documents\SnowflakeDemo\models with a subfolder for example and a subfolder for snowflake_sample_data. It's just like yours. I didn't touch the dbt_project.yml file, so it still has models: SnowflakeDemo: # Config indicated by + and applies to all files under models/example/ example: +materialized: view I tried changing it to this: models: SnowflakeDemo: # Config indicated by + and applies to all files under models/example/ example: +materialized: view snowflake_sample_data: +materialized: view That didn't help. When I do dbt run, it says it's found 3 models but it only builds 2.
@leiaduva2 жыл бұрын
@@KahanDataSolutions I finally got it to deploy, but I had to delete the snowflake_sample_data subfolder completely. I took the source schema out & pasted it into the source schema with the sample models and I make the snowflake_sample_data__store_sales.sql reside at the same level as the samples. That's the only way it would build. Not sure what is different.