I love all your videos. They realy help me out a lot. I am noob in airflow. Would you mind to make a video about oracel client lib integration in your local airflow instance? The connection to the oracle db only works in thick mode. I run the instance with astro cli in vs-studio with docker desktop on a windows. I am a little bit lost what to do now. Do i need to create a new dockerfile? Is there a workaround to skip the thick mode ?
@thedataguygeorge11 ай бұрын
Hey so unfortunately in my testing you'll need to use thickmode, and then edit the dockerfile to install the oracle client libraries as part of your Airflow environment. Super annoying to deal with, and praying Oracle updates it sometime soon but not holding my breath!
@candyskullxoxo466011 ай бұрын
I am the user-my8fv4lu5h by the way. Sorry for double posting but i managed to do it. :D Thanks@@thedataguygeorge
@제비-v9r8 ай бұрын
Thank you so much. They really help me out a lot. I need a video for data migration from Oracle Database to snowflake using airflow with example please
@thedataguygeorge8 ай бұрын
I gotchu! Will work it into the schedule!
@JeannetteSchuelingkamp11 ай бұрын
Can you show how to do the thick mode in a local airflow instanz. How to mount the oracle lib in my astro project. :)
@thedataguygeorge11 ай бұрын
Hey so you'll need to edit the docker image as shown to install the oracle libs as part of the start up process, it's unfortunately the only way I've found that works so far!
@snehithjshiju1332 Жыл бұрын
Hi @thedataguy explained well. Need a video for data migration from MongoDB to mysql using airflow with example please
@thedataguygeorge Жыл бұрын
Thanks! Am working on it, running into some hiccups on the mysql side but will get it out ASAP!
@-Saishankar-zp2tg Жыл бұрын
what if select query returns a result set of some billion records . How to over come the problem
@thedataguygeorge Жыл бұрын
Chunk it! Instead of doing select all, use dynamic task mapping to create X mapped task instances for X amount of chunks of those records, and then you can upload them all in parallel without the need for any specialized hardware!
@ayocs28 ай бұрын
insert_rows is too slow for large dataset is there alternative way to load data faster
@thedataguygeorge8 ай бұрын
Will look into this and get a video out soon!
@shresthaupadhyay5739Ай бұрын
Hey curious me wants to know can we transfer 150 million records of data with that ?
@thedataguygeorgeАй бұрын
Definitely!
@eunheechoi37458 ай бұрын
Do you know how to switch from http to https in docker compose for airflow webserver ? I have updated the compose like environment and volume related to SSL. The webserver container keeps restarting …. Any idea? And create a video for this?
@thedataguygeorge8 ай бұрын
Hmmmm, what kind of compute are you running the webserver on?
@eunheechoi3745 Жыл бұрын
Thank you so much! Great
@thedataguygeorge Жыл бұрын
Thank you! Glad it was helpful!
@eunheechoi3745 Жыл бұрын
Yes it’s helpful. By the way I am having ‘conn_id’ error saying it’s not defined… even though the oracle connection is already saved via the airflow UI. Do you have any idea?
@thedataguygeorge Жыл бұрын
Hmm, maybe check that your conn id is consistent? Also sometimes oracle can be funky so if u want to really make sure its not a problem with the connection you can hard code the connection in the dag
@eunheechoi3745 Жыл бұрын
@@thedataguygeorge ya I ended up just hard coded. however, another problem came out... "insert_rows for i, row in enumerate(rows,1):" I've also built docker images by updating requirement.txt and recreated webserver and schedule...
@eunheechoi3745 Жыл бұрын
and also are you familiar with the error "TypeError: object of type timestamp is not JSON serializable
@lookatskykyyy42935 ай бұрын
Hi! do you still see my comment now, I have problem with oracle to connect it's show i can't use a thin mode in log error, So i try to install cx_Oracle in dockerfile it's have problem to install too.
@thedataguygeorge5 ай бұрын
What issue are you running into?
@oxente1984 ай бұрын
cx_Oracle está defasado. Utilize oracledb
@RAYENAKKARI-o7h10 ай бұрын
What version of Airflow is being used, please?
@thedataguygeorge10 ай бұрын
This was run on 2.6.3!
@apinansornkaewdara1913 Жыл бұрын
sume column is datetime format TypeError: Object of type Timestamp is not JSON serializable
@thedataguygeorge11 ай бұрын
You'll need to convert it out of Timestamp format I believe, can add a step in the python function to do the type conversion as part of the transfer process