Пікірлер
@jedodinho
@jedodinho Күн бұрын
Tenho uma duvida. Na pasta do meu projeto eu criei uma venv com uma versão do python e pandas bem especifica, e eu preciso executar esse codigo nesta venv. Eu não entendi como fazer o lab identificarr a venv e executar o codigo... Usando o agendador do windows para codigos .py por exemplo, eu determino o caminho do python que eu vou usarr, assim podendo ter o ambiente correto. é possivel fazer isso no lab para os meus arquivos .ipynb ?
@BiInsightsInc
@BiInsightsInc Күн бұрын
Yes, you can schedule and run you Notebooks (with ipynb extentions) with the Jupyter Lab scheduler. You cannot run the ipynb via windows task scheduler. You can use the jupyter scheduler for Notebooks (ipynb) files. jupyter-scheduler.readthedocs.io/en/latest/
@honzajazz
@honzajazz 3 күн бұрын
You should also uncomment (remove #) the row with wal_level key to take settings effect.
@afk4dyz
@afk4dyz 7 күн бұрын
Password as an environment variable is an absolute game changer.
@11folders
@11folders 7 күн бұрын
Excellent Demo! Is there a way to specify more granular details about the source? For instance, I may want to specify the page or section heading for a source, especially if I want to cite it as a reference.
@BiInsightsInc
@BiInsightsInc 7 күн бұрын
Thanks. I think cou can print the page number if that's part of the metadata stored in the vector db along with the chunk identifier. You'd need to extract/store that information while preparing your source data.
@mdanowarhossain3802
@mdanowarhossain3802 7 күн бұрын
Where is the .pkl files?
@BiInsightsInc
@BiInsightsInc 7 күн бұрын
You save your trained model as a pickle file. Here is the link to the Notebook where we save the model to disk as a picke file. github.com/hnawaz007/pythondataanalysis/blob/main/ML/Email%20Spam%20Classifier.ipynb
@ashwinkumar5223
@ashwinkumar5223 8 күн бұрын
I mean we use website instead of pdf file
@BiInsightsInc
@BiInsightsInc 8 күн бұрын
Yes, you can use website as a source. You'd need a web scrapper component to scrape data from the web and format it for your RAG app. Here is article on this subject: medium.com/@iankelk/rag-detective-retrieval-augmented-generation-with-website-data-5a748b063040
@ashwinkumar5223
@ashwinkumar5223 8 күн бұрын
Can we do RAG with one website to interact
@BiInsightsInc
@BiInsightsInc 8 күн бұрын
Yes, we can use a website as a data source.
@Croat955
@Croat955 9 күн бұрын
great video it helped me a lot
@huyvu4741
@huyvu4741 9 күн бұрын
free?
@BiInsightsInc
@BiInsightsInc 9 күн бұрын
Yes, there is an open source version!
@ivanl7786
@ivanl7786 13 күн бұрын
Hello! That's a great explanation, thanks! Please tell me how the data transfer is carried out? Are we using the RAM of the server where Python is installed or are we using the RAM of the server where PostgreSQL is installed? I want to understand if this scenario is suitable if there is a table with 30 million rows on the SQL Server side?
@BiInsightsInc
@BiInsightsInc 12 күн бұрын
Hey there, in this use we utilized Pandas and it loads the data in memory of the server where Python is installed. So you would need to make sure either data fits in the server's memory/load in batches or use the chunking strategy to load your data. Hope this helps.
@thecarlostheory
@thecarlostheory 15 күн бұрын
Thank u for the help! Very useful! Leave private models!
@user-ke8lb9fu5h
@user-ke8lb9fu5h 16 күн бұрын
Thank you sir, you helped me understand airflow, and I did the same thing following the same process but from mysql - extract-load -> transformation -> load with free employees database and I did share it on my github and linkedin tagging this video.
@larsh8560
@larsh8560 17 күн бұрын
Quite cool. Is this just "magic" of DBT or is this normal in other tools also? (asking as someone who remembers SCDs as painful in SSIS when I worked with it years ago)
@BiInsightsInc
@BiInsightsInc 17 күн бұрын
This is the dbt "magic". In other tools you need to go through multiple steps and sometimes need custom code to achieve this.
@mahraneabid
@mahraneabid 17 күн бұрын
hi sir the edited model cant be seen by ollama, when I call ollama list in CMD its display only the ollama3.1, why?
@BiInsightsInc
@BiInsightsInc 12 күн бұрын
If you do not see the custom model in your ollama ecosystem then check the model file to make it's correct. Here is an example of the custom model file from openwebui. openwebui.com/m/darkstorm2150/Data-Scientist:latest
@mahraneabid
@mahraneabid 17 күн бұрын
when he said "would you like me to break down the sales by product" and you responded with yes will do the action that he mention or will not?
@BiInsightsInc
@BiInsightsInc 12 күн бұрын
It may work if the SQL model is able to generate sql for the question. You can try it and let us know if this extended option works.
@rafaelg8238
@rafaelg8238 18 күн бұрын
Great video, tks.
@BiInsightsInc
@BiInsightsInc 18 күн бұрын
Glad you liked it!
@hashimraza422
@hashimraza422 19 күн бұрын
HI Boss - Thanks for sharing this, can you help regarding this, we are looking exactly this for our Prod and Dev environment.
@BiInsightsInc
@BiInsightsInc Күн бұрын
Hi there, you can shoot me an email here: [email protected] We can discuss what are you requirements and take it from there.
@HashimRaza-k4k
@HashimRaza-k4k 19 күн бұрын
Hi Boss - Thanks for your time and effort, i need to know this process , can you help in this regard
@abhisheksaini5563
@abhisheksaini5563 21 күн бұрын
I have connect with on prem sql server using cloud composer then at cloud composer how do i install the driver
@BiInsightsInc
@BiInsightsInc 18 күн бұрын
Hi there, the GCP Composer Worker's Pod image runs on ubuntu you can install the the drivers on the on the composer image. I haven't tried it personally but here is a link to SQL Server driver install: stackoverflow.com/questions/60346440/google-composer-how-do-i-install-microsoft-sql-server-odbc-drivers-on-environme
@laophan4591
@laophan4591 24 күн бұрын
I 'd like to create a custom folder to keep the csv on the host and link with the airflow image. I find that practice called mount volume by adding: - ${AIRFLOW_PROJ_DIR:-.}/custom_folder:/opt/airflow/custom_folder => but after that I check by run ls-a command the terminal not show the folder, could you please help ?
@BiInsightsInc
@BiInsightsInc 23 күн бұрын
You can mount a directory to the Airflow image under volume section. Make sure your directory exists in the location where your docker file is located. In the following example we mount the dags from the local folder to Airflow image. volumes: - ./dags:/opt/airflow/dags
@penishilton6940
@penishilton6940 25 күн бұрын
ty king
@adilsaju
@adilsaju 26 күн бұрын
AMAZING
@ambernaz6793
@ambernaz6793 27 күн бұрын
Hi Nawaz, could you please guide me if I want to load data to Power BI. How the code will be different. I am new in this field and I am learning ETL and data pipelining. Thank you.
@BiInsightsInc
@BiInsightsInc 27 күн бұрын
Hi there, you load the data in storage layer i.e. flat files, database, datalake. So you can use any of the pipelines to load data in a storage layer. Power BI reads data from the storage layer. I have a Power BI series. Here is the link: kzbin.info/aero/PLaz3Ms051BAnnlZfFxXs3ezSVM54OlYBr
@ambernaz6793
@ambernaz6793 27 күн бұрын
@@BiInsightsInc Thank you
@michaelaustin1638
@michaelaustin1638 27 күн бұрын
Awesome video! How did you get the various categories when creating a model?
@BiInsightsInc
@BiInsightsInc 27 күн бұрын
Thanks. Those are defaults in the OpenWebUI. You can select relevant categories for a custom model.
@ankit7918
@ankit7918 28 күн бұрын
What laptop specification is best to run ml model. Please give some laptop recommendations.
@BiInsightsInc
@BiInsightsInc 27 күн бұрын
You can pick up a Lenovo laptop that has atleast 12Gb VRAM and four RAM slots upgradable to 128Gb RAM and 3 NVMe slots. Here is one that fits the descirption: www.lenovo.com/us/en/p/laptops/thinkpad/thinkpadp/thinkpad-p16-gen-2-(16-inch-intel)/21fa0027us
@user-kk8xf1jc8h
@user-kk8xf1jc8h 28 күн бұрын
i got this error- Data extract error: ('28000', "[28000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'etlnew'. (18456) (SQLDriverConnect); [28000] [Microsoft][ODBC Driver Manager] Invalid connection string attribute (0); [28000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'etlnew'. (18456)") ​ even tho i have changed Server Authentication too
@BiInsightsInc
@BiInsightsInc 28 күн бұрын
Check your connection string to make sure it is valid. Also, make sure your credentials are valid and you can connect to database via SQL Server management studio or DBeaver. Enable Mix Mode authentication if you haven't already. SQL management studio, go to > Properties > Security > Server Authentication, and under that check the option "SQL Server and Windows Authentication mode"
@laophan4591
@laophan4591 28 күн бұрын
appreciate your video, I wish I knew it earlier!
@jeanchindeko5477
@jeanchindeko5477 29 күн бұрын
3:18 Apache Iceberg has ACID transactions out of the box, and it’s not Nessie which brings ACID transactions to Iceberg. In Iceberg specification the catalog only has knowledge of the list of snapshots, and the catalog doesn’t track the list of individual files part of commit or snapshots.
@Pattypatpat7122
@Pattypatpat7122 Ай бұрын
This was great, much easier on my Windows machine than my Linux machine for a change. Just a question, your table definitions in the video for AdventureWorks don't appear to be the same as the available ones on the Microsoft site for versions 2019 or 2022. I created some dummy tables based on the same table definitions in your GitHub, but obviously my dummy data doesn't relate, so I can't properly test if this model is properly generating the correct SQL. Do you have a link to the database you were using?
@BiInsightsInc
@BiInsightsInc 28 күн бұрын
Hi the database used in the video is developed on top of Adventureworks 2019. Here the link to repo and the series. github.com/hnawaz007/dbt-dw hnawaz007.github.io/mds.html
@aniketrele7688
@aniketrele7688 Ай бұрын
Hi, Is the connector name and topic name always same? Can you name your ropic something else? If you want to have multiple topic for 1 connector then it will be helpful. Thanks in advance.
@BiInsightsInc
@BiInsightsInc Ай бұрын
Hi there, no your connector name can be different than your topic name. You can have multiple connectors read from the same topic.
@junaidmalik660
@junaidmalik660 Ай бұрын
thanks a lot for the detailed video, i want to ask about he accuracy of results? is it accurate or not for big datasets
@BiInsightsInc
@BiInsightsInc Ай бұрын
The results are good on various data seizes. However, you should be careful with the data size. PandasAI uses a generative AI model to understand and interpret natural language queries. The model has a token limit and if your data exceeds that limit then it won’t be able to process your request.
@tiagovianez8482
@tiagovianez8482 Ай бұрын
Teacher, where is the source of this data? I would like to insert them into my database. In my case I will insert it into PostgreSQL, run the ETL and write it to s3. Could you provide me with the source?
@BiInsightsInc
@BiInsightsInc Ай бұрын
Hi the data source is a MS SQL Server sample database called Adventureworks. You can download and restore it. I have a tutorial on how to install SQL Server and restore this database. Here is a link: kzbin.info/www/bejne/m2bQp6KBqrtmrtU
@krishnarajuyoutube
@krishnarajuyoutube Ай бұрын
can we run llama 3 locally on any simple VPS Server, or do we need GPUS ?
@BiInsightsInc
@BiInsightsInc Ай бұрын
Hi you'd need a gpu to run llm. By the way VPS servers can have GPUs.
@diwaspoudel7
@diwaspoudel7 Ай бұрын
Hi there do you have dockeryml file containing mssql connection
@BiInsightsInc
@BiInsightsInc Ай бұрын
Yes, I have done a video on it where I install additional sql server providers and connect to sql server. Here is the link: kzbin.info/www/bejne/qmXLZampirGqfKc&lc=UgxQFElBNgK2dwKo5kV4AaABAg
@mohdmuqtadar8538
@mohdmuqtadar8538 Ай бұрын
Great video What if the response from database exhaustes the context window of the model.
@BiInsightsInc
@BiInsightsInc Ай бұрын
Thanks. If you are encountering model's maximum context length then you can try the following. 1. Choose a different LLM that supports a larger context window. 2. Brute Force Chunk the document, and extract content from each chunk. 3. RAG Chunk the document, only extract content from a subset of chunks that look “relevant”. Here an example of these from LangChain. js.langchain.com/v0.1/docs/use_cases/extraction/how_to/handle_long_text/
@GordonShamway1984
@GordonShamway1984 Ай бұрын
Wonderful as always and just in time. Was going to build a similar use case that auto generates database docs for business users next week. This comes in handy🎉 Thank you again and again
@BiInsightsInc
@BiInsightsInc Ай бұрын
Glad it was helpful! Happy coding.
@KevinHa-wg8qv
@KevinHa-wg8qv Ай бұрын
Hi. I encountered this error when trying to add debezium connector via api call. Would you please help? Thanks. Failed testing connection for jdbc:postgresql://localhost:5432/AdventureWorks with user 'etl': Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. [io.debezium.connector.postgresql.PostgresConnector]
@ryanschraeder8681
@ryanschraeder8681 Ай бұрын
What happens if you kill the airflow web server, or localhost? Will the DAG still run on the schedule you specified?
@BiInsightsInc
@BiInsightsInc Ай бұрын
If the services are down then DAG won’t run. You want to make sure your server remains on for the DAG to execute on schedule.
@gustavoleo
@gustavoleo Ай бұрын
Namaste Haq !!! Thank you so much for making this video!, and also sharing your repo, I'm bit confused how you build the connection string. would you mind to share it? UI had checked you Connect to SQL Server with Python notebook also, but didn't realize what's is not correct on my ConnectionStringCredentials()!
@BiInsightsInc
@BiInsightsInc Ай бұрын
Thanks. The connection strings defined in the secrets.toml file. I have covered it in the initial videos. You can watch them here. kzbin.info/www/bejne/r2rSoHyAbNacmLc kzbin.info/www/bejne/pJrQnWR7qLKsb9E
@dltHub
@dltHub Ай бұрын
❤ Thank you for this amazing video!
@rafaelg8238
@rafaelg8238 Ай бұрын
Great video 👏🏻
@cvarak3
@cvarak3 Ай бұрын
Hi, would you suggest this method to extract data from an active postgres table that has ~5billion rows? If not do you have any videos on what method you would suggest to extract from postgres to s3? Thanks! (Tried with airbyte but keeps failing)
@BiInsightsInc
@BiInsightsInc Ай бұрын
Hi, if you have a kafka cluster running then you can stream data from Postgres to Kafka. A cluster can handle large dataset. You can stand up your own or utilize confluent cloud. Once this set up is in place then configure an S3 sink connector. I have covered that in the following vidoe. kzbin.info/www/bejne/oJDHdoimi56medE
@danielvoss2483
@danielvoss2483 2 ай бұрын
Great job, keep going 👍
@coolkillakhan
@coolkillakhan 2 ай бұрын
i lover uoui