Hi Jordan, thanks for the video! You mentioned having vLLM AND ollama running on a single GPU. How do you prevent vLLM from blocking all available VRAM? Can memory be allocated dynamically?
@jordannanos26 күн бұрын
@@kyudechama ollama does allocate memory dynamically but vLLM does not. I just use docker to restrict which GPU(s) the vLLM runtime has access to.
@jordannanos26 күн бұрын
@@kyudechama and I guess the NVIDIA container toolkit within docker specifically
@kyudechama26 күн бұрын
@@jordannanos thanks for the quick response. Would you mind sharing the docker command that you used to deploy?
I have used the openwebui standard pipeline, and it looks like I can't put more than one table in the DB_table field. That's too much of a downside! Did you come across a solution?
@martinsmuts25574 ай бұрын
Hi Jordan, thanks. I am missing the steps where you created the custom "Database Rag Pipeline with Display". From the Pipelines page you completed the database details and set the Text-to-sql Model to Llama3, but where do you configure the connection between the pipeline valves and the "Database Rag Pipeline with Display" to be an option to be selected?
@jordannanos4 ай бұрын
@@martinsmuts2557 it’s a single .py file that is uploaded to the pipelines container. I’ll cover that in more detail in a future video
@KunaalNaik4 ай бұрын
@@jordannanos Do create this video soon!
@jordannanos4 ай бұрын
@@KunaalNaik @martinsmuts2557 just posted a video reviewing the code: kzbin.info/www/bejne/n325qnidrayVnZY repo is here: github.com/JordanNanos/example-pipelines
@swarupdas80434 ай бұрын
Hi. Could you link us to the source code of the pipeline?
@jordannanos4 ай бұрын
code is here: github.com/JordanNanos/example-pipelines video reviewing the code: kzbin.info/www/bejne/n325qnidrayVnZY
@RedCloudServices3 ай бұрын
Jordan thanks, I have a single gpu runpod setup would you recommend just adding a docker postgresql to existing pod? and is the python code using langchain stored in the pod pipeline settings? this sort of reminds me of AWS serverless Lambda but simpler
@jordannanos3 ай бұрын
@@RedCloudServices if you’d like to save money I would run Postgres in docker on the same VM you’ve already got. That will also simplify networking. Over time you might want to start/stop those services independently in the event of an upgrade to docker or your VM. Or you might want to scale independently. In that case you might want a separate VM for your DB and a separate one for your UI. Or you might consider running kubernetes. Yes the python code is all contained within the pipelines container and uses llama-index not langchain (though you could use langchain too). Just a choice I made.
@jordannanos3 ай бұрын
@@RedCloudServices in other words, you’ll need to pip install the packages that the pipeline depends on, inside the pipelines container. Watch the other video I linked for more detail on how to do this.
@RedCloudServices3 ай бұрын
@@jordannanos yep! just watched it. I just learned openwebui does not allow Vision only models or multi modal LLMs like Gemini. Was hoping to setup a pipeline using a vision model 🤷♂️ also it’s not clear how to edit or setup whatever vector db it’s using
@peter1024 ай бұрын
nice video. saw the link from twitter. my question is, is there a way to speed up the results after you ask it a question?
@jordannanos4 ай бұрын
Yes, working to improve the LLM response and SQL query time
@renatopaschoalim12093 ай бұрын
Hey Jordan! Can I change your pipelines for work in SQL Server?
@jordannanos3 ай бұрын
@@renatopaschoalim1209 yes, it’s tested with Postgres and MySQL. If you know how connect to SQL server with python, you’ll be able to use the pipeline