Hii great video brother i have one question i am not sure whether i can able to explain it to you but here it is :- I just want to know in how many ways we can deploy a model like in local and aws used as API service like Groq
@anandsahu676721 күн бұрын
Like if i want to use my own API for RAG like purpose how or which are the ways we can do it..
@CoderzColumn21 күн бұрын
Thanks for feedback. I think you mean that how many different ways we can access Large Language Model (LLM). So you can run it on local and access it or you can access it through API. When you are running LLM on your computer, it's local and its using your computer resources (CPU, GPU, RAM, etc.). This can be slow as LLMs are large and has high system requirements. Another way to access LLM is through API. In that case, the LLM is deployed on server and when you make http request to particular API URL (with query), it returns response. This is fast because it is running on remote powerful machine and you are only sending http request. Now there are many different LLM providers like ChatGPT, Anthropic Claude, Mistral, Google Gemini, etc. They all provide their LLMs through APIs. But all these LLMs are proprietary and not open source. The LLMs that you can run on your computer are generally open source LLMs (LlaMa models from Meta, Phi models from microsoft, etc). Now, you can deploy these LLMs on AWS or Google Cloud or Azure cloud instance. You'll have to use Flask or Django to create API to access them via internet. You'll have to pay cloud instance rent to keep LLM running there. Now, Groq is a company that designs hardware+software stack that can run LLMs faster. So they have deployed few open source models on their server and made them available through APIs. They don't have any proprietary LLM. They provide open source LLMs through APIs. Generally proprietary LLMs (ChatGPT, Gemini, Claude, etc) are little better in performance compared to open source LLMs (LlaMa, Phi, etc). But open source models are caching faster.
@anandsahu676721 күн бұрын
@@CoderzColumn Thanks for your clarification and response
@anandsahu676721 күн бұрын
@@CoderzColumn So , i can deploy a model on aws sagemaker then from lambda and api gateway on aws i can use it as api is it correct
@CoderzColumn21 күн бұрын
I am not 100% sure about how to do it with Sagemaker and what kind of instances it provides. But i think it might be possible if powerful instance that can run LLMs is available then it might be possible. I just don't know how to do it. I know that you can deploy LLM on powerful AWS compute instance. THen make it available through API. But you'll have to design REST APIs using python web frameworks like Django or Flask or FastAPI. Yes, you can deploy open source models to cloud (AWS, Google Cloud, Azure or any other cloud provider) and make them available through APIs. It'll be faster compared to running on local but you'll have to pay high price of powerful cloud instance.
@theSeoulist25 күн бұрын
Really good explanation. Much appreciated.
@CoderzColumn24 күн бұрын
Thanks for the feedback. Really appreciate it.
@none-hr6zh28 күн бұрын
How to use search tools like internet search in the agent loop
@CoderzColumn24 күн бұрын
You can define a function that takes query as input, searches internet and returns result. The function can be linked with Agent which will generate input query to it to answer user query.
@canalderegistrosАй бұрын
Thank you, sir.
@CoderzColumn24 күн бұрын
Welcome!
@LennyBransАй бұрын
You sir, are awesome!
@CoderzColumn24 күн бұрын
Thank you kindly!
@djstr0b3Ай бұрын
Thanks for this awesome video. Plotting really made me understan the reason for doing HyDE a lot better
@CoderzColumnАй бұрын
Glad you liked it!
@kienletrung8983Ай бұрын
tks sir <3
@CoderzColumnАй бұрын
You're welcome
@paulchan6818Ай бұрын
how to switch to the terminal to type panel serve ....ipynb? thank you
@AnimalsOfLove2 ай бұрын
Extremely useful! Thank you for such a wonderful video
@CoderzColumn2 ай бұрын
Glad you enjoyed it!
@shyterinn2 ай бұрын
thank you, such a great video. If later on I want to update the code, should I reupload to update it for the web app?
@CoderzColumn2 ай бұрын
Yea. So you'll need to update code of the files that you changed and then restart server. To restart server, you can go to "Web" Tab of Pythonanywhere, select your app from right and click on "Reload app_name" button. This will reload app with new code.
@ersineser76102 ай бұрын
Thank you for sharing video and code. Greetings from İstanbul / Turkiye.
@CoderzColumn2 ай бұрын
Thanks for watching!
@zakariaabderrahmanesadelao30482 ай бұрын
The pace at which AI is advancing is insane, I have yet to go through Graph RAG.
@AjarnSpencer2 ай бұрын
I cloned a gguf and have it in connected to my own self made web ui but i cannot figure out how to create a way to add a localdocs upload folder to the web ui that will train the LLM gguf. Can you explain that one sometime?
@duhitaa92 ай бұрын
Great effort sir, Please continue this series to make input indicator value widget . Please try to make part 2 of your below video 🎉 kzbin.info/www/bejne/mpylc5V9i6uag6Msi=wrvrd5dQ-hBb9YEN
@duhitaa92 ай бұрын
Very good effort sir, keep going. Please continue this series to make input indicator value widget . Please try to make part 2 of your below video 🎉 kzbin.info/www/bejne/mpylc5V9i6uag6Msi=wrvrd5dQ-hBb9YEN
@mudiagaakpubi89832 ай бұрын
How did you design the page navigation buttons ?
@gauravgarg-wc4zl2 ай бұрын
Great ! i have tried running the same using ollama mistral
@CoderzColumn2 ай бұрын
That's Great 👍. Try to extend it and make something useful from it. I covered very basic in video but you can do much more.
@Ngozimefendja2 ай бұрын
thank you and i need help on how to create kpi cards
@ayushyadav-bm2to2 ай бұрын
Nice yrr, i like chroma but this server side thing is not my forte
@bantuthomas2 ай бұрын
Your tutorial is truly mind-blowing. And your shared code on github was a great blessing. Thanks for all your effort. I'm sure I'll be watching your channel a lot in the future.
@vladimirdiadichev61402 ай бұрын
Долбанный ГКН, не дает посмотреть.
@pouria13612 ай бұрын
Nice video and many thanks but cpould you create video about the booststrap and external css?
@bantuthomas2 ай бұрын
Thanks for a super tut. Really informative and coherently explained.
@CoderzColumn2 ай бұрын
Glad it was helpful! Thanks for taking time to comment !!
@dysonspreybar49032 ай бұрын
what would be a good way to make overlapping charts? for example if we overlapped a schedule offset by the longest task, multiple times one after the other on the same chart?
@gauravgarg-wc4zl2 ай бұрын
Great ! i have tested from google colab , to install pygraphviz , I have to run following 1st !apt-get install graphviz libgraphviz-dev pkg-config
@theexposer94832 ай бұрын
Great video Mr. Solanki, you have explained it well. I wish to migrate from Streamlit to a framework which does not execute code on every change in the state. Streamlit does not support multitasking or background tasks or to use streaming data from one page to another. Will panel take care of such issues. Keep the good work.
@CoderzColumn2 ай бұрын
Thanks for the feedback. Appreciate it. I think that with Panel, you should be able to execute background tasks. It is based on bokeh server which let us execute functions at specified intervals (this can be used as background tasks). For multitasking, it needs to be tried, I am not 100% sure on it.
@itsyndicate252 ай бұрын
Can you make this an api?
@opokuandrew57162 ай бұрын
I am trying to implement this project, but I am facing errrors on the codes under ##individual Column Stats and ## histogram The first one gives Key error as KeyError: None and the second one give the error as 'plot_width' to figure, similar attributes are outer_width, width or min_width. Kindly help me resolve it
@assavinkengkart91412 ай бұрын
Does this support Dashboard that deals with MySQL database? Because application needs to CRUD with MySQL database. If so, how to set up? Thank you very much
@xspydazx3 ай бұрын
This concept is excellent! However, here’s how it could be refined: Think should be a tool, allowing the agent to self-query for a plan, methodology, or the next step in a sequence. Observe should return the tool call. Action should represent the selected tool or action to be executed. By treating each component as a tool, we can create any methodology or chain of thoughts. For example: Plan Creation: This could be a tool, functioning as a prompt that queries itself with a task and returns a plan. The prompt manages this tool within the chain. Code Refinement: Similarly, this could also be a tool, where the model queries itself with code for refinement. The only change would be in the prompt, whether it's a tool, chain, or graph. This allows for specialized refiners with strict requirements for correcting or industrializing code. When designing processes for the agent, we could have template plans in the planner. For example, if the user wants to repair some code, the planner could offer tool choices and distribute a boilerplate from a set of pre-defined templates. These templates could be stored as docstrings and selected by the model, with tools accessing the required files without going out of bounds. Open code execution and system command execution should always require human intervention due to potential system risks, hence the use of tools with defined boundaries. A planner could also have access to tool collections, which can be loaded into the agent to perform tasks-essentially passing a toolbox to the model. The central controller bot would access all these tools. Given a menu of tasks, the model could return tool collections, plans, and start nodes, allowing the agent to pass a custom state to a chain and return its output. For example, if tasked to create an app, it would query the general planner bot, which would direct it to the correct planner. This planner would provide the agent with tools, requirements, and the start node, enabling the agent to execute the correct graph (tool) with a state and expected outcome. Initially, I built this process similarly to yours, but after reflecting on the above, I realized an even higher perspective: Our front-end should resemble Dialogflow, adding a personality layer. While it may not be intelligent, it provides the necessary constraints on top of the low-level processes we're currently working on. We should recognize that there are many layers beyond this, not just a UI. The react process (or thought process and selected methodology for a task) is great for long and in-depth tasks with a graph-based structure, but it's not ideal for general chat. Hence, we need a simple front-end with keyword detection and response, a dialog manager. Even RAG (Retrieval-Augmented Generation) should be plugged in at this higher level, as it’s not required unless dealing with in-depth tasks. Basic chat history can keep the model personalized and up-to-date. RAG serves as short/long-term memory, which after optimization, can be fine-tuned into the main LLM. However, it should still be a tool, only used when necessary. A state is enough to perform a task, and chunking and embedding for similarity should be reserved for in-depth queries, not for downloading and summarizing data. Summarization only requires a prompt chain, making it essentially a self-query tool. Sub-agents are tools that self-query the model, with clean chat history solely for tool execution. Only the exchange between the agent and tool is saved, and internal processing within the tool is not returned to the agent-only the final output. This can be shared using verbose mode, where two responses are created: a verbose one with extensive logging and custom tripwire exceptions, or a simple output to preserve minimum token exchange between agents (i.e., only query, state, and output). Tools can also function as conditional nodes, allowing input-based branching within the tool. For example, a refiner tool could call a coder tool to return the refined output to the calling agent. The coder node always returns data to the refiner tool unless explicitly told to return the output without refinement. This prevents nodes that cannot return direct output, enabling many nodes in a graph to be executed as tools. I hope this enhancement to your process resonates with you and perhaps guides your next steps. You may find this approach less memory-intensive, as tools unload after execution, and the agent can be a very small model, with the tools providing the intelligence. A moderate-sized model (7-14B parameters) would be ideal, especially since the same model might be sub-called multiple times (e.g., in the case of a refiner or planner that may call a sub-planner). The latest models have been trained on function calling, tool use, and planning.
@arirajuns3 ай бұрын
Thank you for sharing the step by step guide. Really appreciate your efforts.
@CoderzColumn3 ай бұрын
Glad it was helpful! Thanks for taking time to comment !!!
@deepakjain44813 ай бұрын
you don't even try to explain it like we know this topic already
@redblues95663 ай бұрын
Very thanks. from South Korea.
@CoderzColumn3 ай бұрын
Really appreciate your feedback !!! Thank You !!!
@redblues95663 ай бұрын
@@CoderzColumn I did the same, only the OHLC Chart doesn't show. bokeh library doesn't work in django. No error message. Could you possibly tell me why?
@CoderzColumn3 ай бұрын
If you have properly followed tutorial then the chart should appear without any issue. Also, I hope you included script tag in HEAD of html page with proper bokeh version. It should be same as bokeh library installed on your PC. <script src="cdn.bokeh.org/bokeh/release/bokeh-3.3.4.min.js" crossorigin="anonymous"></script> Other than this, I hope you included chart tags this way. <div class="card-body"> {{ scripts | safe }} {{ hist_chart_div | safe }} </div> I would suggest first creating simple django app to understand this and then try it in your app.
@redblues95663 ай бұрын
@@CoderzColumn Thank you very much for your reply. I'll check out the bokeh version in my venv.
@redblues95663 ай бұрын
I solved that problem. The cause of the problem was the difference in bokeh version. Thank you very much my teacher.
@nissarmd80543 ай бұрын
how to modify the grid distance to diplay all dates in mplfiance?
@nissarmd80543 ай бұрын
Incredibly clear and well-structured, making the complex information easy to understand, Thank you SUNNY SOLANKI.
@gomgom3303 ай бұрын
So it just run on our local enviroment, no need internet? But what if we share it on streamlit cloud, can it run as we know streamlit cloud doesn't have high resource or even GPU
@CoderzColumn3 ай бұрын
Yes. It runs on our local computer and does not need internet for it, Now for sharing it on streamlit cloud is not possible with ollama because streamlit does not let u run ollama there and yes it does not have high resource or GPU. You can deploy Ollama on some cloud instance (Gcloud or AWS) and then access it through API. You can then replace code to access ollama through that cloud instance based URL (base_url parameter). That's one solution. Another solution would be to use OpenAI rest API but it is not free. Recently, I have started using Groq REST API which is free. They provide latest Open source models like Llama-3.1, though there is request limit (30 requests per minute). Also, there models are full and not quantized one like Ollama hence more accurate.
@gomgom3303 ай бұрын
@@CoderzColumn aahh i see, so we can build and host our model in server that have high resource, then we use it as rest API for streamlit web app that we share, thx for explain.
@CoderzColumn3 ай бұрын
Yea. You can use it other platforms than Streamlit because just like openai API, your models are available through API. You can even create models for custom uses. But i feel the cloud instance are bit pricey. You'll need good instance with 16/32/64 GB RAM and one with high GPU RAM. The cost increases fastly with specs. You know if you have some spare system with GPU lying around then you can use it as well.You'll have to keep it on all the time and initially, you'll have to do few firewall settings to make it accessible to outside world on internet. This one can be cheaper hopefully but you'll still have to pay for electricity for keeping this system on (which has GPU hence more electricity usage).
@gomgom3303 ай бұрын
@@CoderzColumn yep you right, even if we make own server to host Rest API, we can use it for any use case. thx for response
@aiduongnguyen56533 ай бұрын
Hi CoderzColumn, you are so generous to share and explain coding clearly. Hope you are okay and keep pursuiting your career successfully in nearby future. <3. Thanks for your free tutorials
@CoderzColumn3 ай бұрын
Thanks for your feedback. Really Appreciate it.
@ShammaAshraf-m6x3 ай бұрын
SO GLAD YOU POSTED THIS THANK YOU SM!!!
@CoderzColumn3 ай бұрын
Thanks for your feedback. Really Appreciate it.
@johanhernansanchezvillano52123 ай бұрын
Hi sir. I can see that you can combine matplotlib with widgets . Can widget combine with plotly? Thanks for this videos.
@CoderzColumn3 ай бұрын
Yes, definitely you can use it with Plotly. It'll work fine. Also, I think that with plotly, if you use FigureWidget by wrapping figure object inside it then it'll be better. Check out this article. plotly.com/python/getting-started/#jupyter-notebook-support
@johanhernansanchezvillano52123 ай бұрын
@@CoderzColumn You are nice! Thanks you for your help. greetings from Perú
@CoderzColumn3 ай бұрын
No probs. Hope my answer help you solve the problem.
@EduardoAndreePS3 ай бұрын
Great video! Do you know if there's a way to upload the resulting code from a Streamlit app using Ollama to streamlit.share and make it work?
@JavierSalazar-i4f3 ай бұрын
Wonderful content, I had learnt a lot.
@CoderzColumn3 ай бұрын
Glad you enjoyed it!
@shivambhargava30924 ай бұрын
Superb video. How to deploy it in production?
@jakpoa4 ай бұрын
Nice video... Just what I'm looking for. This Dash-plotly I used when I need a real time update, at maximum 5 to 5 min refresh. Do you know how this will cost monthly? just to have some idea
@CoderzColumn4 ай бұрын
If you have set max_instance in config file to 1 then it won't cost you anything. But if you increase count of instances then it'll charge based on usage of instance. I have included info about instance charge pages in video. Please feel free to check them to know more about charges. But with just one instance running, it comes under free tier hence no charges.
@redblues95664 ай бұрын
Thanks teacher. 😍😍😍
@CoderzColumn4 ай бұрын
Thanks for comment.
@14gustavo144 ай бұрын
Nice video, thanks for this
@CoderzColumn4 ай бұрын
Thanks for the feedback. Appreciate it.
@user-zg3vr8hg9h4 ай бұрын
Is there a way for the answer to show a timestamp that corresponds to a summarized point?
@CoderzColumn4 ай бұрын
I think that when you load transcripts, you can load it in chunks and then create vector store using it. That way you can retrieve timestamps from retrieved relevant documents. Start time of chunk is available as metadata. It'll require bit of coding changes. It won't be accurate but near to answer.