Yes, a comparison between dash and streamlit would be useful as well as a discussion about the deployment (local webserver, or cloud-based, security concerns etc)
@MikeOnlineable7 ай бұрын
Definitely interesting!
@kslader87 ай бұрын
I'd add in your thoughts around oauth2 integration, deployment to cloud infrastructure, and scaling to support growing into a larger ui / ux framework. my big issue has always been figuring out the authentication and deployment piece with streamlit. That said, streamlit keeps getting better so fast maybe all these things are simple now. edit - after watching your whole video it sounds like you have the same problem as me... with dash you can plug in your own decorator pattern pretty easily (at least that is what I do) for authentication and it's wasn't hard for me to structure the application in a way that has grown to be pretty big at this point running on aws eks
@klmcwhirter7 ай бұрын
+1 on the dash vs streamlit comparison. Right before I retired last year I was trying to help some folks on a business team (yep, they had developers - who knew) with streamlit challenges. But I lacked context to be of much help. I would really enjoy hearing more of your experience with these technologies and the evaluation process.
@rafiullah-zz1lf7 ай бұрын
The thing i like about arjan is that he always have something from practicle coding . Not like other youtubers who just teach like from books which ia sometimes not as useful as arjan. Keep it up love u❤
@ArjanCodes7 ай бұрын
Thank you so much for the kind words! I'm glad you enjoy the content :)
@buchi84497 ай бұрын
In our projects, we use oauth2-proxy + Streamlit. We offload user auth to Azure AD and oauth2-proxy. This approach is simple but sufficient for us, as we usually just need to restrict access to the dashboards. In addition, oauth2-proxy can map claims in OIDC tokens returned by IdP, like email, groups, etc., to HTTP headers sent to upstream apps. Streamlit has an API (private one...) to extract values in HTTP headers. We can use this combination to implement more complex RBAC in Streamlit apps.
@kslader87 ай бұрын
I will have to find time to test this :)
@thepackbot7 ай бұрын
Cool!
@fvgoya7 ай бұрын
Will be awesome to see a tutorial about build a dashboard like that.
@DaveParr7 ай бұрын
12:14 curious about the architecture choice for mongodb. Afaik it's an unlikely contender because the nature of large tables of data lends itself to relational schema, and many dvs are olap optimized. Am I missing something? 🤔
@andreacazzaniga84887 ай бұрын
I m guessing not wanting to cope with future changes in linkdin / youtube apis
@durand1017 ай бұрын
I've been using dash + plotly for the last 5 years and have been really impressed with that combo, and even use it in production. Would love to know why you decided on streamlit over dash.
@nicktids7 ай бұрын
We use streamlit for fast prototype. And dash with auth as it's just flask on the back end and can be extended easily. Or you just rock any front end back end combo with plotly charts and htmx callbacks to change what is viewed. You can lose some functionality from dash filtering and callbacks but make up for it as it is the Tech stack you want
@AceofSpades57577 ай бұрын
I used both before and streamlit is way easier to get off the ground and has much better UI by default IMHO
@melissastrong86567 ай бұрын
Excellent, clear video. I particularly enjoyed the "Microsoft Hell" graphic.
@ArjanCodes7 ай бұрын
Glad you enjoyed it!
@carlosdavila28317 ай бұрын
You can put filters on a fixed side bar and scroll the contents with streamlit.
@ArjanCodes7 ай бұрын
Good to know!
@DaveParr7 ай бұрын
Thought exactly the same thing
@bernidacruz7 ай бұрын
Your videos are consistently excellent and serve as a valuable resource for learning to code. Regarding the topic you discussed about complementary topics to dashboards, I've recently come across Keycloak and I'm curious about how to implement it to secure an application and connect it to the company's LDAP (though I'm not entirely sure about the LDAP part). I found your content to be very informative and helpful, and I appreciate you taking the time to share your knowledge with the community. Please let me know if you have any insights or recommendations on integrating Keycloak and LDAP into an application. I'd be very interested to learn more. Thank you for your great work!
@papeya7 ай бұрын
Thanks for yet another helpful video! And right on time, I just switched positions from a governmental run company to a consulting heavy company, so figuring out pretty ways to display data just became high in priority for me :D As someone who only used Dash before, I'd love a deeper dive into the differences to streamlit!
@badcosmonaut73237 ай бұрын
Like many others have mentioned using a database is usually the best option for authorization. Store current key, and then the table will build up in size over time, so I have a simple job that runs once a day that deletes all expired authorizations.
@TomzaBKewl7 ай бұрын
On your point about PowerBI and lack of integration with Rest APIs. Apart from a few big services like Salesforce (to make it more accessible to organisations without any data engineering capability), I think the expectation from Microsoft is that you have some kind of ELT process to extract data from your APIs and load the data to a database, and most likely transform it so that it fits a model that is optimised for BI. Power BI has good integration with things like Snowflake and Synapse, so it's great if you can get all your data there first.
@El_Hectornauta7 ай бұрын
An excellent video Arjan. When I started in the world of data I've used Dash. Made me learn many things for my first job (unfortunately the API I used stopped getting support for a critical endpoint and the learning project died with it). We only needed authentication when using Streamlit, so streamlit-authenticator was a solution. For some little details we used the name of the user as "role" (so it wasn't a "proper" solution, more like a hack).
@j05849247 ай бұрын
Have you considered Apache Superset, Grafana and Metabase?
@Ahmad21319937 ай бұрын
Will be awesome to see a tutorial about build a dashboard like that. How to make better use of Streamlit or how to use the Anthor Python library to create Daschbord
@roaldkleiveland7 ай бұрын
Sure - would be nice to see videos on streamlit and dash. Maybe you also could do flet? I really love that one 😊
@personabrahamaudu7 ай бұрын
I don’t know if this is the best approach, but for the auth issues, you could use the db to hold user auth details (maybe in sha256), verify this data upon login and then store the user permissions temporarily using st.session_state when automatically clears when the page is reloaded.
@marcosoliveira87317 ай бұрын
It would be very interesting to find out your takes on Streamlit and Dash.
@biftheunderstudy7 ай бұрын
In addition to using the sidebar for filters, you can make use of st.columns and containers to size the components.
@CorruptoGrande7 ай бұрын
Hi Arjan, there is streamlit-authenticator for authentication. I don't think it has an authorization layer i.e. different user roles but this be possible to add quite easily via JWT claimes which become available in the user session info. I had the requirement to do authentication against ldap, so I took the authenticator session and cookie handling alongside the ldap3 module to implement ldap authentication in streamlit. Hope that helps.
@MicheleHjorleifsson7 ай бұрын
Would love to see a deepr dive into this and Taipy
@carlesmolins32697 ай бұрын
I want more about dashboards, especially integrating them in wider web-apps
@walid71897 ай бұрын
I was thinking to request this for a long time since I think that taipy is relevant... Can you make a video about ETL/ELT pipelines? You referenced this structure in one of your previous old videos but don't remember ever having the chance to come across something like this. Like, what are the common practices? How to structure the components of the app (each process is a separate service? multiple services/servers?) How to manage processes/threads within the pipeline? How to handle complex data filtering of all sorts (let's say a stream of data that has to split & sent to an ml model)?
@santoshbhattarai25277 ай бұрын
Stremlight and Dash sounds like a good idea to explore.
@Gigusx7 ай бұрын
4:04 no you shouldn't have! The time you've spent testing other solutions is what cleared up for you guys what's important for you in the dashboard, UI, UX, things you definitely want to have in there, things that you definitely don't want to, the shortcomings of tools that are already there, etc. It'd be hard to get that clarity if you just jumped right into building your own 😉
@timvogt70887 ай бұрын
We are building a datalake for detailed sales information on AWS. Gemini suggest powerbi by personally would build data backend and etl part and use quicksite. And later decide powerbi or python way. Quicksite makes visuals easy for business. And etl in for example glue is done with sql or python. So we let business build data view. And datasience do etl or we as devs do it with python in glue. We’re baked in AWS and build for complex materials and complex demand. So AWS is a choice we made. For dashboard we had build a custom api from our rest data. But it took to long to finish.
@tommisaltiola7 ай бұрын
Im setting up Superset for canned BI visualizations. Its built on python. And Streamlit for sharing ad hoc exploratory visualizations
@abdelghafourfid82167 ай бұрын
for authentication we had a special system in the company that uses Kerberos for implicit flow and client id/secret for explicit flow, there was already an internal library in typescript that expose it so we've to build a streamlit extention that uses that typescript package on the front end to handle the auth and send the client token to tha backend where we descode the token to identify the user and handle authorization logic
@magatamass7 ай бұрын
This is really, really nice and useful. Thanks.
@ArjanCodes7 ай бұрын
Glad it was helpful!
@NewyJimmy7 ай бұрын
would be awesome if you could go into more detail on Taipy
@kedrickperkins3316 ай бұрын
I am curious if a comparison video between streamit and plotly was made?
@FilmonTheMystic7 ай бұрын
What would be the best way to store data in the cloud and make it immediately accessible by a python script? I am collecting and storing fiber optic sensor data.
@brainforest887 ай бұрын
Hi @ArjanCodes I use streamlit-ldap-authenticator to authenticate against our LDAP/Active Directory server. For Authorization I use a data model in the database plus casbin. Roles are assigned to the users in the database. I configured casbin for RBAC and in the policies I defined which role has what access to an object. So this portion is outside of the code and better maintainable.
@MaxMustermann-on2gd7 ай бұрын
Have you been building Dashboards using streamlit and a database in the backend? Were you querying the data directly from the database? I wonder how you would use Streamlit in this scenario if the amount if data gets bigger - like a few mm rows or even 10s of mm rows. This is both not performant to ad-hoc query directly from the DB and also loading this data in memory from flat files appears wrong to me. This is in my opinion where Power BI and Microsoft Hell shines. Any suggestion?
@brainforest887 ай бұрын
@@MaxMustermann-on2gd I hear you. I do develop dashboards. I am reading data from a database and the user can edit the data or create new datasets. I have one page with some analysis which are also read from the db. Its not lightning fast but good enough for me. Once deployed I was surprised how low the memory footprint is (approx 200MB) and because the DB is not far away, the performance is quite decent.
@personabrahamaudu7 ай бұрын
You could use the st.container component to fix the parts you don’t want to scroll with the rest of the page
@martin.thogersen7 ай бұрын
We're selfhosting with Shinyproxy/docker to get LDAP auth out of the box. But it spins up a container per user session, so it dang slow/nonscaling and wouldn't recommend it. Does guarantee complete user isolation though. I'm not sure if streamlit can guarantee complete thread safety.
@arthenik7 ай бұрын
Maybe a comparison with NiceGUI would be more interesting, given that it’s meant to integrate with Pydantic and FastAPI?
@takhs917 ай бұрын
Any thoughts on apache superset?
@tommybrecher77427 ай бұрын
@ArjanCodes, do a GitHub action video. automatic linters, release actions, deployment via actions. Spice it up :)
@jumper01227 ай бұрын
Dang I was really hoping this would go over Power BI's Python widget -- it has one but I have yet to mess around with it
@hubstrangers34507 ай бұрын
Thank you....
@MaxMustermann-on2gd7 ай бұрын
I wonder how you would use Streamlit in this scenario if the amount if data gets bigger - like a few mm rows or even 10s of mm rows. This is both not performant to ad-hoc query directly from the DB and also loading this data in memory from flat files appears wrong to me. This is in my opinion where Power BI and Microsoft Hell shines. Any suggestion? Having said that, for these amounts of data I believe some kind of OLAP backend as needed. But how would you stitch this together with Python/Streamlit? PowerBI (as much as i don't like it as well) offers this out of the Box...
@biftheunderstudy7 ай бұрын
10's of millions of rows is not all that much data to store in memory. Depending on the use case I will either store the data in parquet files or query a database for it. There are many tricks to play here, using cache is a very important aspect and streamlit makes it fairly easy.
@MaxMustermann-on2gd7 ай бұрын
@@biftheunderstudy yeah i know its not that much as in big data, but still enough to not query it every time from sql server (this is what my org is using). parquet file should be fast enough you are saying? additional question: if the data is stored in a relational data model (with some of the basic fact tables of 10s of mm rows), would you then essentially replicate this data model in parquet files and query+join them within your streamlit app? or how would you go about it? Or would you spin up some kind if analytical backend that does the actual computation? I know of streamlit cache. But as mentiomed above its not just a matter of keeping the data in memory, you may also dealing with some kind of data model and you cannot denormalize it completely.. Thanks
@biftheunderstudy7 ай бұрын
@@MaxMustermann-on2gd there are many ways to approach this. For example, build an access layer in the database such that you don't need to do any joins when querying. In almost all the cases that I've had to do this, the most performant approach was to read in the disaggregated data either from a SQL query or a flat file using streamlit's caching decorators. Then, I would do any aggregations from that cached dataset, being careful not to allow mutating the original in the process.
@MaxMustermann-on2gd7 ай бұрын
@@biftheunderstudy ok thank you. yeah tbh, i have never tried it with these amounts of data only smaller ones and I was worried it wouldn't be performant enough. (the already compressed powerbi data model is 600mb in size, so a lot more when decompressed in memory) my org/Department is using power bi only for Dashboards, but i always wanted to throw some python Dashboard in as well. 😅 as much as i don't like powerbi/DAX, i have to admit being able to do data querying, transformation, modelling and complex computation on top in just one ecosystem out of the box is still a pretty good selling point.
@biftheunderstudy7 ай бұрын
@@MaxMustermann-on2gd Also, parquet is definitely fast enough. You can read in a 100MB file in much less than a second. But, you should not try to replicate your data model in parquet files, joining the data in python will be slow
@diegovargas38537 ай бұрын
What's the reason to use Mongo instead of a sql database?
@nicolameoli7 ай бұрын
You can just put those filters in the sidebar with st.sidebar.
@Mjhapp7 ай бұрын
Yep +1 on st.sidebar. Plus you could get away with something like the free tier of fivetran and not mess with hand-coding the etl
@DataDestination7 ай бұрын
No one has mentioned Gradio yet. Isn't it an alternative and even more flexible than streamlit? 🤔
@AlanBerman7 ай бұрын
I'm very curious how you landed on MongoDB + custom dashboard rather than something like Grafana + InfluxDB or Prometheus... not that this isn't a great/fun project, but feels like you went straight from expensive commercial solutions and skipped over the FOSS options out there?
@pixaim697 ай бұрын
I was wondering the same thing.
@redbaronnomads32237 ай бұрын
Metabase is web based, easy to structure, and you can just point it at the database. The also have embedded dashes…
@aquasoc607 ай бұрын
Does anyone have a link to this github repo ?
@andyvandenberghe6364Ай бұрын
tip #4 is like common sense data engineering.
@AceofSpades57577 ай бұрын
Thats too bad. I really love streamlit. I dont think it'd be too bad to wrap it in another web app but that would be pretty frustrating.
@aflous7 ай бұрын
First?
@ArjanCodes7 ай бұрын
Yes! 🙌
@pixaim697 ай бұрын
Do you mean first usefull comment ?
@FocusAccount-iv5xe6 ай бұрын
Sponsored videos should be disclaimed during the intro. Feels deceptive to not mention it until 6 minutes into the video.
@ArjanCodes6 ай бұрын
At 0:35 I explicitly say that this is a sponsored video. Also, there's an overlay at the start of the video specifying that this video contains sponsored/promoted material. Am I missing something here?