Thanks for the video! Would be great to also see the how you would write it on a real application
@rankala9 ай бұрын
I would like to point out that there are datebase (extensions) for GIS data. Postgis for postgres. So in fact you could query a database. Other databases have also extensions or native features.
@interviewpen9 ай бұрын
Yes-for our vector-based data this is a good solution. However, for raster data we don’t have any direct equivalent. We sort of glossed over this in the interest of time, so really good thoughts here!
@pmshadow9 ай бұрын
Very good and explicative video, thank you very much. I am currently building an internal data platform, and I was going to use Prefect on a VM, but after seeing your video I believe the best way to go would be: Prefect + Dask Scheduler + Dask Worker on Azure Kubernetes Service. Does that make sense to you? Then I could benefit from autoscaling of the workers. Thanks again!
@interviewpen9 ай бұрын
Yep, that sounds like a great solution! There's also fully managed solutions like Snowflake and Databricks as well, if that suits your use case. Thanks for watching!
@pieter54669 ай бұрын
This made me wonder whether systems like Hadoop and MapReduce are still used/built.
@interviewpen9 ай бұрын
Hadoop MapReduce could absolutely be used in place of Spark/Dask as our distributed data processing cluster. However, this would be a lot of manual work to build the types of aggregations we would need from scratch. Good point!
@yashpandey74339 ай бұрын
Did something similar but on a very large scale in PayPal,
@interviewpen9 ай бұрын
Cool cool!
@ocean645Ай бұрын
Hi, what exactly is this subject? Is it data science?
@interviewpenАй бұрын
This is system design-we’re considering what services and infrastructure to use to solve a high-level problem. Thanks for watching!