will existing data in postgres be synced to hudi too? or just changes since the creation of the streaming
@padam_discussion6 ай бұрын
Interesting video... great
@SoumilShah8 ай бұрын
great video
@onehouseHQ8 ай бұрын
Glad you ejoyed it!
@HoorayforOranges9 ай бұрын
Thank you so much for this. This is the only video I could find that takes a real deep dive into the data without propaganda towards any one candidate.
@JG-zu6nq Жыл бұрын
mistake at 22:41, there's no limitation that you 'cant cross over the boundary' in a query when you do partition evolution in Iceberg
@kjweller Жыл бұрын
You can cross the boundary, but the query predicates need to be right to get the same performance across both partition schemes.
@JG-zu6nq Жыл бұрын
@@kjweller what exactly does that mean, one just has to write select * from table where ts > timestamp '2023-08-21 00:00:00' and even if the partitioning was evolved from say daily to hourly on 08/25 that will work and prune the partitions
@kjweller Жыл бұрын
@@JG-zu6nq take an example if you were partitioning by date daily, and you want to evolve this to partition by userId or vice-versa. A query with only one of the predicates will be efficient just for that section of the partitioned data. It works great for evolving partitioning within different aggregate levels of same value, but struggles across different values.
@paulfunigga Жыл бұрын
@@kjweller what about schema evolution, in your article it says that hudi's schema evolution is good only on spark sql. What if I use hudi with trino? Is schema evolution going to be bad? Also, is hudi good with trino at all? In trino's slack channel they said that they prioritize iceberg.
@paulfunigga Жыл бұрын
@@kjweller also, in your "which format to choose" why didn't you add another point: hudi's table services are managed, compared to iceberg and delta lake, I think it's a big thing.