Well, you can build a fact_transaction table which include purchase and return event all together, that way you can track both activities in one fact table
@gravenguan Жыл бұрын
Anyway, long time haven't saw that thorough data engineering interview question, keep it up!
@vibhavaribellutagi973825 күн бұрын
If return policy is 30 days, it doesn't make sense to have 30 days windows, window is always moving form current date to last 30 days. Instead we need fixed 30 days from the time of order.
@ShikhaShah27224 ай бұрын
If your table is partitioned with vendor_id - - -> this will cause skewness as per the very early stage discussion, not every vendor will have sales. Wouldn't it better to bucket data by vendor_id and partition by date / interval?
@JashRadia4 ай бұрын
Why do we need to have spark streaming between Kafka and landing table if the goal is just to copy data as is and there are no transformations required?
@thomazsuzuki10264 ай бұрын
If you are not doing transformations you will not need it (just if you have connector issues). just kafka is more than enough.