AWS Tutorials - Streaming Data Ingestion in Amazon Redshift

Рет қаралды 8,912

AWS Tutorials

Күн бұрын

Пікірлер: 31

@parthasarathibarman9862 2 жыл бұрын

Trust me you're doing an amazing job touching so many lives! Thank you for all your contributions

@AWSTutorialsOnline 2 жыл бұрын

Thank you so much!

@varra19 2 жыл бұрын

@@AWSTutorialsOnline awesome videos. thanks alot. if possible please make some video using aws msk creation and ingestion into redshift.

@PrakashReddyK 2 жыл бұрын

Awesome , this came last week , thank you very much for your content

@AWSTutorialsOnline 2 жыл бұрын

Glad you enjoy it!

@varra19 2 жыл бұрын

@AWS Tutorials awesome videos. thanks alot. if possible please make some video using aws msk creation and ingestion into redshift.

@Mustafa-yk8lk 2 жыл бұрын

we have now redshift as consumer of stream for optimize the pipeline but i think it is still in preview !

@AWSTutorialsOnline 2 жыл бұрын

I think it is still in preview.

@subhamaybhattacharyya 2 жыл бұрын

Thanks for the great content. But, I am unable to create the materialized view and getting the following error: ERROR: Operations on local objects in external schema are not enabled. [ErrorId: 1-62f52b1e-5ea3ac1d18bf9a1234e86e4f] Am I missing any step ? I have followed all the steps as per the tutorial.

@이재진삐약 2 жыл бұрын

Thanks for the great video! Can you also show demo for S3 --> AWS Kinesis Data Streams --> Redshift? thanks in advance

@sr852008 2 жыл бұрын

So sir.. Once data is read into materialized view then it is vanished from kinesis data stream? Or we can read same data with some other aws components/tools also?

@AWSTutorialsOnline 2 жыл бұрын

Data in Kinesis is purged only after configure retention period. Till that period, any consumer can consume the data.

@YEO19901 2 жыл бұрын

Is it require to insert data into other staging layer the moment we query it from datastream very first time Or will data persist in datastream until we manually purge it?

@AWSTutorialsOnline 2 жыл бұрын

In Kinesis, the data is immutable. It remains there for the retention period configured. When you refresh materialized view; it fetches the latest data from the Kinesis data stream.

@nehabhopale9694 Жыл бұрын

hi i don't want data under single column payload i want data pushed into individual colums(eg. vibrations is one column) how we can achieve this?

@tracyding4906 2 жыл бұрын

I didn't see the ingestion part, do you run python codes at EC2?

@AWSTutorialsOnline 2 жыл бұрын

The ingestion code I run from SageMaker notebook. Sample code is given in a link in the description. And yes, you can run python code from EC2. You need to setup authorization either using EC2 instance profile or using access/secret key profile or using cognito.

@hsz7338 2 жыл бұрын

Thank you for the Video, it is great. I think this is a nice feature. my question is similar to @Gunjan Jain one, but not about the data purge? I am wondering whether Redshift persists the data automatically during the process so that we won't rely on Kinesis Data Stream for its data retention.

@AWSTutorialsOnline 2 жыл бұрын

Materialized view does not store data but provides a way to query data from Kinesis in the real-time. So you need to process data using Materialized View within the retention period of Kinesis data stream otherwise the data is not available anymore for processing.

@YEO19901 2 жыл бұрын

It seems Approximatearrivaltimestamp is what we will use to move the data from materialized view to other staging layer.

@AWSTutorialsOnline 2 жыл бұрын

@@YEO19901 if you want to persist. If all you want is process data within retention period; you don't need any persisting of data.

@tracyding4906 2 жыл бұрын

@@AWSTutorialsOnline Materialized view in SQL server has the data stored.

@tracyding4906 2 жыл бұрын

@@AWSTutorialsOnline Materialized views are especially useful for queries that are predictable and repeated over and over. Instead of performing resource-intensive queries on large tables, applications can query the pre-computed data stored in the materialized view. Data is stored in the materialized view

@anderson.rsantana 2 жыл бұрын

great, something that you barely see on documentations. thanks

@AWSTutorialsOnline 2 жыл бұрын

Well to be honest, it is documented but got published a week back only that is why you didn't come across.

@dude0001tube 2 жыл бұрын

Is there a way to deserialize events in AWS MSK (Kafka) that are Avro serialized? Specifically when the schema is stored in AWS Glue Schema Registry?

@AWSTutorialsOnline 2 жыл бұрын

I need to check. Glue does support streaming data with Kafka but I did not get chance to work with it.

@dude0001tube 2 жыл бұрын

@@AWSTutorialsOnline I got a response from AWS that MSK and Avro deserialization are on the roadmap for Redshift Streaming Ingestion. We do already do Spark Streaming of MSK Avro serialized message into an S3 data lake, so I can confirm that does indeed work. I was hoping to PoC using this to go directly to RedShift for a materialized view but will have to wait until later this year. Thank you for the video and the reply!