Time Series Anomaly Detection with LSTM Autoencoders using Keras & TensorFlow 2 in Python

  Рет қаралды 64,418

Venelin Valkov

Venelin Valkov

Күн бұрын

Subscribe: bit.ly/venelin-...
Complete tutorial + source code: www.curiousily...
GitHub: github.com/cur...
📖 Read Hacker's Guide to Machine Learning with Python: bit.ly/Hackers-...
Detect anomalies in S&P 500 daily closing price. Build LSTM Autoencoder Neural Net for anomaly detection using Keras and TensorFlow 2.

Пікірлер: 90
@MichalMonday
@MichalMonday 2 жыл бұрын
if anyone has a problem with plot statements at the end, it helped when I used: scaler.inverse_transform(test[TIME_STEPS:].close.values.reshape(1,-1)).reshape(-1), and scaler.inverse_transform(anomalies.close.values.reshape(1,-1)).reshape(-1),
@boomtrack5176
@boomtrack5176 2 жыл бұрын
Great thank you
@Wissam-rk7tv
@Wissam-rk7tv Жыл бұрын
Thank you so much , do you have an idéa of how to prepare our data, in the case of a multivariate analysis but with redundant dates, for example the prediction of temperature in several regions ? ( we don't have a unique key )
@CodeEmporium
@CodeEmporium 4 жыл бұрын
This is gold. I'm doing something similar for work. Glad I discovered this channel. Subscribed! Looking forward to more content!
@mikhailb8026
@mikhailb8026 3 жыл бұрын
Dear Venelin, You are training your model using labels (y_train) which are t+1 timestamps for each training sequence (X_train), but Autoencoder is implied to train model with labels that is the same as input training sequence, it means you should use model.fit(X_train, X_train), I guess. Could you kindly explain why you use this scheme of training and name it like Autoencoder ?
@franziskahuber9664
@franziskahuber9664 4 жыл бұрын
Doing my bachelor's thesis on this. Very helpful gaining overview over the topic, thank you!
@leoparada69
@leoparada69 4 жыл бұрын
Great tutorial. Just wanted to point out that the problem at 24:50 is in the way that the mean absolute error is calculated: np.abs(X_train_pred, X_train) is no the same as np.abs(X_train_pred - X_train)
@christalone1693
@christalone1693 3 жыл бұрын
Appreciate all your help man, it's really made a difference in how quickly I've learned a lot of these concepts! You are the best.
@vigneshpadmanabhan
@vigneshpadmanabhan 4 жыл бұрын
This is exactly what i wanted to learn.. would you be able to do the same for a multistep multivariate time-series and identify the anomaly and forecast? Thanks!
@Breno9629
@Breno9629 3 ай бұрын
Hey Mr. Venelin, thank you for the video. If you allow me to ask you some questions, why do we have, while train the model, pass the X and the Y? Is the model reconstructing the original sequence and trying to predict the next value based on the 30 values provided? (I am asking because I was expecting that we would bass the same sequence, something similar as we perform using a vanilla autoencoder). It seems that we input a sequence, tries to predict the next value for the given sequence while we reconstruct the initial sequence. When we calculate the error, the error is based on the reconstruction process am I right? Thank you in advance!
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 4 жыл бұрын
Thanks for the tutorial. Question: Why do you create target label (y), when you are doing an unsupervised analysis?
@cedricvillani8502
@cedricvillani8502 3 жыл бұрын
Because the 🐐 Goat told him so
@martintabikh494
@martintabikh494 4 жыл бұрын
Hi, why do we use y_train to fit the model and not X_train? it is autoencoder right? so we train the model to be able to reproduce the input so X_train
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 4 жыл бұрын
Yes, that is indeed very confusing in this tutorial. Why create target label when you are doing an unsupervised analysis.
@arielgroisman4724
@arielgroisman4724 2 жыл бұрын
same question here, this doesn't look like an autoencoder architecture. It should be model.fit(X_train, X_train,...)
@donwoodlock15
@donwoodlock15 3 жыл бұрын
Thank you for the tutorial. There is one piece I didn't understand. The shape of y_test is 380, so I was thinking that the model would make 380 predictions, but the shape of the predictions (y_test_pred) is 380*30. Is it making the 30 predictions per date? For example it uses the prior 30 days as the input sequence and its predicting the next 30 days? I was also thinking since the shape of y_train is a single closing price per day that the model would be trained to only predict one value per date, not 30. Can you clarify?
@pratiksingh2840
@pratiksingh2840 4 жыл бұрын
Great work Venelin. Very Clean step by step explanation. Keep it up
@hipphipphurra77
@hipphipphurra77 4 жыл бұрын
I am wondering a little bit what we gain from detecting historical anomalies? It is like knowing that last weeks weather probably had an anomalie. What we need is a prediction (not of the future price, that is not enough) of the future performance. If we can't have this than we would at least like to have a prediction of future anomalies.
@xRandom112
@xRandom112 4 жыл бұрын
Great Tutorial, it's really noticeable that you know what you're doing. Keep it up
@gcvictorgc
@gcvictorgc 4 жыл бұрын
Thanks for this! Could you elaborate on your choice for the loss function? Would you make things different if you had >1 features (multivariate time series)? Cheers
@xenophon167
@xenophon167 2 жыл бұрын
Excellent video, thanks a lot! However I would like to see an extension of this example using multiple features. I tried to extend it using more features with no luck so far.
@conduit242
@conduit242 4 жыл бұрын
Uhh...this is non-stationary data, you need to remove the trend or you’ll get these bogus results. LSTMs assume stationarity. Convert it to daily percentage change for stock data.
@dafliwalefromiim3454
@dafliwalefromiim3454 3 жыл бұрын
Exactly, in case of time series data, samples are too auto correlated, its can't be modelled straight, without removing the trend. Hi Rob, can i talk to you one to one, please ? my contact, gautamk2017@email.iimcal.ac.in
@shyamkarthikrameshbabumis5367
@shyamkarthikrameshbabumis5367 3 жыл бұрын
Really really helpful to help me with my time series problem related to climate change, thank you!
@BeCorbie
@BeCorbie 4 жыл бұрын
Very helpful tutorial! I have to do something similar for university and this helps a lot! :)
@ayushpantdeptofcs1635
@ayushpantdeptofcs1635 4 жыл бұрын
How do we take multivariate features to perform anomaly detection? I.e. X1, X2, X3 as the input and we want to predict the future Y value.
@vigneshpadmanabhan
@vigneshpadmanabhan 4 жыл бұрын
I would like to know the same thing
@massiivelli4267
@massiivelli4267 4 жыл бұрын
Create a train - test set with all features. Then fit the scaler separately for train[[X1,X2,X3]] and train[y]. Then when you call the create_dataset you will call it like: X_train, y_train = create_dataset(train, train[y], time_steps) Note that train (X_train) contains X1, X2, X3 and y (of the past N time steps) while y_train contains only the y to predict. The rest should be pretty much the same
@iLoveBrezels
@iLoveBrezels 4 жыл бұрын
@@massiivelli4267 could be further explain where I get train[y] from? Let's say my three features are cpu, ram and hd usage, I did train['cpu'] = scaler.fit_transform(train['cpu']) and for the other two respectively. Where do I get train[y] from? What do I pass into X_train, y_train = create_dataset(train, ?????, time_steps)?
@massiivelli4267
@massiivelli4267 4 жыл бұрын
@@iLoveBrezels train[y] is the variable you want to predict. In other words it is the thing you want to know. So in a time series situation, normally it is the value of a something in a specific time in the future based on the past values.
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 4 жыл бұрын
@@massiivelli4267 The thing is: in Autoencoders, you are not predicting anything. You shouldn't need any target as this is unsupervised!
@DanBarbatti
@DanBarbatti 4 жыл бұрын
Hi Great tutorial. Trying to utilize your code with some of my data. Only change was number of time steps. Getting shape incompatibility errors when I try to use y_train in the fit method. Also using Keras 2.2.4 and tensorflow 1.13.1 ... Any advice?
@FRUXT
@FRUXT 2 жыл бұрын
I have an ad every 3 minuts... Except that, excellent video. However the anomaly detected don't seem to be abnormal for me. It' more abnormal when the change is big and sudden
@mariaclaradantas5419
@mariaclaradantas5419 3 жыл бұрын
This tutorial helped me a lot! Thank you!!
@studyhub3950
@studyhub3950 Жыл бұрын
Firstly thanks. My question is that when input is 30*1 means 30 then how can be output 64 while in autoencoder we compress data then decode for example 30 to 15 to 10 then decode
@wijdanchoukri775
@wijdanchoukri775 2 жыл бұрын
Thank you so much for this, this is what I wanted to learn
@Nofakeable
@Nofakeable 2 жыл бұрын
That was a really well done video, thank you!
@shilpashivamallu9056
@shilpashivamallu9056 3 жыл бұрын
In order to predict for next 8 hours, what needs to be changed in the code? Time_Steps should be 8?. How the model identifies it is in hours or Days? Thanks
@nielspalmans6237
@nielspalmans6237 3 жыл бұрын
is there a way to do what you did on data consisting of multiple attributes rather than just one?
@vahidjoudakian8649
@vahidjoudakian8649 2 жыл бұрын
Very informative, thank you
@AnirudraDiwakar
@AnirudraDiwakar 4 жыл бұрын
This is very nicely explained. Thank you sir.
@nataliagromova961
@nataliagromova961 2 жыл бұрын
Very cool 👍 was it helpful for you to predict the stock price in real life?
@tangibleoxygen1986
@tangibleoxygen1986 4 жыл бұрын
Note: the shared colab notebook also gives exact same error too. Hence I double checked my lines with yours. Any help would be super beneficial
@rhithickm2689
@rhithickm2689 3 жыл бұрын
At around 24:05 time, it should be np.abs(x - y) and not np.abs(x, y) right?
@idotsuk
@idotsuk 4 жыл бұрын
standard scaling doesn't work well here since the S&P 500 is increasing (test samples are strictly larger) But I guess batch normalization makes up for it when only looking at 30 days Maybe it'd make sense to scale with a logarithmic function of the date?
@douglaszechin3233
@douglaszechin3233 4 жыл бұрын
Wouldn't it perform any better if used the return_states in encoder and used it and initial_state in the decoder? It seems that your approach passes just the last output of the LSTM, wich doesn't carry much information...
@marouaslafa7571
@marouaslafa7571 4 жыл бұрын
very good tutorial . Can you do another one about anomaly detection in images ? it will be very interessting
@vamsikrishnabhadragiri402
@vamsikrishnabhadragiri402 3 жыл бұрын
Why did we use time distributed dense layer any specific reason?
@awaisumar5125
@awaisumar5125 4 жыл бұрын
how can we actually feed the real-time test data to this model to get real-time predictions? is there any tutorial or link for this?
@Cyberfako
@Cyberfako 2 жыл бұрын
You are Great! That helped well so thx 🙏
@harshitbhargav
@harshitbhargav 4 жыл бұрын
Does that work for multiple time series of varied time length also?
@prathameshpradipdatar2003
@prathameshpradipdatar2003 4 жыл бұрын
Great walkthrough with the code!
@tangibleoxygen1986
@tangibleoxygen1986 4 жыл бұрын
Hi, thanks for such a detailed and compact explanation. I need some help in .fit() method. I am getting an error time_distributed_ to have 3 dimensions, but got array with shape (7752, 1). I checked github and error logs. The shape of X_train is still 7752, 30, 1. Is there any solution?
@ismailwafaadenwar3254
@ismailwafaadenwar3254 4 жыл бұрын
In the fit() method, change the y_train to also X_train. Theoretically, in autoencoders, X is the input and X is the output
@vishwasgowda
@vishwasgowda 4 жыл бұрын
First of all, thank you for the video tutorial. I am curious if can do a video on how to setup a email systems once the value reaches the anomaly threshold. The idea is to set an alarm before some thing bad happens. You can also point me where I can get an idea to set up an alarm system. Thank you
@priyankadas7102
@priyankadas7102 4 жыл бұрын
Excellent content on your channel. Thanks
@TOM-cd1zb
@TOM-cd1zb 4 жыл бұрын
Hi there, my val_loss is constant from the first epoch, so it is overfitting. Any tips?
@mindbodyzaid7814
@mindbodyzaid7814 2 жыл бұрын
why do you need to create a "y" dataset if for autoencoders "x" should be mapped to "x"?
@nguyenanhnguyen7658
@nguyenanhnguyen7658 2 жыл бұрын
This is cool !
@blackisfav7222
@blackisfav7222 4 жыл бұрын
How to convert it as script where we give data and output is going to be anomaly ..mean end to end functions based script rather than jupyter notebooks
@mohammedghouse235
@mohammedghouse235 3 жыл бұрын
Amazing video, Could you also do the same anomaly detection on oil production profiles?
@suyashsonawane4690
@suyashsonawane4690 4 жыл бұрын
I tried to implement on multi variable dataset but it doesn't work , the last layer gives incompatible shape error
@AdityalikeThe
@AdityalikeThe 4 жыл бұрын
Same with me, did you find a solution to that?
@MrProzaki
@MrProzaki 4 жыл бұрын
same here xD , still looking for a solution .... that i can understand.
@shreyasshinde5451
@shreyasshinde5451 2 жыл бұрын
@ Venelin Valkov Thanks for the video and great explaination. I am working with multiple fearture (Multiple attributes) for Anomaly detection. Could you provide any sample code or any reference for that. Would be great. Thanks :)
@ogochukwuujunwa4680
@ogochukwuujunwa4680 2 жыл бұрын
Please can you change the font of your system to make the text legible
@2guestuser
@2guestuser 4 жыл бұрын
Fantastic tutorial!
@sagar8460830871
@sagar8460830871 3 жыл бұрын
Can we do for multiple variable
@maxmag76
@maxmag76 3 жыл бұрын
Thank you so much for the nice video.
@blackisfav7222
@blackisfav7222 4 жыл бұрын
Consider behaviour of user logins and find the anomalies
@mp3311
@mp3311 2 жыл бұрын
I get the ValueError: Expected 2D array, got 1D array instead at scaler.inverse_transform(test[TIME_STEPS:].close).How could I fix this?
@gnn816
@gnn816 2 жыл бұрын
Hello there, did you manage to solve this problem. I am facing the same issue.
@dibyakantaacharya4104
@dibyakantaacharya4104 4 жыл бұрын
can i execute this by using of image datasets?what will be the code ??
@doudi0101
@doudi0101 4 жыл бұрын
Very interesting, thank you !
@Bruno.FERGANI
@Bruno.FERGANI 4 жыл бұрын
Thanks Venelin for the tutorial ! 👍 Prefered way to use Tensorflow 2.x on Colab is via the %tensorflow_version magic: colab.research.google.com/notebooks/tensorflow_version.ipynb
@abhijeet6989
@abhijeet6989 3 жыл бұрын
Dear Sir, Greetings!! Thank you very much for guiding us throughout the tutorial. Kindly guide the error to solve the below issues. I am getting an error here: THRESHOLD = 1.9 test_score_df = pd.DataFrame(index=test[TIME_STEPS:].index) test_score_df['loss'] = test_mae_loss test_score_df['threshold'] = THRESHOLD test_score_df['anomaly'] = test_score_df.loss > test_score_df.threshold test_score_df['close'] = test[TIME_STEPS:].close Errors are: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 1 THRESHOLD = 1.9 2 test_score_df = pd.DataFrame(index=test[TIME_STEPS:].index) ----> 3 test_score_df['loss'] = test_mae_loss 4 test_score_df['threshold'] = THRESHOLD 5 test_score_df['anomaly'] = test_score_df.loss > test_score_df.threshold 3 frames /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __setitem__(self, key, value) 3042 else: 3043 # set column -> 3044 self._set_item(key, value) 3045 3046 def _setitem_slice(self, key: slice, value): /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in _set_item(self, key, value) 3118 """ 3119 self._ensure_valid_index(value) -> 3120 value = self._sanitize_column(key, value) 3121 NDFrame._set_item(self, key, value) 3122 /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast) 3766 3767 # turn me into an ndarray -> 3768 value = sanitize_index(value, self.index) 3769 if not isinstance(value, (np.ndarray, Index)): 3770 if isinstance(value, list) and len(value) > 0: /usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in sanitize_index(data, index) 746 if len(data) != len(index): 747 raise ValueError( --> 748 "Length of values " 749 f"({len(data)}) " 750 "does not match the length of index " ValueError: Length of values (7752) does not match the length of index (380).
@cyrusazamfar6220
@cyrusazamfar6220 2 жыл бұрын
You are copying stuff from another screen and STILL, you messed it up :) 🤣
@AlonAvramson
@AlonAvramson Жыл бұрын
If it would be profitable, would you still invest time to create and publish a video?
@blackisfav7222
@blackisfav7222 4 жыл бұрын
Any help for logintime based anomaly
@adityahpatel
@adityahpatel 2 жыл бұрын
in autoencoder you should do .fit(x,x), not .fit(x,y)
@skorpio3110
@skorpio3110 Жыл бұрын
Did you get any answer for that? I'm confused too
@farhanjavid6474
@farhanjavid6474 6 ай бұрын
😍😍😍😍😍😍😍
@susantisisteminformasi4154
@susantisisteminformasi4154 4 жыл бұрын
Hello noob here. Why gpu?
@FrancescoLucrezia
@FrancescoLucrezia 3 жыл бұрын
There is a course on Coursera with identical content of this video. So someone is plagiarizing. The course on Coursera is a paid one: www.coursera.org/projects/anomaly-detection-time-series-keras
@cedricvillani8502
@cedricvillani8502 3 жыл бұрын
And then TradingView stomped on him. Go there, learn Pine Script, Make money. Then go on his Patreon and give him money. OK GO GO GO
@Raaj_ML
@Raaj_ML Жыл бұрын
Sorry, this is fundamentally wrong...Autoencoder should try to reconstruct 30 time steps from the input 30 time step values...Here you are trying to mix forecast of 31st value with autoencoder...It should be trained with target values same as input values.
@kacperogorek3958
@kacperogorek3958 3 жыл бұрын
You are close
哈哈大家为了进去也是想尽办法!#火影忍者 #佐助 #家庭
00:33
火影忍者一家
Рет қаралды 124 МЛН
Бенчик, пора купаться! 🛁 #бенчик #арти #симбочка
00:34
Симбочка Пимпочка
Рет қаралды 2,9 МЛН
Who’s the Real Dad Doll Squid? Can You Guess in 60 Seconds? | Roblox 3D
00:34
Autoencoders in Python with Tensorflow/Keras
49:39
sentdex
Рет қаралды 76 М.
Anomaly detection with TensorFlow | Workshop
45:29
TensorFlow
Рет қаралды 107 М.
New Trends in Time Series Anomaly Detection
1:39:14
EDBT Association
Рет қаралды 7 М.
RNN From Scratch In Python
52:51
Dataquest
Рет қаралды 25 М.
180 - LSTM Autoencoder for anomaly detection
26:53
DigitalSreeni
Рет қаралды 91 М.
Anomaly Detection: Algorithms, Explanations, Applications
1:26:56
Microsoft Research
Рет қаралды 95 М.
Anomaly Detection with AutoEncoders using Tensorflow
32:33
AIEngineering
Рет қаралды 41 М.
LSTM-Based Time Series with PyTorch (10.2)
13:54
Jeff Heaton
Рет қаралды 6 М.
Terence Tao at IMO 2024: AI and Mathematics
57:24
AIMO Prize
Рет қаралды 506 М.
哈哈大家为了进去也是想尽办法!#火影忍者 #佐助 #家庭
00:33
火影忍者一家
Рет қаралды 124 МЛН