Multivariate Time Series Prediction with LSTM and Multiple features (Predict Google Stock Price)

Рет қаралды 62,941

Күн бұрын

Пікірлер: 116

@arturoroche3782 4 жыл бұрын

01:58 Download data 04:39 Import libraries 05:19 Load data (Read file) 05:36 Select features 06:33 Data preprocessing 07:44 Scaling data 08:37 Data structure 10:38 Creating model

@kingki1953 Жыл бұрын

Thanks dude

@benjaminlynch9958 2 жыл бұрын

A big part of the problem with the results at the end of the test period was how the training data was scaled. This is a common problem with stock prediction models because with enough time stock prices will almost always exceed the upper bound of the original scaled range. The only way around this is to scale the input data by a different method or to retrain (and rescale) the model on a regular basis. The other thing to know is that the opening price is *NOT* a market derived price but is in fact determined by a conscious decision of a single individual - the market maker - for each stock. There are academic studies that show that opening prices are ‘manipulated’ (in a very legal way) by the market maker, and as such it is advisable to take this into account with project such as this. Most people choosing to predict stock prices choose to focus on the closing price for exactly this reason. Failure to do this will see the model ‘predict’ the maximum possible value over and over again once the actual price has exceeded that maximum which is what we see at the end of this video.

@folashadeolaitan6222 2 жыл бұрын

Thank you for this video sir. Please i have the following questions. 1. Why are we creating a second scaler object for the open price which we want to predict? Is it not part of the training_set which we supplied to the first scaler? sc = StandardScaler() training_set_scaled = sc.fit_transform(training_set) sc_predict = StandardScaler() sc_predict.fit_transform(training_set[:, 0:1])

@mihirbhawsar3150 3 жыл бұрын

Why data not split into 70-30 (x_train,y_train,x_test,y_test)

@vijayendrasdm 3 жыл бұрын

should n't shuffle be set to false in model.fit step for time series data ?

@Mipetz38 3 жыл бұрын

Same question, my 2nd question is if a windowing system is actually being used

@devanshgupta5059 Жыл бұрын

Thank you so much for this tutorial, there arent many multivariate examples on the internet and with the explanation style of your video. greatly appreciated

@DataScienceGarage Жыл бұрын

Thanks you for such feedback! Really appreciate! :)

@NiranjanBallal 4 жыл бұрын

Shouldnt the shape of y be (3857,60)? Because you are predicting the value for 60 days?

@noya-san1118 3 жыл бұрын

good point. he is not predicting the projection for next 60 days, just what it will be 60 days later

@scott7948 3 жыл бұрын

Hi the future predictions don't actually work. If you look the future predictions they are just the last training set predictions. So if you predict 8 time steps in the future it takes the last 8 training set predictions numbers. Anyone know away to revolve this?

@senthilkumara3653 4 жыл бұрын

Hello Prof. Wonderful video. Thanks for sharing. I tried using the same code and dataset. I find 1.) Step 2: I have to modify training_set = dataset_train.to_numpy() -since .as_matrix() has been deprecated. 2.) I get error in step 6 visualizing the predictions ("ValueError: view limit minimum -34770.450000000004 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units"). Please suggest a solution.

@econometrics4423 4 жыл бұрын

I m facing the same problem, please let me know if you find any solution.

@AutoitTutorialReturn 3 жыл бұрын

@@econometrics4423 Run this code after you run step 6: ------------------------------ # Parse training set timestamp for better visualization dataset_train = pd.DataFrame(dataset_train, columns=cols) dataset_train.index = datelist_train dataset_train.index = pd.to_datetime(dataset_train.index) ------------------------------ it's work fo me

@lucreziatosato360 2 жыл бұрын

Hi, did you find any solution?

@lucreziatosato360 2 жыл бұрын

Find it, You just need to put the last cell of code before the penultimate, everything works fine then

@malepatirahul7339 3 жыл бұрын

Hello, In model predictions i did not understand that part model.predict(X_train[ -n_future : ]). can anyone say something on this why it was written as -n_future

@shvprkatta 4 жыл бұрын

Hello doc..for every 90 time steps of X_train data, what is the corresponding y_train data?... I mean is it the 91st value of y_train data?...please clarify

@serdilanlunlu1419 Жыл бұрын

Why we dont split train and test data ?

@neharai4330 4 жыл бұрын

it is not giving future predictions. the future prediction let say for 5 days are same as last 5 days of train set predictions.please look into it ,its confusing

@AkarshVerma 4 жыл бұрын

@noya-san1118 3 жыл бұрын

@Dong Hoon Oh the code is not wrong, it just can't predict the future based on the input.

@ivanarcos9329 3 жыл бұрын

@@noya-san1118 Is there any way to do it? Please, if someone can contribute any way, I thank them.

@turkial-harbi2919 2 жыл бұрын

I think there is also a problem with reshaping the data for the model. The train dataset is dropping the last feature.

@marvinvalentin6386 Жыл бұрын

Hi, this is a good tutorial. Thanks. How will I extract the predicted value so that it will be written besides the actual value. There would be 2 columns, 1 for the predicted and 1 for the actual. Thanks.

@user-sj2qs3cl8o Жыл бұрын

I think your video is very good. It helped me a lot. I perused from start to finish and followed the tutorial using stock data from other companies. I have a question. When you look at the last graph, the last part of the red graph (PREDICTIONS_FUTURE) and the last part of the orange graph (PREDICTION_TRAIN) always match. When I checked the tail, the value probably matches. I'm asking you a question because I'm not good enough. I'm sorry. Thank you so much.

@yangjun330 4 жыл бұрын

thanks. but. how to make the “multi-input one output” or “multi_input multi-output” models? That means i need use many other features (e.g a b c) to predict d.

@ardagon385 3 жыл бұрын

just use multi input to multi output and select the output you need

@WahranRai 3 жыл бұрын

13:29 I dont like your shuffle = True in model.fit, for time series we must keep the time sequence of data

@YohaneesHutagalung 3 жыл бұрын

Strat training ,and the results is : Epoch 00010: val_loss did not improve from inf, Epoch 00010: early stopping CPU times: user 49 s, sys: 3.53 s, total: 52.5 s Wall time: 31.2 s.. i changed lr, still didn't change

@academicexpert1108 4 жыл бұрын

Here future predictions are exact same as training predictions. that means, it takes last train predicted data and plot them as future predictions. but not actual predictions

@DataScienceGarage 4 жыл бұрын

no, look again the tutorial :)

@DataScienceGarage 4 жыл бұрын

Training predictions are displayed just for checking if we predicted correctly. Or.. could you clarify on you note?..

@academicexpert1108 4 жыл бұрын

I'm sorry if I am wrong. But As I notice, for Predictions_future you take data from X_train(-n_future:0). that means last x values of X_train data. even though the date index is changing to future, no new predicted values for the future. if your n_future is 5, if you check the last 5 values of PREDICTIONS_FUTURE and PREDICTION_TRAIN both are same.

@DataScienceGarage 4 жыл бұрын

@@academicexpert1108 I will check it and try to answer to you in more details soon. I developed this code carefully but I will double check on your concerning.

@academicexpert1108 4 жыл бұрын

@@DataScienceGarage ok thanks a lot

@madhu1987ful 2 жыл бұрын

Also I hv a question...can we use same scaler for features n target?

@yangtian1138 4 жыл бұрын

Thank you for your code and video Dr. Vytautas! I am just confused about the train_x, do you only include the first 4 features for training instead of all the 6 features?

@YeungLorentz 4 жыл бұрын

i guess something wrong is there...

@victoriarobinson9717 3 жыл бұрын

@@YeungLorentz Out of curiosity, what do you think he should have done? I am currently struggling with my dataset where I just have 2 features, and I can't determined if my train matrix should include the index date column or just the two features(the one I want to predict and the feature that impacts it) or just the one feature that impacts?

@janlippemeier6194 3 жыл бұрын

Perfect Video. Could easily adapt it for predicting the power signal of an industrial machine. Especially the provided Jupyter Notebook turned out valueable for me. Keep up the good work.

@DataScienceGarage 3 жыл бұрын

Thank you for watching, appreciate your feedback! :)

@dannous 3 жыл бұрын

Why do you replace commas in each cell? In the CSV there is no comma, plus this is very CPU intense for long dataframes

@dr.ahmedal-adaileh2901 3 жыл бұрын

Thanks for the sharing. However, it would be much helpful if you would have shown the code a bit more clearly ;) I mean a better zoom in

@ravibengeri1507 4 жыл бұрын

Hi Sir, What approach we should follow when the target variable is following sigmoid or logistic or S curve with respect to time. Shall we still apply Time Series? If we can which algorithm we should chose as it has multiple variables affecting target variable?

@harshaannehal9919 3 жыл бұрын

first of all, it is an amazing video. i loved how you explained everything. I was wondering if you can show how to show the data on tensorboard

@machib77 4 жыл бұрын

Can you make an example using returns? I think it would be interesting if you could model something that tries to guess correctly the side of the future price (if it's going up or down, rather than how much)

@chadgregory9037 3 жыл бұрын

if you're trying to guess directions, you are going about markets the WRONG way. You should be learning how to manage risk, and participate without losing. If you can play the game and not lose, you are guaranteed to win. Just don't lose.

@edu1113 4 жыл бұрын

a silly question but can you predict the past price as well with this model? for example i want to know the value of missing data in the past, can i estimate the values using the same model? Im working on gap-filling for a time-series data

@chadgregory9037 3 жыл бұрын

you should look into GANs and VAEs for missing data

@awaisumar5125 4 жыл бұрын

how did we select 64 input? , can you please explain it a little more

@NikhilKumar-md2fb 4 жыл бұрын

What theme are you using for Jupyter Notebook? It looks better than the default.

@DataScienceGarage 4 жыл бұрын

It is Jupyter Notebook in Visual Code.

@jackflavell4191 3 жыл бұрын

a bunch of stuff is depricated and when i fix it the training data is somehow set to the 70s. anyone else get this?

@raymondcruzin6213 4 жыл бұрын

hello, I would like to see how it can be represented in a diagram or flow diagram. thank you

@gaurangsharma9819 4 жыл бұрын

Thanks!, BTW...How are you converting list to 2D array , i am stuck at that particular place , right before model preparation

@jadonsumit786 11 ай бұрын

there is a problem you have already included labels in Xtrain. and also repeating the same data by Ytrain. in short you are just trying to make relations between the 3d and 2d arrays with all the same scaled values.

@fliederblumen1843 4 жыл бұрын

hello, can we do multivariate LSTM prediction without using the target feature as input feature? that means here remove the column of 'open' as input variable, only using 'high, low, close, adj close' as the 4 features to predict 'open' with LSTM?

@chadgregory9037 3 жыл бұрын

Lol, you can put whatever the fuk u want into a network, even the number of times your cat meows, who knows, maybe there is a quantum connection between your cat and hte market

@hubertnguyen8855 3 жыл бұрын

Hi, I think there is one mistake in your code of y_train. It should be: y_train.append(training_set_scaled[i:i+n_future,0]) not [i+n_future: n_future:1,0]. Thanks, correct and explain if I'm wrong!!!

@DataScienceGarage 3 жыл бұрын

Hi! Thank you very much for the comment! I will check your suggestion, it's worth to try. And come back with the answer.

@HealthyFoodBae_ 4 жыл бұрын

Could we create a Real Time Time Series program using LSTM?

@crypto_peng 4 жыл бұрын

How you build those PPT? pretty useful sharing. thanks

@saidtojiboev9673 2 жыл бұрын

Why did you start by taking 5 features in the beginning but take 4 features? for i in range(n_past, len(training_set_scaled) - n_future +1): X_train.append(training_set_scaled[i - n_past:i, 0:dataset_train.shape[1] - 1]) y_train.append(training_set_scaled[i + n_future - 1:i + n_future, 0]) Here for the X_train you took 4 features by this code 0:dateset_train.shape[1] - 1. Could you please tell me why you dropped 1 feature?

@raymondcruzin6213 4 жыл бұрын

Hi I just notice, that Volume was dropped as features on your model. Can you compare it using uni variate vs. multivariate. Thanks for the video... :)

@jeffpicapiedra 3 жыл бұрын

Really nice video, it helped me to learn so much about the implementation process of the LSTM. At the end you say something about feeding the model with more stocks to try to improve the permformance. I'm wondering how could you reshape the data to add more stocks with the same multiple features. I'm actually doing a work about this topic and it would help me so much this advice. Thank you.

@DataScienceGarage 3 жыл бұрын

Thanks a lot for such feedback!

@HealthyFoodBae_ 4 жыл бұрын

Thank you! How do we show output of actual/predicted values

@indzy 3 жыл бұрын

if you understand how it works. you already got that.

@raymondcruzin6213 4 жыл бұрын

Hi Dr., I am currently working with similar project and wanted to implement multivariate LSTM on predicting CPU resource utilization, memory usage, running processes, etc. Can you make a multistep output in predicting both Open, Close, High and Volume prediction? Great work!

@DataScienceGarage 4 жыл бұрын

Hi! I am going to do another tutorial with mutlistep output. I am not able to do it now, but in near future - I hope it will be released.

@raymondcruzin6213 4 жыл бұрын

@@DataScienceGarage my data set is with similar time stamp with 3min window. How can I revise it from your code?

@raymondcruzin6213 4 жыл бұрын

@@DataScienceGarage From your code or generally, how do we tune or improve our predicting methods with multivariate LSTM?

@madhu1987ful 2 жыл бұрын

Excellent well structured video. Can u also make a video on using darts library for multivariate, multiple time series forecasting?

@bintangsaputra4684 2 жыл бұрын

Hi sir, thankyou for the tutorial, it really helps me to do my thesis! However, I have an error when convert the prediction (y_pred_future) into dataframe. Here is the error : ValueError: Shape of passed values is (13, 5), indices imply (13, 1) I still can not find the answer, I hope you can help me, thanks !

@gomes8335 Жыл бұрын

The model was trained to have 5 columns and you are only providing it one. So that's the problem.

@zainhajhamad4725 4 жыл бұрын

Hey Dr I wanna ask you What the platform you are working in it?

@pratikgehlot9516 3 жыл бұрын

Visual studio

@sanjay-g9o 4 жыл бұрын

Thanks Dr. Vytautas, nicely explained

@DataScienceGarage 4 жыл бұрын

Thanks for feedback! :)

@hamidgholizadegan1285 4 жыл бұрын

Thanks for sharing your great video. I am using Multivariate Time Series Prediction with LSTM and Multiple features to predict the next 90 days , before fitting to LSTM I scale he values anf after prediction my prediction array will be represented by a 3-dimensional array in python, say of shape (90, 14, 1)) and call it forecast. when I want to inverse it to get actual numbers I got error regarding 3 dimentional array. because inverse transformer get 2 dimensional array. what should I do? the shape of my forecasting is (90, 14, 1) . I got this error : I got this error ValueError ValueError: Found array with dim 3. Estimator expected

@chadgregory9037 3 жыл бұрын

is it predicting 14 days forward, for each day, for 90 days..... if so you need to melt or unmelt or whatever they call it, un-melt

@ayommuharrom7489 4 жыл бұрын

Thanks, Good tutorial. How about Echo State Network method? maybe you can implement and make the comaparation

@laithhaleem5940 2 жыл бұрын

Thanks for this tutorials But it's not Multivariate Prediction, You just extracted Open price from dataset by this code sc_predict.fit_transform(training_set[:, 0:1]) , So, You actualy depended on just one variable.

@serdilanlunlu1419 2 жыл бұрын

Great tutorial, thanks a lot sir !!

@mehedihossain6312 4 жыл бұрын

how to check accuracy

@marinakurmanova-368 4 жыл бұрын

for regression check this approach: alphascientist.com/model_evaluation.html

@jacknaneek1681 3 жыл бұрын

can't read the code on the screen. too small.

@jacknaneek1681 3 жыл бұрын

I see the github link with all code. Thank you!

@titlov123 4 жыл бұрын

please post the code

@DataScienceGarage 4 жыл бұрын

Today or tomorrow...

@DataScienceGarage 4 жыл бұрын

Posted in description, please check.

@j.k.priyadharshini9753 5 ай бұрын

deep learning models are like solving a mystery.🥴🥴

@dannous 3 жыл бұрын

You define datelist_future_ but is never used

@zainhajhamad4725 4 жыл бұрын

Such an amazing video! Thank you!!

@DataScienceGarage 4 жыл бұрын

Thanks!

@zainhajhamad4725 4 жыл бұрын

@@DataScienceGarage hello Dr, When we start training, we face error called "OverFlowError: cannot convert float infinity to integer" What could we do to solve this problem? I appreciate you help, thank you🙏

@jayjohnnyjay8792 4 жыл бұрын

thanks for the video!

@kanshkansh6504 4 жыл бұрын

Thanks for the tutorial

@DataScienceGarage 4 жыл бұрын

Thank you for watching!

@DataScienceGarage 4 жыл бұрын

Sorry for not the best quality of the video, I had some issues with recorders. Full code with explanations on Github: github.com/vb100/multivariate-lstm/blob/master/LSTM_model_stocks.ipynb. Thanks for watching!

@rajeshmourya4624 3 жыл бұрын

Thank you!!!

@chadgregory9037 3 жыл бұрын

Lol IT LITERALLY LEARNED MARKOVIAN GUESSES HAHA!!!

@imarticuslearning6071 3 жыл бұрын

Its a very nice video , but its not multivariate , you are extracting only Google's stock price and predicting then how can you say it is multivariate . its purely univariate .

@DataScienceGarage 3 жыл бұрын

Hey! A Multivariate time series has more than one time-dependent variable. Each variable depends not only on its past values but also has some dependency on other variables. This dependency is used for forecasting future values. Source: www.analyticsvidhya.com/blog/2018/09/multivariate-time-series-guide-forecasting-modeling-python-codes/

@tomanderson3608 2 жыл бұрын

Can anyone explain what is happening / how this code block works: #train_gold_scaled ismy equivalent of training_set_scaled #train_gold_df is my equivalent of dataset_train for i in range(n_past, len(train_gold_scaled) - n_future + 1): X_train.append(train_gold_scaled[i - n_past:i, 0:train_gold_df.shape[1] - 1]) y_train.append(train_gold_scaled[i + n_future - 1: i + n_future, 0])

@dutchy5752 2 жыл бұрын

he just deleted my comment. train and future overlap as you can see in the code below. There is no future prediction in this code # Perform predictions predictions_future = model.predict(X_train[-n_future:]) predictions_train = model.predict(X_train[n_past:])