Why Transformers fail at Time Series. Why do simple models beat Transformers at TSF

  Рет қаралды 236

Devansh: Chocolate Milk Cult Leader

Devansh: Chocolate Milk Cult Leader

8 ай бұрын

When they were first gaining attention, the world lost its mind about Transformers in Time Series Forecasting. Unfortunately, Transformers never quite lived up to the hype. So, what went wrong?
To quote the authors of, "TSMixer: An All-MLP Architecture for Time Series Forecasting"- "The natural intuition is that multivariate models, such as those based on Transformer architectures, should be more effective than univariate models due to their ability to leverage cross-variate information. However, Zeng et al. (2023) revealed that this is not always the case - Transformer-based models can indeed be significantly worse than simple univariate temporal linear models on many commonly used forecasting benchmarks. The multivariate models seem to suffer from overfitting especially when the target time series is not correlated with other covariates."
The problems for Transformers don't end here. The authors of 'Are Transformers Effective for Time Series Forecasting' demonstrated that Transformer models could be beaten by a very simple linear model. When analyzing why Transformers failed, they pointed to the Multi-Headed Self Attention as a potential reason for their failure.
"More importantly, the main working power of the Transformer architecture is from its multi-head self-attention mechanism, which has a remarkable capability of extracting semantic correlations between paired elements in a long sequence (e.g., words in texts or 2D patches in images), and this procedure is permutation-invariant, i.e., regardless of the order. However, for time series analysis, we are mainly interested in modeling the temporal dynamics among a continuous set of points, wherein the order itself often plays the most crucial role."
To learn more about their research and Transformers in TSF tasks, I would suggest reading the article below.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- artificialintelligencemadesim...
My grandma’s favorite Tech Newsletter- codinginterviewsmadesimple.su...
Check out my other articles on Medium. : rb.gy/zn1aiu
My KZbin: rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: rb.gy/m5ok2y
My Instagram: rb.gy/gmvuy9
My Twitter: / machine01776819
Are Transformers effective for TSF- artificialintelligencemadesim...
For more details, sign up for my free AI Newsletter, AI Made Simple. AI Made Simple- artificialintelligencemadesim...
If you want to take your career to the next level, Use the discount 20% off for 1 year for my premium tech publication, Tech Made Simple.
Using this discount will drop the prices- 800 INR (10 USD) → 640 INR (8 USD) per Month
8000 INR (100 USD) → 6400INR (80 USD) per year (533 INR /month)
Get 20% off for 1 year- codinginterviewsmadesimple.su...
Catch y'all soon. Stay Woke and Go Kill all
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- artificialintelligencemadesim...
My grandma’s favorite Tech Newsletter- codinginterviewsmadesimple.su...
Check out my other articles on Medium. : rb.gy/zn1aiu
My KZbin: rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: rb.gy/m5ok2y
My Instagram: rb.gy/gmvuy9
My Twitter: / machine01776819

Пікірлер: 1
@andreszapata4972
@andreszapata4972 Ай бұрын
I'm trying to develop a simple transformer architecture neural network to predict a 'line', but it doesn't perform well on data outside of the training set. I'm not sure what to do. Sometimes the network fits the training data well, but it's unable to generalize. I also tried using an LSTM, but the same issue occurs. What can I do? Keep in mind that I want to use the network to train it with different data, so the 'line' is my starting point
Are Transformers Effective for Time Series Forecasting? Machine Learning Made Simple
9:52
Devansh: Chocolate Milk Cult Leader
Рет қаралды 2,9 М.
Neural Transformer Encoders for Timeseries Data in Keras (10.5)
8:24
Sigma Girl Education #sigma #viral #comedy
00:16
CRAZY GREAPA
Рет қаралды 123 МЛН
ИРИНА КАЙРАТОВНА - АЙДАХАР (БЕКА) [MV]
02:51
ГОСТ ENTERTAINMENT
Рет қаралды 873 М.
1❤️#thankyou #shorts
00:21
あみか部
Рет қаралды 69 МЛН
Devin AI's demos are lying. Tactics used  to generate hype (
19:40
Devansh: Chocolate Milk Cult Leader
Рет қаралды 426
EXPERIMENT Glowing 1000 degree KNIFE VS COCA COLA
6:33
MrGear
Рет қаралды 92 МЛН
Time Series Forecasting with XGBoost - Advanced Methods
22:02
Rob Mulla
Рет қаралды 110 М.
Why Meta released Llama 3 Open Source and MS is fighting against OS Foundation Models.
20:11
Devansh: Chocolate Milk Cult Leader
Рет қаралды 98
Transformer-Based Time Series with PyTorch (10.3)
6:33
Jeff Heaton
Рет қаралды 10 М.
A Very Simple Transformer Encoder for Time Series Forecasting in PyTorch
15:34
Let's Learn Transformers Together
Рет қаралды 2,3 М.
сюрприз
1:00
Capex0
Рет қаралды 1,2 МЛН
WWDC 2024 Recap: Is Apple Intelligence Legit?
18:23
Marques Brownlee
Рет қаралды 4,9 МЛН
ВСЕ МОИ ТЕЛЕФОНЫ
14:31
DimaViper Live
Рет қаралды 61 М.