LSTM
Long Short-Term Memory — a type of recurrent neural network designed to learn long-range dependencies in sequential data.
LSTM networks, introduced in 1997 by Hochreiter and Schmidhuber, solved the vanishing gradient problem that plagued standard RNNs. Their gated memory cells allow them to remember important information over many time steps and forget irrelevant details.
An LSTM cell uses three gates (input, forget, output) to control what information flows in, stays, and flows out. This design made LSTMs the dominant architecture for sequence tasks throughout the 2010s.
LSTMs powered machine translation (Google Translate pre-2016), speech recognition, and early language models. They've been largely displaced by transformers, but remain useful in specific low-latency or memory-constrained applications.