Deep Learning Recurrent Neural Networks In Python Lstm Gru And More Rnn Machine Learning Architectures In Python And Theano Machine Learning In Python ðŊ
They can remember information for hundreds of steps, making them ideal for text generation, speech recognition, and complex time series. GRU (Gated Recurrent Unit) GRUs are a simpler, faster alternative to LSTMs. They merge the forget and input gates into a single "update gate" and combine the cell state with the hidden state. GRUs perform similarly to LSTMs on many tasks but with fewer parameters.
In Python (with Theano-style tensors), a naive implementation looks like:
h_t = T.tanh(T.dot(x_t, W_xh) + T.dot(h_prev, W_hh) + b_h) They can remember information for hundreds of steps,
h_t = tanh(W_x * x_t + W_h * h_t-1 + b)
import numpy as np from keras.models import Sequential from keras.layers import GRU, Dense def generate_sine_wave(seq_length, num_samples): X, y = [], [] for _ in range(num_samples): start = np.random.uniform(0, 4*np.pi) seq = np.sin(np.linspace(start, start + seq_length, seq_length + 1)) X.append(seq[:-1].reshape(-1, 1)) y.append(seq[-1]) return np.array(X), np.array(y) GRUs perform similarly to LSTMs on many tasks
from keras.models import Sequential from keras.layers import LSTM, GRU, SimpleRNN, Dense, Embedding from keras.preprocessing import sequence max_features = 20000 maxlen = 100 # truncate reviews to 100 words batch_size = 32 Build model model = Sequential() model.add(Embedding(max_features, 128, input_length=maxlen)) model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2)) # or GRU(128) model.add(Dense(1, activation='sigmoid')) Compile (Theano backend) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) Train model.fit(x_train, y_train, batch_size=batch_size, epochs=5, validation_data=(x_val, y_val))
| Architecture | # Gates | Cell State | Best for | |--------------|---------|------------|-----------| | Simple RNN | 0 | No | Very short sequences | | LSTM | 3 | Yes | Long dependencies, complex data | | GRU | 2 | No | Smaller datasets, faster training | While Theano is no longer actively developed (it was a pioneer, but most have moved to TensorFlow/PyTorch), many legacy systems and research codebases still use it. Here's how you'd build an LSTM for sentiment analysis using Theano with the Keras 1.x API: While standard neural networks treat each input as
Recurrent Neural Networks (RNNs) are the powerhouse behind most modern breakthroughs in sequence dataâthink speech recognition, machine translation, time series forecasting, and even music generation. While standard neural networks treat each input as independent, RNNs have a "memory" that captures information from previous steps.