Pay Attention

Binary Classification in Supervised Learning

Date Created: May 9, 2024

Using an artificially generated dataset about two centers, I form a basic Neural Network and introduce L1 and L2 regularization, loss functions, and gradient descent.

Read Here

RNNs, GRUs, and LSTMs

Date Created: May 15, 2024

I simulate a sine wave with noise and then explore RNN, GRU, and LSTM architectures from scratch as a form of denoising.

Read Here

Transformer Architecture

Date Created: May 25, 2024

After discussing Query, Key, Value matrix initialization, I explain the attention process using a feedforward network to learn a saw wave.

Read Here