Pay Attention

Exploring and annotating the latest research in Generative AI

Binary Classification in Supervised Learning

Date Created:

Using an artificially generated dataset about two centers, I form a basic Neural Network and introduce L1 and L2 regularization, loss functions, and gradient descent.

Read Here

RNNs, GRUs, and LSTMs

Date Created:

I simulate a sine wave with noise and then explore RNN, GRU, and LSTM architectures from scratch as a form of denoising.

Read Here

Transformer Architecture

Date Created:

After discussing Query, Key, Value matrix initialization, I explain the attention process using a feedforward network to learn a saw wave.

Read Here