ML Projects
Train deep learning models end-to-end on small datasets in a guided, GPU-powered notebook. Run cells, inspect outputs, and reveal solutions when you're stuck.
Every project runs on a dedicated GPU kernel — Premium only
MNIST Digit Classifier
PROTrain a small convolutional neural network to classify handwritten digits end-to-end: download MNIST, build DataLoaders, define a CNN, train it on a GPU, and evaluate with accuracy and a confusion matrix. Variables persist across cells — download once, use everywhere.
Digits Classifier with a PyTorch MLP
PROA fast, fully in-memory project: load scikit-learn's 8x8 `digits` dataset, build a tiny multilayer perceptron in PyTorch, train it on the GPU, and evaluate. Great for seeing the full train/eval loop without any dataset download.
CIFAR-10 Image Classifier
PROTrain a small convolutional neural network to classify 32x32 color images across 10 classes (airplane, automobile, bird, cat, ...). Download CIFAR-10, build DataLoaders, define a CNN with batch-norm, train it on a GPU, and evaluate with accuracy and a confusion matrix. Variables persist across cells — download once, use everywhere.
Fashion-MNIST Classifier
PROClassify 28x28 grayscale clothing images — T-shirts, trousers, sneakers and more — with a small CNN. Same end-to-end flow as MNIST, but the ten classes are real-world garments, so predictions read as human labels. Download once, train on a GPU, and evaluate.
Kuzushiji-MNIST Classifier
PROClassify 28x28 grayscale images of cursive Japanese hiragana (10 classes) with a small CNN — a drop-in, slightly harder cousin of MNIST. Same end-to-end flow: download, build DataLoaders, train on a GPU, and evaluate with accuracy and a confusion matrix.
Olivetti Faces Recognition
PROA compact face-recognition project on scikit-learn's Olivetti dataset: 40 people, 400 64x64 grayscale photos. Load it in memory, build a small CNN, train it on a GPU, and evaluate — no large download, sklearn fetches the data for you.
Text Classification with TF-IDF
PROClassic NLP without a neural network: classify forum posts into topics using TF-IDF features and a linear classifier. Fetch a few categories of the 20 Newsgroups dataset, turn raw text into TF-IDF vectors, train a Logistic Regression model, and evaluate with accuracy and a confusion matrix. Variables persist across cells — vectorize once, reuse everywhere.
Mini-GPT: A Char-Level Transformer
PROBuild a tiny GPT-style transformer from scratch — token and positional embeddings, masked multi-head self-attention, and a transformer block — then train it as a character-level language model on a small built-in corpus and generate new text. Small enough to train in seconds on a GPU, complete enough to show how a real LLM works under the hood.
Text Generation with GPT-2
PROLoad the pretrained GPT-2 language model and its tokenizer from Hugging Face, tokenize a prompt, and generate text with greedy, sampling, and top-k / temperature decoding. See how decoding settings reshape the output, and inspect the model's next-token distribution. The model downloads into the GPU sandbox on first use and stays loaded for the session.
Sentiment Analysis with DistilBERT
PROUse a DistilBERT model fine-tuned on SST-2 to classify the sentiment of sentences. Tokenize text with a WordPiece tokenizer, run batched inference on the GPU, read off positive/negative probabilities, and compare predictions against expected labels. The model downloads into the sandbox on first use and stays loaded for the session.