Skip to content

Real Project Notebooks (Google Colab)

These are intermediate to advanced, real dataset projects designed for copy-paste and run in Google Colab.

Real Dataset Policy

  • Every notebook uses a real public dataset.
  • Every notebook includes a Dataset Source and Download Instructions block.
  • Each block includes: official source URL, auto-download command, and manual fallback instructions.

Notebook Portfolio

Notebook Focus Dataset Open in Colab GitHub
01-image-classification-fmnist-cnn.ipynb Image classification baseline with NeuroGebra ModelBuilder Fashion-MNIST (Zalando) Open View
02-image-classification-observatory-pro.ipynb Image classifier diagnostics with AdaptiveLogger, health warnings, tiered storage, dashboard Fashion-MNIST (Zalando) Open View
03-gan-mnist-baseline.ipynb GAN baseline with NeuroGebra activation bridge into PyTorch MNIST Open View
04-gan-mode-collapse-diagnostics.ipynb GAN instability and mode-collapse diagnostics with Observatory Pro MNIST Open View
05-diffusion-image-denoising-baseline.ipynb Diffusion-inspired noise prediction baseline CIFAR-10 Open View
06-diffusion-math-scheduler-deep-dive.ipynb Diffusion schedule math, forward-process visualization, reproducibility fingerprint CelebA (primary), CIFAR-10 fallback Open View
07-core-nlp-spam-classifier.ipynb Core NLP spam detection with NeuroGebra ModelBuilder SMS Spam Collection (UCI) Open View
08-core-nlp-sentiment-classifier.ipynb Core NLP sentiment classification with bridge-assisted training and adaptive warnings IMDb Large Movie Review Open View
09-small-language-model-charlm-from-scratch.ipynb Small language model from scratch (character-level) Tiny Shakespeare Open View
10-small-language-model-charlm-tuning.ipynb Character LM tuning and fingerprint-based comparison WikiText-2 Open View

Dataset Sources

  • Fashion-MNIST: https://github.com/zalandoresearch/fashion-mnist
  • MNIST: http://yann.lecun.com/exdb/mnist/
  • CIFAR-10: https://www.cs.toronto.edu/~kriz/cifar.html
  • CelebA: https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
  • SMS Spam Collection: https://archive.ics.uci.edu/dataset/228/sms+spam+collection
  • IMDb Reviews: https://ai.stanford.edu/~amaas/data/sentiment/
  • Tiny Shakespeare: https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt
  • WikiText-2: https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/