Real Project Notebooks (Google Colab)¶
These are intermediate to advanced, real dataset projects designed for copy-paste and run in Google Colab.
Real Dataset Policy¶
- Every notebook uses a real public dataset.
- Every notebook includes a
Dataset Source and Download Instructionsblock. - Each block includes: official source URL, auto-download command, and manual fallback instructions.
Notebook Portfolio¶
| Notebook | Focus | Dataset | Open in Colab | GitHub |
|---|---|---|---|---|
| 01-image-classification-fmnist-cnn.ipynb | Image classification baseline with NeuroGebra ModelBuilder | Fashion-MNIST (Zalando) | Open | View |
| 02-image-classification-observatory-pro.ipynb | Image classifier diagnostics with AdaptiveLogger, health warnings, tiered storage, dashboard | Fashion-MNIST (Zalando) | Open | View |
| 03-gan-mnist-baseline.ipynb | GAN baseline with NeuroGebra activation bridge into PyTorch | MNIST | Open | View |
| 04-gan-mode-collapse-diagnostics.ipynb | GAN instability and mode-collapse diagnostics with Observatory Pro | MNIST | Open | View |
| 05-diffusion-image-denoising-baseline.ipynb | Diffusion-inspired noise prediction baseline | CIFAR-10 | Open | View |
| 06-diffusion-math-scheduler-deep-dive.ipynb | Diffusion schedule math, forward-process visualization, reproducibility fingerprint | CelebA (primary), CIFAR-10 fallback | Open | View |
| 07-core-nlp-spam-classifier.ipynb | Core NLP spam detection with NeuroGebra ModelBuilder | SMS Spam Collection (UCI) | Open | View |
| 08-core-nlp-sentiment-classifier.ipynb | Core NLP sentiment classification with bridge-assisted training and adaptive warnings | IMDb Large Movie Review | Open | View |
| 09-small-language-model-charlm-from-scratch.ipynb | Small language model from scratch (character-level) | Tiny Shakespeare | Open | View |
| 10-small-language-model-charlm-tuning.ipynb | Character LM tuning and fingerprint-based comparison | WikiText-2 | Open | View |
Dataset Sources¶
- Fashion-MNIST: https://github.com/zalandoresearch/fashion-mnist
- MNIST: http://yann.lecun.com/exdb/mnist/
- CIFAR-10: https://www.cs.toronto.edu/~kriz/cifar.html
- CelebA: https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- SMS Spam Collection: https://archive.ics.uci.edu/dataset/228/sms+spam+collection
- IMDb Reviews: https://ai.stanford.edu/~amaas/data/sentiment/
- Tiny Shakespeare: https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt
- WikiText-2: https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/