Scope
End-to-end pipeline: from dataset cleaning and curation to model design, training, and evaluation on cheese image data.
Weeks 1–14Featured Research
A summer semester-long undergraduate research project guided by Dr. Xiang Ma, focusing on building a high-quality cheese image dataset and training modern vision models to classify cheese varieties using PyTorch and university HPC resources.
End-to-end pipeline: from dataset cleaning and curation to model design, training, and evaluation on cheese image data.
Weeks 1–14VS Code, Python, Git/GitHub, Google Colab, and PyTorch as the core stack for development and experiments.
ML StackMultiple cheese datasets from Kaggle plus a small, curated dataset collected later in the project for evaluation.
Dataset FocusModel training on the university's high-performance computing GPU servers after prototyping in Colab.
HPC GPUThe project follows a structured weekly plan: learning the tools, cleaning and understanding the cheese datasets, designing a model that fits the GPU cluster, and evaluating how well it generalizes.
This research project investigates how modern image classification models handle real-world cheese images. The work starts with tool ramp-up (VS Code, Python, Git, Colab), moves through classic datasets like MNIST to understand core ideas, and then applies those principles to cheese images sourced from Kaggle and a small custom dataset.
The emphasis is not just on accuracy, but on building a clean, reliable dataset and understanding where and why models fail.
Early weeks focus on building a solid foundation: using VS Code as the main editor (with Python and Markdown support), following Python tutorials, and learning Git & GitHub for version control. MNIST is used as a “hello world” for computer vision to understand dense networks and convolutional neural networks before touching the cheese data.
Once the basics are solid, the work shifts to PyTorch on Google Colab for prototyping models and experimenting with training loops.
Reproduce basic fully connected and CNN models on MNIST to understand training dynamics, loss curves, and what “good” performance looks like on a clean benchmark.
Apply cleaning tools and manual inspection to remove mislabeled, non-cheese, blurry, or duplicate images. Align all images to a shared size to prepare them for training.
Design a PyTorch model (starting from CNN backbones and transfer learning) sized to fit the university’s GPU cluster. Consider trade-offs between model size, accuracy, and compute cost. Large-scale transformer-style models are explored conceptually as future extensions.
Train the model on the HPC GPU server, monitor performance, and refine via hyperparameter tuning, regularization, and architecture tweaks. Later in the project, evaluate the trained model on a small, newly collected cheese dataset.