Cheese Image Classification with Machine Learning

University of Wisconsin–Eau Claire · Research with Dr. Xiang Ma · Summer 2025

Phase 1: Foundations Phase 2: Dataset & Cleaning Phase 3: Modeling & HPC Phase 4: Evaluation & Documentation

This project investigates how well modern image classification models can recognize different cheese varieties from real-world images. The work follows a structured 14-week tutorial designed by Dr. Xiang Ma, starting from tool onboarding (Python, VS Code, Git, Colab) and classic datasets like MNIST, then progressing to cheese-specific datasets from Kaggle, data cleaning, and baseline model design in PyTorch.

My contribution focused on data quality and reproducible experiments: understanding what a “good” image dataset looks like, cleaning and organizing the cheese images, and helping implement and run baseline models on the university’s high-performance computing (HPC) resources.

14 Weeks

Structured research tutorial

PyTorch

Primary deep learning framework

My Responsibilities

  • Completed foundational work with Python, Git/GitHub, VS Code, and Google Colab following the Week 1–2 tutorial.
  • Studied MNIST and CNN architectures to understand core image classification concepts before moving to cheese data.
  • Explored Kaggle cheese datasets, inspected class labels, and applied quality criteria: no mislabeled images, consistent sizes, minimal blur, and no duplicates.
  • Helped clean and standardize the datasets using Python scripts, image inspection tools, and example notebooks for automated checks (e.g., CleanVision and Gemini-based cheese recognition).
  • Assisted in designing and running baseline PyTorch models for cheese classification on Colab and the university’s HPC GPU server.
  • Documented the weekly workflow, including data preprocessing decisions and experiment configurations, to support future students.
← Back to Research