I researched how convolutional neural networks (CNNs) can classify cheese types from images. I focused on
data quality and reproducibility: collecting and cleaning data, building an automated
pipeline, and training/evaluating models with GPU acceleration. The project contributes a maintainable baseline for
future computer vision work in food technology.
My Responsibilities
- Collected images from Kaggle and other sources; organized class labels.
- Wrote Python scripts (PIL + CleanVision) to detect duplicates, blurry/corrupt, and mislabeled images.
- Integrated Google Gemini API to flag questionable images and reduce dataset noise.
- Produced raw → filtered → cleaned dataset versions with hashes/manifests for reproducibility.
- Submitted/monitored GPU jobs on the university HPC (logging accuracy/loss and resource usage).
- Experimented with learning rate, batch size, and epochs; documented weekly reports and a final summary.