archivedSep 1, 2025

Images Captioned to Minecraft

A deep learning project exploring how natural language descriptions map to Minecraft gameplay clips.

PythonTensorFlowHugging FaceDeep Learning

Highlights

  • - Designed a data pipeline pairing gameplay clips with human-written captions.
  • - Prototyped CNN and transformer-based models in Google Colab.
  • - Planned retrieval and evaluation experiments for caption-to-clip matching.

Context

This project investigates multimodal learning by connecting visual game environments with natural language descriptions.

Current Work

Model architectures and training pipelines are still evolving. Results and a formal write-up are in progress.