smolvlm2-captioner

Applies SmolVLM2-2.2B-Instruct multimodal model to video frames for image captioning.