Unpacking Multimodal Language Models in VQA: Llava’s Interpretability
Nov 24, 2024 · 3 min read · Arxiv: https://arxiv.org/abs/2411.10950v1 PDF: https://arxiv.org/pdf/2411.10950v1.pdf Authors: Sophia Ananiadou, Zeping Yu Published: 2024-11-17 Understanding Llava's Contribution to Visual Question Answering The paper, "Understanding Multimodal LLM...
Join discussion