Gabi Dobocanblog.telepat.io·Nov 24, 2024Unpacking Multimodal Language Models in VQA: Llava’s InterpretabilityArxiv: https://arxiv.org/abs/2411.10950v1 PDF: https://arxiv.org/pdf/2411.10950v1.pdf Authors: Sophia Ananiadou, Zeping Yu Published: 2024-11-17 Understanding Llava's Contribution to Visual Question Answering The paper, "Understanding Multimodal LLM...CLIP
Taehyeong Leejsonobject.hashnode.dev·Jun 17, 2024How to Deploy LLaVA on Amazon SageMaker for Real-Time Image AnalysisIntroduction LLaVA (Large Language and Vision Assistant) is an open-source model classified as an LMM (Large Multimodal Model). It is being developed through a collaboration between Microsoft and various research institutions. One of its strengths i...1 like·243 readsLLaVA