Tag feed

#vlm

10 posts0 followers

Trending tags this week

VRvijaykrishnan ragavanvthoughts.hashnode.dev

Moving from Processing Agents to Vision Enabled

Mar 12 · 2 min read · Why VLMs Failed My Automation Pilot (And the Offline "Hybrid" Fix) I’ve been experimenting with automating incoming claim or any standard form parsing to streamline operations along with agents. Thi

Join discussion

RARana Adeel Tahirthelatentlament.hashnode.dev

0

Fine-Tuning BLIP-2 with LoRA

Feb 10 · 3 min read · In my journey to dive deeper into multimodal AI systems, I decided to fine-tune BLIP-2, a powerful vision-language model trained on the Flickr8k dataset to generate image captions. What made this more exciting was integrating LoRA (Low-Rank Adaptatio...

Join discussion

FMFarzam Mohammaditriedandtestedbuilds.com

0

Setting Up Your Self-Hosted AI Stack - Part 3: From Receipts to Structured Data with Vision Language Models and n8n

Feb 3 · 21 min read · It's Been a While Sorry for being away for so long. I've been head-down strengthening my software engineering foundation—consuming rather than producing. After a while, I started feeling this pull to balance things out by building again. I just finis...

Join discussion

GGitHubOpenSourcegithub-open-source.hashnode.dev

0

JoyCaption: The Open, Uncensored VLM That Will Supercharge Your Diffusion Models

Oct 27, 2025 · 3 min read · 📝 Quick Summary: JoyCaption is an open-source Visual Language Model (VLM) designed for image captioning. It aims to provide a free, uncensored alternative to existing models like ChatGPT, enabling the training and fine-tuning of diffusion models on ...

Join discussion

AMAnsh Mishragithub-anshmishraa.hashnode.dev

0

𝗩𝗟𝗠𝘀 - 𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗼𝗳 𝗔𝗜 𝗶𝘀 𝗔𝗯𝗼𝘂𝘁 𝘁𝗼 𝗚𝗲𝘁 𝗘𝘃𝗲𝗻 𝗠𝗼𝗿𝗲 𝗩𝗶𝘀𝘂𝗮𝗹!

Sep 29, 2025 · 2 min read · We’ve come a long way from traditional LLMs (Large Language Models) that could only understand textual data. But now, the AI world is rapidly evolving — and 𝗩𝗟𝗠𝘀 (𝗩𝗶𝘀𝗶𝗼𝗻-𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗠𝗼𝗱𝗲𝗹𝘀) are leading the charge. 🔥 Unlike LLMs,...

Join discussion

SRShubh Raiblog.zysec.ai

0

Breaking the Brittleness: How LLMs and VLMs Are Transforming UI Test Automation

Sep 25, 2025 · 7 min read · 🌍 The Problem: Brittle UI Test Automation Automated UI testing has long been a cornerstone of software quality. Frameworks like Selenium, Cypress, and Playwright have powered countless regression suites, CI/CD pipelines, and release processes. But d...

Join discussion

AUAman Ullaconnectaman.hashnode.dev

0

Fine-Tuning the Qwen2.5-7B-VL-Instruct Model: A Comprehensive Guide

Feb 22, 2025 · 13 min read · In this blog post, we explore the intricacies of fine-tuning the Qwen2.5-7B-VL-Instruct model—a state-of-the-art multi-modal transformer designed for both text and image understanding. We will delve into the model’s architecture and its applications,...

Join discussion

SGSmaranjit Ghoseblog.smaranjitghose.com

0

Understanding VLMs with Moondream

Jan 11, 2025 · 6 min read · 🌟 Introduction Visual Language Models (VLMs) are revolutionizing the way machines understand and interact with the world. By combining the power of large language models (LLMs) with vision encoders, VLMs enable natural language interaction with visu...

Join discussion

AGAryan Gargblog.arygarg.me

1

PaliGemma 2 - VLMs made easy

Dec 8, 2024 · 7 min read · Introduction The evolution of vision-language models has been nothing short of remarkable. From their early stages of independently handling images and text to their current ability to seamlessly integrate the two, these models have reached new heigh...

UAkshi commented

LBLakshya Borasithetarget018.hashnode.dev

0

Vision Language Model

Aug 2, 2024 · 6 min read · What is Vision Language Model ? Vision Language Models (VLM) are changing the game for multimodal content creation and interaction by bridging the gap between visual and textual cognition. Ongoing research in VLMs advances multimodel architecture to ...

Join discussion

#vlm

Search Hashnode

#vlm

Trending tags this week

Moving from Processing Agents to Vision Enabled

Fine-Tuning BLIP-2 with LoRA

Setting Up Your Self-Hosted AI Stack - Part 3: From Receipts to Structured Data with Vision Language Models and n8n

JoyCaption: The Open, Uncensored VLM That Will Supercharge Your Diffusion Models

𝗩𝗟𝗠𝘀 - 𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗼𝗳 𝗔𝗜 𝗶𝘀 𝗔𝗯𝗼𝘂𝘁 𝘁𝗼 𝗚𝗲𝘁 𝗘𝘃𝗲𝗻 𝗠𝗼𝗿𝗲 𝗩𝗶𝘀𝘂𝗮𝗹!

Breaking the Brittleness: How LLMs and VLMs Are Transforming UI Test Automation

Fine-Tuning the Qwen2.5-7B-VL-Instruct Model: A Comprehensive Guide

Understanding VLMs with Moondream

PaliGemma 2 - VLMs made easy

Vision Language Model