@luisciber

What Happens When an LLM "Thinks": Tokens, Logits, and Sampling

Feb 15 · 20 min read · You send a prompt. The model "thinks" for a few seconds. And generates a response that seems intelligent. What happened in that time? If your way of implementing systems with LLMs is based on using a high-level framework plus OpenAI APIs and derivati...

The Patient Who Thought He Had Microservices (Spoiler: He Didn’t)

Sep 25, 2025 · 3 min read · Last Friday, a case arrived at my practice that reminded me why I specialized in systems architecture. “Doctor,” said the Tech Lead, clearly exhausted, “our system is sick. We started with microservices two years ago, everything seemed perfect on pap...

From Raw C++ to “Plug-and-Play” AI: Why the Human Spark Is Still the Real Engine of the Future

Aug 7, 2025 · 2 min read · There was a time — not so long ago — when, if you wanted to do anything interesting with Artificial Intelligence, you first had to pledge allegiance to linear algebra, become a C++ samurai, and accept that your idea would probably live and die on a s...

How to build a RAG System Locally

Apr 13, 2024 · 13 min read · What is a RAG system? A Retrieval-Augmented Generation (RAG) system combines the capabilities of a large language model with a retrieval component that can fetch relevant documents or passages from a corpus. This powerful combination allows the langu...

Android mobile development without emulators or physical connections.

Jan 27, 2022 · 4 min read · The first thing... The problem I am a mobile application developer. I develop apps for both Android and iOS using Flutter. One of the main problems I had when I started in this world of mobile development is the high consumption of RAM memory that em...