The supermarket analogy for embeddings is one of the clearest mental models I've seen for explaining vector spaces to non-ML folks. When I was building a RAG pipeline last year, the jump from understanding Word2Vec's static embeddings to working with contextual embeddings from transformer models was exactly the conceptual shift you describe -- the same word needing different coordinates depending on its neighbors. That history from Bag-of-Words through to modern dense representations really helps ground why retrieval quality depends so heavily on embedding choice.