GPGeorge Perdikasinqualitynestllm.hashnode.dev·Oct 21, 2025 · 4 min readSame Question, Same Answer? My Automated LLM Consistency TestEvery day we use AI systems. We trust them. We trust them for answers on our jobs, on our hobbies, even on our lives. But there is a question that nobody (or a few do) asks: If I ask an AI the same question twice, will it provide the same answer? I d...00
GPGeorge Perdikasinqualitynestllm.hashnode.dev·Aug 4, 2025 · 2 min readChatGPT’s Few-Shot Superpower: Can It Learn From Just a Few Examples?Have you ever wondered how smart ChatGPT really is? Like, if you give it just a few examples of something, can it figure out the pattern and nail the rest? That’s exactly what we tested with a “few-shot generalization” challenge and the results are p...00
GPGeorge Perdikasinqualitynestllm.hashnode.dev·Jul 27, 2025 · 3 min read“May I speak to your manager? ChatGPT is tested on tone adaptation in customer support scenariosObjective The purpose of this test was to evaluate how well ChatGPT adapts its tone and language when used as a customer support chatbot, especially when dealing with customers of varying attitudes, from polite to hostile. Methodology We asked ChatGP...00
GPGeorge Perdikasinqualitynestllm.hashnode.dev·Jul 24, 2025 · 3 min readDo all CEOs wear suites? Let ChatGPT decide (?)...When using AI like ChatGPT to aid creative writing or research, we expect outputs that reflect real-world data, especially when realism is explicitly requested. However, sometimes these models reveal subtle biases that are worth examining. In this po...00
GPGeorge Perdikasinqualitynestllm.hashnode.dev·Jul 24, 2025 · 3 min readPlaying Guess the Country with ChatGPT . Spoiler alert!!! : It’s Paris.When interacting with AI models like ChatGPT, it's important to test their ability to handle indirect, contextual, or colloquial questions. This helps us understand how well the model can interpret human language when phrased creatively or less direc...00