Vibhor Guptavibhorgupta.hashnode.dev·Dec 17, 2024Understanding Latency Metrics: Key Indicators for Application Performance and User SatisfactionImportance of Latency Metrics: Latency metrics are crucial for evaluating the performance of an application, as they provide insights into how quickly the system responds to user requests. These metrics are particularly important for understanding th...System Design
Mike Vincentmike-vincent.hashnode.dev·Dec 9, 2024Speed Demon: LLMs’ 600ms Race to Appear HumanThe future of AI isn’t about bigger models or smaller models-it’s about speed. The race to achieve responses in under 600 milliseconds. That’s the benchmark separating AI interactions that feel mechanical from those that feel human. This latency thre...AI
Abu Precious O.btere.hashnode.dev·Dec 3, 2024Understanding ML Inference Latency and ML Services LatencyIn the world of machine learning (ML), achieving quick results is critical, especially in real-time applications like autonomous driving, recommendation systems, and interactive voice assistants. But often, discussions about ML performance focus on a...low-latency
Taejung Heoblog.aqudi.me·Nov 24, 2024AWS Lambda와 Connection Pool 사용 시 발생한 응답 지연 문제 해결기배경 개발자님! 화면이 너무 늦게 떠요!! 이번 주에 매출 통계 기능을 개발해서 개발 서버에 올렸는데 테스팅을 하다가 갑자기 긴급 호출이 들어왔다. API 속도가 너무 느리다는데 얼마나 느리길래 그런지 확인해봤더니 무려 10~16초가 걸렸다. 신규 레포지토리를 파서 개발한 거라 설정에 문제가 있었나? VPC간 통신 때문에 지연되는 건가? 별의 별 생각이 다 들어서 각 모듈별로 실행되는 시간, 실제 쿼리하는 시간을 전부 측정해봤는데 10~16...AWS
Maxat Akbanovmaxat-akbanov.com·Oct 18, 2024AWS Route 53: Latency-based Routing PolicyLatency-based routing (LBR) in AWS Route 53 is designed to route end-user requests to the AWS region that provides the lowest latency. This routing policy ensures that users are connected to the closest and fastest endpoint (in terms of network laten...44 readsawsDevops
Kavyaportkey-llm-elo-rating.hashnode.dev·Oct 14, 2024⭐ Reducing LLM Costs & Latency with Semantic CacheImplementing semantic cache from scratch for production use cases. Vrushank Vyas Jul 11, 20235 min Image credits: Our future AI overlords. (No, seriously, Stability AI) Latency and Cost are significant hurdles for developers building on top of Large...semantic cache
Kristie Lockhartdigitalserviceguide.hashnode.dev·Oct 12, 2024How to Achieve the Best Internet Speeds for Gaming in 2024As the gaming landscape continues to evolve with immersive graphics and online multiplayer experiences, having the best internet speed is more critical than ever. Lag and connection issues can ruin your gaming session, whether you’re competing in an ...Downloadspeeds
OBULIPURUSOTHAMAN Kobulipurusothaman.hashnode.dev·Oct 8, 2024Distributed CachingCaching is used to temporarily store copies of frequently accessed data in high-speed storage layers (such as RAM) to reduce latency and load on the server or database. When your dataset size is small, it’s usually enough to keep all the cache data ...Distributed Caching
Subhanshu Mohan Guptablogs.subhanshumg.com·Sep 27, 2024Optimizing Kubernetes Node Placements Based on User Footprint and LatencyWelcome to Part I of my Kubernetes series, where we dive into Optimizing Node Placements Based on User Footprint and Latency. In today’s world of global-scale applications, every millisecond counts. We’ll explore how to strategically place Kubernetes...11 likes·41 readsMastering Kubernetes: Revolutionizing Cloud-Native OperationsMicroservices
Arslan Haroonarslanharoon.hashnode.dev·Sep 27, 2024Latency and Throughput in system designIn system design and computer networks, latency and throughput are two key performance metrics that describe the efficiency and speed of a system. Though related, they measure different aspects of system performance and are often considered together ...System Design