SJsamyak jaininclickhousenu.hashnode.dev·Apr 16 · 3 min readThe Architect’s Guide to ClickHouse MergeTree Engines: Beyond the BasicsClickHouse is often hailed as the "speed demon" of analytical databases. But that speed isn't magic; it’s largely due to the MergeTree engine family. If you’re building agentic systems or real-time an00
SJsamyak jaininclickhousenu.hashnode.dev·Apr 9 · 3 min readClickhouse - Powering agentic systems with millisecond queries at petabyte scaleIn today’s data-driven world, speed matters. Whether you're analyzing user behavior, processing logs, or building real-time dashboards, traditional databases can struggle under massive workloads. That00
SJsamyak jainindataskills.hashnode.dev·Oct 11, 2023 · 4 min readApache Hadoop - Getting Started (Understanding the Basics)Hadoop is an open-source software framework used for storing and processing Big Data in a distributed manner on large clusters of commodity hardware. Hadoop is licensed under the Apache v2 license. Hadoop was developed, based on the paper written by...00
SJsamyak jainindataskills.hashnode.dev·Jan 7, 2022 · 2 min readApache Airflow - Getting StartedWhat is Apache Airflow? Apache Airflow is an open-source project that was created in 2014 in Airbnb by Maxime Beauchemin, and published in June 2015. Apache Airflow is an open-source platform designed to programmatically author, schedule, and monitor...00
SJsamyak jainindataskills.hashnode.dev·Jun 4, 2021 · 3 min readBig Data Processing - How it All StartedIt started with an interesting thought of making the whole internet searchable. Mike Cafarella and Doug Cutting who started working on development of an open source web search engine. A search engine that can index billions of pages, back then projec...00