Sr. Data Engineer working mostly on Data and Observability problems. Writing mostly about Data and cloud, sometimes productivity and other musings.
Mentoring, Tech chats, Ideas, Investments
Jun 8, 2024 · 5 min read · In today's data-driven world, real-time data processing and analytics have become crucial for businesses to stay competitive. Apache Hudi (Hadoop Upserts and Incremental) is an open-source data management framework that provides efficient data ingest...
Join discussion
Feb 15, 2024 · 6 min read · Introduction Large Language Models (LLMs) have revolutionized the field of Natural Language Processing (NLP). These models, such as GPT-4, are designed to understand and generate human-like text. In this post, we will delve into how to work with LLMs...
Join discussion
Jan 5, 2024 · 4 min read · Credit card fraud is a significant concern for financial institutions, as it can lead to considerable monetary losses and damage customer trust. Real-time fraud detection systems are essential for identifying and preventing fraudulent transactions as...
Join discussion
Oct 31, 2023 · 3 min read · In a production ETL (extract, transform, load) pipeline, it is often helpful to manage environment variables to store sensitive information, such as database credentials or API keys. This allows you to keep this sensitive information separate from yo...
Join discussion