Stellar Cyberstellarcyber.hashnode.dev·Jul 15, 2024Bring Your Own Data Lake: Do It The Right WayHaving spent a significant amount of time in the SIEM industry, I’ve seen patterns and evolutions that define the landscape. One of the most notable changes has been the shift from traditional, monolithic SIEM deployments to more flexible, scalable s...DiscussOpen XDR
Debashis Adakadak.hashnode.dev·Jun 29, 2024Databricks Variant DataThe VARIANT data type is a recent introduction in Databricks (available in Databricks Runtime 15.3 and above) designed specifically for handling semi-structured data. It offers an efficient and flexible way to store and process this kind of data, whi...Discussbig data
StarRocks Engineeringstarrocks.hashnode.dev·May 31, 2024How WeChat’s Lakehouse Design Efficiently Handles Trillions of RecordsAbout WeChat WeChat is the world’s largest standalone mobile app, serving over 1.3 billion monthly active users as a platform for instant messaging, social media, and mobile payments. To support its unprecedented and rapidly expanding user base, WeCh...Discusslakehouse
StarRocks Engineeringstarrocks.hashnode.dev·May 28, 20245 Brilliant Lakehouse Architectures from Tencent, WeChat, and MoreYour data lakehouse promised flexibility, scalability, and greater cost-effectiveness, but you'd consider yourself lucky if it could deliver at least two of those three most of the time. Your experience isn't unique. In fact, it's all too common. In ...DiscussData-lake
Manuel Schmidbauerdeltaload.hashnode.dev·May 24, 2024Building a poor man's data lake for Shopify dataInspired by a recent blog post, I decided to experiment with various technologies and build a small data lake for Shopify data. In this project, the following technologies are used: Data Ingestion:dlthub I use the dlt connector to push data from the ...DiscussPoor Man's Data LakeData-lake
StarRocks Engineeringstarrocks.hashnode.dev·May 24, 2024Comparison of the Open Source Query Engines: Trino and StarRocksIn this post, we want to compare Trino, the popular distributed query engine that runs analytical queries over big volumes of data with interactive latencies with StarRocks. Sources of Information We’ve consulted StarRocks committers (Heng Zhao, Star...DiscussDatabases
Sumit Mondalsumit007.hashnode.dev·May 18, 2024Understanding AWS Security Lake: A Comprehensive Guide with Hands-On ExampleIntroduction to AWS Security Lake AWS Security Lake is a robust service designed to centralize security data from diverse sources into a dedicated data lake, making it easier for organizations to manage, analyze, and derive insights from their securi...DiscussAWS - HandsOnAWS
Alex Mercedalexmerced.hashnode.dev·Mar 28, 2024Great Blogs on DataOps for Apache Iceberg LakehousesDataOps, short for Data Operations, represents the seamless orchestration of people, processes, and technology to enhance the quality and reduce the cycle time of data analytics. At the heart of this approach is data versioning, a critical practice t...Discuss·1 likedata lakehouse
Alex Mercedalexmerced.hashnode.dev·Mar 6, 2024The Apache Iceberg Lakehouse: The Great Data Equalizer (disrupting the Snowflake/Databricks status quo)Get an Early Release Copy of Apache Iceberg the Definitive Guide Follow this tutorial to create a Data Lakehouse on your Laptop Iceberg Lakehouse Engineering Video Playlist In the dynamic realm of data platform development, competition among vendors ...Discuss·1 likesnowflake
Kiran ReddyforDatabricks - PySparkdatabricks-pyspark-blogs.hashnode.dev·Mar 1, 2024Strategies for Retrieving Files From Azure Cloud in DatabricksIntroduction When searching on your preferred search engine for accessing ADLS Gen 2 from a Databricks notebook, it will probably give you this great reference. In this blog, we aim to consolidate and simplify these methods, providing clear instructi...Discuss·10 likesADLS