Mehul Kansalmehulkansal.hashnode.dev·Aug 26, 2024Week 14: Delta Lake on Azure Databricks 🚀Hey there! 👋 In this week's blog, we’ll delve into the specific challenges associated with traditional Data Lakes and demonstrate how Delta Lake, an open-source storage layer, effectively addresses these issues, with hands-on examples using Azure Da...DiscussAzure
Mehul Kansalmehulkansal.hashnode.dev·Aug 19, 2024Week 13: Azure Databricks Essentials 💡Hey there, data enthusiasts! 👋 This week's blog delves into the fundamental aspects of setting up and using Azure Databricks, guiding you through the creation of a Databricks instance, understanding cluster types, and exploring essential features li...DiscussAzure
Mudassar Khanidataisgold.hashnode.dev·Aug 19, 2024How does Delta Lake enhance Databricks?Delta Lake enhances Databricks by adding powerful features and capabilities that address many common challenges in data engineering and analytics. Specifically, Delta Lake brings improvements in data reliability, performance, and management to the Da...DiscussData-lake
Mudassar Khanidataisgold.hashnode.dev·Aug 13, 2024How does Spark handle big data?Apache Spark handles big data through a combination of distributed computing, in-memory processing, and efficient data management techniques. Here's a breakdown of how Spark manages large-scale data: 1. Distributed Computing: Cluster Management: Spa...Discuss#apache-spark
Cloud Tunedcloudtuned.hashnode.dev·Aug 4, 2024Databricks: Transforming Data Analytics and Machine LearningDatabricks: Transforming Data Analytics and Machine Learning In the era of big data and advanced analytics, businesses need robust platforms to harness the power of their data. Databricks has emerged as a leading solution for data engineering, data s...DiscussDatabricks
Rahul Rathodcodeok.hashnode.dev·Jul 14, 2024The Revolutionary Journey of Apache Spark: From Academic Roots to Industry DominanceIn the world of big data, speed and efficiency are paramount. Among the many technologies that have emerged to address these needs, Apache Spark stands out as a revolutionary force. Born from academic innovation and nurtured by a growing community, S...Discussapache
Pawan Kumardevopsapk.hashnode.dev·Jul 13, 2024Understanding Data Dependency Management in DatabricksWhen working with Databricks, you may often need to incorporate third-party dependencies to enhance your code's functionality. Whether you're using Databricks with Scala or Python, importing external jars or modules is essential for leveraging additi...DiscussAzure
Chandrasekar(Chan) Rajaramcr88.hashnode.dev·Jul 7, 2024Secure Databricks Access to Azure Data Lake Gen2 via Service Principal and Azure Key VaultIntroduction In the world of big data analytics, securing access to your data storage is paramount. As organizations increasingly adopt cloud-based solutions, the need for robust, scalable, and secure data access mechanisms becomes crucial. This blog...Discuss·1 like·42 readsDatabricks
Debashis Adakadak.hashnode.dev·Jun 29, 2024Databricks Variant DataThe VARIANT data type is a recent introduction in Databricks (available in Databricks Runtime 15.3 and above) designed specifically for handling semi-structured data. It offers an efficient and flexible way to store and process this kind of data, whi...Discussbig data
Joubin Najmaiedatafragments.com·Jun 21, 2024Week of June 17 2024 - Mindmap RecapJune 17, 2024: Databricks - Issues with Excel Library in Clusters An issue was encountered with the crealytics:spark-excel library in Databricks. This Spark plugin is essential for reading and writing Excel files within Databricks. However, we observ...DiscussDatabricks