Arpit Tyagidataminds.hashnode.dev·9 hours agoCopy data from the Data Lake to the SQL Database, deleting any existing data each time before loading the new data:Step 1: Check the existing data in the employee table: Step 2: Change the data manually in the csv file in data lake: we need to check whether this change will be visible in the table or not (definitely at the end). Step 3: Settings in “Sink” opti...DiscussAzure Data FactoryAzure Data Factory
Arpit Tyagidataminds.hashnode.dev·10 hours agoData Lake to SQL DB Data Movement ( or CSV to SQL Table data movement)Step 1: Do we have the table employee in SQL? Step 2: Use the dataset which is pointing to the file in Data Lake: Step 3: Set up the Source and Sink in the copy activity: Source - It will point to the dataset which is pointing to the employee file ...DiscussAzure Data FactoryAzure
samhita sarkarsamhita-sarkar.hashnode.dev·Nov 11, 2024Deciding When to Normalize or Denormalize Data for Best ResultsNormalisation and denormalisation are essentially design patterns in databases that determine how data is stored and used. Storage pattern : The key difference between normalisation and denormalisation is their storage patterns, which are essentially...DiscussDatabases
PULKIT KAPOORblog.pulkitkapoor.com·Nov 11, 2024Understanding Amazon Redshift Data Sharing FeaturesAWS Redshift is the data-warehousing solution provided by Amazon Web Services. As part of the Datawarehouse offering, Amazon Redshift provides a very useful feature known as data sharing. In this short article, we will get an understanding of when to...Discuss·30 readsAWS
Constantin Lungudatawise.dev·Nov 9, 2024For quality work, understand your data grain firstThere’s one concept that is central when working with data and goes beyond specific technology or SQL dialects. You just can’t miss it. I’m talking about the grain—the granularity of the data. It’s essentially answering the question: what does one ro...DiscussLearning JourneySQL
Kliment Merzlyakovklimmy.hashnode.dev·Nov 8, 2024Robust evaluation of binary variable averageTL;DR Some metrics are an average of the binary variable (0/1, False/True) — conversion rate, churn rate, etc. These metrics might not represent the actual value when there is a small sample size beneath them (one web session with one conversion lea...Discussdata-engineering
Alex Mercedalexmerced.hashnode.dev·Nov 8, 2024Intro to SQL using Apache Iceberg and DremioBlog: What is a Data Lakehouse and a Table Format? Free Copy of Apache Iceberg the Definitive Guide Free Apache Iceberg Crash Course Lakehouse Catalog Course Iceberg Lakehouse Engineering Video Playlist Introduction SQL (Structured Query Language) h...DiscussSQL
Jitender Kaushikjitenderkaushik.com·Nov 8, 2024Exploring Microsoft Fabric: Notebooks vs. Spark Jobs and How Java Fits InMicrosoft Fabric offers a versatile platform for data processing, blending interactive notebooks with powerful Spark jobs. While both tools serve different purposes, understanding their distinctions can optimize your workflows, especially with Java c...Discussmicrosft fabric notebook
Jitender Kaushikjitenderkaushik.com·Nov 7, 2024Why Don’t Notebooks Support Java? A Closer Look at Language and Platform CompatibilityNotebooks, like Jupyter, are powerful tools for data science, exploratory analysis, machine learning, and even educational purposes. They allow users to mix code, visualisations, and documentation in one interactive environment. But despite their pop...DiscussJava
Jitender Kaushikjitenderkaushik.com·Nov 6, 2024"Hello World" in Python, Java, and Scala: A Quick Dive into Spark Data Analysis.The "Hello World" program is the simplest way to demonstrate the syntax of a programming language. By writing a "Hello World" program in Python, Java, and Scala, we can explore how each language introduces us to coding concepts, and then delve into t...DiscussJava