Eklemis Santo Nduneknotes.app·Dec 4, 2024Understanding OLTP, Master Data, and OLAP: Key Differences and Use CasesIn the realm of data management and processing, terms like OLTP, Master Data, and OLAP frequently surface. While they are interconnected in the data ecosystem, each serves a distinct purpose. This article aims to demystify these concepts, compare the...1 likedata analysis
Nextwebbnextwebb.hashnode.dev·Nov 29, 2024Avoiding Pitfalls in Amazon S3: Handling Case Sensitivity in Python WorkflowsWhen working with Amazon S3, it’s easy to overlook an important nuance: case sensitivity. While bucket names are case-insensitive, object keys (file paths) are case-sensitive. This distinction can lead to unexpected bugs in your workflows. For instan...63 readsAWS s3
Eklemis Santo Nduneknotes.app·Nov 28, 2024Enhancing Data Confidence with Idempotent PipelinesIn our previous discussion on data pipelines, we explored how data moves from collection to processing and delivery, transforming raw information into valuable insights. Now, let's delve deeper into a concept that significantly enhances the reliabili...idempotence
Arpit Tyagidataminds.hashnode.dev·Nov 12, 2024Copy data from the Data Lake to the SQL Database, deleting any existing data each time before loading the new data:Step 1: Check the existing data in the employee table: Step 2: Change the data manually in the csv file in data lake: we need to check whether this change will be visible in the table or not (definitely at the end). Step 3: Settings in “Sink” opti...7 likesAzure Data FactoryAzure Data Factory
BuzzGKbuzzgk.hashnode.dev·Nov 7, 2024Data Pipeline Design 101In recent years, data pipeline design has undergone a significant transformation. The traditional approach of moving data from OLTP to OLAP databases has given way to more complex and diverse pipelines. Today's data pipelines integrate components fro...data pipeline
BuzzGKbuzzgk.hashnode.dev·Nov 4, 2024Enhancing Data Integrity with TraceabilityData pipelines involve multiple sources and technologies, making it challenging to uphold data integrity and compliance. This is where data traceability proves invaluable. By tracking data movement and logging access or modifications, traceability en...data pipeline
Sujit Nirmalblackshadow.hashnode.dev·Oct 24, 2024Introduction to Chatbots and GenAIChatbots are AI-driven programs designed to simulate conversation with human users. They can be used for customer service, personal assistance, and more. Generative AI, on the other hand, involves using AI models to generate new content, such as text...chatbot
Arpit Tyagidataminds.hashnode.dev·Oct 23, 2024How to Copy SQL Database Tables Post-Join in Azure Data Lake via ADF!!Step 1: Create an instance of Azure Data Factory: Step 2: Set up Linked Services: Step 3: Create a dataset for both (source-SQL Database and destination-Azure Data Lake): Step 4: Build the Data Pipeline with the help of datasets: Step 5: Test ...10 likesAzure Data FactoryAzure
KAPUPA HAAMBAYIdatasmithery.hashnode.dev·Oct 22, 2024A Day In The Life of a Super Azure Data EngineerWhen I was starting out my data engineering journey, I often imagined what it would be like to work as a great data engineer, especially in a fast-paced, data-driven environment like manufacturing. Fun fact*: I actually worked for a manufacturing com...#techinmanufacturing
Gyuhang Shimplto001.hashnode.dev·Oct 14, 2024Lambda vs Kappa Architecture in Data Pipeline (Korean)Lambda Architecture 구성 요소 Batch Layer 정기적으로 대량의 Historical Data 를 처리합니다. (예: Daily 또는 Hourly) 이를 통해 높은 정확도와 데이터 Completeness (완전성) 을 보장하며, 복잡한 Data 변환을 처리합니다. Speed Layer 실시간 Data 를 처리하여 low-latency 시간의 결과를 제공합니다. Batch Layer 가 동일한 Data 를 처리할...kappa architecture