Madhav Ganesanmadhavganesan.hashnode.dev·Nov 17, 2024Introduction to Big Data AnalysisData refers to raw, unprocessed facts, statistics, or information collected for reference, analysis, and processing. They are of different formats: Structured Data: Organized in a defined format, such as rows and columns in a database Unstructured Da...Discusshadoop
Rajni RetheshforMiddleware - Be Productive, Not Busy!middlewarehq.com·Oct 11, 2024Apache Hadoop Dora Metrics: Disorganized Workflow with Prolonged Lead TimesFrom social media posts and sensor data on machinery to financial transactions, tax info, defense stats, public health updates, and medical records – there's a whole lot of data flying around in the online world. This data can come in all shapes and ...Discuss100 Days of Dora Metrics Case Studiesapache hadoop
Rajat Srivastavaalgowithrajat.hashnode.dev·Sep 28, 2024Big DataWith the rise of social media, Online shopping, Streaming services, etc., vast amount of information are getting generated with each clicks and interactions we make online. These data can provide valuable insights to companies to improve their produc...Discussbig data
Shivanshi Singhshivanshi770.hashnode.dev·Sep 17, 2024Hadoop vs. Spark: Which Big Data Framework is Right for You?Introduction In the world of Big Data, Hadoop and Spark are two of the most powerful and widely used frameworks. Both offer robust solutions for handling large-scale data processing, but they differ in how they approach and solve Big Data challenges....Discusshadoop
Mudassar Khanidataisgold.hashnode.dev·Aug 13, 2024How does Databricks compare to Hadoop?Databricks and Hadoop are both powerful platforms for processing and analyzing large datasets, but they have different architectures, capabilities, and approaches to handling big data. Here's a comparison between the two: 1. Architecture Databricks:...Discusshadoop
yugal kishoreyugalkishore.hashnode.dev·Aug 8, 2024PageRank using mapreduce on hadoopThis is a guide to making page-rank project using hadoop for data analysis, you can also use this as a cloud computing project To see how to install hadoop you can refer Code With Arjun’s guide: https://codewitharjun.medium.com/install-hadoop-on-ub...Discuss·2 likes·36 readshadoop
Maseed Irfan aliirfandataengineer.hashnode.dev·Jul 31, 2024Mastering Data Management with Hive: Internal vs. External TablesMastering Data Management with Hive: Internal vs. External Tables When it comes to data management in Hive, understanding the difference between internal and external tables is crucial. Here’s a comprehensive guide on how to efficiently manage your d...Discusshive
Rahul Rathodcodeok.hashnode.dev·Jul 14, 2024The Revolutionary Journey of Apache Spark: From Academic Roots to Industry DominanceIn the world of big data, speed and efficiency are paramount. Among the many technologies that have emerged to address these needs, Apache Spark stands out as a revolutionary force. Born from academic innovation and nurtured by a growing community, S...Discussapache
Abhishek Jaiswaldataplumber.hashnode.dev·Jun 29, 2024Data Democratization..In today's data-driven world, data democratization is a game-changer for businesses. By making data accessible to all employees, regardless of their technical expertise, companies can harness the power of information to drive efficiency, innovation, ...Discuss·1 likedata-engineering
Abhishek Jaiswaldataplumber.hashnode.dev·Jun 27, 2024ETL Process with Generative AI.The Extract, Transform, Load (ETL) process is a cornerstone of modern data management, crucial for ensuring data is accurately moved from source to destination. Generative AI has emerged as a powerful tool to optimize and streamline this process, enh...Discuss·1 likeetl-pipeline