Madhav Ganesanmadhavganesan.hashnode.dev·Nov 24, 2024Introduction to Hadoop:)Hadoop is an open-source software framework designed to handle and process large volumes of data across distributed computing environments. It is designed to be scalable, fault-tolerant, and capable of handling vast amounts of data efficiently. Clust...hadoop
Madhav Ganesanmadhavganesan.hashnode.dev·Nov 17, 2024Introduction to Big Data AnalysisData refers to raw, unprocessed facts, statistics, or information collected for reference, analysis, and processing. They are of different formats: Structured Data: Organized in a defined format, such as rows and columns in a database Unstructured Da...hadoop
Rajni RetheshforMiddleware - Be Productive, Not Busy!middlewarehq.com·Oct 11, 2024Apache Hadoop Dora Metrics: Disorganized Workflow with Prolonged Lead TimesFrom social media posts and sensor data on machinery to financial transactions, tax info, defense stats, public health updates, and medical records – there's a whole lot of data flying around in the online world. This data can come in all shapes and ...100 Days of Dora Metrics Case Studiesapache hadoop
Rajat Srivastavaalgowithrajat.hashnode.dev·Sep 28, 2024Big DataWith the rise of social media, Online shopping, Streaming services, etc., vast amount of information are getting generated with each clicks and interactions we make online. These data can provide valuable insights to companies to improve their produc...big data
Shivanshi Singhshivanshi770.hashnode.dev·Sep 17, 2024Hadoop vs. Spark: Which Big Data Framework is Right for You?Introduction In the world of Big Data, Hadoop and Spark are two of the most powerful and widely used frameworks. Both offer robust solutions for handling large-scale data processing, but they differ in how they approach and solve Big Data challenges....hadoop
Mudassar Khanidataisgold.hashnode.dev·Aug 13, 2024How does Databricks compare to Hadoop?Databricks and Hadoop are both powerful platforms for processing and analyzing large datasets, but they have different architectures, capabilities, and approaches to handling big data. Here's a comparison between the two: 1. Architecture Databricks:...hadoop
yugal kishoreyugalkishore.hashnode.dev·Aug 8, 2024PageRank using mapreduce on hadoopThis is a guide to making page-rank project using hadoop for data analysis, you can also use this as a cloud computing project To see how to install hadoop you can refer Code With Arjun’s guide: https://codewitharjun.medium.com/install-hadoop-on-ub...2 likes·36 readshadoop
Maseed Irfan aliirfandataengineer.hashnode.dev·Jul 31, 2024Mastering Data Management with Hive: Internal vs. External TablesMastering Data Management with Hive: Internal vs. External Tables When it comes to data management in Hive, understanding the difference between internal and external tables is crucial. Here’s a comprehensive guide on how to efficiently manage your d...hive
Rahul Rathodcodeok.hashnode.dev·Jul 14, 2024The Revolutionary Journey of Apache Spark: From Academic Roots to Industry DominanceIn the world of big data, speed and efficiency are paramount. Among the many technologies that have emerged to address these needs, Apache Spark stands out as a revolutionary force. Born from academic innovation and nurtured by a growing community, S...apache
Abhishek Jaiswaldataplumber.hashnode.dev·Jun 29, 2024Data Democratization..In today's data-driven world, data democratization is a game-changer for businesses. By making data accessible to all employees, regardless of their technical expertise, companies can harness the power of information to drive efficiency, innovation, ...1 likedata-engineering