qtmuniao.hashnode.devScaling Big Data Processing in the Cloud: Proven Practices with Spark & RayTags:cloud-data-processing, large-scale-data, spark, ray, cloud-architecture, data-engineering, python-dataprocessing, devops-cloud, kubernetes, distributed-systems Table of Contents Introduction Sharing 2.1 Environment Sharing Across Machines 2...Nov 19, 2025·8 min read
qtmuniao.hashnode.dev3 basic sorting algorithmBubble sorting, selection sorting and insertion sorting are the three most basic sorting algorithms. They seem very different at first look, but essentially they share the same principle underneath. The process could be split into two steps: Partiti...Nov 16, 2023·1 min read
qtmuniao.hashnode.devLarge-Scale Data Systems (1): DataSetTo explore large-scale data systems, from storage to processing, from the interface to implementation, I'll be writing a variety of articles. This is the first one -- Dataset. Logically, we treat all the data to be processed as a dataset. Representa...Sep 28, 2023·1 min read
qtmuniao.hashnode.devStep by step, dissecting the most challenging parts of the database transaction — IsolationWhen it comes to database transactions, most people's first reaction is often ACID. However, the importance and complexity of the four attributes are not equivalent. The most challenging aspect to comprehend is Isolation (I). One primary reason for t...Sep 23, 2023·7 min read
qtmuniao.hashnode.devLevelDB Data Structures Serials I: Skip ListI’ve heard a lot about LevelDB, now I had a chance to skim through the code on a whim, and it deserves its reputation. If you’re interested in Storage Engine, if you want to use C++ gracefully, or if you want to learn how to organize codes, I recomme...Sep 23, 2023·12 min read