@thanhtrung5763

Trung Thành

@thanhtrung5763Ho Chi Minh City, VietnamJoined May 2026

About

Nothing here yet.

Available for

Nothing here yet.

Trung Thành's blogs

Sparkthanh-de.hashnode.dev2 posts

Articles Comments

Recently published

TTTrung Thànhthanh-de.hashnode.devMay 6 · 6 min read

I spent 6 hours studying PySpark join strategies. Here's what I learned

match keys between two tables and boom, you get results. That mindset worked fine in SQL databases. Then I started working with Spark on large datasets and my jobs started failing, timing out, or grinding for hours. The reality: Spark join performanc...

TTTrung Thànhthanh-de.hashnode.devMay 6 · 4 min read

I spent 8 hours learning Spark partitioning and bucketing. Here's what I discovered

s one thing I've noticed: most Spark pipelines waste 30-60% of their compute time reading data they don't need or shuffling data that could have been pre-organized. During my recent deep-dive, I spent 8 hours learning two important optimization techn...

Trung Thành

About

Available for

Trung Thành's blogs

Recently published

I spent 6 hours studying PySpark join strategies. Here's what I learned

I spent 8 hours learning Spark partitioning and bucketing. Here's what I discovered

Search Hashnode

Trung Thành

About

Available for

Trung Thành's blogs

Recently published

I spent 6 hours studying PySpark join strategies. Here's what I learned

I spent 8 hours learning Spark partitioning and bucketing. Here's what I discovered