Turboline LTDblog.turboline.ai·May 6, 2024AI-Based Data Transformation: A Comparison of LLM-Generated PySpark Code (Using GPT-4)Welcome to the first installment of our series comparing data transformation codes generated by various large language models (LLMs). In this series, we aim to explore how different AI models approach data engineering and analytical tasks under ident...Discusschatgpt
Yash Srivastavablog.yashsrivastava.link·Jan 18, 2023Basic Spark RDD transformationsRDD(resilient distributed datasets) are the basic unit of storage in spark. you can think of an rdd as a collection distributed over multiple machines.Most of the time higher level structured APIs are used in spark applications which under the hood g...Discussspark