PySpark Documentation - PySpark 3.1.2 documentation
PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Spark's features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core.
https://spark.apache.org/docs/latest/api/python/
PySpark JSONL 로드 느릴때 Schema 제공으로 속도 높이기
StructType을 통해서 Schema를 제공해주면 MetaData 만들기 위한 로드를 해결할 수 있다.
https://junbuml.ee/pyspark-jsonl-schema

Seonglae Cho