Texonom
Texonom
/
Application
Application
/Network Science/Network Theory/Broadcast/Streaming/Stream Processing/Streaming Processing Tool/Apache Spark/
PySpark
Search

PySpark

Creator
Creator
Seonglae Cho
Created
Created
2021 Jul 10 5:41
Editor
Editor
Seonglae Cho
Edited
Edited
2024 May 4 9:5
Refs
Refs
from pyspark import SparkContext masterurl = 'spark://:7077' sc = SparkContext(masterurl, 'myapp')
 
 
 
 
Data Science
Personal Site
Data Science
https://dkharazi.github.io/blog/spark-standalone/
Data Science
PySpark Documentation - PySpark 3.1.2 documentation
PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Spark's features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core.
PySpark Documentation - PySpark 3.1.2 documentation
https://spark.apache.org/docs/latest/api/python/
PySpark JSONL 로드 느릴때 Schema 제공으로 속도 높이기
StructType을 통해서 Schema를 제공해주면 MetaData 만들기 위한 로드를 해결할 수 있다.
PySpark JSONL 로드 느릴때 Schema 제공으로 속도 높이기
https://junbuml.ee/pyspark-jsonl-schema
 
 
 
 

Recommendations

Texonom
Texonom
/
Application
Application
/Network Science/Network Theory/Broadcast/Streaming/Stream Processing/Streaming Processing Tool/Apache Spark/
PySpark
Copyright Seonglae Cho