Apache PySpark PySpark ML (DataFrame-based) DataFrame-based machine learning API with pipelines and feature transformers. Documentation GitHub Documentation 📖 Documentation https://spark.apache.org/docs/latest/api/python/reference/pyspark.ml.html Other Resources 🔗 PipelineGuide https://spark.apache.org/docs/latest/ml-pipeline.html