Jean Yves – Medium

Pinned

Jean Yves
in
Towards Data Science

Delight: The New & Improved Spark UI & Spark History Server is now Generally Available

Delight is a free, hosted, cross-platform monitoring dashboard for Apache Spark with memory and CPU metrics that will hopefully delight…

9 min readMay 5, 2021

--

Delight: The New & Improved Spark UI & Spark History Server is now Generally Available

--

Pinned

Jean Yves
in
Data Mechanics

Spark on Kubernetes Made Easy: How Data Mechanics Improves On The Open-Source Version

If you’re looking for a high-level introduction about Spark on Kubernetes, check out The Pros And Cons of Running Spark on Kubernetes, and…

5 min readApr 28, 2021

--

Spark on Kubernetes Made Easy: How Data Mechanics Improves On The Open-Source Version

--

Jean Yves
in
Towards Data Science

Run your R (SparklyR) workloads at scale with Spark-on-Kubernetes

Tutorial: How to build the right Docker image, start your Spark session, and run at scale!

4 min readJan 25, 2022

--

Run your R (SparklyR) workloads at scale with Spark-on-Kubernetes

--

Jean Yves
in
Towards Data Science

Improve Apache Spark performance with the S3 magic committer

Achieve up to 65% performance gain using the latest S3 magic committer from Spark 3.2 and Hadoop 3.3!

7 min readJan 20, 2022

--

1

Improve Apache Spark performance with the S3 magic committer

--

1

Jean Yves
in
Towards Data Science

Apache Spark 3.2 Release — What’s New For Spark-on-Kubernetes

Apache Spark 3.2 was released in October 2021(see release notes) and it is now available for Data Mechanics customers, and for anyone who…

8 min readNov 4, 2021

--

Apache Spark 3.2 Release — What’s New For Spark-on-Kubernetes

--

Jean Yves
in
Towards Data Science

Tutorial: Running PySpark inside Docker containers

In this article we’re going to show you how to start running PySpark applications inside of Docker containers, by going through a…

4 min readOct 28, 2021

--

1

Tutorial: Running PySpark inside Docker containers

--

1

Jean Yves
in
Towards Data Science

Optimized Docker Images for Apache Spark — Now Public on DockerHub

They include Spark, Python, Scala, Java, Hadoop, and fast connectors to S3, GCS, Azure Data Lake, Delta Lake, Snowflake, and other sources…

4 min readMay 12, 2021

--

4

Optimized Docker Images for Apache Spark — Now Public on DockerHub

--

4

Jean Yves
in
Towards Data Science

The Story of a Migration from EMR to Spark on Kubernetes

The goals of our migration, the architecture we targeted, the technical challenges we encountered, and the results we achieved.

5 min readApr 27, 2021

--

1

The Story of a Migration from EMR to Spark on Kubernetes

--

1

Jean Yves
in
Towards Data Science

Apache Spark 3.1 Release: Spark on Kubernetes is now Generally Available

With the Apache Spark 3.1 release in March 2021, the Spark on Kubernetes project is now officially declared as production-ready and…

8 min readMar 30, 2021

--

1

Apache Spark 3.1 Release: Spark on Kubernetes is now Generally Available

--

1

Jean Yves
in
Towards Data Science

Highlights of Data + AI Summit 2020 (formerly Spark Summit)

Recent developments with Spark 3.0, Spark-on-Kubernetes going GA, PySpark usability improvements, and more.

7 min readDec 15, 2020

--

Highlights of Data + AI Summit 2020 (formerly Spark Summit)

--

Jean Yves

Jean Yves

Co-Founder @Data Mechanics, The Cloud-Native Spark Platform Senior Product Manager @ Spot.io — Building Ocean for Spark Former software eng @Databricks.

Following

Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams