Releases

LakeSail rewrites Apache Spark in Rust, cuts costs by 94%

LakeSail says its Rust rewrite of Spark drops the JVM, keeps Spark APIs intact, and trims benchmarked costs by 94% on TPC-H workloads.

Nina Kowalski··2 min read
Published
Listen to this article0:00 min
LakeSail rewrites Apache Spark in Rust, cuts costs by 94%
Source: X (formerly Twitter
This article contains affiliate links, marked with a blue dot. We may earn a small commission at no extra cost to you.

LakeSail has rebuilt Apache Spark in Rust and is pitching the result as a drop-in replacement that keeps Spark SQL and DataFrame jobs running without code rewrites. The company says Sail is Spark Connect-compatible, removes the JVM entirely, and delivered 8x faster performance alongside 94% lower costs in benchmark testing.

The practical pitch is narrow and familiar to anyone running Spark in anger: keep the same APIs, move the endpoint, and leave the rest alone. LakeSail says the usual migration step is often just swapping the line in SparkSession.builder.remote(...), letting existing PySpark jobs, pipelines, and notebooks point at Sail instead of a JVM-backed cluster.

AI-generated illustration
AI-generated illustration

Under the hood, LakeSail frames Sail as a Rust-native distributed compute engine built on Apache DataFusion and Apache Arrow. The company says that choice is what lets it target batch processing, streaming, and compute-heavy AI workloads in one system, while avoiding garbage-collection pauses and JVM startup overhead. LakeSail also says Sail supports Delta Lake and Apache Iceberg, and that deployments run in the customer’s AWS account in a BYOC model.

The company is still young enough that its ambitions matter as much as its benchmarks. Futuriom described LakeSail in March 2026 as a three-year-old San Francisco startup led by CEO and cofounder Shehab Amin, with a goal of becoming a Spark replacement for batch, stream, and AI workloads. Public project pages now show some real ecosystem motion too: PyPI lists pysail 0.6.4 with a June 6, 2026 release date, and describes it as a Rust-written Spark replacement.

The headline numbers still deserve the fine print LakeSail gives them. The 94% cost figure comes from derived TPC-H benchmarks, and the company says actual savings will vary by workload. That leaves the central question for Rust practitioners intact: whether this is more than a sharp benchmark and a clean API story, or a meaningful step toward Rust-native data infrastructure that can stand in for Spark without asking users to rebuild their pipelines from scratch.

This article was produced by Prism’s automated news system from verified source data, official records, and press releases, then run through automated quality and moderation checks before publishing. The system is built and supervised by the people who set the standards it runs under. Read our full AI policy.

Did this article answer your question?

Discussion

More Rust Programming News