AI That Outsmarts the Experts: doubleAI's WarpSpeed Rewrites GPU Code

📊 Key Data
  • 3.6x average performance increase over NVIDIA's hand-optimized cuGraph library
  • 100% of algorithms tested run faster with WarpSpeed's code
  • 55% of kernels achieved >2x speedup
🎯 Expert Consensus

Experts are likely to view this as a significant breakthrough in AI's ability to autonomously optimize complex GPU code, potentially transforming fields bottlenecked by expert knowledge.

about 2 months ago
AI That Outsmarts the Experts: doubleAI's WarpSpeed Rewrites GPU Code

AI That Outsmarts the Experts: doubleAI's WarpSpeed Rewrites GPU Code

TEL AVIV, Israel – March 02, 2026 – In a bold announcement that challenges the boundaries of artificial intelligence and human expertise, Israeli startup doubleAI has unveiled WarpSpeed, a system it claims has autonomously surpassed world-class human engineers in one of computing’s most complex fields: GPU performance optimization.

The company reported that its new “Artificial Expert” system rewrote and re-optimized every single software kernel in NVIDIA’s cuGraph library, a foundational tool for graph analytics used worldwide. The result was a staggering 3.6x average performance increase over the existing code, which has been meticulously hand-tuned by top-tier NVIDIA engineers for nearly a decade. The hyper-optimized library, now called doubleGraph, has been released on GitHub as a drop-in replacement, promising a massive performance boost with no code changes required from developers.

A New Benchmark in AI Performance

The performance metrics released by doubleAI are striking. According to the company, 100% of the algorithms tested run faster with WarpSpeed's code, and 55% of the individual kernels achieved more than a twofold improvement. These gains were reportedly validated across three different NVIDIA GPU architectures—the A100, L4, and A10G—suggesting the improvements are robust and widely applicable.

What makes this achievement particularly significant is the nature of the challenge. GPU performance engineering is an esoteric discipline, often described as a “dark art.” Unlike many problems where AI has shown success, this domain breaks the mold. While AI has famously conquered games and even excelled on competitive coding platforms, those tasks often benefit from abundant training data, simple solution verification, and short, logical steps. GPU optimization has none of these luxuries.

Publicly available, truly optimized CUDA kernels are scarce, numbering only in the few thousands, which starves data-hungry AI models. Furthermore, verifying the correctness of a new kernel is profoundly difficult; simple comparisons are insufficient, and multiple valid solutions can exist. Finally, achieving peak performance requires a long and intricate chain of interdependent decisions involving memory layout, caching strategies, and hardware-specific scheduling. It’s a domain where even the most advanced AI coding agents like Gemini CLI and Claude Code have reportedly failed, often producing incorrect or buggy code nearly 40% of the time in doubleAI's tests.

Introducing 'Artificial Expert Intelligence'

doubleAI is framing this breakthrough not as a step toward the much-hyped Artificial General Intelligence (AGI), but as the dawn of a new, perhaps more immediately practical paradigm: Artificial Expert Intelligence (AEI).

“The real question isn’t ‘can AI code?’ — it’s ‘can AI become an expert?’” said Prof. Amnon Shashua, the company's co-founder and CEO, in the press release. “Humanity’s progress is bottlenecked by experts. If we can copy and paste expertise into the world, the impact is transformative.”

The science underpinning WarpSpeed is built on two novel algorithmic concepts designed to tackle the domain's inherent difficulties. The first, Diligent Learning, is a method for efficiently searching vast and complex design spaces to find high-quality solutions even when training data is sparse. The second, PAC Reasoning and Verification, is a sophisticated methodology for ensuring code correctness without a pre-existing “answer key.” It uses one AI component to generate challenging test cases and another to automatically validate the results, bootstrapping a reliable self-verification system. This creates a virtuous cycle, or flywheel, where better verification leads to better training data, which in turn builds stronger AI experts.

From Lab to GitHub: The Community Response

In a move to back up its claims, doubleAI has made its optimized library, doubleGraph, publicly available on GitHub. It is presented as a seamless, drop-in replacement for NVIDIA's cuGraph, allowing any developer using the library to potentially realize significant performance gains instantly. This open-source release shifts the conversation from a corporate announcement to a public, verifiable test.

However, while the code is now in the wild, the industry awaits independent, third-party validation of the performance claims. Currently, all published benchmarks originate from doubleAI itself. The developer and high-performance computing communities will undoubtedly be the ultimate arbiters, putting doubleGraph through its paces in real-world applications. The repository's issue trackers, pull requests, and forks will serve as a live scorecard of its adoption and reliability. Initial documentation notes the library is based on a recent version of cuGraph and is currently limited to single-GPU configurations, indicating a focused but potentially expanding scope.

The NVIDIA Connection and Broader Implications

The context of this announcement is made more interesting by doubleAI's backing. The company recently raised a massive $200 million Series A round with a list of high-profile investors that includes NVIDIA itself. This financial relationship adds a layer of complexity to the narrative. While NVIDIA is an investor, the doubleGraph GitHub repository explicitly states the project is “Not affiliated with, endorsed by, or sponsored by NVIDIA Corporation.” This suggests a financial bet on a promising technology rather than a direct strategic collaboration on this specific project.

If WarpSpeed's AEI model proves to be as effective as claimed, the implications extend far beyond GPU libraries. The company has made it clear that cuGraph was a stress test. By proving its mettle in a domain where data is scarce, validation is hard, and the human-engineered baseline is elite, doubleAI argues that AEI can be applied anywhere expertise is the primary bottleneck. The company has its sights set on other high-stakes fields like drug discovery, advanced chip design, cybersecurity, and climate technology.

The announcement positions WarpSpeed as a tool to close the persistent gap between rapidly advancing hardware and the software that struggles to keep pace. For decades, the full potential of new silicon has been locked away, accessible only to a small cadre of engineers who can master its intricacies. By automating this expertise, AEI could unlock latent performance in hardware already deployed, accelerating scientific research and enabling applications that were previously computationally out of reach. The true test for this technology now lies in its adoption and performance in real-world applications beyond the company's own benchmarks.

Product: Cryptocurrency & Digital Assets ChatGPT Claude
Sector: AI & Machine Learning Cybersecurity Software & SaaS Venture Capital
Theme: Generative AI Machine Learning Automation Artificial Intelligence
Metric: Revenue
Event: Corporate Finance
UAID: 18986