News

monday.com Deploys Nebius Token Factory to Scale Open-Source AI Inference

Investor M.V. Cunha revealed monday.com is deploying Nebius Token Factory to cut open‑source AI inference costs and latency using Nebius AI Cloud 3.0 "Aether".

Marcus Chen•3/8/2026•3 min read

Published 03:01 AM

Listen to this article•0:00 min

Share this article:

monday.com Deploys Nebius Token Factory to Scale Open-Source AI Inference — AI-generated illustration

Investor M.V. Cunha revealed monday.com as an enterprise customer of Nebius, saying, "monday.com [is] an enterprise customer of Nebius ($NBIS), deploying their Token Factory to optimize open‑source AI model inference for efficiency and cost savings. This partnership highlights monday.com’s push into advanced AI infrastructure amid competitive pressures." That investor disclosure is the only supplied source that names monday.com as a Nebius customer; no Nebius press material in the supplied set lists monday.com by name.

Nebius unveiled Nebius Token Factory in a press release datelined Amsterdam on November 5, 2025, positioning the product as a production inference platform built on Nebius AI Cloud 3.0 "Aether". Nebius said, "AI projects often scale faster than the teams around them. Nebius Token Factory streamlines the post‑training lifecycle, turning open‑source model weights into optimized, production‑ready systems with guaranteed performance and transparent cost per token."

The vendor described integrated tooling and performance claims in precise terms: "Integrated fine‑tuning and distillation pipelines allow teams to adapt large open models to their own data while cutting inference costs and latency by up to 70%," and added that "optimized models can be deployed to production endpoints instantly, without manual infrastructure setup." Nebius also stated Token Factory is "validated by benchmarks including MLPerf® Inference," an explicit benchmark reference included in the November 5 release.

The press release listed supported models and hosting options, saying Token Factory "supports all major open models, including DeepSeek, GPT‑OSS by OpenAI, Llama, NVIDIA Nemotron and Qwen, and also offers customers the option to host their own models." The syndicated text in parts duplicated NVIDIA Nemotron, a repetition preserved in the supplied material.

Customer and partner endorsements accompanied the launch. Alex Mashrabov, founder and CEO of Higgsfield AI, said, "Running inference at scale with healthy economics requires efficient on‑demand and autoscaling capabilities. Nebius was the only provider that met our requirements — reducing overhead, simplifying management, and enabling us to deliver faster, more cost‑efficient AI in production." Julien Chaumond, CTO at Hugging Face, added, "Hugging Face and Nebius share the same mission of making open AI accessible and scalable. By partnering with Nebius Token Factory, we’ve been able to provide faster and more reliable inference for developers building on large open‑source models."

Independent coverage in Windowsforum framed Token Factory as "a full‑throated attempt to carve out a position in the AI cloud race" and listed enterprise use cases including customer‑facing conversational AI with sub‑second response requirements, vertical LLMs for healthcare and finance, high QPS embedding and search services, code generation tooling, and vision‑and‑language multimodal applications. Windowsforum also cautioned that "the difference between compelling vendor narratives and operational reality hinges on independent validation. Nebius’ performance claims — sub‑second latency at hyperscale, 99.9% uptime, and dramatic inference cost reductions — are achievable in particular configurations, but enterprises must insist on proof through rigorous PoCs, audited compliance documentation, and contractual protections."

Windowsforum further reported company context that supports Nebius’s commercial push: a headquarters in Amsterdam, a public listing on a major U.S. exchange, a September 2025 multi‑year GPU infrastructure agreement with a global cloud company, and public filings indicating substantial capital raises to scale data center footprint.

The key unresolved item for monday.com employees and observers is confirmation: the investor disclosure is the sole supplied source naming monday.com as a Nebius customer, and the scope, timeline, models in use, measured cost or latency reductions, and contractual SLAs remain unconfirmed in the supplied materials. Journalistic next steps to expect are direct confirmations from monday.com or Nebius, MLPerf® Inference benchmark reports and configurations, PoC results that reproduce the cited "up to 70%" gains, and any contract announcements that clarify commercial terms and SLAs.

Know something we missed? Have a correction or additional information?

Submit a Tip