ETFOptimize | High-performance ETF-based Investment Strategies

Quantitative strategies, Wall Street-caliber research, and insightful market analysis since 1998.


ETFOptimize | HOME
Close Window

Baseten Launches New Inference Products to Accelerate MVPs into Production Applications

Baseten announces first platform expansion powered by the Baseten Inference Stack: APIs for open-source AI models and features for training models to improve inference performance

Baseten, the leader for mission-critical inference, announced the public launch of Baseten Model APIs and the closed beta of Baseten Training today. These new products enable AI teams to seamlessly transition from rapid prototyping to scaling in production, building on Baseten’s proprietary inference stack.

In recent months, new releases of DeepSeek, Llama, and Qwen models erased the quality gap between open and closed models. Organizations are more incentivized than ever to use open models in their products. Many AI teams have been limited to testing open models at low scale due to insufficient performance, reliability, and economics offered by model endpoint providers. While easy to get started with, the deficiencies of these shared model endpoints have fundamentally gated enterprises’ ability to convert prototypes into high-functioning products.

Baseten’s new products - Model APIs and Training - solve two critical bottlenecks in the AI lifecycle. Both products are built using Baseten’s Inference Stack and Inference-optimized Infrastructure, which power inference at scale in production for leading AI companies like Writer, Descript, and Abridge. Using Model APIs, developers can instantly access open-source models optimized for maximum inference performance and cost-efficiency to rapidly create production-ready minimum viable products (MVPs) or test new workloads.

“In the AI market, your number one differentiator is how fast you can move,” said Tuhin Srivastava, co-founder and CEO of Baseten. “Model APIs give developers the speed and confidence to ship AI features knowing that we’ve handled the heavy lifting on performance and scale.” Baseten Model APIs enable AI engineers to test open models with a confident scaling story in place from day one. As inference increases, Model APIs customers can easily transfer to Dedicated Deployments that provide greater reliability, performance, and economics at scale.

"With Baseten, we now support open-source models like DeepSeek and Llama in Retool, giving users more flexibility for what they can build,” said DJ Zappegos, Engineering Manager at Retool. “Our customers are creating AI apps and workflows, and Baseten's Model APIs deliver the enterprise-grade performance and reliability they need to ship to production."

Customers can also use Baseten’s new Training product to rapidly train and tune models, which will result in superior inference performance, quality, and cost-efficiency to further optimize inference workloads. Unlike traditional training solutions that operate in siloed research environments, Baseten Training runs on the same production-optimized infrastructure that powers its inference. This coherence ensures that models trained or fine-tuned on Baseten will behave consistently in production, with no last-minute refactoring. Together, the latest offerings enable customers to get products to market more rapidly, improve performance and quality, and reduce costs for mission-critical inference workloads

These launches reinforce Baseten’s belief that product-focused AI teams must care deeply about inference performance, cost, and quality. “Speed, reliability, and cost-efficiency are non-negotiables, and that’s where we devote 100 percent of our focus,” said Amir Haghighat, co-founder and CTO of Baseten. “Our Baseten Inference Stack is purpose-built for production AI because you can’t just have one piece work well. It takes everything working well together, which is why we ensure that each layer of the Inference Stack is optimized to work with the other pieces.”

“Having lifelike text-to-speech requires models to operate with very low latency and very high quality,” said Amu Varma, co-founder of Canopy Labs. “We chose Baseten as our preferred inference provider for Orpheus TTS because we want our customers to have the best performance possible. Baseten’s Inference Stack allows our customers to create voice applications that sound as close to human as possible.”

Teams can start with a quick MVP and seamlessly scale it to a dedicated, production-grade deployment when needed, without changing platforms. An enterprise can prototype a feature on Baseten Cloud, then graduate to its own private clusters or on-prem deployment (via Baseten’s hybrid and self-hosted options) for greater control, performance tuning, and cost optimization, all with the same code and tooling. This “develop once, deploy anywhere” capability directly results from Baseten’s Inference-optimized Infrastructure, which abstracts the complexity of multi-cloud and on-premise orchestration for the user.

The news follows on a year of considerable growth for the company. In February, Baseten announced the close of a series C funding round co-led by IVP and Spark and which moved its total amount of venture capital funding to $135 million. It was recently named to Forbes AI 50 2025, a list of the pre-eminent privately held tech companies in AI which also featured a number of companies that Baseten powers 100 percent of the inference for, like Writer and Abridge.

About Baseten

Baseten is the leader in infrastructure software for high-scale AI products, offering the industry's most powerful AI inference platform. Committed to delivering exceptional performance, reliability, and cost-efficiency, Baseten is on a mission to help the next great AI products scale. Top-tier investors, including IVP, Spark, Greylock, Conviction, Base Case, and South Park Commons back Baseten. Learn more at Baseten.co

Contacts

Recent Quotes

View More
Symbol Price Change (%)
AMZN  222.56
+0.02 (0.01%)
AAPL  274.61
+0.50 (0.18%)
AMD  209.17
+1.59 (0.77%)
BAC  54.81
-0.52 (-0.94%)
GOOG  307.73
-1.59 (-0.51%)
META  657.15
+9.64 (1.49%)
MSFT  476.39
+1.57 (0.33%)
NVDA  177.72
+1.43 (0.81%)
ORCL  188.65
+3.73 (2.02%)
TSLA  489.88
+14.57 (3.07%)
Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the Privacy Policy and Terms Of Service.


 

IntelligentValue Home
Close Window

DISCLAIMER

All content herein is issued solely for informational purposes and is not to be construed as an offer to sell or the solicitation of an offer to buy, nor should it be interpreted as a recommendation to buy, hold or sell (short or otherwise) any security.  All opinions, analyses, and information included herein are based on sources believed to be reliable, but no representation or warranty of any kind, expressed or implied, is made including but not limited to any representation or warranty concerning accuracy, completeness, correctness, timeliness or appropriateness. We undertake no obligation to update such opinions, analysis or information. You should independently verify all information contained on this website. Some information is based on analysis of past performance or hypothetical performance results, which have inherent limitations. We make no representation that any particular equity or strategy will or is likely to achieve profits or losses similar to those shown. Shareholders, employees, writers, contractors, and affiliates associated with ETFOptimize.com may have ownership positions in the securities that are mentioned. If you are not sure if ETFs, algorithmic investing, or a particular investment is right for you, you are urged to consult with a Registered Investment Advisor (RIA). Neither this website nor anyone associated with producing its content are Registered Investment Advisors, and no attempt is made herein to substitute for personalized, professional investment advice. Neither ETFOptimize.com, Global Alpha Investments, Inc., nor its employees, service providers, associates, or affiliates are responsible for any investment losses you may incur as a result of using the information provided herein. Remember that past investment returns may not be indicative of future returns.

Copyright © 1998-2017 ETFOptimize.com, a publication of Optimized Investments, Inc. All rights reserved.