• Image 01
  • Image 02
  • Image 03
  • Image 04
  • Image 05
  • Image 06
Need assistance? Contact Us: 1-800-255-5897

Menu

  • Home
  • About Us
    • Company Overview
    • Management Team
    • Board of Directors
  • Your Loan Service Center
  • MAKE A PAYMENT
  • Business Service Center
  • Contact Us
  • Home
  • About Us
    • Company Overview
    • Management Team
    • Board of Directors
  • Your Loan Service Center
  • MAKE A PAYMENT
  • Business Service Center
  • Contact Us
Recent Quotes
View Full List
My Watchlist
Create Watchlist
Indicators
DJI
Nasdaq Composite
SPX
Gold
Crude Oil
Markets
Stocks
ETFs
Tools
Markets:
Overview
News
Currencies
International
Treasuries

Penguin Solutions Introduces Industry's First Production-Ready CXL-Based KV Cache Server

By: Penguin Solutions, Inc. via Business Wire
March 16, 2026 at 16:37 PM EDT
ⓘ This article is third-party content and does not represent the views of this site. We make no guarantees regarding its accuracy or completeness.

Penguin Solutions MemoryAI KV cache server, an 11TB memory appliance, enables efficient deployment of enterprise-scale AI inference

Penguin Solutions, Inc. (Nasdaq: PENG), the AI factory platform company, today announced the industry's first production-ready KV cache server that utilizes CXL memory technology to address the critical "memory wall" challenge in AI inferencing—Penguin Solutions MemoryAI™ KV cache server. This innovative solution delivers up to 11 TB of CXL-based memory engineered to optimize performance of enterprise scale inference, including agentic AI. The result is lower latency, higher throughput, increased efficiency of GPU clusters, consistent achievement of stringent service-level agreements (SLAs), and faster time-to-first-token (TTFT).

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20260316416248/en/

Penguin Solutions MemoryAI KV cache server is the industry's first production-ready KV cache server that utilizes CXL memory technology to address the critical "memory wall" challenge in AI inferencing. The innovative solution delivers up to 11 TB of CXL-based memory engineered to optimize performance of enterprise scale inference, including agentic AI.

Penguin Solutions MemoryAI KV cache server is the industry's first production-ready KV cache server that utilizes CXL memory technology to address the critical "memory wall" challenge in AI inferencing. The innovative solution delivers up to 11 TB of CXL-based memory engineered to optimize performance of enterprise scale inference, including agentic AI.

While model training and tuning is primarily compute-bound and occurs episodically, the continuous memory-bound and latency-sensitive inference workloads required for inference and agentic AI are complex and fundamentally different. Inference demands are typically 30% compute driven (GPU) and 70% memory driven (RAM), elevating the need for greater memory capacity and causing performance bottlenecks and GPU idle time. Accelerating memory-dependent AI processes, Penguin’s MemoryAI KV cache server increases memory capacity by integrating 3 TB of DDR5 main memory and up to eight 1 TB CXL Add-in Cards (AICs).

“CXL-enabled KV cache technology delivers faster time-to-first-token, reduced time per output token, and increased overall end-to-end token throughput,” said Phil Pokorny, chief technology officer at Penguin Solutions. “These critical performance improvements enable enterprise-scale inferencing across many users who expect low latency and timely access to AI-generated insights. The introduction of Penguin’s MemoryAI KV cache server is designed to help enterprises sustain these performance improvements and consistent service standards as model size, context windows, precision requirements, and concurrency demands continue to grow.”

By significantly expanding the memory available to GPUs, the server enables organizations to mitigate GPU memory bandwidth limits, reduce redundant re-compute operations, and optimize clusters for inference performance. This increased system efficiency also enables organizations to train larger models and process expansive datasets faster.

Benefits of Penguin Solutions MemoryAI KV cache server in Cluster Design

With expanded, disaggregated memory, the server offers several operational benefits:

  • Support for larger context size and concurrency: Penguin’s MemoryAI KV cache server is particularly crucial for enterprise-scale tasks requiring large context windows and minimal latency, including real-time financial news parsing, retrieval-augmented generation (RAG) over massive 10-K datasets, and regulatory compliance analysis.
  • Flexibility to tier cluster memory: CXL-based KV cache delivered by the server creates a new tier of cluster memory to supplement existing high bandwidth memory (HBM) and system DRAM, delivering speeds 10x faster than NVMe-based approaches. This provides new flexibility in offloading KV data for faster access.
  • Compatibility with NVIDIA Dynamo: The solution is compatible with NVIDIA Dynamo, NVIDIA's software architecture for KV cache memory offloading.
  • Cost and power efficiency: The server enables organizations to maximize the efficient use of GPUs by adding large memory pools and optimizes clusters by right-sizing GPUs and memory. Additionally, the solution provides efficient operation, drawing less power than equivalent GPU servers.

The Penguin Solutions MemoryAI KV cache server builds upon Penguin Solutions’ legacy of innovation in high-performance computing expertise, with customers already deploying the solution to optimize cluster performance and meet demanding latency SLAs for production AI workloads.

Explore Penguin Solutions’ MemoryAI KV cache server page or visit booth #1031 at the NVIDIA GTC AI Conference and Expo March 16-19, 2026, in San Jose, Calif.

MemoryAI and Penguin Solutions are trademarks or registered trademarks of Penguin Solutions, Inc. or its affiliates. All other trademarks are the property of their respective owners.

About Penguin Solutions

The most transformative technological advancements are often the hardest to deploy and optimize. Penguin Solutions, the AI factory platform company, has the innovative technologies, skills, experience, and partnerships needed to turn your AI ambitions into reality.

In addition to our AI capabilities, Penguin Solutions offers memory and LED solutions serving a wide range of high-performance and specialized applications.

For more information, visit https://www.penguinsolutions.com.

View source version on businesswire.com: https://www.businesswire.com/news/home/20260316416248/en/

Contacts

PR Contact
Maureen O’Leary
Corporate Communications, Penguin Solutions
1-602-330-6846
pr@penguinsolutions.com

Report this content

If you believe this article contains misleading, harmful, or spam content, please let us know.

Report this article

More News

View More
News headline image
Shorting the Grid: Bloom Energy’s $25B AI Power Play ↗
July 02, 2026
Via MarketBeat
Topics Artificial Intelligence
Tickers BE BN
News headline image
SanDisk’s Volatility May Be Telling Bulls What They Want to Hear ↗
July 02, 2026
Via MarketBeat
Topics Artificial Intelligence
Tickers BAC QCOM SNDK
News headline image
Meta’s AI Compute Push Could Turn Its Massive CapEx Bill Into a Competitive Weapon ↗
July 02, 2026
Via MarketBeat
Topics Artificial Intelligence
Tickers AMZN CRWV META MSFT SPCX
News headline image
General Mills Is a 5-Star Turnaround Play for Buy and Hold Investors ↗
July 02, 2026
Via MarketBeat
Tickers GIS
News headline image
3 Dividend ETFs Built for Stability in a Volatile Market ↗
July 02, 2026
Via MarketBeat
Topics ETFs
Tickers DGRO SDY

Recent Quotes

View More
Symbol Price Change (%)
AMZN  242.67
+0.97 (0.40%)
AAPL  308.63
+14.25 (4.84%)
AMD  517.82
-23.06 (-4.26%)
BAC  58.73
+0.37 (0.63%)
GOOG  356.18
-1.71 (-0.48%)
META  582.90
-30.01 (-4.90%)
MSFT  390.49
+6.21 (1.62%)
NVDA  194.83
-2.75 (-1.39%)
ORCL  140.27
-2.23 (-1.56%)
TSLA  393.45
-31.85 (-7.49%)
Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the Privacy Policy and Terms Of Service.
© 2025 FinancialContent. All rights reserved.

Having difficulty making your payments? We're here to help! Call 1-800-255-5897

Copyright © 2019 Franklin Credit Management Corporation
All Rights Reserved
Contact Us | Privacy Policy | Terms of Use | Sitemap