Qualcomm's ARM racks challenge AI competition in the data center

"Two variants will be released, the AI200 and the AI250, the former in 2026 and the latter in 2027. With both solutions, Qualcomm is entering the data center market for AI inferencing, or the daily running of AI models. The AI200 focuses on low total cost of ownership (TCO) and optimizes performance for large language models (LLMs) and multimodal models (LMMs). Each card has 768 GB of LPDDR memory, which provides more capacity for less money."

"The AI250 goes one step further with an innovative memory architecture based on near-memory computing. This places computing power close to the physical memory chips. This delivers more than 10x higher effective memory bandwidth and much lower energy consumption. This enables so-called disaggregated AI inferencing, which increases hardware efficiency. This splits the calculation of the AI tokens and the processing of the initial prompt."

Inferencing will account for the lion's share of AI calculations. Qualcomm is entering the data center AI inferencing market with two rack-scale systems: AI200 in 2026 and AI250 in 2027. The AI200 targets low total cost of ownership and optimizes performance for large language and multimodal models, offering 768 GB of LPDDR per card for greater capacity at lower cost. The AI250 employs near-memory computing to place compute close to memory, delivering over 10x effective memory bandwidth and much lower energy use, enabling disaggregated inferencing. Both racks use direct liquid cooling, PCIe scale-up, Ethernet scale-out, confidential computing, and consume about 160 kW per rack.

#qualcomm #ai-inferencing #near-memory-computing #rack-scale-data-centers

Read at Techzine Global

Unable to calculate read time

Collection

[

...

]

Qualcomm's ARM racks challenge AI competition in the data centerQualcomm's ARM racks challenge AI competition in the data center Briefly

Qualcomm's ARM racks challenge AI competition in the data center
Qualcomm's ARM racks challenge AI competition in the data center
Briefly