Google’s TurboQuant jolts memory stocks, but demand outlook seen intact

Jo He-rim 2026. 3. 27. 16:34
자동요약 기사 제목과 주요 문장을 기반으로 자동요약한 결과입니다.
전체 맥락을 이해하기 위해서는 본문 보기를 권장합니다.

"Efforts to reduce KV (key-value) cache have been ongoing and cutting KV cache does not necessarily translate into lower usage of HBM."

"That would lead to increased computing workloads and higher memory content, ultimately positioning memory-chip makers as the biggest beneficiaries of the expanding AI ecosystem."

음성재생 설정 이동 통신망에서 음성 재생 시 데이터 요금이 발생할 수 있습니다. 글자 수 10,000자 초과 시 일부만 음성으로 제공합니다.
글자크기 설정 파란원을 좌우로 움직이시면 글자크기가 변경 됩니다.

이 글자크기로 변경됩니다.

(예시) 가장 빠른 뉴스가 있고 다양한 정보, 쌍방향 소통이 숨쉬는 다음뉴스를 만나보세요. 다음뉴스는 국내외 주요이슈와 실시간 속보, 문화생활 및 다양한 분야의 뉴스를 입체적으로 전달하고 있습니다.

New AI algorithm targets inference memory bottleneck; Analysts say lower costs could ultimately drive chip demand
Google (AP-Yonhap)

Google’s new AI memory compression algorithm sent memory chip stocks sliding this week, raising concerns over potential demand disruption. Analysts, however, say the long-term outlook remains intact, with the algorithm possibly opening new application potentials to boost demand.

The tech giant earlier in the week introduced TurboQuant, an algorithm designed to reduce the working memory required for artificial intelligence inference by compressing key-value cache — the memory that allows large language models to retain context without repeated computation.

Following the announcement Tuesday, on Thursday shares of Samsung Electronics and SK hynix fell 4.71 percent and 6.23 percent, respectively. US-listed memory firms also dropped sharply, with the Philadelphia Semiconductor Index sliding 4.8 percent on Thursday. Nvidia fell 4.2 percent as Micron Technology dropped 6.97 percent and Sandisk plunging 11.02 percent.

TurboQuant
Google's graphic for TurboQuant (Google Research blog)

Google says TurboQuant can reduce AI memory usage by up to sixfold without compromising accuracy, while delivering processing speeds up to eight times faster than Nvidia’s H100 graphics processing unit under certain conditions.

According to Google, the data compression technique restructures data into a simpler form so it can be efficiently compressed while preserving most of the core information. It then applies an additional step to correct small errors from the initial compression, helping maintain accuracy in tasks such as language processing and search.

The company said the technology could eventually be applied to its Gemini AI model and search devices in the future.

Cloudfare CEO Matthew Prince described the breakthrough as Google’s "DeepSeek moment," likening it to how China’s DeepSeek demonstrated that high-performance AI models could be built at significantly lower costs.

The development has raised concerns that improved memory efficiency could dampen demand for DRAM and high bandwidth memory chips, key components underpinning AI workloads.

Demand impact limited
Samsung Electronics HBM4 (Samsung Electronics)

Still, analysts said the market reaction may be extreme, citing uncertainty around commercialization and broader demand dynamics.

"At this stage, it is still a research paper that has yet to be validated, so its impact on actual memory demand appears minimal," said Yoo Hoi-jun, a professor at the KAIST Graduate School of AI Semiconductor.

"Efforts to reduce KV (key-value) cache have been ongoing and cutting KV cache does not necessarily translate into lower usage of HBM."

Kim Rok-ho, an analyst at Hana Securities echoed the view, noting that it remains unclear how Google’s simulation results will translate into real-world deployment and how quickly the technology can scale.

"Memory price forecasts for the second and third quarters have been revised upward following the first quarter, and with supply shortages expected to persist throughout the year, the possibility of stronger-than-expected DRAM price increases remains," Kim said.

Others pointed to a longer-term rebound, noting that lower memory requirements could reduce the cost of running AI systems and accelerate adoption — ultimately boosting chip demand.

"Low-cost AI technologies such as TurboQuant are likely to lower barriers to adoption and significantly expand overall demand," said Kim Dong-won, head of research at KB Securities.

"That would lead to increased computing workloads and higher memory content, ultimately positioning memory-chip makers as the biggest beneficiaries of the expanding AI ecosystem."

Google is expected to present further details on TurboQuant at the International Conference on Learning Representation 2026 in Rio de Janeiro, starting April 23. The research was conducted in collaboration with researchers at Google and Google Deepmind, as well as Han In-su, an assistant professor of electrical engineering at the Korea Advanced Institute of Science and Technology.

Copyright © 코리아헤럴드. 무단전재 및 재배포 금지.