KAIST researchers say new GPU tech doubles AI service speed

이재림 2024. 7. 8. 19:34
글자크기 설정 파란원을 좌우로 움직이시면 글자크기가 변경 됩니다.

이 글자크기로 변경됩니다.

(예시) 가장 빠른 뉴스가 있고 다양한 정보, 쌍방향 소통이 숨쉬는 다음뉴스를 만나보세요. 다음뉴스는 국내외 주요이슈와 실시간 속보, 문화생활 및 다양한 분야의 뉴스를 입체적으로 전달하고 있습니다.

A group of researchers from KAIST and chip design startup Panmnesia succeeded in more than doubling a GPU's memory processing speed with the goal of addressing the memory limitations of advanced GPUs that are used to train AI.
A GPU deployed with the Compute Express Link (CXL) developed by KAIST [KAIST]

A group of researchers from KAIST and chip design startup Panmnesia succeeded in more than doubling a GPU's memory processing speed with the goal of addressing the memory limitations of advanced GPUs that are used to train AI.

The team, led by Professor Jung Myoung-soo, says its Compute Express Link (CXL) technology can operate AI services 2.36 times faster than existing GPU memory expansion technologies, KAIST said Monday.

CXL is a protocol that enables high-speed and high-capacity transfer between processors, including GPUs, and memory. The company's CXL-Enabled AI Accelerator was unveiled at CES 2024 in January, while its new CXL-GPU was unveiled last week.

The research will be presented at the upcoming USENIX Annual Technical Conference in Santa Clara, which begins Wednesday.

Companies commonly train AI systems on multiple GPUs to access the memory required, significantly increasing the cost of developing any given model.

CXL, however, enables the GPU to access external memory in the same way it accesses internal memory, significantly increasing the capacity available in a more cost-effective manner. It essentially allows CPUs and GPUs to “share” memory without copying or moving data.

The KAIST team noticed performance slowdowns in CXL-GPU devices when processing data from memory chips. Therefore, the team developed a technology where the memory chips can autonomously read stored data without having to wait for the main processor to complete the transaction and can also retrieve data from the GPU more quickly.

“This technology can accelerate the market opening of CXL-GPU, significantly reducing the memory expansion costs for Big Tech companies operating large-scale AI services,” Jung said in a statement.

The successful adoption of this technology, however, will depend on how easily Panmnesia's solution can be integrated into existing hardware and whether GPU developers are on board with the standard.

BY LEE JAE-LIM [lee.jaelim@joongang.co.kr]

Copyright © 코리아중앙데일리. 무단전재 및 재배포 금지.

이 기사에 대해 어떻게 생각하시나요?