Kakao's open-source AI model can interpret both words and pictures

이재림 2024. 1. 19. 17:37
글자크기 설정 파란원을 좌우로 움직이시면 글자크기가 변경 됩니다.

이 글자크기로 변경됩니다.

(예시) 가장 빠른 뉴스가 있고 다양한 정보, 쌍방향 소통이 숨쉬는 다음뉴스를 만나보세요. 다음뉴스는 국내외 주요이슈와 실시간 속보, 문화생활 및 다양한 분야의 뉴스를 입체적으로 전달하고 있습니다.

Kakao unveiled its Honeybee multimodal AI model, though its KoGPT 2.0 remains under wraps.
Kakao Brain's multimodal source code "Honeybee" was released on the open-source platform Github on Friday. [KAKAO BRAIN]

Kakao unveiled its multimodal AI model “Honeybee” for the first time on Friday at a conference hosted by the Ministry of Science and ICT. The tech giant's hyperscale language model KoGPT 2.0, however, remains under wraps.

Kakao’s to-be CEO Chung Shin-a presented the source code as she discussed the company's upcoming plans for developing AI models and services.

C-suite executives from various fields related to platforms, telecommunications, beauty, TV and robotics attended the conference, which was focused on government policies and collaborations related to AI, as were executives from Samsung, LG, Doosan Robotics, Naver and Amorepacific.

Honeybee's code base was seeded to developers through GitHub on the same day, according to Kakao’s research subsidiary Kakao Brain.

The source code itself is not a large language model (LLM), but rather a module that could be plugged to other large language models. LLMs that implement would become multimodal, gaining the ability to comprehend both image and text prompts.

For instance, if a user feeds a picture of two basketball players on a court to a Honeybee-integrated LLM and asks “how many times did the player on the left win?” in English, the model could comprehend the image and the text to elicit a proper response.

Honeybee achieved the top score on a functionality test of several global multimodality evaluation protocols, including MME, MMBench and SEED-Bench.

Kakao Brain believes that Honeybee could be an innovative education tool, as it can interact with the users by simultaneously inputting a certain image and a text query, though exact forms use cases for Honeybee are still to be officially specified.

“We are deliberating on adapting Honeybee to a variety of services,” said Kakao Brain CEO Kim Il-do in a statement. “We will seamlessly put more effort on research and development (R & D) to come up with a more perfected AI model.”

Kakao is a relative latecomer to the global race for AI supremacy that OpenAI's ChatGPT catalyzed last year. Kakao initially promised to release KoGPT 2.0 last year but has since continuously postponed its release amid various allegations related to inner friction and shady dealings surrounding its acquisition of K-pop agency SM Entertainment.

Korean companies such as Naver, Korea’s largest portal site, and LG AI Research rolled out LLMs HyperCLOVA X and Exaone last year, respectively. Those models are being adapted to a variety of services across online platforms and financial firms.

BY LEE JAE-LIM [lee.jaelim@joongang.co.kr]

Copyright © 코리아중앙데일리. 무단전재 및 재배포 금지.

이 기사에 대해 어떻게 생각하시나요?