Naver ventures into AI image search business

2024. 8. 22. 11:15

글자크기 설정 파란원을 좌우로 움직이시면 글자크기가 변경 됩니다.

매우 작은 폰트
작은 폰트
보통 폰트
큰 폰트
매우 큰 폰트

이 글자크기로 변경됩니다.

(예시) 가장 빠른 뉴스가 있고 다양한 정보, 쌍방향 소통이 숨쉬는 다음뉴스를 만나보세요. 다음뉴스는 국내외 주요이슈와 실시간 속보, 문화생활 및 다양한 분야의 뉴스를 입체적으로 전달하고 있습니다.

닫기

Naver announced that it will add visual processing capabilities to its AI agent, CLOVA X. The image shows the ‘chart understanding’ feature of HyperCLOVA X. [Courtesy of Naver Corp.]

South Korean platform giant Naver Corp. is set to integrate image recognition capabilities into its conversational artificial intelligence (I) agent service, CLOVA X. Global tech giants such as OpenAI Inc. and Google LLC are moving towards more sophisticated chatbots by developing multimodal AI that can simultaneously understand and process various forms of data, such as text, images, and speech, while Naver is focusing on a free model to achieve user lock-in effects.

According to sources from the information technology (IT) industry on Wednesday, Naver plans to add multimodal functionality to CLOVA X later in August 2024, which will allow the recognition of images and related query responses. This feature may be available to general CLOVA X users.

Users could upload a photo of a math problem, for example, to request an answer from CLOVA X or ask it to create a poem related to the image. This would be a step up from text-based interactions to image-based ones, as CLOVA X could previously handle documents like PDFs, TXT, HWP, and DOCX and engage in related conversations. But it could only recognize text within documents, limiting interactions with charts or graphs, and users will now be able to request tasks such as creating proposals based on graph data, allowing for more specialized functions.

Naver is also testing image editing features on CLOVA X, allowing users to delete or modify parts of uploaded images. These features, including changing image backgrounds or altering the colors of clothing, will be rolled out to all users in stages after further refinement and the exact release date for these editing features is yet to be determined.

LG AI Research recently introduced an image-based query response agent in the trial version of its generative AI service Chat EXAONE 3.0, but this service is restricted to LG employees and no official release date has been set yet.

Meanwhile, major global tech firms have already integrated image search AI into their services. Google’s AI chatbot Gemini supports image uploads for Q&A and similar features are available in OpenAI’s ChatGPT and Anthropic’s Claude chatbots.

Global tech companies are also developing AI agents capable of recognizing speech and video.

“The integration of image search AI enhances usability and accessibility at the input stage, making image-based searches more useful,” Ha Jung-woo, head of Naver Cloud’s AI Innovation Center, said.

Meanwhile, image search and analysis AI are not limited to conversational AI agents. Google introduced Google Lens, which provides real-time information about images via smartphone cameras. Image search AI is expected to be particularly powerful in chatbots and e-commerce, allowing users to easily find similar products by uploading pictures of the desired piece of clothing or items.

이 기사에 대해 어떻게 생각하시나요?

매일경제에서 직접 확인하세요. 해당 언론사로 이동합니다.

국제

Naver ventures into AI image search business