Open source AI platforms gain attention as AI voice market grows
이 글자크기로 변경됩니다.
(예시) 가장 빠른 뉴스가 있고 다양한 정보, 쌍방향 소통이 숨쉬는 다음뉴스를 만나보세요. 다음뉴스는 국내외 주요이슈와 실시간 속보, 문화생활 및 다양한 분야의 뉴스를 입체적으로 전달하고 있습니다.
Voice AI is not a highly regarded market like large language models (LLMs) yet. But it is considered essential for the upcoming multimodal era, where various AIs, including text, images, and voice, are integrated.
According to multiple sources from the information technology (IT) industry, Kyutai, a non-profit AI research lab based in France, recently unveiled its self-developed voice AI model. Moshi is available in a free version, along with its code. The model is based on a language model called Helium, which has 7 billion parameters and is akin to human brain synapses.
It can even be used without an internet connection, allowing it to be stored and used on smartphones or tablets, in contrast with OpenAI’s voice AI, which is cloud-based. Moshi’s voice generation time is only 0.2 seconds, faster than OpenAI’s GPT-4, which takes 0.23 to 0.32 seconds.
Kyutai Chief Executive Officer Patrick Perez emphasized in a recent interview with Maeil Business Newspaper that his company will make AI easily accessible for everyone, while noting that research on Moshi and other multimodal foundation models will continue.
Kyutai is currently viewed as the French counterpart to OpenAI. It was co-founded in November 2023 by the iliad Group, CMA CGM Group, and Schmidt Futures, led by former Google CEO Eric Schmidt, with a total investment of 300 million euros. A core team of eight developed voice AI that rivals OpenAI’s, capable of very natural conversations and available for online trials, within six months.
Other companies have also released voice AI as open source, with notable examples including Meta, Coqui, Mozilla, and Kaldi.
Meta earlier unveiled MMS, capable of recognizing and generating over 4,000 languages. A significant advantage of MMS is its ability to learn from data without needing labeled training tags. For their parts, Mozilla’s DeepSpeech has improved GPU efficiency and Coqui has launched fast real-time voice recognition and text-to-speech conversion.
Both DeepSpeech and Coqui are open source, and the rationale for distributing AI in this format is to gain a first-mover advantage. Unlike closed models like OpenAI’s GPT or Anthropic’s Claude, open source allows anyone to access and use the source code for free. This increases technological accessibility for a broader use base and helps avoid dependence on certain closed models. Developing companies can build ecosystems around open source, encouraging many developers to adopt the technology and lead in standardizing it.
“The AI market is not solely driven by closed models like OpenAI or Anthropic,” an industry insider said. “Open source models are also demonstrating sufficiently good performance.”
The closed sector is also actively developing voice AI. OpenAI recently launching an updated voice mode for ChatGPT that improves usage in 50 languages, including Korean and Japanese, and is currently available to paid users in Korea.
OpenAI’s voice mode allows for adjustment of AI speech speed and can recognize the speaker’s emotions, with the company refining the Korean voice output to sound more natural and support nine different voice versions. Google also unveiled its AI voice assistant, Gemini, in August 2024. The assistant has been optimized for mobile environments, offering ten voices to choose from regarding tone and style.
According to market research firm Mordor Intelligence, the voice recognition market is projected to grow to $42.08 billion in 2029 from $14.95 billion in 2024. With the advancement of AI, it is expected to be widely adopted across various sectors, including smart homes and IoT, customer service and call centers, healthcare, automotive and navigation, educational tools, gaming and entertainment, banking and finance, legal and administrative services, accessibility support, and translation services.
Copyright © 매일경제 & mk.co.kr. 무단 전재, 재배포 및 AI학습 이용 금지
- 오늘의 운세 2024년 9월 26일 木(음력 8월 24일) - 매일경제
- 백종원이 푹 빠졌던 돈가스 가게, 또 사고쳤다…고속도로 휴게소 ‘명품맛집’ 등극 - 매일경제
- “믿고 돈 맡겼더니, 어이가 없네”…은행들, 올해 사고로 날린 돈이 자그마치 - 매일경제
- “우리는 로봇이 아니다”...손흥민 폭탄 발언 도대체 무슨 일 있었길래 - 매일경제
- “무료인데 챗GPT보다 빠르다고?”…프랑스 회사가 공개한 이 음성비서, 정체가 - 매일경제
- [속보] 당정 “이공계 석사 1000명에 연간 500만원 특화 장학금 추진” - 매일경제
- “국세청 공무원도 틀려, 국민은 오죽하겠나”…연말정산 5년간 최소 1조7천억 추가 납세 - 매일
- “줄게 줄게 오물 다 줄게”…‘짧은 치마’ 춤추는 김여정, 빵 터졌다 - 매일경제
- “주방에 썩은 쥐가”…하루 900개 팔리는 강남 도시락 업체 위생상태 ‘충격’ - 매일경제
- 데뷔전 치르고 리그 평정? 황인범, ESPN 이주의 팀 선정 쾌거...MVP까지? - MK스포츠