Israeli AI firm targets CX market with ‘digital human’ version of ChatGPT
ChatGPT shows how large language models are capable of producing text on a variety of subjects in under a minute, but what if this information could be turned into a video clip in real time via a photorealistic human presenter or video content generated within minutes?
At this year’s Mobile World Congress, D-ID – an Israeli firm with a background in facial recognition – announced new real time streaming capabilities to its proprietary text-to-video API, which allows users to create a photorealistic chatbot that can interact with people in real time or generate videos on the information they seek within minutes.
Chat.D-ID ‘studio’ is essentially a mix of APIs that have been added to the firm’s existing text to video platform. Besides the new real-time streaming animation feature, added ingredients now include conversational AI; Open AI’s generative AI Chat GPT and Stable Diffusion’s image generating API.
This combination allows end users clients to upload an avatar in D-ID’s online studio platform or select one of the firm’s existing photorealistic avatars based on real people who have licensed their image to the company.
Users can then ask questions, and in less than a minute receive a video which sees a photorealistic human generate the answers.
Yaniv Levi, the firm’s product marketing VP said that this use case was perfect for marketing and sales presentations – saving firms “thousands” on video content creation or online training.
Thanks to its real time streaming capabilities the tech also claims to enable call centre service providers to launch ‘digital human’ chatbots which can interact with consumers in a more human, engaging and effective way than a text bot – it’s this use case that has brought the firm to MWC.
At a demo on D-ID’s stand at Israel’s booth in Hall 5, a digital human chatbot appears on a consumer TV as the viewer seeks customer service advisor to deal with the TV’s faulty picture.
The answers are given in real time (in any language required) and the lip synching is particularly impressive.
Levi claims the firm has several dozen enterprise customers the studio platform is currently “going viral “ with a new subscriptions to its studio happening “every five seconds”.
To date, the product VP says there have been over 100 million visits to the online platform and it currently has “hundreds of thousands” of users.
As well as being available online as a platform, the real time streaming animation technology behind chat.D-ID is being released to developers via an API, the firm added.
According to Levi, the enterprise cost for a ‘one stop shop’ package of text generator, image generator and video creation costs $1,500 per month, which comes with 8,000 video minutes per year.
D-ID started out in 2017 with a facial recognition blocker technology before making the pivot into generative AI three years ago and the firm is still applying its legacy technology in its latest product to prevent misuse of the chat.D-ID’s platform.
At MWC the firm also announced that it has joined a framework for the ethical and responsible development, creation and sharing of synthetic media/generative AI initiated by Partnership on AI (PAI) and backed by a cohort of launch partners, including the BBC, The New York Times, Meta, Amazon and IBM Watson.
Subscribe to our Editor's weekly newsletter