Gautam Jha is the Co-Founder & CTO of Kalpa Labs, an SF-based YC backed startup building large scale Foundational speech models. Voice is quickly becoming a primary interface for enterprise software, ...
Google LLC’s DeepMind artificial intelligence unit today rolled out a new text-to-speech model called Gemini 3.1 Flash TTS.
OpenAI has today introduced a suite of advanced audio models and tools through its API, designed to empower developers in creating sophisticated, voice-driven applications. These updates include ...
Deepgram, the real-time AI infrastructure company underpinning the Voice AI economy, today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition ...
Real-time voice artificial intelligence startup Deepgram Inc. today announced the general availability of Flux Multilingual, ...
Companies are betting big on generative AI to gain a competitive edge. But adoption challenges remain. According to a recent survey from EY, a significant portion of businesses looking to embrace ...
As part of its fantastic body of work on speech and voice models, Apple has just published a new study that takes a very human-centric approach to a tricky machine learning problem: not just ...
Amazon researchers have unveiled the largest text-to-speech AI model to date, which they claimed shows "emergent" qualities that enhance its ability to speak even complex sentences naturally.
The global speech and voice recognition market is projected to grow from $20 billion in 2023 to over $53 billion by 2030. That number sounds impressive until you look at how the industry is actually ...
Large language models powered by artificial intelligence differ in how they classify and respond to hate speech, according to recent findings by Annenberg School for Communication researchers. In a ...