I can’t find any voicemail services that work the way I want them to though, so I started building my own using Twilio to handle the incoming phone call + ElevenLabs for text-to-speech + AssemblyAI for speech-to-text + Trestle Smart CNAM API for identifying the caller. I’ll open-source the code once it’s ready.
Twilio’s TTS isn’t as good as ElevenLabs, and their transcription isn’t as good as AssemblyAI. AssemblyAI can pull key details out of the message (eg people’s names, company names, callback numbers, etc) and IIRC it’s quite a bit cheaper than Twilio’s transcription.
Plus now I can out “AI engineer” on my resume, lol. A lot of “AI” is all about gluing other people’s work together, and that’s exactly what I’m doing.
Voicemail’s definitely not dead.
I can’t find any voicemail services that work the way I want them to though, so I started building my own using Twilio to handle the incoming phone call + ElevenLabs for text-to-speech + AssemblyAI for speech-to-text + Trestle Smart CNAM API for identifying the caller. I’ll open-source the code once it’s ready.
Seems awfully over complicated. Why not just use some twiml verbs like <say> and <gather>?
Twilio’s TTS isn’t as good as ElevenLabs, and their transcription isn’t as good as AssemblyAI. AssemblyAI can pull key details out of the message (eg people’s names, company names, callback numbers, etc) and IIRC it’s quite a bit cheaper than Twilio’s transcription.
Plus now I can out “AI engineer” on my resume, lol. A lot of “AI” is all about gluing other people’s work together, and that’s exactly what I’m doing.