Managing audio is a key component of modern business. Zoom meetings, YouTube videos, and podcasts, as examples, involve listening, adjusting, and transcribing speech.
There’s no shortage of sophisticated audio tools. I’ve tested a selection of them that claim to use artificial intelligence. Here are my findings arranged by the use case:
- Transcribing,
- Podcast tools,
- Meetings, webinars,
- Text-to-speech.
AI-powered Audio Tools
Transcribing. AudioNotes uses artificial intelligence to convert speech into text. Upload an audio file or speak to the tool and it will transcribe and summarize. The transcription is available for 30 languages, but the summaries are in English only. My tests for this post were in English.
AudioNotes saves all transcripts and summaries automatically in users’ dashboards. Unfortunately, it does not save the audio inside the text. Users can tag the notes to find them quickly and can share them with other registered users.
Use the tool to create transcripts of videos or podcasts or to record ideas and outlines. In a live chat, an AudioNotes rep told me an iPhone app is coming in “two to three weeks.”
AudioNotes offers a free, limited plan and a blizzard of paid plans under “Personal,” “Pro,” and “PodNotes” categories. Each has multiple pricing models ranging from $49 per year to $249 per month.
Recorder, a free Google app for Android devices, is a close alternative to AudioNotes.
Podcast tools. Podcastle is a multi-feature tool for creating better podcasts. It uses AI to:
- Improve audio quality by removing background noises,
- Create podcast transcripts and outlines,
- Detect and remove filler words — e.g., “um,” “ah,” “like,” “you know.”
The free “Basic” plan includes three hours of audio, limited access to the editing tools, and a watermark on the transcriptions, among other limitations. Paid plans are “Storyteller” and “Pro” for $11.99 and $23.99 per month. Both have extensive features and capacity.
Close alternatives to Podcastle are Auphonic, Descript, and Adobe Podcasts.
Meetings, webinars. Otter is an AI assistant that automatically generates meeting transcripts and summaries. Invite Otter to your meetings on Zoom, Microsoft Teams, or Google Meet. It will turn voices into text and capture slides.
Add comments to the transcript and share with your group. The summary resembles a table of contents: clicking sections will take you to that spot in the recorded audio, making recordings easy to navigate.
Otter is also a helpful tool to record and transcribe podcasts.
Otter’s free “Basic” plan includes 300 monthly transcription minutes — 30 minutes per conversation. “Pro” and “Business” plans cost $8 and $20 per month, billed annually.
MeetGeek is a close alternative to Otter.
Text-to-speech. Murf is an AI voice generator to turn text into speech. It’s handy for creating video voiceovers — such as for YouTube and TikTok — and audio versions of articles. Paste the text into the tool, and it will generate the audio.
Murf offers multiple voices — male, female, educator, developer, more. The differences were stark in my testing. Some voices were much better than others. So listen to a few before selecting one. Then add pauses and adjust the speed and pitch as needed.
Murf’s free plan includes 10 minutes of voice generation and three users. “Basic,” “Pro,” and “Enterprise” plans are $19, $26, and $99, per user per month, billed annually. Each offers progressively more features.
Speechify is a close alternative to Murf.