Posted on September 27, 2020 at 10:50 pm
I needed to add voice-over to a video tutorial, so I was looking for text-to-speech web services with realistic human voices and accents, that could help me to auto-translate a few phrases and export them into MP3 audio file, to easily insert them in the video timeline.
Here is what I have found:
TTSMMP3.com is so far the best free (and paid) text-to-speech converter, it is very easy to use and it offers also affordable paid plans in case you need to convert more than 3,000 characters. It also supports breaks/pauses (in seconds), emphasizing words, changing of speed, pitch and whisper (supported SSML tags). It uses Amazon Polly for the TTS processing.
* Best English voice: US English / Matthew
Kurakella text-to-speech converter is another good service I have used in various video tutorials, what is nice about this service is that it uses text-to-speech cloud services from various providers, including Amazon Polly, Microsoft Azure Cognitive Services, Google Text-to-Speech API, and IBM Watson Text to Speech (TTS). This is very good because there are many voices to choose from, and with the monthly paid plan you can unlock also premium voices (Neural for Azure or WaveNet for Google) that are more realistic than basic ones.
* Best English voice: (Microsoft) Guy B. – En – US (paid plan)
Wideo Text to Speech is another very good web service that you can use to convert text to speech (MP3) easily from your web browser. Personally I have found that the English voices “[en-US] Mike Stevens -S” (tag is en-US-Standard-D) and “[en-US] -S” (tag is en-US-Standard-I) are very realistic and I have used them in a few video tutorials with success. Looks like the service is using Google Text to Speech API to convert text to MP3.
During my personal tests, I have found that the voices “Guy (Neural) – Male” and “Amy (Neural) – Female” from Microsoft Azure Text to Speech service are the most realistical ones. Additionally, selecting voice style as “Customer Support” made it perfect for my video tutorials. It also supports many Speech Synthesis Markup Language (SSML) tags.
On Google Text-to-Speech API I have found that the English voices “en-US-Wavenet-C – Female”, “en-US-Wavenet-J – Male”, “en-US-Standard-I – Male”, “en-US-Standard-D – Male” are the most realistical ones. But there are definitely others, you need to test them to find out which one works good for your project. Here are some nice Google TTS PHP examples:
- How to List All Iptables NAT Rules
- Parse or Split FTP URL in Delphi XE using TIdURI
- How to Download a File via FTP in Delphi XE
- Public FTP Server to Test Upload and Download
- How to Parse Command-Line Arguments in Bash
- How to Make Iptables Rules Persistent
- Bash Install Iptables-Persistent Automatically
- Route OpenVPN Connections Through Floating IP
- How to fix "Clickable elements too close together"
- How to fix "Text too small to read"
- Create Custom Rest Endpoints for WP Rest API
- Timestamp URLs for SHA1 SHA256 Code Sign 2021
- PHP Multi-cURL to Run Parallel cURL Requests (Example)
- The following signatures were invalid: EXPKEYSIG B188E2B695BD4743
- Best Clean Monospace Web Fonts 2021
- Add New Path to Delphi 10.4 Sydney on Library Path