Jump to content

Sound to speech with Praat

Recommended Posts

For converting sound files to speech for the speech synthesizer, we have a number of options:

After struggling with and further exploring the conversion of vocals for my Bad Apple demo, I can add a new option:

  • Praat by the Phonetic Sciences group of the University of Amsterdam, combined with my conversion tool ConvertPraatToLpc

Praat (which is Dutch for "talk") is a powerful phonetics program with many techniques and algorithms to analyze and synthesize speech. You can operate it from a GUI or from scripts. It's available for the major platforms (e.g. on Debian/Ubuntu: sudo apt install praat).


Two features in Praat are relevant: a choice of algorithms to extract the pitch and a choice of algorithms to compute LPC coefficients. I have created:

  • A Praat script lpc.praat that reads a specified WAV file, analyzes the speech, and writes a Praat pitch file and a Praat LPC file. The script is a single file in my Bad Apple demo.
  • A command-line tool ConvertPraatToLpc that reads a pair of these files and writes a binary LPC file that is suitable for our TMS5200 speech synthesizer. The tool is one of my video tools, written in Java.

The Bad Apple build script illustrates the flow. For example:

praat --run lpc.praat \
  /tmp/input.wav \
  /tmp/output.PraatPitch \
  /tmp/output.PraatLPC \
  250 550 0.02 0.40 0.20 0.20 0.03

java ConvertPraatToLpc \
  -addstopframe \
  /tmp/output.PraatPitch \
  /tmp/output.PraatLPC \
  0.4 0.6 \

The header resp. the documentation of the tools explain the parameters. You can also follow the Praat script interactively and explore the results.


I feel like we can still push speech extraction further. For example, converting WAV files of the high-quality speech from the speech dictionary, with any of these programs, doesn't produce anything close to the original speech, even though it could/should. I'm curious about your experiences and thoughts.

  • Like 14
  • Thanks 4
Link to comment
Share on other sites

2 hours ago, Ksarul said:

I seem to remember that either @Stuart or @Willsyhas/had access to one of the hardware speech conversion devices TI used to make LPC speech BITD. If so, it might help inform us as to the differences between the old school speech conversion hardware and the newer stuff.

'Twas @Stuart

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...