GI SP-025 Learning/Programming for Intellivoice?

First Spear · February 18, 2017

Hey all. I am trying to find supplemental info and examples where the GI SP-025 (used in the Intellivoice) was programmed and used for playback. My Bingle searches haven't been fruitful yet, so here I am. I'm looking to learn how it was used and developed-for BITD.

All ideas appreciated.

Thanks!

intvnut · February 18, 2017

You can find the SP0250 Applications Manual online. The SP0256 in the Intellivoice uses a modified SP0250 vocal tract model with a 4-bit microsequencer that unpacks the coefficient data into the VTM. The engineering spec is on Papa Intellivision. Frank Palazzolo and I have separately reverse engineered the opcode encodings, and I documented them on Spatula City.

jzIntv includes a disassembler for speech data (dasm0256), a reference voice driver (examples/library/ivoice.asm), and a reference Intellivoice implementation.

I also wrote a utility called VTinker that lets you tweak VTM parameters.

The main missing piece is encoding the LPC-12 parameters for the vocal tract model. That's where all the secret sauce is.

intvnut · February 18, 2017

FWIW, the way speech encoding worked BITD for these chips is that the voice actor would speak the phrases—perhaps multiple takes—and that would get recorded and digitized. The data would then get edited and fed to LPC encoder tools which produce the encoded speech data, perhaps compressed. That gets fed through the speech synthesizer to see how it sounds. Based on the outcome, you'd edit either the source material, the encoder parameters, or the encoded output to improve the result. Wash, rinse, repeat until you like the samples.

One of my old supervisors worked in the product group that used a slightly newer generation of the same technology that went into answering-machine type products. For the prerecorded phrases in ROM, he said they went through many editing cycles. They had professional encoder/listener experts that could listen for encoding artifacts and know exactly what parameters to tweak to get the audio sounding good. It's rather tricky.

Edited February 18, 2017 by intvnut

+DZ-Jay · February 18, 2017

FWIW, the way speech encoding worked BITD for these chips is that the voice actor would speak the phrases—perhaps multiple takes—and that would get recorded and digitized. The data would then get edited and fed to LPC encoder tools which produce the encoded speech data, perhaps compressed. That gets fed through the speech synthesizer to see how it sounds. Based on the outcome, you'd edit either the source material, the encoder parameters, or the encoded output to improve the result. Wash, rinse, repeat until you like the samples.

One of my old supervisors worked in the product group that used a slightly newer generation of the same technology that went into answering-machine type products. For the prerecorded phrases in ROM, he said they went through many editing cycles. They had professional encoder/listener experts that could listen for encoding artifacts and know exactly what parameters to tweak to get the audio sounding good. It's rather tricky.

Thanks for all the details. It's good to have all these resources listed in one place.

Also, I seem to remember you working on an LPC encoder in the past. Is that work complete? If so, are you planning on publishing it?

-dZ.

Sign In

GI SP-025 Learning/Programming for Intellivoice?

Recommended Posts

First Spear

Link to comment

Share on other sites

intvnut

Link to comment

Share on other sites

intvnut

Link to comment

Share on other sites

+DZ-Jay

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Apps

My Activity Streams

More