First Spear Posted February 18, 2017 Share Posted February 18, 2017 Hey all. I am trying to find supplemental info and examples where the GI SP-025 (used in the Intellivoice) was programmed and used for playback. My Bingle searches haven't been fruitful yet, so here I am. I'm looking to learn how it was used and developed-for BITD. All ideas appreciated. Thanks! Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 18, 2017 Share Posted February 18, 2017 You can find the SP0250 Applications Manual online. The SP0256 in the Intellivoice uses a modified SP0250 vocal tract model with a 4-bit microsequencer that unpacks the coefficient data into the VTM. The engineering spec is on Papa Intellivision. Frank Palazzolo and I have separately reverse engineered the opcode encodings, and I documented them on Spatula City. jzIntv includes a disassembler for speech data (dasm0256), a reference voice driver (examples/library/ivoice.asm), and a reference Intellivoice implementation. I also wrote a utility called VTinker that lets you tweak VTM parameters. The main missing piece is encoding the LPC-12 parameters for the vocal tract model. That's where all the secret sauce is. 1 Quote Link to comment Share on other sites More sharing options...
intvnut Posted February 18, 2017 Share Posted February 18, 2017 (edited) FWIW, the way speech encoding worked BITD for these chips is that the voice actor would speak the phrases—perhaps multiple takes—and that would get recorded and digitized. The data would then get edited and fed to LPC encoder tools which produce the encoded speech data, perhaps compressed. That gets fed through the speech synthesizer to see how it sounds. Based on the outcome, you'd edit either the source material, the encoder parameters, or the encoded output to improve the result. Wash, rinse, repeat until you like the samples. One of my old supervisors worked in the product group that used a slightly newer generation of the same technology that went into answering-machine type products. For the prerecorded phrases in ROM, he said they went through many editing cycles. They had professional encoder/listener experts that could listen for encoding artifacts and know exactly what parameters to tweak to get the audio sounding good. It's rather tricky. Edited February 18, 2017 by intvnut 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted February 18, 2017 Share Posted February 18, 2017 FWIW, the way speech encoding worked BITD for these chips is that the voice actor would speak the phrases—perhaps multiple takes—and that would get recorded and digitized. The data would then get edited and fed to LPC encoder tools which produce the encoded speech data, perhaps compressed. That gets fed through the speech synthesizer to see how it sounds. Based on the outcome, you'd edit either the source material, the encoder parameters, or the encoded output to improve the result. Wash, rinse, repeat until you like the samples. One of my old supervisors worked in the product group that used a slightly newer generation of the same technology that went into answering-machine type products. For the prerecorded phrases in ROM, he said they went through many editing cycles. They had professional encoder/listener experts that could listen for encoding artifacts and know exactly what parameters to tweak to get the audio sounding good. It's rather tricky. Thanks for all the details. It's good to have all these resources listed in one place. Also, I seem to remember you working on an LPC encoder in the past. Is that work complete? If so, are you planning on publishing it? -dZ. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.