Jump to content
IGNORED

Looking for help on TMS5220 Circuit & Software Design and Usage


coolio

Recommended Posts

I am trying to create a project that interfaces with a TMS5220 speech synthesize chip at the hardware level, and I am not having much success. I am able to generate a lot of noise, just not the expected speech-like sounds. I need some help figuring out what I might be doing wrong with the circuit interfacing with the TMS5220 or even the software. I know this isn't strictly a TI 99/4a question, but I honestly don't know where else to ask. Before I dump my questions here, I just want to double check if this would be a good place to ask.

 

Thanks!

  • Like 2
Link to comment
Share on other sites

11 hours ago, coolio said:

I am trying to create a project that interfaces with a TMS5220 speech synthesize chip at the hardware level, and I am not having much success. I am able to generate a lot of noise, just not the expected speech-like sounds. I need some help figuring out what I might be doing wrong with the circuit interfacing with the TMS5220 or even the software. I know this isn't strictly a TI 99/4a question, but I honestly don't know where else to ask. Before I dump my questions here, I just want to double check if this would be a good place to ask.

 

Thanks!

It would be tough to separate the TMS5220 from the TI-99/4A!  You are most welcome to post questions here. 

 

Are you using a Speech ROM, or sending raw data to the TMS5220?

 

Here are some resources to consult:

 

Thierry Nouspikel's TI Tech Pages

http://www.unige.ch/medecine/nouspikel/ti99/speech.htm

 

Peripherals Theory of Operation - section 13, Speech Synthesizer

ftp://ftp.whtech.com/datasheets and manuals/Hardware/Texas Instruments/PHP1200 Peripheral Expansion System/Peripheral Expansion System Theory Of Operation and Technical Traning Guide 1982-09-03.pdf

 

Bunyard Hardware Manual - Much of the same material (same author!)

https://archive.org/details/tibook_hardware-manual-for-the-texas-instruments-994a

 

Some Speech code and documents which I collected from my father:

https://gitlab.com/FarmerPotato/speech

 

 

Datasheets in https://ftp.whtech.com/datasheets and manuals/Datasheets - TI/

Of relevance:


TMS5200 Databook

 

TMS5220.PDF

 

TMS5220C

https://ftp.whtech.com/datasheets and manuals/Datasheets - TI/TMS5220C (like 5200 in 99-4 speech) manual.pdf

 

Early TMC0285 (rough.. somehow the file is empty on ftp.whtech.com)

https://usermanual.wiki/Manual/TMC0285SpeechSynthesisProcessor.1384298344/view

 

 

 

The TMS5200s are functionally similar.  0285 was the early part number for the "Home Computer Chip" as the Speak & Spell's TMC0281 evolved. It appears in the Speech Synthesizer peripheral with various markings. 


The differences among 5200s won't prevent you from getting recognizable speech.    
 

LPC strings coded for the Home Computer , 0285 or 5200, will still be recognizable on a 5220 or 5220C. 

 

Link to comment
Share on other sites

Thanks. Yes, I have studied Thierry Nouspikel's TI Tech Pages very closely over the years. My question gets into a bit more detail. For context, I am trying to operate the TMS5220 from my custom hombrewed breadboard CPU (learn more about that here). The only thing I am trying to do is enable the Speak External command and so there is no integration with a Speech ROM. When I try to write external LPC speech data to the TMS5220, sound is produced, but it is very garbled. If I vary the OSC input between the 8 KHz and 10 KHz sampling speeds, I can just make ut some of the words I would expect, but it's too static and garbled.

 

In order to satisfy the timing requirements of the "Write Cycle for External Speech Data" identified in the TMS5220 data sheet, I have built circuity that does the following when I want to write data to the common register or FIFO buffer:

  1. Bring /WS low
  2. Immediately present the byte to write on the data bus (with D0 being MSB and D7 being LSB)
  3. Wait for the /READY line to go high, then continue to hold both /WS low and the data bus value valid
  4. When the /READY line returns low, then release /WS (bring it high) and return the data bus to high-Z state.
  5. Rinse and repeat

I have verified with a logic analyzer that my circuit is in fact doing that, so that's not the question here - unless this sequence is wrong, then that's the answer ;-). However, I am noting peculiar behavior with the /READY line that I don't readily understand. 

 

First, I note in Thierry Nouspikel's notes not he TMS5220 that the /READY line can be used as a sort of flow control. That is, after writing the first 16 bytes into the FIFO buffer, when I write the 17th, the /READY line will be held high until the FIFO buffer has consumed some data and can make room for more. I in fact see this happening in the logic analyzer. The first 16 bytes get written with the /READY line is held high for about 20 uS each time, which is in line with the data sheet on t_w(R), and then on the 17th byte the /READY line is held higher about 9.5 mS, and then the /READY line returns low, and my circuitry releases /WS and the data bus. During that time, the talking begins (the sound is not intelligible, but I can't tell if it is the first 16 bytes that is causing the problem). This leads to my first question: When the /READY line is held high for so long on that 17th byte, is the external data that I was trying to write accepted (just before) when the /READY line is returned low, or in this scenario did that external write attempt fail and when the /READY line is returned low I need to try again with that 17th byte that I was trying to write?

 

Then, after that 17th byte, the byte writing never returns to t_w(R) (/READY line pulse width) being less than 23 uS as indicated in the data sheet. All subsequent byte writing cycles experiences the /READY line being held high for 200us each for a few bytes, then a long pause of 10-20 mS. Basically it looks like it the speech data is consumed a few bytes at a time (3-8 bytes at a time, it varies) and each byte takes about 200 uS to consume, and then that packet causes a longer pause measured in 10s of milliseconds, during which I assume speech sound is being generated. My question then is whether this timing pattern sounds about right?

 

Finally, I have sample LPC speech data that is about 300 bytes long. The /INT line is held high (inactive) until about halfway through that data, at which point it gets brought low periodically. I am taking this as a sign that something is not going as I expect because this would indicate that either the talking has stopped or the FIFO buffer is low or empty.  But the first time /INT asserts, the /READY line is still being held high. So I am a bit confused by this. I suspect this means my data writing process above and the dependences on the /READY line for flow control might not be right, but I cannot find what might be wrong per the data sheet and Thierry Nouspikel's tech page.

 

Any advice here would be much appreciated.

Edited by coolio
Typo
Link to comment
Share on other sites

This (https://chrisacorns.computinghistory.org.uk/docs/Acorn/Manuals/Acorn_SpeechSystemUG.pdf) might be helpful, describing the 5220 speech synth in the BBC Micro. Lots of lovely low-level detail about loading data in "Section Seven". The 5220 speech synth is interfaced through a 6522 which presumably enables the processor to carry on processing while waiting for the 5220 to do its stuff.

Link to comment
Share on other sites

This is how it's documented by TI: Send in 16 bytes (as READY allows) and poll the status register (again as READY allows) and watch for the Buffer Low bit, when you see that it's half empty, so send in another 8 bytes and continue polling (or send whatever you have left if less and you're done).

  • Like 3
Link to comment
Share on other sites

Timing: 

 

Clocked at 40 speech frames/sec, each period 25 ms consumes from 5-51 bits.  Your observed 10-20 mS long pause seems about right. 
 

300 bytes is a big sample, but ok. Have you inspected  it frame by frame? (I provide one such decoder, in C, in my gitlab above.) That will tell you what bitrate to expect. 
 

I've only driven the 5200 from a 4A, using the ISR code by John Philips (also in my directory.) This works like others say above: speech interrupt occurs, handler code checks status, feeds in  8 bytes, done. Repeat. 


My wild guess is that your setup garbles a byte in the FIFO on the 17th write that is blocked on READY. Try loading 16 then wait for the chip to say it wants more and send another 8. Either on interrupt, or you poll status.

 

 

 

 

  • Like 1
Link to comment
Share on other sites

Some general thoughts, based on my experience simulating the TMS5200 in software (Java code to be released, derived from the canonical code in Mame). It seems you have implemented and probably checked most of them, and some probably don't apply.

 

On the input side:

  • You mustn't forget the initial SPEAK EXTERNAL command (0x40).
  • The input is a contiguous stream of bits, packed in a stream of bytes.
  • The bit order is easy to get wrong, although it would consistently result in pure noise.
  • The LPC coding tables are different for the TMS5200 and the TMS5220, but not dramatically.
  • You probably want to end with a STOP frame (0xf) to leave the chip in a clean state.
  • I also get garbled speech when underflowing the LPC buffer.
  • You could start with a simple vowel sample of 16 bytes, e.g. stuffing it with compact REPEAT frames.
  • I assume the TMS9900 doesn't resend the 17th byte either when it is held up by the /READY line.

On the output side:

  • In software, you can use the analog output or the more accurate digital output.
  • In my experience, getting the order of the bits in the digital output wrong can result in vaguely recognizable speech.
  • Like 1
Link to comment
Share on other sites

14 hours ago, FarmerPotato said:

Timing: 

 

Clocked at 40 speech frames/sec, each period 25 ms consumes from 5-51 bits.  Your observed 10-20 mS long pause seems about right. 
 

300 bytes is a big sample, but ok. Have you inspected  it frame by frame? (I provide one such decoder, in C, in my gitlab above.) That will tell you what bitrate to expect. 
 

I've only driven the 5200 from a 4A, using the ISR code by John Philips (also in my directory.) This works like others say above: speech interrupt occurs, handler code checks status, feeds in  8 bytes, done. Repeat. 


My wild guess is that your setup garbles a byte in the FIFO on the 17th write that is blocked on READY. Try loading 16 then wait for the chip to say it wants more and send another 8. Either on interrupt, or you poll status.

Thanks! So I changed the control flow of my set up to this (after sending the speak external command):

  1. Set N to 16
  2. Repeat for N bytes or until input buffer is done:
    • Bring /WS low
    • Immediately present the byte to write on the data bus (with D0 being MSB and D7 being LSB)
    • Wait for the /READY line to go high, then continue to hold both /WS low and the data bus value valid
    • When the /READY line returns low, then release /WS (bring it high) and return the data bus to high-Z state.
  3. Wait for /INT line to go low. Check the status register on the TMS5220
  4. If status register indicates talking has stopped:
    • If input buffer not done yet, send speak external command and set N to 16. Go to step #2.
    • If input buffer done, set stop frame and got to END
  5. If status register indicates buffer is low, set N to 8 and go to step #2

 

This certainly produces better results, as the rhythm of the noise resembles the speech rhythm of my sound sample I converted to LPC data, but it is still very garble and not intelligible. Which now makes me wonder if I don't have good LPC data. Does anybody have LPC data designed to be sent with the speak external command on the TMS5220?

 

 

 

Link to comment
Share on other sites

8 hours ago, Eric Lafortune said:

 

  • You mustn't forget the initial SPEAK EXTERNAL command (0x40).

I thought the speak external command was x110xxxx, which would be 0x60. Isn't 0x40 the "load address" command for an address of 0x0?

 

Link to comment
Share on other sites

There are a bunch of sample LPC strings in my gitlab. Mostly from game cartridges. (released and unreleased.)

 

How did you make your LPC data? 
 

Dumb question: verify that the actual bytes are on the data bus? And yeah TI numbered from D0 most significant. 
 

What impedance do you present to the output (analog). Crazy idea: run the output to the input of a 76489 as in the 99/4A.
 

Apart from the 4A, TI always had filter and amp stages in the output.
 

You'll see a pre-emphasis filter option on the front end of LPC coding, like in Blue Wizard.  This gives more attention to higher more essential speech frequencies.  (Starting with Speak & Spell) TI  would recommend  an analog output stage to amplify the low end.

 


 


 

 

Link to comment
Share on other sites

static unsigned char press_fire [] = { // From Parsec
    0x10, 0x80, 0x58, 0x43, 0x9B, 0x6A, 0x8A, 0x67, 
    0x46, 0xC8, 0xD9, 0xEA, 0xD1, 0x4C, 0x8E, 0x8C, 
    0x13, 0x96, 0x5B, 0x6B, 0xAA, 0x76, 0xD9, 0x5A, 
    0xC2, 0x24, 0x69, 0xD3, 0x85, 0x3A, 0x85, 0x1C, 
    0x03, 0x74, 0x59, 0x66, 0x01, 0x0B, 0x44, 0xC0, 
    0x03, 0x14, 0x60, 0x40, 0x56, 0xAE, 0x43, 0xD5, 
    0xD3, 0x23, 0xB4, 0x18, 0x2D, 0xCD, 0xEC, 0x88, 
    0xF0, 0x64, 0x7C, 0x74, 0xBB, 0x22, 0x22, 0x52, 
    0xCE, 0xD1, 0x5D, 0x4F, 0x6B, 0x2F, 0xD9, 0x47, 
    0xF7, 0xCD, 0xBD, 0xBC, 0xE8, 0x1C, 0xDD, 0x27, 
    0xF7, 0xB6, 0x92, 0x73, 0x74, 0x9F, 0xCC, 0xD7, 
    0x52, 0xEE, 0xD2, 0x7D, 0x91, 0x1C, 0x4F, 0x39, 
    0x43, 0xF7, 0x85, 0x73, 0xB4, 0xE4, 0x0C, 0xDD, 
    0x57, 0xAE, 0x96, 0x92, 0xDC, 0xF4, 0xD0, 0xA8, 
    0x93, 0x57, 0x6A, 0xD3, 0x6D, 0x92, 0x6C, 0x76, 
    0xC9, 0x55, 0x4F, 0xBA, 0xB7, 0x0E, 0xD9, 0x1A, 
    0xDB, 0x1B, 0x02, 0x88, 0x3A, 0x84, 0x03, 0x14, 
    0x00, 0x40, 0x80, 0x54, 0xE1, 0x00, 0x01, 0xF8, 
    0xCC, 0x35, 0x00, 0xDF, 0x8F, 0x26, 0xD9, 0x1B, 
    0xB5, 0x8E, 0xB7, 0x3C, 0xB5, 0xA9, 0x65, 0x9D, 
    0x98, 0xC8, 0x4E, 0x65, 0xB8, 0xAD, 0x60, 0x14, 
    0x6B, 0xE8, 0x54, 0x4E, 0x9B, 0x1E, 0x0D, 0x57, 
    0xCD, 0xDB, 0x9E, 0x7A, 0xD1, 0x93, 0x86, 0xDE, 
    0x4E, 0x4A, 0xDE, 0x20, 0x86, 0x21, 0x96, 0x63, 
    0xE9, 0x84, 0x01, 0xD3, 0xB0, 0x35, 0x33, 0x4A, 
    0xF5, 0xAE, 0x61, 0xB2, 0xCD, 0x69, 0x35, 0x2B, 
    0x18, 0x0B, 0xF7, 0x44, 0x9D, 0xED, 0xEA, 0x18, 
    0x57, 0x0B, 0x7B, 0x55, 0x57, 0xA5, 0x5C, 0x66, 
    0xF2, 0xC5, 0x9C, 0xB5, 0xF2, 0x14, 0xD3, 0x0F, 
    0x73, 0xC5, 0x0F
};
 

Link to comment
Share on other sites

12 minutes ago, coolio said:

I thought the speak external command was x110xxxx, which would be 0x60. Isn't 0x40 the "load address" command for an address of 0x0?

 

You're right -- it's 0x60 indeed. This byte doesn't go into the LPC FIFO buffer. The bit order of the commands and the LPC data remain tricky. On the TI-99, you can send the command byte as-is, and LPC software will already reverse the LPC bits for you, so you can send the resulting LPC bytes unchanged. If it helps, my VideoTools project has ConvertLpcToText and ConvertTextToLpc to convert between binary and readable/editable LPC files.

Link to comment
Share on other sites

20 hours ago, JasonACT said:

static unsigned char press_fire [] = { // From Parsec
    0x10, 0x80, 0x58, 0x43, 0x9B, 0x6A, 0x8A, 0x67, 
...

Thanks! I used this bit stream and it seemed to work, but the speech still sounds poor. Listen to the the audio here:

 

 

 

I was the Parsec "Press Fire to Begin" sound I assume, but that audio quality is very poor. I wonder if the audio quality is caused by my audio amplifier I am using. It's just a simple LM386 circuit similar to the second one documented here. As a comparison, here is an audio file of my own that I ran through TMS-Express:

 

 

Both samples have static noise problems. 

 

For clarity, I am using the audio filtering circuit in Thierry Nouspikel's documentation of the TI Speech Synthesizer circuit, specifically the SPEAKER line from the TMS5220 is connected to a 0.22 uF capacitor and 1K8 ohm resistor to ground, and a 1 uF capacitor inline with the audio line. This then feeds the LM386 audio amp, which is driving a 8 ohm, 0.5W speaker. 

 

Also, since making the recommended change to use the interrupt line as the cue for more data, the logic analyzer picture looks much better (the /READY line goes high for only 20 uS at a time). So like I said, I wonder how much of this is my audio circuit now.

 

Any pointers would be much appreciated.

 

Thanks!

 

 

  • Like 2
Link to comment
Share on other sites

OK, progress here. I moved away from the LM386 audio amp to some computer speakers with their own amplifier and that fixed the audio quality issue. the Parsec sample @JasonACT provided sounds just like it did on my TI 99/4a .... with one exception. The speech synthesis gets about 3/4 through the data and then just halts. What I hear is "Press fire t<bzzzt>". About the time the sound goes south, I the TS (talk status) bit on not he TMS5220 goes to 0. But the thing is I don't get an interrupt prior to this indicting the BUFFER is low on data. The process stops always at the same pointing the data stream, meaning the talking stops before all the data is sent.  I am not sure how to debug this. Has anyone experienced anything like this before?

 

I looks at @Asmusr's TMS5220 handling code for some of the games he published the source to. I see in his code instead of waiting on the TMS5220 interrupt pin to send 8 more bytes to the FIFO buffer, he polls the BL (buffer low) status big. I changed my code to do that and get the exact same results (about 3/4 of the data stream is successful, and then talking stops). 

 

I also tried restarting the speak external command by when I get a premature TS=0, I reissue the speak external command and pick up with the data transmission where it previously left off. This doesn't help.

 

Any thoughts or ideas on this? Thanks!

 

Edited by coolio
clarity
Link to comment
Share on other sites

  • 2 months later...

OK, I have made progress here. After literally months of trying (not every day, just during "hobby time") I came to the conclusion my problem was the breadboard on which I was building my system (too much noise and capacitance). So, I moved my hardware design to a PCB, and got it working. Yay! The software approach for the speak external command was to send the first 16 bytes, and then poll for the TMS5220 status buffer looking for the buffer low status and sending 8 more. Works perfectly. And the beauty of my hardware design is that I don't have to put the CPU into a wait state while the TMS5220 latches onto the data (as indicated by the READY line). I'll share more on the details of this design later (I plan to make a video about it).

 

Now I have a new problem: what are the best practices for encoding custom audio into LPC bitstream? I am using TMS Express to do the conversion from a WAV file to an LPC bitstream, and that in general works, but frequently the audio from the TMS5220 is just awful, notably during times like inter-word silences. Here is an example: 

 

Note that in the second sample (the classic Daisy song) there is a large amount of static in between words and at the "tail" of words. 

 

Any advice on the best practices when LPC encoding custom audio would be appreciated!

 

Thanks!

 

 

 

 

Edited by coolio
typo
  • Like 2
Link to comment
Share on other sites

From documents I've seen recently from the 1980s:  LPC was always cleaned up by a human. The human editor would step through the frames (ie play from 0-20, 0-21, etc). to find syllable boundaries and word boundaries,  They would then chop out noise frames. Another necessity was smoothing discontinuities in pitch or other parameters. 

 

I don't know anything about how it was done after 1985.  I haven't done this myself. 

 

  • Like 2
Link to comment
Share on other sites

The "hello world" sounds considerably better than the "Daisy, Daisy" song. From what I understand, the speech synth works better with some voices than with others. It may be that the synth can't handle singing like this, where each word is very elongated. I'd be trying to optimise the quality for 'normal' speech, which is what the synth was designed for.

  • Like 2
Link to comment
Share on other sites

For audio-to-speech, it's important to start from clean speech audio: avoid poor recordings, background noise, background instruments, echoes, overdubbing,... LPC speech encoding is based on a simple model of a vocal tract, so best suited for simple voices. High-pitched voices and songs are more difficult because of the sample frequency and the limited number of available LPC frequencies.

 

It then helps to play with the options of the tool that you are using (QBox Pro, Blue Wizard, python_wizard, TMS Express, Praat) : notably frequency range of voiced frames, stability of frequencies, threshold between voiced and unvoiced frames, pre-emphasis, frame sizes. In my experience, this requires a lot of trial and error because of the combinatorial explosion of possible settings.

 

I still need to document and release my own audio-to-speech tool, which I created in the hopes of solving this problem. It includes an exact simulation of the TMS5200/TMS5220, to automatically tune the results and clean up clicks and pops resulting from quirks/bugs in the processor. It turned out to be a rabbit hole with months of research in academic papers and technical implementations. The problem remains non-trivial. It did help to create much better results for the vocals of the Bad Apple demo, also the be released (as version 2). Let me aim for October.

  • Like 3
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...