cph1776 Posted April 16, 2016 Share Posted April 16, 2016 Hello all....I'm running Speecoder v. 1.1E on Win99/4A v3.009 The program loads fine, but when I try to decode something, I don't get the full list of frames (sets of bits sent to the synthesizer) I select "Decode," then "Vocabulary," and input a vocabulary word (such as "HELLO") at the prompt. Then back to "Decode" and select "Examine" It only shows one line, with the number 15 (=1111, the "End of Data" code for the speech synthesizer) and "7998 bytes free" (out of 8000) Printing the list just shows the same thing, the 15 and nothing else. Anything I'm missing here? (I haven't tried it on MESS or V9T9 yet. It'll be quite some time before it gets to my "real" TI 99/4A, as I still need to verify that I have a working disk drive) Thanks in advance for any help Quote Link to comment Share on other sites More sharing options...
+mizapf Posted April 16, 2016 Share Posted April 16, 2016 Oh, nice to see. I suppose you heard the "HELLO"? I can't tell how good the emulation of Win99/4A is in that aspect, but in MESS it worked when I tried last time. Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 16, 2016 Author Share Posted April 16, 2016 Hallo Herr Zapf! Thanks for replying. Yes, the synth did speak. I'm figuring it might have something to do with Win994a's speech emulation code.... I'll try MESS. It's a bit more of a pain to use (the command line syntax is hairy, and I don't have a front-end for it yet), but its emulated speech does sound better than Win994a's.... Quote Link to comment Share on other sites More sharing options...
+mizapf Posted April 16, 2016 Share Posted April 16, 2016 I'll try MESS. It's a bit more of a pain to use (the command line syntax is hairy ...) like mess ti99_4a -cart editass -peb:slot2 32kmem -peb:slot3 speech -peb:slot8 hfdc -flop1 yourdisk.dsk Save it to a script or batch file, and just forget the command line. Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 16, 2016 Author Share Posted April 16, 2016 Just what the doctor ordered. I tried it in MESS and it worked fine. I've yet to play around with the parameters to see what sounds it can make....will do so later Question: When I do a CALL SPGET and get the speech data, is that the same data that SPEECODER displays? Once again, thanks for your help! Quote Link to comment Share on other sites More sharing options...
+mizapf Posted April 16, 2016 Share Posted April 16, 2016 Question: When I do a CALL SPGET and get the speech data, is that the same data that SPEECODER displays? Depends on the meaning of "same". The data you get with SPGET is the speech rom content, grouped into bytes. The speech synthesizer works on a bit base to feed its filters. SPEECODER's job is to decode the byte sequence into the separate values and to encode it again after changing. For instance, for "A", SPGET delivers the following byte sequence: 96 0 28 167 138 206 37 167 42 221 ... Strip the first three bytes. The remaining bytes can be shown as this bit sequence: 10100111 10001010 11001110 00100101 10100111 00101010 11011101 ... However, the speech rom is read in the reverse direction. Turn around each byte. 11100101 01010001 01110011 10100100 11100101 01010100 10111011... Energy is 4 bits, repeat is one, pitch is 6, K1 is 5, K2 is 5, K3 to K7 are each 4 bits, K8-K10 are 3 bits 1110 0 101010 10001 01110 0111 0100 1001 1100 1010 101 010 010 1110 1 1... and SPEECODER shows this: 14 0 42 17 14 7 4 9 12 10 5 2 2 Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 16, 2016 Author Share Posted April 16, 2016 OK, I get it. I had written a BASIC program that did a CALL SPGET, converted the returned string into a string of bits, then tried to arrange them by frames as described in the TMS5220 datasheet. I knew something was wrong when I started getting "End of data" codes (1111) in the middle of the speech string. It didn't make sense. I did know enough to strip off the first byte, which is always a 96 (>60) - the "Speak External" command. But I didn't know about the other two leading bytes, or that the rest of the bytes had to be reversed. I'll add that to my program and see if that helps. Thanks again for the advice! Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 17, 2016 Author Share Posted April 17, 2016 That did the trick! (Reversing the bytes, that is...). Thanks for your help. There are two things I'm interested in doing with the synthesizer: a. Develop a way to convert a WAV file into the codes used by the speech synthesizer. QBOX does this, but it's a bigger pain to set up than MESS :-) and it is poorly supported. If I can use a program like SOX to convert a WAV to LPC, then another program to convert the LPC file into one the TI understands, that would be a good thing. b. Develop a way to get the speech synthesizer to say any word, without resorting to the TE II (can't be used with Extended Basic) or speech support subroutines (take too long to load, space on disk, etc.) There were some early BASIC programs that allowed the user to cut apart the speech data strings and concatenate them, but that solution ignored the underlying bit structure of the frames, and wasn't all that flexible (or good-sounding). Quote Link to comment Share on other sites More sharing options...
+OLD CS1 Posted April 17, 2016 Share Posted April 17, 2016 SoX does LPC? SoX does LPC! What is the relationship between the LPC format of the 5220 and the LPC-10E spec supported by SoX? (Man, I really love SoX! I have to see if the newest version has been ported to the Amiga.) 1 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted April 17, 2016 Share Posted April 17, 2016 I did know enough to strip off the first byte, which is always a 96 (>60) - the "Speak External" command. But I didn't know about the other two leading bytes, or that the rest of the bytes had to be reversed. Section 5.2 ("FIFO Buffer"): ... As required by the synthesis section, data is shifted out serially starting with the LSB from the "First-In" byte. When this byte has been exhausted, the stack ripples down one byte and begins shifting from the next "First-In" byte. I actually forgot about the second and third byte of the SPGET output at first, but remembered when I showed the binary representation. I think the second and third byte are just a length word for the following speech data. Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 17, 2016 Author Share Posted April 17, 2016 I actually forgot about the second and third byte of the SPGET output at first, but remembered when I showed the binary representation. I think the second and third byte are just a length word for the following speech data. I checked it with a LEN function -- second byte is 0, third is the length of the actual speech data (minus the leading 3 bytes mentioned) Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 17, 2016 Author Share Posted April 17, 2016 SoX does LPC? SoX does LPC! What is the relationship between the LPC format of the 5220 and the LPC-10E spec supported by SoX? (Man, I really love SoX! I have to see if the newest version has been ported to the Amiga.) Yes, SOX does LPC. However the frames are not directly compatible with the TI SS chip (the bits corresponding to pitch, filter coefficients, etc. are arranged differently.) So far, I haven't found anything definitively showing how the bits in LPC-10E are arranged, although information must be out there, somewhere...I *think* the bits may be organized as shown on pg. 17 of this pdf: http://my.fit.edu/~vkepuska/ece5525/lpc_paper.pdf, But I haven't verified it.... Quote Link to comment Share on other sites More sharing options...
+Ksarul Posted April 17, 2016 Share Posted April 17, 2016 (edited) You might want to try and find a copy of A Practical Handbook for Speech Encoders, as that seems to be a deep-dive into all of the encoders current when it was written, including LPC-10. Here's a PowerPoint from 2004 with a detailed teardown of the specification (general bitstream breakout is on slide 7). Note that LPC-10e is also called STANAG 4198 and FS-1015, so when you add those two names to your search string, you might find more. I just went hunting and turned up a copy of STANAG 4198 online. Annex B contains the entire coding standard and explains the bitstream. If you need the STANAG, just hunt it up, as the resulting document comes back as a watermarked copy from the source I found. It is a free download, but it requires registration. FS-1015 was also later changed to FIPS-137, but that document was obsoleted back in 2000, and I haven't been able to find a copy of it online. It looks like the source I found for the STANAG is the only one out there, as all roads lead back to them. . . Edited April 17, 2016 by Ksarul Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 17, 2016 Author Share Posted April 17, 2016 You might want to try and find a copy of A Practical Handbook for Speech Encoders, as that seems to be a deep-dive into all of the encoders current when it was written, including LPC-10. Here's a PowerPoint from 2004 with a detailed teardown of the specification (general bitstream breakout is on slide 7). Note that LPC-10e is also called STANAG 4198 and FS-1015, so when you add those two names to your search string, you might find more. I just went hunting and turned up a copy of STANAG 4198 online. Annex B contains the entire coding standard and explains the bitstream. If you need the STANAG, just hunt it up, as the resulting document comes back as a watermarked copy from the source I found. It is a free download, but it requires registration. Thanks. I downloaded the STANAG 4198, and have a copy on my computer now. Page C-3 has the actual ordering of the bits, which I have not seen anywhere else. This should be very useful in making a LPC10->TMS5220 filter. One day I will look at how the K values are derived. It looks like a lot of math--solving matrices and such. I haven't touched any of that math since my college days, mid-1980s! Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 18, 2016 Author Share Posted April 18, 2016 Well, folks, I just wrote an Extended BASIC program that can generate arbitrary speech sounds. Right now it just generates 21 frames (about 1/2 second) of the "EEE" sound (as in the word "meet") Here is the source. It could probably be optimized to run a bit faster, but it does work... 600 !SPEECHLAB = TRY TO CREATE A SPEECH SOUND 605 ! "eee" SOUND 610 EN$="1100" ! ENERGY=12 620 RP$="0" ! NOT A REPEAT FRAME 630 PT$="101000" ! PITCH=41 640 K1$="10100" ! K1=20 641 K2$="10000" ! K2=16 642 K3$="0011" !K2=3 643 K4$="0010" !K3=2 644 K5$="0101" !K4=5 645 K6$="1101" !K6=13 646 K7$="1011" !K7=11 647 K8$="101" !K8=5 648 K9$="010" !K9=2 649 K10$="011" !K10=3 650 X1$=EN$&RP$&PT$&K1$&K2$&K3$&K4$&K5$&K6$&K7$&K8$&K9$&K10$ 660 BC=0 :: CH$=CHR$(0) 670 FOR X=1 TO LEN(X1$):: BV=VAL(SEG$(X1$,X,1)):: CALL SETBIT(CH$,BC,BV) 675 BC=BC+1 :: NEXT X 679 ! REPEAT 20 FRAMES 680 RP$="1" 682 X2$=EN$&RP$&PT$ 683 X2$=RPT$(X2$,20) 690 X2$=X2$&"1111" !STOP CODE 700 FOR X=1 TO LEN(X2$):: BV=VAL(SEG$(X2$,X,1)):: CALL SETBIT(CH$,BC,BV) 701 BC=BC+1 :: NEXT X 709 !FLIP BITS FOR CALL SAY 710 C2$="" :: FOR X=1 TO LEN(CH$):: A=ASC(SEG$(CH$,X,1)):: CALL FLIP(A,B) 712 C2$=C2$&CHR$(B):: NEXT X 720 C2$="`"&CHR$(0)&CHR$(LEN(C2$))&C2$ 730 CALL SAY(,C2$) 735 ACCEPT A$ :: IF A$<>"N" THEN 730 The program relies on two subprograms: FLIP, which reverses bits in a byte, and SETBIT, which sets/clears bits in a string: 9000 SUB FLIP(A,B) 9001 !REVERSES BIT PATTERN IN A, OUTPUT IN B 9005 B=0 9010 FOR X=0 TO 7 :: M=2^(7-X):: N=2^X 9030 IF (A AND M)THEN B=B OR N 9080 NEXT X 9090 SUBEND 30000 SUB SETBIT(CH$,BIT,A) 30010 B1=INT(BIT/8+1) 30011 L=LEN(CH$):: IF B1>L THEN CH$=CH$&RPT$(CHR$(0),B1-L) 30020 R=(BIT/8-INT(BIT/8))*8 30030 M=2^(7-R) 30040 V=ASC(SEG$(CH$,B1,1)) 30050 IF A=1 THEN V=V OR M 30060 IF A=0 THEN V=V AND(NOT M) 30070 CH$=SEG$(CH$,1,B1-1)&CHR$(V)&SEG$(CH$,B1+1,255) 30999 SUBEND You're welcome to play around with this, if you're interested. One thing that I noticed, and that almost threw me off a bit, is that in MESS, the command CALL SAY("E") will sound as intended, the "ee" sound in "meet." On the other hand, using Win994a, CALL SAY("E") sounds like the "e" sound in "met." Hmm.... 1 Quote Link to comment Share on other sites More sharing options...
+mizapf Posted April 18, 2016 Share Posted April 18, 2016 Hehe ... I could have a look for my two first Extended Basic programs which were published in the German TI Revue magazine around 1986; I called them "DECODE" and "ENCODE". It's not hard to guess what happened to that idea later. BTW, I wrote SPEECODER during my basic military service 1988 / 89. Not much to do in the evenings. 1 Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 20, 2016 Author Share Posted April 20, 2016 >BTW, I wrote SPEECODER during my basic military service 1988 / 89. Not much to do in the evenings. I think my last serious programming (not just games or telecomm) was probably in 1983, then I "rediscovered" the TI in 1990 and wrote a few more programs for it until 1995 or 1996....then I discovered the emulators in early 1997.... Quote Link to comment Share on other sites More sharing options...
cph1776 Posted April 24, 2016 Author Share Posted April 24, 2016 Hello, all. I'm currently developing a perl script to convert the LPC-10 files from SOX into the format used by the TI chip. I want to do a bit more testing before I turn it loose on the world, though. Maybe in a week or so, depending on other obligations... Meanwhile, I did move a couple of the speech demo files made with QBOX from Win99/4A to MESS. They sound much better--more bass in the MESS speech synthesizer. 4 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.