Jump to content
IGNORED

Sample playback at 60hz


Tursi

Recommended Posts

Earlier this year I finally found a page that described FFTs in a way that made sense enough to me, and I started playing with audio conversion. I know it's been done before, but this is something I've wanted to attempt for many years.

 

After lots and lots of experimenting, I hit an unexpected result - I got (I think) recognizable voice out of it. There's a lot of work left before it's useful, but I wanted to post a sample.

 

 

It only has any luck right now on high pitched voices ('Let it Go' and 'Smile Smile Smile' work best), but it was interesting enough I thought I'd share a teaser. The converter itself is still not working well enough for a release (and perhaps it will never be more than a novelty, it's very noisy still).

 

The sample heard here takes 4606 bytes. It plays back by changing the sound generator settings 60 times a second, 11 bytes each time (so 660 bytes per second, though the 4606 bytes is 'compressed'. The output is too random for good compression, and in real usage probably would not be, but I had the compressed playback toolchain already written ;) ).

 

The program (and source to prove it) is here. I didn't include the WAV file since it bloats the zip, but I did include the VGM and you can play that in any VGM player to show it's not faked. ;) The file SOUNDSAMPLE in the zip is a TIFILES compatible file ready to copy to emulation or real hardware. (I haven't tried it on real hardware, if anyone does, can we get a video of that?)

 

convertsoundsample.zip

  • Like 7
Link to comment
Share on other sites

Mixed results, I've run a lot of different things through it. The results are good enough to tell me that the concept is working and that a theory I had many years ago - that speech was possible with 60hz playback - is possible. But they are not good enough to be useable in any way and it's difficult to make progress. I need to write a visualizer so that I can see in real time the DFT results and the pitches selected for the TI so I can debug the selection code and look for patterns. (For instance, the noise detection doesn't work at all and I'd like to see why not. I'm also not convinced the volume detection works.) (FWIW, this is a few steps along the path from my first attempt ;) ).

 

Even simple things like chip tunes don't come out well. They are recognizable, but there's a lot of pitch warble. I suspect it's caused by misalignment in the sine waves causing a single tone to become two or three depending on the exact cycle, but the visualizer will make it a lot easier to prove that theory. (I have some compensation in there for that, I sample a longer period than I'm actually going to use, but... yeah. Lots of experiments going on in there. That said, it does fine on a single-tone pure sine wave... which also supports that at least the code is on the right track.

 

I did a lot of research as well. Even the best musical converters I could find did not do a very good job, certainly not usable. But I found an interesting discussion somewhere about a research project that used this technique for speech. Playing with their tool, though, you needed at least 6 voices before it was any good. All that makes me wonder if this is a dead-end approach anyway and may not get much better. But it's fun to keep tweaking...

 

This was the page I found that covered how FFTs work - I meant to post it: http://www.earlevel.com/main/2002/08/31/a-gentle-introduction-to-the-fft/

Link to comment
Share on other sites

It only has any luck right now on high pitched voices ('Let it Go' and 'Smile Smile Smile' work best), but it was interesting enough I thought I'd share a teaser. The converter itself is still not working well enough for a release (and perhaps it will never be more than a novelty, it's very noisy still).

 

Even with the noisy quality, you have succeeded in lodging this particular song in my brain for the past two hours! I can't let it go!!!

  • Like 1
Link to comment
Share on other sites

I've downloaded but won't be able to read it tonight.

 

I suspect the answer is yes, though... I did a test once where I replaced the sine waves in a FM synthesis emulator with square waves to see if the same concepts would work on TI-like chips, and it sounded just fine. Heck - in THIS experiment I tried running the DFT using square waves instead of sine waves, and that actually worked too (only slightly less accurately). But the end result didn't sound better, I thought it might be a better match.

 

Phase control is our big loss... I had to lose a hope I'd had for years about stacking the voices for more resolution in volume control for sample playback. When I actually did it, the lack of phase control resulted in complex and unpredictable waveforms instead of the stacking I'd hoped for.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...