The 2600 isn't widely hailed for it's luxurious sound.
The TIA chip, which is responsible for the sound of the 2600 as well as the video, was designed with fairly minimal audio capabilities. On it's own it has the ability to make fairly simple tones, rumbles, and white noise type sounds.
Similar to how you need to "race the beam" to draw anything useful with a 2600, you also need to "ride the speaker" to add interesting audio texture to your sound effects, changing the frequency and volume content as the sound plays.
Further complicating the business of making interesting sounds is the fact that there are only a handful of frequencies to pick from in TIA's upper range, stemming from TIA's base-frequency-divider design. If you want a high pitched sound that shifts around, the transitions just aren't going to sound smooth.
One of the biggest hurdles to sound-coding on the 2600 surprisingly comes from wetware limitations, not hardware inadequacies. Our brains treat a sound more or less as a whole unit, rather than a collection of individual frequencies. Even when playing back a sound over and over, most people will have difficulty naming more than basic changes in the fundamental frequency, and even then they can do so only over a relatively large timescale.
In this sense, it's more intuitive for us to race the beam than it is to ride the speaker.
One solution to this problem is playing back a sound sample. That way the coder remains blissfully unaware of the frequency content, but still can reproduce the sound.
The problem with sample playback is it needs to happen frequently and continuously. On the 2600 that translates to a complicated kernel design (or one that blanks while playing) and eats up valuable rom space.
I set out to investigate an alternative approach to sound reproduction by writing an FFT to TIA sound converter. The converter would generate lower-rate samples that would contain TIA frequency information. The plan was for this to be a kind of middle-ground between simple sound generation and expensive sample playback.
FFT stands for Fast Fourier Transform. Without going over a bunch of theory, it's a way of picking out what frequencies are present in a length of data, and what strengths they're present at.
So the converter opens the WAV file and performs a sliding window FFT on it - in other words, it figures out the frequencies are present in each 1/60th of a second chunk.
Then it looks at which frequency is loudest in each chunk, and finds a nearest TIA frequency match.
There are a few things that are wrong with this approach overall.
First, it completely ignores the fact that a sound is made up of more than one frequency. This makes the implementation easier, but limits the kind of sounds we can accurately represent.
Second, it uses square wave TIA frequencies to represent the sine waves in the FFT, which potentially changes the character of the sound a bit.
Third, it completely ignores the phase of the frequencies. Phase differences between related frequencies tend to provide a lot of texture to sounds.
But since my design goal wasn't to perfectly reproduce the sound, I figured it would be worth trying to see what kind of results I would get.
I've attached a zipfile below that contains 2600 binaries, batari Basic files, and the original WAV files. The 2600 binaries play the sounds on startup, and then play again if you press the fire button on the joystick.
The pacman-eats-a-ghost sample is pretty close to what I was hoping for as a result. Comparing the TIA sound playback in stella to the original WAV shows the converter nicely simplified the sound. It's not the same sound, but it's close in character.
Pacman-dies is interesting because it highlights TIA's weakness in the upper range.
The original sound sample starts off at a higher pitch, and our simplified tonal sample doesn't do a great job of representing it until the sound gets a fair bit lower, near the end of the sample.
Haha is a short sample of a laugh. The original sample had fundamental frequencies below the range of my FFT routine, so it picked up on upper harmonics. Because of that the result isn't accurate, but has a neat texture that would make a neat sound effect for an underwater animal or weird space alien. [edit - a low pass filter, as suggested by batari allowed the routine to find the lower fundamental!]
Dkintro is a sample of the Donkey Kong level intro tune. I didn't plan to use the converter as a tool for whole songs, but I think the result here shows that a specialized FFT-to-TIA-song-data converter might be feasible.
With a fair bit of extra code the converter should be able to compare FFT results with FFT data representing TIA's various possible outputs, instead of trying to match fundamental frequencies. This would allow the converter to better represent more complex sounds and noise.
It's a work in progress at this point, but if nothing else comes of it, hopefully I've inspired other 2600 coders to consider the texture of the sounds in their own creations.
[edit - low pass filtered laugh result added]