OK, I'll bite
You really are conflating a bunch of different things and painting a worse picture than reality (not that the vanilla ST sample playback is stellar....)
4 bits of amplitude is certainly more coarse than 8 bits, but, given good samples and a decent sample rate, it could still sound fairly good. For instance, although CD's are indeed composed of 16 bit samples (well almost always, but that's another day), those samples are often played through a 1 bit DAC. Yes, much higher sample rate than our lowly ST sound chip, but by the same argument you present, 4 bits is WAY more than 1, so the ST is obviously better than CD quality?
That said, the ST isn't massively oversampling on playback, so the limited quantization levels certainly can come into play, again depending on the original samples. Much like mixing colors or multiplexing sprites, toggling amplitude in the samples 'fast enough' can create the illusion of additional quantization levels. Also, keep in mind that we really discern amplitude as power, so there's a time factor involved as well.
Regarding filtering, that's really going to come down to how the original samples were crafted and what rate they are played back. So long as our original samples don't have any frequency content greater than 1/2 our playback rate, you won't have any aliasing, and you won't have to rely on filters to address it.
None of that is to say that vanilla ST is some kind of sample driven powerhouse, it certainly is not, but given the hardware limitations, the end result depends greatly on the source material and techniques, and only somewhat on the actual sound chip (and in truth, this is because of the limitations of the sound chip)