# How to play 1-bit samples

Topic: Development
Languages:

At Robsy's MSX Workshop you can download a tiny package that includes the asMSX sourcecode to replay 22,050 or 11,025 Hz WAV files in MSX using the keyboard click function. A program in C is also included to convert 8-bit mono WAV files to the 1-bit format that the program uses. An example ROM is also included.

This technique has been used by many well-known Japanese games, such as Runemaster II and also in Dutch games, such as Oh Shit!

Oh! You have used the s**t work! I meant "word". Sorry

actually, you did ^_^

Did I? Oh, f**k then! I'm sooooo sorry Why anyone would like to play 1-bit samples on MSX? There is one exeption, that I explained here: www.msx.org/forumtopic3562p12.html
...but this method does not need any conversion.

If you want to play samples on MSX, use PSG. This way you get 4bit or even 5.5bit. (Explained here: www.msx.org/forumtopic663p7.html)

Well, I agree that 4-bit PCM is far better in quality BUT it also multiplies by a 4 factor the sample size. That is a problem for tiny MSX1 productions, I am afraid. Anyway, today I will do some tests about 4-bit PCM using the PSG and sample-compression.

Reeeaaalllyy cool! I'm going do my own "Here's johnny, franky, willy and joey!" heheh, or "I repeat, this is our -last- attack". it's
"this is joey, paul, willie and frankie" *rolleyes*

sorry, oh-shit purist here ;P

PLEASE USE THE PSG AND DATA COMPRESSION!!!
You can store data using a very simple compressor (DPCM at 1 or 2 bits) and play at 4bits!!! The quality is very good and the algorithm easy and well known

PS
Having a prediction filter of 3 taps means 3 multiplications and two additions per sample. Using simplifyed (non adapted) taps you can replace multiplications by shifts.
If you use a prediction filter of 3-4 taps you can have easily a sampling rate of 8KHz.
Storing data at 1bit you get 4bit quality!!!

Yes, I've been reading the MAP info and now I have a 4-bit PCM replayer too. Again, the problem is the size. The quality, of course, does improve. But for speech there is little difference between 1-bit and 4-bit. For sounds, specially if they have echo, 4-bit is far better. I have converted all the Windows 11 KHz sounds to MSX and, in my opinion, they sound better in the MSX cool, mp3? hehe.

Nice!

Very nice!!

@PITPAN
The size is not an issue! If you use DPCM you can store 2bit for sample and play at 4bit per sample with almost the same quality when you do not compress. The DPCM needs very few ops and is very simple to be implemented. If you want I could send you same code exaple in order to let you evaluate an upgrade of your readwav.

MP3s? A sick idea for a new app using ethernet: Using a PC-Server to decode & downsample MP3s, stream to the MSX and playback it via PSG (or even other audio devices) @Graw:

There’s also an article about playing samples on the PSG on the MSX Assembly Page.

4 bits is "impressive": http://webs.ono.com/WYZ/4BIT.bin :)

Impressive @PITPAN
First of all, before thinking to DPCM, in order to properly convert from 8-12-16bits to 4bits, you need to chose which bits are the best to be preserved (or more generally, you need to chose a scale factor which minimise the signal distortion during the conversion to 4bit).
Only if the input signal has a full dynamic the most significant bits are the best to be preserved.
In general your conversion program (8-12-16bits to 4bits) should scale the input by a coefficient, quantise at 4bits, scale back the result, and compute the mean square error (MSE) between the two (input and quantized) signals. Change the coefficient, and try again till you find the minimum MSE which gives the best 4bit conversion. This is the heavy way to compute the best 4bit conversion.
If you assume to scale of one single bit at time the “easy and go way” is to follow this algorithm:

Let X be the current sample
Let S = 0 a 16 bit accumulator
Define a mask M of for zeros like this …11100001111…

next sample:
Do Z = M and X
Do S=S+Z*Z
Go next sample till end on the input file

Now you have the first MSE in S, shift M of one position and redo anything from the beginning.
You get a new MSE and so on.
At the end you get an MSE for each mask position. If the input has 16bits you get 16-4 = 12 MSE values, corresponding to 12 mask positions.
The minimum MSE corresponds to mask which represents the best 4bits to be preserved.
Pick up them: they are the best 4bit conversion of your file.

ERRATA CORRIGE
Define a mask M of for zeros like this …11100001111…
Must become
Define a mask M of 4 zeros like this …11100001111…

in order to properly convert from 8-12-16bits to 4bits, you need to chose which bits are the best to be preserved (or more generally, you need to chose a scale factor which minimise the signal distortion during the conversion to 4bit).
Only if the input signal has a full dynamic the most significant bits are the best to be preserved.

As PSG outputs linear 4bit sample data, I can see no reason to select other than bits most significant from 8,12 or 16 bit sampledata as you can't anyway replay anything more accurate than 4bits can reproduce.

So... to compress, I think, you should use 4bit data and try to compress it to 1-3 bits. I have not studyed, could it work on 4bit samples, but there is a special subset of DPCM called ADPCM, that is used for example in MSX-Audio samples. Maybe these sources could be usefull: www.mccm.aktu.nl/millennium/milc/asm/topic_3.htm

This is untested idea written with non existing language, that could MAYBE work for packing 4bit sample to 2bits. ... I have not had enough time to think about this...

```CASE TWOBITS=&B00 THEN D=D*2:IF ABS(D)<1 THEN D=1
CASE TWOBITS=&B01 THEN D=D/2:IF ABS(D)<1 THEN D=0
CASE TWOBITS=&B10 THEN D=-SGN(D):IF D=0 THEN D=-1
CASE TWOBITS=&B11 THEN D=D
OUT=OUT+D
IF OUT>15 THEN OUT=15:D=0
IF OUT<0 THEN OUT=0:D=0
SOUND 8,OUT
```

Why selecting other bits different from the most significant?
In general one or two bits could be zero or almost every time zero.
All depends on the level of the signal while recording it and converting in digital samples.
Quantization has two effects:
when you cut the max levels of the signal you get "clipping";
when you cut low levels of the signal you get "granular noise".
If the msbits are zeros or zero very often, they could be less “significant” of what you think.
You could accept a certain amount of clipping in order to get a better level of quantization of low level bits.

About the PSG output, I am not so sure it is liner, the datasheet of AY-3-8913 says that the DAC is logarithmic !! In this case my strategy for computing optimal quantization is not incorrect.
Actually also selecting 4 adjacent bits is incorrect! The correct strategy is to find the closest reproduction level to the current sample!!!

ERRATA CORRIGE
this case my strategy for computing optimal quantization is not incorrect
must be
this case my strategy for computing optimal quantization is incorrect

Basically it works well but you need a sampling frequency 2-3 times higher than the
Nyquist rate.

Ah, now I know, what you mean! ... but you can't do it like this...

It is true, that it is a good idea to adjust volume of sample to maximum before converting it, but if you simply ignore MSBs you will not get cliping but the sample will overflow and cause speaker to hit another side. That is something we don't want to happen...

You should use good multibly factor (for example 1.5) and if the sample overflows, simply set it to maximum or minimum (depends of what side the sample overflowed)
Basically it works well but you need a sampling frequency 2-3 times higher than the
Nyquist rate.

What is Nyquist rate? Do you mean, that I need 4-6 bits to store 4bit sample? Doesn't sound too good packing... >You should use good multibly factor (for example 1.5) and if the sample overflows, simply set it to maximum or minimum (depends of what side the sample overflowed)

Correct! This is what I meant when I said:

[…] you need to chose a scale factor which minimise the signal distortion during the conversion to 4bit [...].

The Nyquist rate is the lowest sampling rate that will permit accurate reconstruction of a sampled analog signal. It is equal two twice the max frequency you have in the analog signal.
If you digitalize human voice at phone quality the max frequency is (in average) 4KHz, thus you can use a sampling at 8KHz. If you digitalize music or songs, the max frequency can reach easily 11KHz, thus the sampling rate must be 22KHz.
When you digitalize at a rate lower than the Nyquist rate you get "aliasing" than consists, roughly specking, in a distortion of the high frequencies. In case you are forced to adopt a given sampling rate (due to you HW limitations) the correct procedure is to digitalize at the correct sampling frequency, filter the signal in order to cancel the high frequencies that would be affected by aliasing, and then downsample the filtered signal to the correct sampling rate.

About your Adaptive Delta Modulation, if the sampling rate isn’t higher than the Nyquist rate you have an effect that is known as “slope distortion”. The player cannot follow the signal slope as the step can vary slowly. If the sampling rate is higher than the minimum you get a better slope reconstruction.

This should thread become a forum, do you agree?

PS
IF OUT>15 THEN OUT=15 =0
should become
IF OUT>15 THEN OUT=15 =15

PPS
What about the DAC response in the PSG? It is very NON linear!!!
Does anyone have an analytic expression for the analog levels produced corresponding to the 16 values of the volume?
This is the first step to have a correct 4bit quantization ! !
Choosing 4 adjacent bits from the input is totally WRONG!!!

As far as I can uderstand the analog levels og the PSG are in the given by

y = 1,41^n / 1,41^15

where n=0:15 is the amplitude value in the register.

try this matlab line and comapre te result with the PSG datasheet

n=0:15;y=1.41.^n;plot(n,y/max(y),'o');grid

Lart update
The basis in the DAC levels is sqrt(2)

thus you have

y = sqrt(2)^n

with n = 0:15

What about the DAC response in the PSG? It is very NON linear!!!
Yeah, actually you are right, I think I was messing up with SCC.

`IF OUT>15 THEN OUT=15:D=15`

No, D is delta, this will cause out to be increased by 15! Actually it is maybe better not to touch D at all and do just clipping. packing routine can this way "take some speed".

CASE TWOBITS=&B11 THEN D=D
Here we can also have a special case... for example:
CASE TWOBITS=&B11 AND D=0 then D=2
When D=0 routine can use &B01 bit combination...
y = sqrt(2)^n
Hmm... If it is like this, then nf was right (greez to Bandwagon) It is possible to play 8bit (or maybe even 10bit) samples on PSG! It seems, that I really must rip their sample routine someday. :9

>Hmm... If it is like this, then nf was right (greez to Bandwagon) It is possible to play 8bit (or maybe even 10bit) samples on PSG! It >seems, that I really must rip their sample routine someday. :9

I confirm that the response of the DAC is like:

y = sqrt(2)^n / sqrt(2)^15

but this means that you can get a 4bit quantizer with a larger dynamic of what you expected if it were linear.

Actually the dynamic of the DAC is

20* log10 (MaxVal/MinVal)

Thus our 4bit DAC has

20*log10(sqrt(2)^15/sqrt(2)^1) = 42.1442 dB

that corresponds to the dynamic range of a 7bit linear quantizer

20*log10(2^7/2^0) = 42.1442 dB

Now the trick is to find a nice quantizer for converting linear 8-12bits samples in the non linear 4bit samples.

As you can see scaling the input at the right level before quantization become a real and important issue now.

This news is scrolling away from us... Let's continue here:
http://www.msx.org/forumtopicl663.html