Crystal clean PCM 8bit samples on the poor PSG

Page 1/7
| 2 | 3 | 4 | 5 | 6

By ARTRAG

Enlighted (4772)

ARTRAG's picture

29-12-2005, 13:09

Ok this is time to share a nice routine for playing 8bit PCM samples on the PSG without the ugly spikes that are generated by the previous conventional techniques that play sample by sample each PCM sample using a 3 channels table.

Go to http://map.tni.nl/articles/psg_sample.php for details on the “standard” solution.

In a word, the standard solution associates each 8bit sample to a triplet of channel levels suitably chosen in order to minimize the error of the PSG overall output level with respect to the given sample. As the sum of the 3 channels leads to 608 disjoint values, you would expect a the output quality of a 9bit quantizer, but, unfortunately, the resulting levels are oddly spaced, so the theorical quality corresponds in dB to about 48dB of SNR, which is turn is equivalent to a 8bit uniform quantizer.

Problem:
The “standard” solution has a very annoying problem: the z80 cannot change the 3 PSG channels at the same time, this implies that, during the transition between two successive samples, the PSG passes trough undefined states where some channels are from the previous triplet and some channels are from the triplet corresponding to the next sample.

The result is that, at each transition between two successive samples, the PSG can produce 1 or 2 spikes of short duration. This effect, repeated at each sample transition, becomes a very annoying wideband noise that sensibly reduce the overall quality of the output, far below the optimal 48dB of SNR.

Solution:
The sole solution is to control the PSG trough every transition.

As the z80 can change a single channel at time and each channel has 16 values, the PSG can be modelled at any time as a state machine with 256 states, 16 inputs, and 608 outputs.
The 256 states are the 16*16 values of the old channels (those that don’t change at the current time), the 16 inputs are the allowed values of the current channel, the 608 outputs are the sum of the volume values from the “state” and of the volume value from the current input.
Each sample transition corresponds to three steps of the state machine.
Assume ABC the current values playing and A’B’C’ the new values.
The transitions are:

ABC -> ABC’
ABC’-> AB’C’
AB’C’-> A’B’C’

So, between the level ABC corresponding to the previous sample and the level A’B’C’ corresponding to the next sample, we get :
ABC
ABC’
AB’C’
A’B’C’

Assume x be the sequence of input samples, we want to minimize the mean square error between the x and the output sequence of levels in any single transition, this implies we need to develop a search algorithm that explores all the sequences of state transitions of the PSG corresponding to the allowed outputs, and finds the best sequence of channel levels that minimize the mean square error.

The algorithm is already existing, and is the classical Viterbi algorithm, I do not run into details here, but in case I will go back on this classical topic.

The output of such an algorithm is the sequence of levels for each channel that encodes the input sequence x.

Now the new problem: each triplet is 4*3= 12 bit : a huge waste of space!
Solution run length encoding.
The best solution should be a RLE based on nibbles, as some "detail channels" can have very long bursts of non equal symbols. Due to lack of time and of energy I did as you can see in the following.
I know that the implementation could be improved, but I cannot fit any a better solution in 447 t cycles (8KHz of PCM player).
Enjoy

Login or register to post comments

By ARTRAG

Enlighted (4772)

ARTRAG's picture

29-12-2005, 13:10


Fs          equ 8192  ;     alternative 8000 or 8192

;------------------
     MACRO  PsgW reg,value
            ld a,reg    ; select AY channel A volume register ; 7+1
            out (#A0),a                                       ; 11+1
            ld  a,value                                       ; 7+1
            out (#A1),a ; and write back to the AY            ; 11+1
            ENDM                                              ; total 36+4 = 40T
;------------------

        MACRO WaitXX
            if (Fs==8000)
                .3 nop          ; 3*5 = 15
            else
            if  (Fs==8192)
                nop             ; 5
            else
                assert 0
            endif
            endif
            ENDM            ; total 15 or 5
            
        MACRO Wait68 
            .2 ex (sp),hl   ; 2*20 =40
            .4 nop          ; 4*5 = 20
            cp 0            ; 8    
            ENDM            ; total 68

;------------------

        MACRO next  reg
            ld  a,(hl)      ; 8
            .4 rrca         ; 4*5
            and 15          ; 8
            ld  reg,a       ; 5
            ld  a,(hl)      ; 8
            and 15          ; 8
            exx             ; 5
            ld  reg,a       ; 5
            exx             ; 5
            inc hl          ; 7
            ENDM            ; tot = 79

;------------------

        OUTPUT wavplay.com
       
        ORG 100h

START:
        di
        PsgW 0,255
        PsgW 1,255
        PsgW 2,255
        PsgW 3,255
        PsgW 4,255
        PsgW 5,255
        PsgW 6,255
        PsgW 7,10111111B


;-------------------------------------
; IN   HL - Encoded sample start address
;      HL'- Sample length (#pcm saples)
;-------------------------------------

        ld  hl, SAMPLE+2
        ld  bc,#0101
        ld  de,bc
        exx
        ld  hl, (SAMPLE)
        ld  c,#A1           
        

.LOOP:      
        exx             ; 5
        dec b           ; 5
        ld  a,15         ; 8
        and b            ; 5
        jp nz,waitA     ; 11 
        next  b         ; 79
1       dec d           ; 5
        ld  a,15         ; 8
        and d            ; 5
        jp nz,waitB     ; 11 
        next  d         ; 79
2       dec e           ; 5
        ld  a,15         ; 8
        and e            ; 5
        jp nz,waitC     ; 11 
        next  e         ; 79
3       exx             ; 5
        
        ld  a,8         ; 8
        out (#a0),a     ; 12
        inc a           ; 5
        out (c),b       ; 14
        out (#a0),a     ; 12
        inc a           ; 5
        out (c),d       ; 14
        out (#a0),a     ; 12
        out (c),e       ; 14
            
        WaitXX          ; 15 or 5
        
        dec hl          ; 7
        ld  a,h         ; 5
        or  l           ; 5
        jp  nz,.LOOP    ; 11
                        ; tot  = 447T or 437T
        ret

waitA   Wait68          ; 68
        jp   1B         ; 11
                        ; tot = 79

waitB   Wait68          ; 68
        jp   2B         ; 11
                        ; tot = 79
                        
waitC   Wait68          ; 68
        jp   3B         ; 11
                        ; tot = 79


;       b7b6b5b4|b3b2b1b0
;       NumTick |LevVolum


SAMPLE:
        include "C:\MATLAB6p5\work\out.txt"

FINISH:

By ARTRAG

Enlighted (4772)

ARTRAG's picture

29-12-2005, 13:14

To whom is interested in going deeper in the way the Viterbi optimization is performed

[y,Fs,nbit,opt]=wavread('bluetooth_input.wav');

%load handel

y = (y-min(y)) /(max(y)-min(y)) * 1.15;                % input tra 0 e 1.15

f = 3579545;
fpcm = Fs;

tp = f/fpcm; % 447 = 8000Hz, 437 = 8192Hz
dt1 = (12+5+14)/tp;
dt2 = (12+14)/tp;
dt3 = 1-dt1-dt2;

N = 3*length(y);

x = zeros(1,N);
x(1:3:end) = y;
x(2:3:end-3) = y(1:end-1)*(1-dt1)+y(2:end)*dt1;
x(3:3:end-3) = y(1:end-1)*(1-dt1-dt2)+y(2:end)*(dt1+dt2);
x(end-2:end)= y(end);

n=[0:15]';
vol=2.^(n/2)/2^7.5;
vol(1)=0;

nxtS = uint8(zeros(256,16));
for i=0:15
    for j=0:15
        for in=0:15
            nxtS(i*16+j+1,in+1) = uint8(j*16 + in);
        end
    end
end

curV = zeros(256,16);
for i=0:15
    for j=0:15
        for in=0:15
            curV(i*16+j+1,in+1) = vol(i+1)+vol(j+1)+vol(in+1);
        end
    end
end    

Stt = uint8(zeros(256,N));
Itt = uint8(zeros(256,N));

L  = zeros(256,1);
St = uint8(ones(256,1));
It = uint8(ones(256,1));

for t =1:N
    Ln = inf*ones(256,1);
    if mod(t-1,1000)==0
        proc = fix(t/N *100)
    end
    for cs=0:255
        for in=0:15
            cv = curV(cs+1,in+1);
            ns = double(nxtS(cs+1,in+1));

            switch mod(t-1,3)
                case 0
                    Ltst = L(cs+1)+dt1*abs(x(t)-cv)^2;
                case 1
                    Ltst = L(cs+1)+dt2*abs(x(t)-cv)^2;
                case 2
                    Ltst = L(cs+1)+dt3*abs(x(t)-cv)^2;
            end

            if  Ln(ns+1) >= Ltst
                Ln(ns+1) = Ltst;
                St(ns+1) = cs;
                It(ns+1) = in;
            end
        end
    end
    L = Ln;
    Stt(:,t) = St;
    Itt(:,t) = It;
end

[l,i] = min(L);

P = uint8(zeros(1,N));
I = uint8(zeros(1,N));

P(N) = Stt(i,N);
I(N) = Itt(i,N);
for t = (N-1):-1:1
   P(t) = Stt(double(P(t+1))+1,t);
   I(t) = Itt(double(P(t+1))+1,t);
end

V = zeros(1,N);
for t = 1:N
   V(t) = curV(double(P(t))+1,double(I(t))+1); 
end

er = sum(abs(x(1:3:end)-V(1:3:end)).^2)*dt1+...
     sum(abs(x(2:3:end)-V(2:3:end)).^2)*dt2+...
     sum(abs(x(3:3:end)-V(3:3:end)).^2)*dt3

en = sum(abs(x(1:3:end)).^2)*dt1+...
     sum(abs(x(2:3:end)).^2)*dt2+...
     sum(abs(x(3:3:end)).^2)*dt3

disp('SNR db=')
SNR = 10*log10(en/er)



j=1; 
A = 0;  
B = 0;  
C = 0;
la = 1; 
lb = 1; 
lc = 1;
ja=1;   
jb=2;   
jc=2;
s = [];

for i=1: length (I)
    switch mod(i-1,3)
        case 0
            if A==I(i)
                la=la+1;
                if la==17
                    s(ja)=16*16+double(A);
                    la=1;
                    ja=j;
                    j=j+1;
                end
            else
                s(ja)=16*la+double(A);
                la=1;
                ja=j;
                j=j+1;
                A=I(i);
            end
        case 1
            if B==I(i)
                lb=lb+1;
                if lb==17
                    s(jb)=16*16+double(B);
                    lb=1;
                    jb=j;
                    j=j+1;
                end
            else
                s(jb)=16*lb+double(B);
                lb=1;
                jb=j;
                j=j+1;
                B=I(i);
            end
        case 2
            if C==I(i)
                lc=lc+1;
                if lc==17
                    s(jc)=16*16+double(C);
                    lc=1;
                    jc=j;
                    j=j+1;
                end
            else
                s(jc)=16*lc+double(C);
                lc=1;
                jc=j;
                j=j+1;
                C=I(i);
            end
    end
    
end

disp('actual len=')
disp(length(s))

I0 = I(1:3:end);
I1 = I(2:3:end);
I2 = I(3:3:end);

i0 = find(diff(double(I0)));
str0=[];
for j=0:(length(i0)-1)
    if j==0
        t = I0(1:i0(j+1));
    else
        t = I0(i0(j)+1:i0(j+1));
    end
    str0(j+1,1) = length(t);
    str0(j+1,2) = I0(i0(j+1));
end

i1 = find(diff(double(I1)));
str1=[];
for j=0:(length(i1)-1)
    if j==0
        t = I1(1:i1(j+1));
    else
        t = I1(i1(j)+1:i1(j+1));
    end
    str1(j+1,1) = length(t);
    str1(j+1,2) = I1(i1(j+1));
end

i2 = find(diff(double(I2)));
str2=[];
for j=0:(length(i2)-1)
    if j==0
        t = I2(1:i2(j+1));
    else
        t = I2(i2(j)+1:i2(j+1));
    end
    str2(j+1,1) = length(t);
    str2(j+1,2) = I2(i2(j+1));
end
disp('best len=')
disp(length(str0)+length(str1)+length(str2))
disp ('ratio =')
disp(length(s)/(length(str0)+length(str1)+length(str2)))

% i=20 ; [  bitand(s(1:i)',240)/16 bitand(s(1:i)',15) ]
% [ str0(1:10,:) str1(1:10,:) str2(1:10,:) ]


t = zeros(N,1);

t(1:3:end) = 0   +       [1:N/3];
t(2:3:end) = dt1 +       [1:N/3];
t(3:3:end) = dt1 + dt2 + [1:N/3];

figure(1)
subplot(2,1,1)
plot(t,V,t,x,'r',t,x,'.g')
subplot(2,1,2)
plot(t,V-x)

np = fix(sum(fix(s/16))/3)-6;

[fid, MESSAGE] = fopen('out.txt','W');
fprintf(fid,'   dw    %6d\n',np);
fprintf(fid,'   db    %3d\n',s);
fclose(fid);

% ASM replayer simulator

Vp=[];t=1;
ChA=0;ChB=0;ChC=0;

b=1;d=1;e=1;
hl=1;
hlp = np;
while (hlp>0)
	b=b-1;
	if b==0
        b = fix(s(hl)/16);
        bp = bitand(s(hl),15);
        hl=hl+1;
	end
	d=d-1;
	if  d==0
        d = fix(s(hl)/16);
        dp = bitand(s(hl),15);
        hl=hl+1;
	end
	e=e-1;
	if  e==0
        e = fix(s(hl)/16);
        ep = bitand(s(hl),15);
        hl=hl+1;
    end

	ChA = bp;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
	ChB = dp;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
	ChC = ep;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
    hlp = hlp-1;
end

figure(2)
plot(V)
hold on;
plot(Vp,'.r')

By ro

Prophet (3700)

ro's picture

29-12-2005, 13:27

kewl, I think. 'ave to test it. but sound kewl. thanx Atrag

By ARTRAG

Enlighted (4772)

ARTRAG's picture

29-12-2005, 13:33

I have sent two small examples at 'downloads@msx.org'
and I am preparing some others short encodedd wavs

By ro

Prophet (3700)

ro's picture

30-12-2005, 08:13

looking forward 2 it!

By ARTRAG

Enlighted (4772)

ARTRAG's picture

31-12-2005, 16:40

I have done a good number of tests and I have found that the main problem
of this approach is the size of the data.

My RLE imlementation is very poor as my data (at least some channels) can
be very irregular.
Sometimes the result, wrt an 8bit input, is 1.6 times the input, i.e. it is worst than
not encoding at all, that results 1.5 the original lenght....

I have found also that if I trade off SNR with # transition the size can drastically
decrease to 0.9 times the original sive but the SND tends to reduce of 2-3dB.

I will post soon the new algorithm for trading size with SNR.

By snout

Ambassador (14863)

snout's picture

31-12-2005, 21:22

Sorry for the delay on putting the files online, it have been extremely busy days... as soon as I recover from my new-years hangover, I'll put 'em in our downloads section! Smile

By ARTRAG

Enlighted (4772)

ARTRAG's picture

01-01-2006, 05:55

Take your time.
With jltursan we discovered that the msdos pcmenc.exe could not work on all the PC
due to the need of a .dll from matlab.
Probably (also for increasing the processing speed) it would be required a new version
of pcmenc.exe developed directly in C (I compiled the matlab program).

By dvik

Prophet (2176)

dvik's picture

01-01-2006, 10:21

Hi ARTRAG,

Is the ideas in your new PSG sample player something totally new or a refinement of what was discussed in http://www.msx.org/forumtopic663p105.html ? It sounds like its somewhat new in the sense you time it quite well and then using an adaptive way of changing samples. I'm really looking forward to hear the result.

By ARTRAG

Enlighted (4772)

ARTRAG's picture

01-01-2006, 22:56

This is a further development more that a refinement.
The approach is completely different from the one we developed (that one was a "sample by sample" encoding,
this one is a "sequence" encoding) but I have planned this solution (the fact that the sequence of PSG levels can
be found with a Viterbi algorithm) when we discovered the "spike" problem.

If you are interested send me an email and I'll send you some stuff to have a
taste of the whole thing.

Page 1/7
| 2 | 3 | 4 | 5 | 6
My MSX profile