Voice synthesis on ISR

Página 5/25
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10

Por Louthrax

Prophet (2083)

imagem de Louthrax

29-05-2016, 19:24

So the thing here is to analyze a source .WAV file with Fourier or Fast Fourier transform, get the main(s) frequency(ies) and generate short SCC samples that are updated every 1/60s ?

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 20:23

Intéressé pour voir la source ? Si vous utilisez MATLAB, pas de problème...

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 20:33

The ASM part is trivial, these are the PSG and PSG+SCC routines on the ISR

		
;-------------------------------------
		
ISRPsg:
                ld      a,(SfxOn)
                or      a
                call    nz,ReplayerUpdatePsg
                ret		
ISRScc:
                ld      a,(SfxOn)
                or      a
                call    nz,ReplayerUpdateScc
                ret				
;-------------------------------------
		
		
ReplayerUpdateScc:
		ld	a,(PAG_FRAME)
		ld	(Bank2),a
		
		call	en_scc				; scc in page 2
                ld      a,3Fh
                ld      (sccBank3),a		; scc registers in page 2

		ld      a,00011111b     	; scc channels 1-5 active
                ld      (988Fh),a
		
		LD	HL,(PNT_FRAME)
		LD	A,[HL]
		cp	080h					; frame terminator
		JR	Z,SCCReplayerMute
		
		ld	de,PSG_REG+0
		ld	bc,3*2
		ldir
		ld	de,9880h 
		ld	bc,5*2
		ldir

		ld	de,PSG_REG+8
[3]		call	logvol
		ld	de,988Ah 
		ld	bc,5
		ldir
		
		LD	(PNT_FRAME),HL

		call 	ROUT
		
		call	en_slot		
		ret
		
ReplayerUpdatePsg:
		ld	a,(PAG_FRAME)
		ld	(Bank2),a
		
		LD	HL,(PNT_FRAME)
		LD	A,[HL]
		cp	080h					; frame terminator
		JR	Z,PSGReplayerMute

		ld	bc,5*2
		add	hl,bc					; skip 5 channels		

		ld	de,PSG_REG+0
		ld	bc,3*2
		ldir

		ld	bc,5
		add	hl,bc					; skip 5 channels		

		ld	de,PSG_REG+8
[3]		call	logvol

		LD	(PNT_FRAME),HL
	
ROUT:		
		XOR     A
		LD      BC,11*256+$A1
		LD      HL,PSG_REG
LOUT:	        OUT     [0xA0],A
		INC     A
		OUTI 
		JR      NZ,LOUT
		RET
		
; extract psg volumes from the data
logvol:
		ld	a,(hl)
[4]		RLCA 
		and  15
		ld 	(de),a
		inc de
		inc hl
		ret

;-------------------------------------
; Mute replayer
;-------------------------------------
SCCReplayerMute:

                xor      a
                ld      (988Fh),a		; all SCC channels inactive

PSGReplayerMute:
                xor      a
                ld      (SfxOn),a

		LD	HL,PSG_REG
		LD	DE,PSG_REG+1
		LD	BC,14
		LD	[HL],A
		LDIR					; all PSG channels inactive

		LD   A,10011000B		; **** POR SI ACASO ****
		LD   [PSG_REG+7],A

		LD   A,31
		LD   [PSG_REG+6],A
		
		jr	ROUT


;-------------------------------------
; Initialize the scc
;-------------------------------------
SccInit:

                xor		a         ; Do not reset phase when freq is written
                ld  	(98E0h),a			; on SCC
                ld  	(98C0h),a			; cover SCC+ in SCC mode

                ld      de,9800h
        	ld		a,4
		
1:             ld      hl,sinewave
                ld      bc,32
                ldir
        	dec	a
        	jr	nz,1b
                ret
sinewave:
                db	0,24,48,70,89,105,117,124,127,124,117,105,89,70,48,24,0,231,207,185,166,150,138,131,129,131,138,150,166,185,207,231

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 20:31

The data frame looks like this:

    dw 67,110,30,28,26,35,45,31 
    db 0x81,0x81,0x81,0x91,0x91,0x92,0x92,0x92 
    dw 31,30,621,143,47,62,67,373 
    db 0x81,0x91,0x91,0x92,0x92,0xA2,0xB3,0xE9 

There are 8 frequencies (actually periods) and 8 volumes (the lower nibbles is coded for SCC, the upper for PSG)
In this way the same data are used in both cases, with and without an SCC detected
The tones are sorted by relevance, so when the PSG is alone, it will use the last 3 tones.

Por Metalion

Paladin (1009)

imagem de Metalion

29-05-2016, 20:37

Louthrax wrote:

So the thing here is to analyze a source .WAV file with Fourier or Fast Fourier transform, get the main(s) frequency(ies) and generate short SCC samples that are updated every 1/60s ?

That concept I can understand.
FFT speaks to me.
Is that the method used ?

ARTRAG wrote:

Intéressé pour voir la source ? Si vous utilisez MATLAB, pas de problème...

I didn't know you spoke french, ARTRAG.
C'est une belle surprise ... !

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 20:46

The real problem is how to generate the data from a speech sample.
My matlab script basically does this

- read a wav file
- low pass filter the input in 500Hz-6000Hz
- segment the audio file in chunks of 1/60 of sec
- compute the power spectrum of the chunk (via FFT)
- find the 8 biggest local maxima of the power spectrum making sure they do not mask each other
- encode their frequencies and amplitudes as msx periods and volumes

Simple but with some manual tweaking on the encoding side

Por Manuel

Ascended (15756)

imagem de Manuel

29-05-2016, 21:10

Are the Matlab scripts compatible with GNU Octave? If so, then everyone could use these scripts to generate sample data. See https://www.gnu.org/software/octave/

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 21:36

No idea, this is the main script, it compiles too.
Just replace the path where wav files are with your own path.
Someone with octave could tell if it works.


close all;
clear;
path = 'wav\SKYJAGUAR\';


names = dir([path '*.wav']);
nfiles = size(names,1);

for ii = 1:nfiles 
    ii
    name = [ path names(ii).name];

    [Y,FS,NBITS] = wavread(name);
    if size(Y,2)>1
     X = Y(:,1)+Y(:,2);
    else
     X = Y;
    end

    Wn = [450/FS, 6000/FS];
    
    [Bbp,Abp]=butter(5,Wn); 

    Tntsc = 1/60;

    Nntsc = fix(Tntsc*FS);

    Nblk = fix(length(X)/Nntsc);
    X = X(1:Nblk*Nntsc);
    L = length(X);

    t = [1:Nntsc]';

    Y = zeros(Nblk*Nntsc,1);
    f = zeros(Nblk,8);
    a = zeros(Nblk,8);
    p = zeros(1,8);
    
    X = filter(Bbp,Abp,X);
    
    for i=1:Nblk
        x = X((i-1)*Nntsc+1:i*Nntsc);

        XF = abs(fft(x));
        [pks,locs]= findpeaks(XF(1:round(Nntsc/2)),'SORTSTR','descend'); 

        if size(pks,1)<8
            y = zeros(1,8);
            y(1:length(pks)) = pks;
            pks = y;
            y = round(Nntsc/2)-7:round(Nntsc/2);
            y(1:length(locs)) = locs;
            locs = y;
        end
        
         pks = pks(8:-1:1);
         locs = locs(8:-1:1);
        
        y = zeros(size(x));
        freq = zeros(1,8);
        amp  = zeros(1,8);
        
        for  ti=1:8
            j = locs(ti);   
            freq(ti) = (j-1)/Nntsc*FS;
            amp(ti) = abs(XF(j))/Nntsc;
            y = y + amp(ti)*(sin(2*pi*freq(ti)*t/FS+p(ti)));
            p(ti) = 2*pi*freq(ti)*t(end)/FS;
        end

        Y((i-1)*Nntsc+1:i*Nntsc) =  y;
        f(i,:) = freq;
        a(i,:) = amp;
    end

%    sound(X,FS)
%    sound(Y,FS)

    TP = uint16(3579545./(32*f));

    m = max(a(:));
    nscc = uint8(a/m*15);

    npsg = 2*log2(a/m)+15;
    npsg(isinf(npsg))=0;
    npsg = uint8(ceil(npsg));

    n = npsg*16+nscc;		% in the same byte psg and scc volumes
	
    fid = fopen([name 'frm_scc3.txt'],'w');
    for i = 1:Nblk
        fprintf(fid,'    dw %d,%d,%d,%d,%d,%d,%d,%d \n',TP(i,1),TP(i,2),TP(i,3),TP(i,4),TP(i,5),TP(i,6),TP(i,7),TP(i,8));
        fprintf(fid,'    db 0x%s,0x%s,0x%s,0x%s,0x%s,0x%s,0x%s,0x%s \n',dec2hex(n(i,1),2),dec2hex(n(i,2),2),dec2hex(n(i,3),2),dec2hex(n(i,4),2),dec2hex(n(i,5),2),dec2hex(n(i,6),2),dec2hex(n(i,7),2),dec2hex(n(i,8),2));
    end
    fclose(fid);

end

fid = fopen('frm_scc3.txt','w');
fprintf(fid,'nfiles: equ  %d \n\n',nfiles);

fprintf(fid,'   page 0\n');

fprintf(fid,'frames: \n');


for ii = 1:nfiles
    fprintf(fid,'   dw frame%d\n',ii-1);
    fprintf(fid,'   db :frame%d\n',ii-1);
end

for ii = 1:nfiles 
    fprintf(fid,'   page 1..31\n');
    name = [ path names(ii).name];
    fprintf(fid,'frame%d: \n',ii-1);
    fprintf(fid,'   include %s \n',[name 'frm_scc3.txt']);
    fprintf(fid,'   db	080h\n');
end
fclose(fid);

!sjasm -Iasm -s sccLOFI3.asm

Por sd_snatcher

Prophet (3068)

imagem de sd_snatcher

29-05-2016, 21:38

This nifty method seems to be doable even on the OPL soundchips without PCM support, like the OPLL and OPL3. Is that right?

Por ARTRAG

Enlighted (6250)

imagem de ARTRAG

29-05-2016, 21:48

Totally true! This is why I was asking for snippets to detect OPLL and play tones at variable volumes and frequencies. OPLL is still a obscure for me, but it should be perfectly suitable for the purpose.

Página 5/25
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10