Voice synthesis on ISR

Página 8/30
1 | 2 | 3 | 4 | 5 | 6 | 7 | | 9 | 10 | 11 | 12 | 13

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 09:34

Mixing in software say 4 harmonics for 4 channels of 32 samples seems quite heavy...
It does 3x32x4 multiplications per frame... to not count additions
Maybe one could use only 2 harmonics, quantize relative amplitudes and store precomputed waves in rom.

Por Grauw

Ascended (9181)

imagem de Grauw

30-05-2016, 09:51

Could do like FM chip synthesis does internally; operate in logarithmic space where you can use additions instead of multiplications, followed by a final lookup table transform to linear space.

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 10:06

I was thinking that I can store scaled versions of the main tone and subsample it to get its higher harmonics.This would cost say 128x32 bytes but I would avoid all multiplications. Using logarithmic volumes one could end to much less waves, say 16 (using the psg approach).

The cpu load per frame would be 32*(n-1)*4 additions where n is the number of higher harmonics to reproduce.
The other operations would be indirect accesses to the right scaled wave and to the subsamples corresponding to the current harmonic

Por Grauw

Ascended (9181)

imagem de Grauw

30-05-2016, 10:03

You could use logarithmically scaled versions so that you can just add them at the cost of reducing the volume resolution to just 16 steps (like the PSG). Also note that you could possibly just have one full-volume sine wave and then shift it to get the logarithmic versions of it.

I wonder how expensive it would be to implement the FFT in Z80 code if you use a logarithmic scale to avoid the multiplications.

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 10:08

Yes

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 10:09

No idea about the fft feasibility but it seems too complex to me

Por wouter_

Champion (429)

imagem de wouter_

30-05-2016, 13:22

ARTRAG wrote:

Mixing in software say 4 harmonics for 4 channels of 32 samples seems quite heavy...
It does 3x32x4 multiplications per frame... to not count additions ...

My idea was to do all these calculation in the encoder. So also store the waveforms (like how you now store frequencies and volumes). But I see now that would require more storage than a full 8-bit 8kHz sample. So ignore this idea ;-)

Grauw wrote:

I wonder how expensive it would be to implement the FFT in Z80 code if you use a logarithmic scale to avoid the multiplications.

Log-scale makes multiplication cheap but addition expensive. FFT needs both multiplication and addition.
You can partly work around this by having two log<->linear lookup tables, but these conversions very quickly accumulate rounding errors.

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 20:20

The best solution seems to have 15 waves scaled according to logarithmic volumes (zero volume is not represented).
One byte could pack the amplitudes of two harmonics.
The same 15 waves can be subsampled by 2,3,4 etc. to get higher harmonics
The encoder should
- compute the peaks in the amplitude spectrum, say P their set and Pf their frequencies
Repeat 4 times
- find the highest peak in P say f0 its frequency in Pf, remove f0 from Pf, store its scc amplitude, remove it from P
- look for any f in Pf in the range 2*f0 +/- df
- if present store its relative amplitude as logarithmic volume, remove f from Pf and the amplitude from P
- same for 3*f0 and 4*f0
Endrepeat

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

30-05-2016, 21:21

Techniques for Harmonic Sinusoidal Coding

https://www.google.it/url?sa=t&source=web&rct=j&url=http://w...

Por ARTRAG

Enlighted (6406)

imagem de ARTRAG

31-05-2016, 00:33

Reading chapter 3 from page 54 it seems I shouldn't start from the highest peak but from the one at the lowest frequency...
Interesting reading, there is a lot to improve in this coder for msx

Página 8/30
1 | 2 | 3 | 4 | 5 | 6 | 7 | | 9 | 10 | 11 | 12 | 13