Wow, this gives quite an quality improvement (with -o1 option), much cleaner samples indeed (and nicer wave forms in the visualizers). I will experiment a little further with other samples, eager to see what is possible!
Great to hear! I’m looking forward to Awesome version 2! (though it might lose some “character” )
Do you not get the same results with -o2? What type of samples is that with, do you notice a difference between those that show a clear fundamental (one primary peak and valley), and those that consist primarily of overtones (two or more peaks & valleys)?
If so, it could be worth thinking about selecting the phase of an overtone if the fundamental has a low power, or perhaps the pitch detection should more eagerly reject low power fundamentals (with additional benefit that you get higher wave resolution)…
@ARTRAG Thanks for the pointers! I had found PEFAC cited in the source code (am about to read it, I read YIN yesterday), I’ll add RAPT to my reading list as well! From what I see so far PEFAC works based on the spectrogram and uses the presence of overtones to its advantage, like I was curious about before, so definitely interesting. It does inspire me with a bit more confidence than the time-domain autocorrelation-based algorithms.
Oh the PEFAC paper was great. It’s pretty short and the method is relatively simple and easy to follow, and it answers a lot of questions that came to mind. Like that RAPT is also a time-domain algorithm that performs worse than YIN so probably not super interesting, and also that applying temporal continuity weights to the selection gives a big accuracy improvement and suppresses octave errors (which RAPT does so still worth a read, can also be applied to PEFAC). The configuration of the spectrogram and how it’s interpolated to log scale is also clearly described.
In my limited experiments with voice-box I remember that PEFAC was going better than RAPT
Moreover its implementation was exposing the parameters I needed to change so for me the choice was easy
Are you working to a new implementation of the encoder?
My version needs matlab run time libraries and this is a very high barrier to its diffusion and use
If you release a version that doesn't need that cumbersome overhead, it could help to spread the use of this encoding method and could go in bundle with Trilotrackr and Realfun 3 which include the use of its samples
Are you working to a new implementation of the encoder?
My version needs matlab run time libraries and this is a very high barrier to its diffusion and use
If you release a version that doesn't need that cumbersome overhead, it could help to spread the use of this encoding method and could go in bundle with Trilotrackr and Realfun 3 which include the use of its samples
It’s more like exploring, to see how it works, learn, and if I can find ways to improve it. I’m not sure if there will be a final implementation.
I’m on macOS so I can’t run the .exe conveniently, and I don’t have Matlab either. So instead I’m building some experiments of my own in JavaScript. But I think your executable is quite accessible for most people even though they need to install the Matlab runtime.
Tbh Python or C++ would be a better language choice since they have good signal processing libraries (e.g. numpy/scipy and JUCE). But I do most of my build scripts in JS and there is some fun and learning to be had implementing it myself. A web hosted conversion tool in combination with a MAP article could be interesting, but I’m not sure if I will get that far.
Great to hear! I’m looking forward to Awesome version 2! (though it might lose some “character” )
Here it is Awesome-update
Very cool, noticeable improvement to sample clarity!
Sound f*cking fantastic. WOW!
Much cleaner now! Great amazing work!
Yea! Are you ready ? ;-) You are applying effects to sampled speech too.
The "Awesome" sample is modulated in real time.
Can you also use samples as they were instruments ?
WOW!!! Is this version already available for trial?