
Create
CGO - Adventure Time: Distant Lands BMO (Simone Giertz)

577
5
21
Description
Link - A 1-minute dataset. 20 kHz There aren't a lot of voice lines, so I had to squeeze as much clean data as I could, denoise, remove SFX, de-click, and spectral repair.
Comments

just a note spectral repair isnt that useful, ur better off just not using it. u can also compress ur audio with a ration of 2.0:1 to lower or remove some left over noise and make the dataset much more consistent. Also use phasing and de ess.

I use spectral repair just to remove some tiny imperfections caused by SFX separation (they are not even audible)

Usually it does more harm then good cuz it introduces more inconsistent frequencies. GANs don't like that. You can test it out yourself. To compare u should look at the gradients and see how they'll adjust more smoothly compared to the repaired dataset. Just know that just bc u can't hear said repaired frequencies that doesn't mean that the GAN can't be confused by it

Also if u ripped from yt it's advisable to train with 32k rather than 40k :3

I meant that even before repairing, it was at an inaudible level and outside of the voice frequency range. No, it was ripped from HMAX.

still u may not hear it but the gan sure can see it thru the mel spectrogram. Regardless mind showing me a close up of the spectrogram on rx?

oh yeah thats rough

u should def train with 32k

What do you mean by rough?

the cut off is much lower than 40k
Add a comment
Samples
Pitch