Description
Dataset was 1:41 (actually 14 seconds, but repeated several times).
Comments

okay
14 seconds dataset 😭
we need to make it sing the covers that were made with sentence mixing
im gonna do ai morshu scatman now
exactly
instead of sentence mixing it will be ai
if AI was this good in 2010 internet would have gone crazy
he cut EVERY SINGLE LETTER
since he has like 5 lines

wonder how it would sound like if you train it on sentence mixed voicelines and overfit it as much as possible
🤔 should i

if you know how to sentence mix an entire dataset (or take existing ones like the ltg one i made above lol) then sure
i know how to do it by muscle memory, that's what i always did in 2015

based
seems like this model was made on an old version of rvc
i hate this
ill retrain it on crepe
made dataset with a new technique
i dont know if this will make it better or worse
best loss rate ive ever seen in my entire life

I used this notebook: https://colab.research.google.com/drive/1fc09UzPha3n-q6e5WFwUhr6u5EwOWM9p

And I trained it on crepe too
weirdly enough it doesn't work on the latest version of mangio's rvc fork
throws errors

Huh, that's odd
plus pretty weird since it has a total_fea file

That fork is used in the notebook
which got deprecated almost 3 weeks ago
ill post this V3, maybe will make an eventual V4 if it turns out good with sentence mixed vocals
i tried a new method for small datasets so let's see how it turns out, training to 300 epochs

I put V2 in the name because I trained it using the V2 notebook

So technically this is V1
rename it to v1 lol i thought this was a v2 of the model
Morshu (Link: The Faces of Evil) V1 RVC (300 Epochs)

I don't want to mislead anyone into thinking this was trained with RVC V1
rvc v1 doesn't exist

There should be an RVC V2 tag
what
there is no rvc v2?
i dont understand lol
you mean secondary base models?

Yes, I think it trains a different way as well
only for smaller datasets with smaller epoch count
it trains "faster"
like a v1 500 epochs is the same as v2 300 epochs
it needs less epochs (?)
im training with v1
just rename it to Morshu (Link: The Faces of Evil) RVC-2 (300 Epochs)
cause v2 is usually used for models

Morshu (Link: The Faces of Evil) RVC-2 (300 Epochs)
also the audio was pretty low quality so i used 32k sample target

If you want to make a better quality Morshu dataset, I recommend Elevenlabs
i will never touch elevenlabs, i hate it as a whole

Based

I think someone else did it before but I can't be bothered to find it
alright it's at 250 epochs, 50 left to go
mhm

Generate all phonemes and use that for sentence mixing
yes
high iq
it sounds like sentence mixing 😭
even tho it's ai

Morshu (From Link: The Faces of Evil) (RVC v2) 300 Epoch
Right
Could try
Add a comment
Samples
Pitch