Description
hi 9min dataset samplerate 32k 200 epochs snowiev3 pretrain rmvpe no index cause i never use that honestly
Comments

egirl voice inflation
I'm not gonna lie blyuv, your last 3 models have sounded like the exact same person lol
🤣 🤣 🤣

sounds like merged voices to me
yeahhh

i have so many female voices at this point

i can just merge them and make infinite

i just dont post them bc people dont credit anymore

or try reselling them
i merge different ones, idk why they sound the same at the end
but i merge and then train them at rvc

if you merge similar voices will sound kinda the same
rvc has too many limits honestly i think is cause of that
if you don't have a merge folder with hundreds of different merges are you really a realtime enjoyer
let's be honest here
this is a 3 merged models i made and then remake them on rvc again with a inference and new dataset
but i think the rvc has too many pitch tones limites that they sound the same
ah that's why
you should just post the merge
instead of making a new model of the inference of the merge
cuz it sounds worse that way
yeah i know
but im not giving like good models or mergeds xD
so i just post what i recycle
like this one
ah, fair enough
i heard the gonna release a new " good " pretrain
But like, what kinda voices are you merging. Like 3 egirl models together, or are you merging different kinda voices?
this one is 3 differents vtubers
yeahhhh
that explains it
it sounds good on singing at least owo
merging gives the best dividends when you're merging voices that are pretty distinct from one another, but then again I am assuming what kinda vtubers you're using as data xD
haha to be honest i just pick random good quality vtuber voices
dont even know the names i just listen to the sound of the voices lol
lmao
based
and pick 3 or 4 make dataset make model and merge them
the lazy part is just cleaning the audio
yeah it's pretty tedious
Titan, but it's still just a general pretrain so it's not going to be huge or anything. ~~~~I am finetuning a pretrain for English female voices in particular~~
like idk how it's going to turn out, but my hope is that it gives better results for choosing a specialty for that data.
rvc makes me feel so fustrated ;-; i hope they release a new tech soon
what's wrong with it
i mean its good but once you get used to it, you realized it has a lot of limitations

play around with diff-svc
and not good quality parts

way better for singing

and overall vocal range
tbh i just use it for realtime
i dont care about the singing or inference lol

realtime has its limitations

since rvc models are trained on voice

any non voice sounds will fuck up

like coughing, laughing, sneeze and more
what limitations are you talking about in terms of realtime?

prob those i mentioned
the pitch and tone is not the same as the dataset

also tone change

like whispering
Well I can manage laughing with realtime, but not like full on belly laughs
more like giggles
when you train a model it doesnt detected all the sound of the voice even if you boost all the dynamics

dont expect the model to translate emotions if you talk act girly

some guys use it talking like a boy

it becomes just a guy with a girl voice
i know, im not saying is bad but is not the same as dataset
its never the same as dataset

of course

it wont ever be 1:1
yeah, that is a problem
yeah thats why i say it has limitations
we call that... a tomboy

i wouldnt say its a technology limitation

it does what is supposed to do

it doesnt convert a TTS to a human voice

nor someone with no emotion to something with emotion
But like, I only use merges anyways so I don't really care about it lining up with the source audio 100%
yeah im not talking about the emotion part
as long as it sounds like I want it to after the fact
but yes the pitch tone and quality
its never the same as the dataset
like if you clone your own voice
it wont sound the same
xD
similar and accurate yes
but not the same
rcv gives you its best guess after you train
basically lol
and to get back to this, the solution is not to act like something you're not, but to sculpt a voice towards your particular speaking style 🤔
by merging different types of speakers
into a ratio that just vibes with your speech patterns
~~unless you think every woman is a bubbly valley girl lmao~~
yeah i mean all im saying is even if you clone your own voice it wont sound the same, not the talking part and the voice acting, i say the pitch and tone it wouldnever be the same, that is the limitation im refering too
~~ I was talking to MrModz~~
I get what'cha mean.
101 comments damn XD

You should try training all the datasets together instead of merging

gives better results imo
it'd be super tedious to test different ratios of voices that way

yea

but better sounding results
example?


thats a model combining 3 voices in the training phase
i tried that too but idk why i get better results with merge maybe my voice or idk
on inference sounds really good , but on realtime is other story

well do the voices you combine sound similar?
i dont remember, but i know i did that experiment xD
yeah I don't think it's better than merges 100%
Like you can prob get similar results with both methods
but it's like way more of a pain in the ass having to train a new model each time

i agree but just from what ive done training them has gotten me better results
oh yeah blyuv, have you messed around with using plugins on your models to make them sound more realistic
like what plugins
I see you in the voice chat here alot so I know you use real time quite a bit, but idk if you're interested in making it sound more like a real mic
i tried to use some vst
but idk why i prefer more virgin audio xD
Equalization on the voice to quite the robotic parts of the voice, and make other parts louder, convulations on your model can imitate the sound of a bad mic more, and like bit crushers to make it sound less unnaturally clear
yeah that's fair enough, I just use them on the model while I'm playing games or whatever

How would i go about doing this?
makes it so no-one can tell its ai
Well first you need something like Elgato wave-link or voice meter to apply plugins to your virtual cable

I have voice meter
also mares do you know how to change samplerate on wokada on client side?
i have my windows on 44khz my mic at 44khz and everything on 44khz but when i output a audio on okada its 48k
idk, I don't use okada
mine output to 44khz just fine on go_realtime_gui.bat
Well you get all these plugins then

I usually save models from others under "Model (by Author)"
and for the EQ what you really wanna do is basically soften the spectrum on the sides here, since a lot of the unnatural clearness on voices I have noticed comes from like, 50khz to 200khz.

do i need to install another thing thats not voice meter? bec there is no plugins menu that i can find. unless im dumb
I think you need voice meter banana

i do have that
But if you have an elgato mic just use wave link imo
Yeah I don't use that, so idk. I just know other people have used plugins on it.

i dont
But yeah here's an example of what it sounds like with all of those plugins on a voice.
I usually don't have the bit crushing on it from cymatics origin, but I had that on too just for examples sake.
Like most voice things compress your voice enough where you prob don't need the bitcrushing with a good voice
Tbh you should make some guides about how to clean datasets.
Huh?
I don't have any great secrets, I only train models on data that already has zero background noise lol
I mean, a updated guide about dataset cleanup and plugin usage for cleanup.
🐢 👍
teamongus
And specifically these plugins were for real time, but you're right people could probably apply some of them to inferenced audio
Add a comment
Samples
Pitch
More to explore
Saiba Momoi (Blue Archive)

Ariana Grande AI

JENNIE of BLACKPINK [Strong Ver.]

Saiba Momoi (Blue Archive) (VA: Tokui Sora)

Hatsune Miku
SpongeBob SquarePants (Talking And Singing)
Takanashi Hoshino (from Blue Archive)

Satoru Gojo (JJK) [VA Yuichi Nakamura]

ENHYPEN Heeseung

Sunaokami Shiroko (Blue Archive)

Villager (Minecraft)

Mortis [Brawl stars]
Jungkook (BTS)

Tendou Arisu (Blue Archive)

Kanye West
Loading more