TF2 Spy | Dennis Bateman_e16-GPT_e15-SoVITS

Name: TF2 Spy | Dennis Bateman_e16-GPT_e15-SoVITS | AI RVC Model
Brand: Weights
Rating: 4.6 (2 reviews)

⚠️

EnglishGPT SovitsTTS / RealtimeFictional

lusbert_

1 year ago

👀

168

👍

🪄

Description

TLDR the SoVITS is 15e with GPT being 16e GPT is DPO trained < notes: (for people who wanna learn more stuff about my experience with GPT-SoVITS) dataset size is 21 minutes batch size 2 on GPT and SoVITS SoVITS Learning Rate 0.4 15 epochs sovits took around 40 minutes 16 epochs gpt 27 minutes with DPO saving freq 2 at first i trained GPT to 20 epochs but last epoch saved is e18 prob cause my saving freq was 3 so ye be sure to look out ofr that as well as for SoVITS model it was fine most likely since i trained to 15 epochs and not 20 so prob smth to do with if the modulo of epoch is 0 or not so like [epoch count]mod[saving freq] with mod being % in python so in my case where i trained to 20 epochs with saving freq of 3 20%3 which is 2 aka non 0, so thats BAD :3 so just make sure the mod is always 0 so that you train to the full epoch count i didnt like the model when i first trained without DPO so i retrained GPT model with DPO it also went from using 6GB VRAM to using 8GB this time it took 35 minutes also apparently in SOME cases the default pretrained GPT model is better, often where new words are said or the punctuation is different i'd say? but im not really sure

Comments

lusbert_

1 year ago

wall texting in question

lusbert_

1 year ago

it actually got the accent pretty well ngl

realpikachuwu

1 year ago

yap

realpikachuwu

1 year ago

joke

realpikachuwu

1 year ago

but still yap

malik

1 year ago

Bro you don’t have to write a fking article

lusbert_

1 year ago

also in when it says "Two, I shall" it some how connects the two words

lusbert_

1 year ago

idfc i was going to explain my experience i put a TLDR and those who care can read the rest

lusbert_

1 year ago

since if someone wanted to do this on their own they'd be able to benefit from a good explanation

Kimid

1 year ago

he isnt wrong about SD3 never going to come out

lusbert_

1 year ago

true

lusbert_

1 year ago

spy only speaks facts

lusbert_

1 year ago

for SD 3 i had to say `ess dee three` instead

Kimid

1 year ago

its been "so long" since any updates

lusbert_

1 year ago

eh like 3 days of radio silence and then a new sudden update

lusbert_

1 year ago

but fyi the researchers who started Stable Diffusion at Stability AI resigned so expect slower development

Kimid

1 year ago

Ik its "Forever" for the people who expect them to release it immediately

lusbert_

1 year ago

well they are teasing the people either way so there's more impatience on top of that

Kimid

1 year ago

which sucks kinda

lusbert_

1 year ago

yep

Kimid

1 year ago

by the time its out I bet RVC V3 would be out for a week or 2

lusbert_

1 year ago

nah i expect RVC to be fully abandoned by then cause RVC v3 pretrains didnt make much improvements either way

lusbert_

1 year ago

and RVC is slowly dying

Kimid

1 year ago

Oh man that sucks

lusbert_

1 year ago

hence the server is dying with it same for help chats

lusbert_

1 year ago

ye but well people move from one thing to another thing when they get bored

lusbert_

1 year ago

me when only 20 models every day while old AI hub had like 400 every day or even more? idfk i cant remember

Kimid

1 year ago

I feel like sovits models are easier and require less effort to train you dont need to clean the audio as nearly as much

Kimid

1 year ago

I meant GPT SoVITS

lusbert_

1 year ago

oh ye

lusbert_

1 year ago

thats for sure its easier to train as well

lusbert_

1 year ago

like in terms of being faster

Kimid

1 year ago

much faster

Kimid

1 year ago

I wonder if its possible to fix the GPT part

lusbert_

1 year ago

wdym

Kimid

1 year ago

Make it quicker to train

lusbert_

1 year ago

GPT is already super fast..

lusbert_

1 year ago

well it depends on the dataset size and if DPO or not but still

Kimid

1 year ago

How do you disable DPO

lusbert_

1 year ago

in my tests it was always faster than SoVITS training

lusbert_

1 year ago

its a check mark in the webui by default its disabled

scruffygamer (RVC Commissions)

1 year ago

RVC is dying? shit

scruffygamer (RVC Commissions)

1 year ago

welp gotta wait for someone to make a successor to it

Kimid

1 year ago

Oh ok

lusbert_

1 year ago

GPT-SoVITS imo but well its TTS but well for me TTS is just easier and better and more useful

scruffygamer (RVC Commissions)

1 year ago

it doesnt sound that great to me. not as versatile as RVC

Kimid

1 year ago

It sucks but I think we should still make RVC models

scruffygamer (RVC Commissions)

1 year ago

i mean dont get me wrong it's better than tacotron

lusbert_

1 year ago

here "enable DPO" https://i.postimg.cc/8zvTJBcV/image.png

Kimid

1 year ago

Oh yeah that is disabled

lusbert_

1 year ago

ye by default its that

scruffygamer (RVC Commissions)

1 year ago

i went from talknet, to sovits, to rvc v1, to v2

lusbert_

1 year ago

it gets the VRAM usage higher by like 2GB iirc

lusbert_

1 year ago

idfk where it'll go next

scruffygamer (RVC Commissions)

1 year ago

give it some time i'm sure a new AI will make RVC v2 look bad in comparison lol

lusbert_

1 year ago

RVC is more like for singing tbh GPT-SoVITS has higher potential i think

scruffygamer (RVC Commissions)

1 year ago

how so

Kimid

1 year ago

All that retraining everyone's gonna have to do

scruffygamer (RVC Commissions)

1 year ago

how is it better than a singing AI

lusbert_

1 year ago

its just faster and also does better with accent

lusbert_

1 year ago

no no i mean speaking

lusbert_

1 year ago

TTS speaking on GPT-SoVITS and RVC for singing

Kimid

1 year ago

And inflection and porosity

lusbert_

1 year ago

for my use cases TTS is better

lusbert_

1 year ago

lemme google

lusbert_

1 year ago

oh reasonable

Kimid

1 year ago

Yeah

lusbert_

1 year ago

imo GPT-SoVITS with RVC can be even better honeslty

lusbert_

1 year ago

i should try it one time but too lazy rn

Kimid

1 year ago

Oh yeah that works very well i tried it and it helped more

lusbert_

1 year ago

ye i'd expect it so cause like GPT-SoVITS can carry the accent and RVC can carry the "voice" of it

Kimid

1 year ago

GPT Sovits>Elevenlabs for sure

RC4

1 year ago

scruffygamer (RVC Commissions)

1 year ago

I think Elevenlabs is still more realistic but GPT Sovits is more versatile

Kimid

1 year ago

Yeah its more realistic but GPT SoVITS is free and open source and you can fine tune

gabrielpika145

1 year ago

ElevenLabs sometimes doesn't accurately get the voice right when you ai cloned the voice and its higher qauilty. GPT SoVITS can create the most accurate voices through TTS surpassing Tacotron2 and other TTS Ai software.

Kimid

1 year ago

I tried doing Mario's voice through it with 90 percent style exaggeration and it doesn't sound as good as SoVITS

gabrielpika145

1 year ago

What I heard from my friend, Mario was the most difficult to make a accurate voice using Ai software and SoVITS pretty much somewhat solve it because his pitch/tone goes up and down and most voices go closer to their pitch range.

gabrielpika145

1 year ago

I don't think RVC will die per say imo but it is slowly fading away in the models section but its still popular regardless. I think its especially useful as a voice changer and a singing tool.

Kimid

1 year ago

RVC is still good for covers and somewhat voice conversion

gabrielpika145

1 year ago

and even for other sounds and sfxs like drums

Kimid

1 year ago

Oh yeah that too

Kimid

1 year ago

I'm probably still going to make RVC models

gabrielpika145

1 year ago

I'm still gonna make them because after all I still have a huge list of models I got to train. I just wish I get a new GPU or a way to train it using my computer's CPU.

gabrielpika145

1 year ago

because I can able to run RVC GUI with RMVPE using my CPU (thanks to my friend modding it) and RVC Realtime using my GPU or CPU I think and I had not bump into any problems at all. I only have a old GPU from 2017 that runs up to 2GB or around that of VRAM.

RC4

1 year ago

who actually uses TT2

RC4

1 year ago

Thats one of the worst tts voice cloning i've ever seen

gabrielpika145

1 year ago

Uberduck, Fakeyou and the ai streams really

gabrielpika145

1 year ago

but they don't know that GPT SoVITS exists

gabrielpika145

1 year ago

at least I find the strokes funny

RC4

1 year ago

echelon knows

scruffygamer (RVC Commissions)

1 year ago

i used to make models for uberduck

scruffygamer (RVC Commissions)

1 year ago

good times

Kimid

1 year ago

Did you use tacotron?

scruffygamer (RVC Commissions)

1 year ago

yep

Kimid

1 year ago

I bet that had to get lots of data

scruffygamer (RVC Commissions)

1 year ago

i've made pretty decent models off of less than 30 seconds

Kimid

1 year ago

like a ton and probably took forever to train

Kimid

1 year ago

Oh wow I didnt think tacotron could do that I thought you have to have alot of data

scruffygamer (RVC Commissions)

1 year ago

having a lotta data is good too

scruffygamer (RVC Commissions)

1 year ago

havent touched tacotron in quite a while

Kimid

1 year ago

I mean hours and hours

scruffygamer (RVC Commissions)

1 year ago

eh like 3 hours usually

scruffygamer (RVC Commissions)

1 year ago

or less

Kimid

1 year ago

Oh that isnt as bad as I thought

gabrielpika145

1 year ago

I've attempt to make my first TT2 model through uberduck but that went so bad but I did tried again but failed this time. Now my friend makes the TT2 models for me instead.

scruffygamer (RVC Commissions)

1 year ago

i have yet to train with GPT Sovits but i'll def give it a go in the near future

gabrielpika145

1 year ago

but we made them for a collaborative ai stream project and so far its going pretty well.

gabrielpika145

1 year ago

and yes it would take over 9-12 hours to train each of these models which he locally trained on his Linux computer or his main idk

scruffygamer (RVC Commissions)

1 year ago

does weights.gg not have GPT sovits support yet?