Serval - Honkai: Star Rail

Create

Serval - Honkai: Star Rail

FictionalRVC v2English
__june user image
__june
1 year ago
👀

1k

👍

12

🪄

280

Description

(CV: Natalie Van Sistine) 40K, 8 Hop length Trained on 28 minutes of in-game dialogue <

Comments

__june user image
__june
1 year ago

chose 1000epochs bc it had less artifacting

__june user image
__june
1 year ago

can't do covers cause im abroad rn lol

dolyfin user image
dolyfin
1 year ago

wait isnt this over trained to oblivion

__june user image
__june
1 year ago

listen to this

dolyfin user image
dolyfin
1 year ago

what steps was e250 at

dolyfin user image
dolyfin
1 year ago

thats strange that it sounds better?

dolyfin user image
dolyfin
1 year ago

is that inferenced over the original voiceline?

__june user image
__june
1 year ago

yes

__june user image
__june
1 year ago

the voice sounds fine even at 1000 epochs so thats why i kept it

__june user image
__june
1 year ago

¯\_(ツ)_/¯

__june user image
__june
1 year ago

also because i was using mangio to infer

__june user image
__june
1 year ago

with rmvpe these problems are usuaslly non-existant

__june user image
__june
1 year ago

i was testing using worst-case scenario (in terms of sibilant artifacts)

__june user image
__june
1 year ago

240epoch was 27k so im assumiong 250 was at around 28/29k

dolyfin user image
dolyfin
1 year ago

i think the graph smoothing was too much

dolyfin user image
dolyfin
1 year ago

do you have a 0.95 graph still

__june user image
__june
1 year ago

nope

dolyfin user image
dolyfin
1 year ago

unfortunate then

__june user image
__june
1 year ago

it does display the trend properly though

__june user image
__june
1 year ago

it's not like this is an exact science so

dolyfin user image
dolyfin
1 year ago

sometimes the graph skews up at the start quite a bit when at 0.999

__june user image
__june
1 year ago

nah its a normal trend for all these voices

__june user image
__june
1 year ago

sampo natasha tingyun graphs btw

__june user image
__june
1 year ago

all at 0.999 smoothing

dolyfin user image
dolyfin
1 year ago

seems like there is always 2 dips

__june user image
__june
1 year ago

yep

dolyfin user image
dolyfin
1 year ago

although a 50k serval one might be good

dolyfin user image
dolyfin
1 year ago

ill have to test more

dolyfin user image
dolyfin
1 year ago

idk why my models are quite fine at low epoch

__june user image
__june
1 year ago

i was using mangio-crepe to infer, which struggles a lot when it comes to sibilants

__june user image
__june
1 year ago

rmvpe usually has no issues with that

dolyfin user image
dolyfin
1 year ago

although i dont think inferencing over the original audio is a good idea cause it might overfit?

__june user image
__june
1 year ago

yeah it's just an easy way to determine how "faithful" the model is

__june user image
__june
1 year ago

that's why i add multiple examples

__june user image
__june
1 year ago

like a singing test tts test and the original audio test

__june user image
__june
1 year ago

to show the models stability

Add a comment

Samples

New
Classic
1. Singing
Male
English
2. Singing
Female
English
3. Singing (Dry)
Female
English
4. Singing (High)
Female
English
5. Singing 2
Male
English
6. Singing (Dry)
Male
English
7. Singing (Dry, High)
Male
English

Pitch

Selected Audio
Selected Audio