Description
Rediscovered this show thanks to theodd1sout I vaugley remember watching this show as a kid a few times and from what little I remember these 2 were my favorite. Was originally just gonna do one of them but then I quickly realized "oh shit i'm gonna have to do both" Datasets are both 1 minute (give or take) taken from a clip compilation I found focused on them specifically it luckily had a bit of singing data I could use in it. Though I didn't remember what they sounded like I found out with a combination of my memory and context. Zak: Wheezie:
Comments

Rediscovered this show thanks to theodd1sout, I vaugley remember watching this show as a kid a few times and from what little I remember these 2 were my favorite. Was originally just gonna do one of them but then I quickly realized "oh shit i'm gonna have to do both" Datasets are both 1 minute (give or take) taken from a clip compilation I found focused on them specifically, it luckily had a bit of singing data I could use in it. Though I didn't remember what they sounded like, I found out with a combination of my memory and context. Zak: https://huggingface.co/Xhepyxopila/MiscCartoonModelsRVC/resolve/main/Zak.zip Wheezie: https://huggingface.co/Xhepyxopila/MiscCartoonModelsRVC/resolve/main/Wheezie.zip

don't usually share examples like this, but I liked this one enough to share it here. Made FnF chromatics using these models + some audio SaltyDKDan recorded for people to make a scale out of him.

i REALLY love how this one turned out
Here a comparison to the previous Zak and Wheezie model. https://discord.com/channels/1089076875999072296/1141713417930031195 https://discord.com/channels/1089076875999072296/1141712665270558870
Wheezie: (+6 for both instrumental and vocals)
Interesting results...
I'm not sure if its due to the dataset, or rvc v2 48k, or whatever could be causing it, but the Zhepy models sound better than the Clementine ones.
Like theres more detail in the voice.

the weird thing about this to me is that they used 48k RMVPE with a 9x longer dataset and somehow my models sound better
It could be two things.
1. singing data might not be in the other model.
2. RVMPE isn't as good with detail compared to mangocrepe.
3. 48k could cause issues with some datasets causing artifacts.
4. whatever the source and clean up that was used.
Checking a talking test, its the same results here.
_1 tests are yours.
and the standard name ones are Clementines.

honestly i couldn't tell much of a diffrence at first

but then i listened closer
Would be curious to see if you make an Ord and Cassie model uf they have similar differences.

maybe at some point i will
Add a comment
Samples
Pitch
More to explore
Saiba Momoi (Blue Archive)

Ariana Grande AI

JENNIE of BLACKPINK [Strong Ver.]

Saiba Momoi (Blue Archive) (VA: Tokui Sora)

Hatsune Miku
SpongeBob SquarePants (Talking And Singing)
Takanashi Hoshino (from Blue Archive)

Satoru Gojo (JJK) [VA Yuichi Nakamura]

ENHYPEN Heeseung

Sunaokami Shiroko (Blue Archive)

Villager (Minecraft)

Mortis [Brawl stars]
Jungkook (BTS)

Tendou Arisu (Blue Archive)

Kanye West
Loading more