@Tamasg I wonder where NVSP/TGSB got the main voice from?

danestange@caneandable.social

@Tamasg I wonder where NVSP/TGSB got the main voice from? He sounds very interesting. Can his voice be mathmatically turned into someone else? Could his s's, k's p's d's and t's and sh's be turned to faidy ones like wintalkers? I have so many random thoughts. Could it get female voices like wintalker or dectalk have? The trouble is there's not much opened formant tts research in this field in 2026, but I'd imaginy've already thought about all this stuff. It's not about faidy, there's some magic to how mark made the unvoiced sounds sound so human at 11k upped to 22.

tamasg@mindly.social

@danestange yeah, that's like, baked into the formant tables - Speechplayer and Speechbox still sound a bit different but if you compared the two, we just have less clickyness in vowels and maybe a smoother quality to that voice, but overall that personality of what the voice is, you could probably give someone Speechplayer and speechbox, they would recognize Speechbox's voice right away as Speechplayer for that reason. Ha. i tried importing the MacinTalk formants but the problem is that what the frame gets in data and what we get can differ, since MacinTalk uses different bandwidths, and their resonators are not set up the same way as ours are. So you actually would need to scale the phoneme tables from it to ours and maybe then it probably would have that Fred personality. Would be a total dead-end though as it's so not legal, something worth trying locally out, and if you grabbed XCode technically you could build your own Speechbox copy, but it requires that stupid $99 developer account so you can sign it with your own developer keys and such, bla. Eventually having an "import phonemes and packs" button on that editor tab will solve that one.

tamasg@mindly.social

@danestange fixed it. The issue was in rate compensation it had this at 10. You could even go into your editor tab. find en-us as language. find this setting: semivowelMs: 10
Change it to 30. Test "do you" after in classic pitch, it will work far better after.

danestange@caneandable.social

@Tamasg Holy fucklet! So wow, this thing is gonna be insane!! I wonder what the espeak accent will sound like? This might mean english accents can happen, speechbox's speechplayer can drive a scottish accent finally. I'm so curious and excited to see this insanity if you push it or are satisfied with it, unless you end up handcrafting and handtuning each language still which hmm. It's all up to you obviously. I am so excited man, this is one of my biggest obsessions apart from openclaw. I wonder if I could probably ask openclaw these questions given it has the repo on my mac? Hmm. though you already have your own agent and stuff like this and have gone way in the weeds. You know more about it than an AI might. You also have plans for it and fun things that us users have no idea about.

danestange@caneandable.social

@Tamasg Wow, thanks for this! You're awesome!!

tamasg@mindly.social

@danestange well, this is how you nail a bug down! That's why I love how your brain works, you give such good perceptual detail about what's sounding off and where and how it's sounding to your ears that pinning this one down was like a 30 minute thing in the end. really not bad, and now everyone who uses Classic Pitch will benefit But yeah, my Claude-code has so many markdown and memory files, 200 lines of memory, it knows about every bug, the 20 ways I tried to solve the diphthong shimmer saga, what would change for future features and how... At this point that project folder is just a markdown set for a brand new Formant Synthesis textbook, ha. So many learnings, wisdom, and experiments recorded in there it's insane. But also makes Claude code instantly know what my problems and issues are when I open it in the project.

CIRCLE WITH A DOT

@Tamasg I wonder where NVSP/TGSB got the main voice from?