Here's what the raw data of the unit selection db file for SpeakEasy sounds like.

rommix0@mindly.social

Here's what the raw data of the unit selection db file for SpeakEasy sounds like. It gives you a good idea on how unit selection synthesis works.

datajake1999@dragonscave.space

@rommix0 If my theory about the engine using MBROLA as a backend is correct, what we are actually hearing are the building blocks for a diphone based voice. When I listened to the clip you posted, it sounded familiar, and after listening to the clip of the synth itself, I remembered listening to the raw data of EN1 a while back.

rommix0@mindly.social

@datajake1999 in a way, yeah. It's like MBROLA, but not actually MBROLA since speakeasy is proprietary.

borrisinabox@fwoof.space

@rommix0 is that 8-bit linear PCM?

rommix0@mindly.social

@BorrisInABox nah, it's 16 bit. I'm sure the data has subheaders.

borrisinabox@fwoof.space

@rommix0 Wow, really? Sounds very 8-bit or less in that clip.

rommix0@mindly.social

@BorrisInABox you would think that

x0@dragonscave.space

@rommix0 @BorrisInABox Or some kind of compressed.

rommix0@mindly.social

@x0 @BorrisInABox yeah like adpcm

alexchapman@tweesecake.social

@rommix0 @datajake1999 Yeah I definitely recognise that as diphone based synthesis.

spacepup@mastodon.stickbear.me

@rommix0 i think it's diphone, specifically, the embrola en1 database

rommix0@mindly.social

@spacepup Nah. en1 is english but with a foreign speaker. The speaker used for SpeakEasy is not a foreigner.

alan@dragonscave.space

@rommix0 sounds like a fucked up vocal warmup

keao@caneandable.social

@rommix0 what is speak easy?

rommix0@mindly.social

@keao tts synth

CIRCLE WITH A DOT

Here's what the raw data of the unit selection db file for SpeakEasy sounds like.