@ginsenshi @kaveinthran @Alan @BTyson look for the word "raw", next to it is a "download raw content" I think.
tamasg@mindly.social
Posts
-
How can I upload audio here so that people can play directly? -
Well, I think there's only one solution folks.Well, I think there's only one solution folks. adding a notBeforeFlags condition to the allophone rule engine. This would let word-final aspirated stop rules say "only fire when the next phoneme ISN'T a vowel" - so "blank" gets breathiness but "back up" keeps its full aspiration. Useful for other languages too anyway, maybe.
-
Safe cases (all good):• "blank", "cat", "stop", "milk", "help" — word-final stop followed by silence or end of utterance.Safe cases (all good):
• "blank", "cat", "stop", "milk", "help" — word-final stop followed by silence or end of utterance. No aspiration token exists. Breathiness injects cleanly. ✓
• "crisp", "act", "fact", "craft" — final stop in a coda cluster. Only the LAST stop (/p/ in crisp, /t/ in act) is word-final. Earlier stops in the cluster aren't. ✓
• "blanket", "stopping" — stops before vowels within a word. NOT word-final. Rules don't fire. ✓
• "grab", "bad", "dog" — voiced stops. Rules list only [p, t, k]. No match. ✓
• "match", "watch" — /tʃ/ is a single affricate token, not "t". No match. ✓
• "lapse", "box", "tax" — /p/ and /k/ are before /s/ within the word, NOT word-final. /s/ is word-final but isn't in our rule's phoneme list. ✓
The problem case: "back up", "stop it"
Well fuck. -
Hmm. Claude was "Affectionately pondering aspiration scaling mechanisms" - how can you even ponder that in an affectionate way and why would you want to are my two questions there.Hmm. Claude was "Affectionately pondering aspiration scaling mechanisms" - how can you even ponder that in an affectionate way and why would you want to are my two questions there.
-
How can I upload audio here so that people can play directly?@amir @kaveinthran @Alan @BTyson yeah I know, Synfonica LLC - that's where I thought Susan Hertz now works, she was leading that successor but as I recall this must have been 2021 or so that we heard of it starting. 5 years and yeah, no news and very little on the web about the company itself, besides her profile that still lists it as a place she's working at.
-
How can I upload audio here so that people can play directly?@kaveinthran @Alan @BTyson you can always change the inflection scale by adding:
legacyPitchInflectionScale: 0.5 (should be less inflective)
legacyPitchInflectionScale : 1.4 (should be more inflection)
With this people can tune in how "eloquence" it sounds while keeping that same curve formula -
How can I upload audio here so that people can play directly?@kaveinthran @Alan @BTyson Haha Claude is telling me we have it as the classic pitch: "Ha! Yeah, that's literally your legacy pitch mode. It's the original NV Speech Player / ipa.py pitch algorithm — the "smooth formula-based pitch curves from April 2014" IS calculatePhonemePitches from ipa.py, which IS calculatePitchesLegacy in your C++ frontend. Same math, same declination, same stress accents, same clause-type final contours." @fastfinge @ppatel
-
How can I upload audio here so that people can play directly?@kaveinthran @Alan @BTyson this is definitely great use of it! I'm glad I can now compare both side by side with a proper add-on! Haha. I still like TGSpeechbox for the smoother, less clicky voice on consonants, and the only reason I even added multilingualness was because it only supported UK English. Our UK is almost the same, so now I can remove some rules to fix the last few things in it and have the add-on to tune with. So yeah, thanks for making that.
-
How can I upload audio here so that people can play directly?@kaveinthran @Alan @BTyson Oooh it sounds almost like our Classic pitch with the Eloq one I think? Hmm. The standard one is definitely the same as Espeak_style, but I think "classic" is the Eloq one I tried to port after Brandon mentioned it. Will have to compare both side by side a bit more. For formant sharpness, will probably make it 0.9 default (so what we have as 45% on the slider) which removes that thump on words like "glottal" and tube sound but at least doesn't make it squishy like compact. Maybe that's a better default.
-
How can I upload audio here so that people can play directly?@kaveinthran @Alan @BTyson yeah, I think we can add it as one of the pitch modes, next to Espeak_style and Classic. Just the state of Speechplayer sounding the same and either too muffled or too much like a tube makes me too sad to continue it ever again. I'm not a synth expert enough to make everyone happy, lol. Guess what sucks is I can hear both: now at formant sharpness 40 (which is 1.0) it's muffled, at 50 is when we get more thumps though. So yeah, both camps are correct in some way, sigh.
-
Gosh wow.Gosh wow. Firefox crashes, it restores itself to the "edit release" page, and now, everything attaches twice to the release? That's just an odd one. What the heck, it shouldn't resubmit data like that.
-
TGSpeechBox v2.80 is here after quite some testing!@Lino0876 yeah, all versions from 2023.2 to and 2026.1 are supported, I run it through each installer before releasing to make sure we can work on both types at the same time and nothing broke.
Haha and yeah, same thought LOL! -
TGSpeechBox v2.80 is here after quite some testing!TGSpeechBox v2.80 is here after quite some testing!
DSP v7 brings per-formant transition control. Formant frequencies now move independently of amplitude during transitions, meaning smoother consonant-to-vowel movement without the mushiness that usually comes with longer crossfades.
The new allophone rule engine lets language packs define phonological rules entirely in YAML, like intervocalic flapping, dark L, unreleased stops, and more. No C++ needed. Just add rules and the engine handles position, stress, and neighbor context automatically.
Special coarticulation gives vowels natural coloring from surrounding consonants - rhotic F3 lowering for American R, labial rounding, alveolar fronting. Small shifts that add up to noticeably more natural speech.
The Fujisaki pitch model has been rewritten with exponential declination. Long sentences no longer hit an awkward pitch floor mid-utterance.
New platform: Linux AARCH64 for Raspberry Pi and ARM64 machines!
NVDA driver fixes race conditions that could randomly interrupt speech. The tgSBPhonemeEditor now supports editing allophone and coarticulation rules directly in the GUI, not the add-on yet.
To those who found the prior SpeechBox too sibilant, Cluster Timing rules enabled on English languages should help this, along with allophone rules. To those who find it too soft now, remove them from your pack and test.
https://github.com/tgeczy/TGSpeechBox/releases/download/v-280/tgSpeechBox-2026-v280.nvda-addon
https://github.com/tgeczy/TGSpeechBox/releases/download/v-280/TGSBPhonemeEditor-v280.zip
https://github.com/tgeczy/TGSpeechBox/releases/download/v-280/tgspeechbox-linux-x86_64-v280.tar.gz
https://github.com/tgeczy/TGSpeechBox/releases/download/v-280/tgspeechbox-linux-aArch_64-v280.tar.gz -
Things I hate about Espeak's English: It says "resource" not as re-source but melds the e sound into the R, and for whatever reason it says "semi" not as Eloquence's "sem-mee" but the "sem-ih" way you would call a truck.Things I hate about Espeak's English: It says "resource" not as re-source but melds the e sound into the R, and for whatever reason it says "semi" not as Eloquence's "sem-mee" but the "sem-ih" way you would call a truck. Both of these get me every time.