Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. TGSpeechBox v3.0-beta2 is here

TGSpeechBox v3.0-beta2 is here

Scheduled Pinned Locked Moved Uncategorized
13 Posts 3 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T This user is from outside of this forum
    T This user is from outside of this forum
    tamasg@mindly.social
    wrote last edited by
    #1

    TGSpeechBox v3.0-beta2 is here!
    Big changes: SAPI is now a single DLL! No more juggling speechPlayer.dll, nvspFrontend.dll, and libespeak-ng.dll. Runtime code cut nearly in half.
    Per-voice settings on mobile! each voice remembers its own tuning when you switch.
    Diphthong quality leap: bandwidths now interpolate alongside frequencies during glides, fixing the "shaky" vowel quality reported by users. Onset settle time lets resonators establish before the glide begins. Adaptive hold scales with formant distance so narrow diphthongs don't stall and wide ones don't smear.
    Fixes: limiter no longer pumps at pitch rate (goodbye "same page" grittiness), stops no longer sound like affricates at low volume (/t/→/tʃ/ and /p/→/f/ gone), quote-aware clause splitting fixes intonation through dialogue, and iOS clicking from rapid VoiceOver swiping is resolved.
    Also: GOAT monophthongization, velar word-final boost, sample rate picker on iOS, pause mode on Android, and a bunch more.
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgSpeechBox-v300b2.nvda-addon
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgsbPhonemeEditorWin32-v300b2.zip
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/TGSpeechSapiSetup-v300b2.exe
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/TGSpeechBox-v300b2.apk
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgspeechbox-linux-x86_64-v-300b2.tar.gz
    https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgspeechbox-linux-aarch64-v-300b2.tar.gz

    C 1 Reply Last reply
    1
    0
    • T tamasg@mindly.social

      TGSpeechBox v3.0-beta2 is here!
      Big changes: SAPI is now a single DLL! No more juggling speechPlayer.dll, nvspFrontend.dll, and libespeak-ng.dll. Runtime code cut nearly in half.
      Per-voice settings on mobile! each voice remembers its own tuning when you switch.
      Diphthong quality leap: bandwidths now interpolate alongside frequencies during glides, fixing the "shaky" vowel quality reported by users. Onset settle time lets resonators establish before the glide begins. Adaptive hold scales with formant distance so narrow diphthongs don't stall and wide ones don't smear.
      Fixes: limiter no longer pumps at pitch rate (goodbye "same page" grittiness), stops no longer sound like affricates at low volume (/t/→/tʃ/ and /p/→/f/ gone), quote-aware clause splitting fixes intonation through dialogue, and iOS clicking from rapid VoiceOver swiping is resolved.
      Also: GOAT monophthongization, velar word-final boost, sample rate picker on iOS, pause mode on Android, and a bunch more.
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgSpeechBox-v300b2.nvda-addon
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgsbPhonemeEditorWin32-v300b2.zip
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/TGSpeechSapiSetup-v300b2.exe
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/TGSpeechBox-v300b2.apk
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgspeechbox-linux-x86_64-v-300b2.tar.gz
      https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b2/tgspeechbox-linux-aarch64-v-300b2.tar.gz

      C This user is from outside of this forum
      C This user is from outside of this forum
      chrisduffley@mastodon.chrisduffley.com
      wrote last edited by
      #2

      @Tamasg OK, is this just me or is it sounding more UK/Scotish under certain phonemes? Five sounds more scotish (somewhat), at least that I've noticed the most. I don't know if it really sounds authentically UK now though lol.

      T K 2 Replies Last reply
      0
      • C chrisduffley@mastodon.chrisduffley.com

        @Tamasg OK, is this just me or is it sounding more UK/Scotish under certain phonemes? Five sounds more scotish (somewhat), at least that I've noticed the most. I don't know if it really sounds authentically UK now though lol.

        T This user is from outside of this forum
        T This user is from outside of this forum
        tamasg@mindly.social
        wrote last edited by
        #3

        @ChrisDuffley really good one to flag. This is with En-GB? Perhaps some of the tuning to get "eight" to sound better messed with that, hmm. worth looking over the changelog if so 😄

        C 1 Reply Last reply
        0
        • T tamasg@mindly.social

          @ChrisDuffley really good one to flag. This is with En-GB? Perhaps some of the tuning to get "eight" to sound better messed with that, hmm. worth looking over the changelog if so 😄

          C This user is from outside of this forum
          C This user is from outside of this forum
          chrisduffley@mastodon.chrisduffley.com
          wrote last edited by
          #4

          @Tamasg Yep indeed. Also the iOS version doesn't use the newer phoneme table NVDA is using, so R's are quite stuck out for instance, in Sierra and Zero for sure. Also, speakking of which, the o in 0 is quite, hmm, I can't even explain it. But it's a bit lighter than it used to be. For some reason I always compare to regular NVSpeechPlayer and ESpeak as they have the best British phoneme table out there, in my opinion.

          C 1 Reply Last reply
          0
          • C chrisduffley@mastodon.chrisduffley.com

            @Tamasg Yep indeed. Also the iOS version doesn't use the newer phoneme table NVDA is using, so R's are quite stuck out for instance, in Sierra and Zero for sure. Also, speakking of which, the o in 0 is quite, hmm, I can't even explain it. But it's a bit lighter than it used to be. For some reason I always compare to regular NVSpeechPlayer and ESpeak as they have the best British phoneme table out there, in my opinion.

            C This user is from outside of this forum
            C This user is from outside of this forum
            chrisduffley@mastodon.chrisduffley.com
            wrote last edited by
            #5

            @Tamasg LOL, another one. The "OU" in thousand is quite odd, the end of it is kind of long. Just find it really interesting lol.

            C 1 Reply Last reply
            0
            • C chrisduffley@mastodon.chrisduffley.com

              @Tamasg OK, is this just me or is it sounding more UK/Scotish under certain phonemes? Five sounds more scotish (somewhat), at least that I've noticed the most. I don't know if it really sounds authentically UK now though lol.

              K This user is from outside of this forum
              K This user is from outside of this forum
              kaveinthran@disabled.social
              wrote last edited by
              #6

              @ChrisDuffley @Tamasg While you are using the ngb table, how words like "sharp" "wrap" sounds to you?

              C 1 Reply Last reply
              0
              • C chrisduffley@mastodon.chrisduffley.com

                @Tamasg LOL, another one. The "OU" in thousand is quite odd, the end of it is kind of long. Just find it really interesting lol.

                C This user is from outside of this forum
                C This user is from outside of this forum
                chrisduffley@mastodon.chrisduffley.com
                wrote last edited by
                #7

                @Tamasg a few other things. Australia which I switchd to for the first time. The o is so British really. Is the "u" in public suposed to be so, like, I can't describe it? It should have more of a mouth-opening ish uh in it, shouldn't it? Yeah, you're talking to a stickler for accents but can't even properly fix a phoneme🤣

                T 1 Reply Last reply
                0
                • K kaveinthran@disabled.social

                  @ChrisDuffley @Tamasg While you are using the ngb table, how words like "sharp" "wrap" sounds to you?

                  C This user is from outside of this forum
                  C This user is from outside of this forum
                  chrisduffley@mastodon.chrisduffley.com
                  wrote last edited by
                  #8

                  @kaveinthran @Tamasg a slightly more opened up o in shop, for how sharp sounds here, and also wrap is a bit more closed I guess, it's really hard to describe stuff like this lol.

                  K 1 Reply Last reply
                  0
                  • C chrisduffley@mastodon.chrisduffley.com

                    @Tamasg a few other things. Australia which I switchd to for the first time. The o is so British really. Is the "u" in public suposed to be so, like, I can't describe it? It should have more of a mouth-opening ish uh in it, shouldn't it? Yeah, you're talking to a stickler for accents but can't even properly fix a phoneme🤣

                    T This user is from outside of this forum
                    T This user is from outside of this forum
                    tamasg@mindly.social
                    wrote last edited by
                    #9

                    @ChrisDuffley Gosh Australian is going to need a lot of work. I know there's Hilenbrand phoneme tables for US English, but I wonder if anything like that exists for Australian, where they gathered it from actual speakers. Something like that would really help it, right now it's more like a bad mix of US and UK, which was just about as much as I could pull out of the research, but not all the patterns where it sounds which. Like, flaps are more US, but vowel centralization definitely different and can be US on some. So yeah, it's honestly like, one of the trickiest languages, but I hope I can improve it. The other UK issues might be related to diphthong collapse though, I'll try to disable it and see because it does do some shifting of the bandwidths, and if it's shifting the onset of that sound it could absolutely become more Scottish. Same with something like Thousand where you glide the Ou-w sound together a bit.

                    1 Reply Last reply
                    0
                    • C chrisduffley@mastodon.chrisduffley.com

                      @kaveinthran @Tamasg a slightly more opened up o in shop, for how sharp sounds here, and also wrap is a bit more closed I guess, it's really hard to describe stuff like this lol.

                      K This user is from outside of this forum
                      K This user is from outside of this forum
                      kaveinthran@disabled.social
                      wrote last edited by
                      #10

                      @ChrisDuffley @Tamasg At least for me, in beta 1, I couldn't here the "ep" sound in those words. US english works good though. When you compared with espeak, you can hear the "ep" sound. A fun way to do this is to allow both espeak and speech box read the first 10 sentences or first page of the first book of Harry Potter.

                      K 1 Reply Last reply
                      0
                      • K kaveinthran@disabled.social

                        @ChrisDuffley @Tamasg At least for me, in beta 1, I couldn't here the "ep" sound in those words. US english works good though. When you compared with espeak, you can hear the "ep" sound. A fun way to do this is to allow both espeak and speech box read the first 10 sentences or first page of the first book of Harry Potter.

                        K This user is from outside of this forum
                        K This user is from outside of this forum
                        kaveinthran@disabled.social
                        wrote last edited by
                        #11

                        @ChrisDuffley @Tamasg In many words with the ngb, the p sound at an end of word is considerably silent.

                        K 1 Reply Last reply
                        0
                        • K kaveinthran@disabled.social

                          @ChrisDuffley @Tamasg In many words with the ngb, the p sound at an end of word is considerably silent.

                          K This user is from outside of this forum
                          K This user is from outside of this forum
                          kaveinthran@disabled.social
                          wrote last edited by
                          #12

                          @ChrisDuffley @Tamasg you also can compare with the classic NV speech player for ENGB, it's the gold standard. I wonder, could we use the NVSpeech Player phonemes for ENGB as it should be already stable?

                          T 1 Reply Last reply
                          0
                          • K kaveinthran@disabled.social

                            @ChrisDuffley @Tamasg you also can compare with the classic NV speech player for ENGB, it's the gold standard. I wonder, could we use the NVSpeech Player phonemes for ENGB as it should be already stable?

                            T This user is from outside of this forum
                            T This user is from outside of this forum
                            tamasg@mindly.social
                            wrote last edited by
                            #13

                            @kaveinthran @ChrisDuffley not really. Other languages have altered that by now slightly, reverting them is a bad move. Also there's passes like coarticulation that make a difference on the final sound, even if we did so. It's a lot more complex machinary interacting together.

                            1 Reply Last reply
                            0
                            • pixelate@tweesecake.socialP pixelate@tweesecake.social shared this topic
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups