Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Whoa. UTF-8 is older now than ASCII was when UTF-8 was invented.

Whoa. UTF-8 is older now than ASCII was when UTF-8 was invented.

Scheduled Pinned Locked Moved Uncategorized
27 Posts 18 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • mo@mastodon.mlM mo@mastodon.ml

    @vathpela IMHO, redundancy and/or checksums should be implemented on different layer, not in text encoding

    Like, there's many, many ways to keep bits from corrupting, which are applicable in different cases
    And forcing one particular inside of text encoding itself is...meh

    Same for compression btw. For some texts (CJK in particular) UTF-8 is sub-optimal, but even basic deflate makes it compact enough

    TL;DR: UTF-8 is not perfect, but having one encoding for every text outweighs

    @tek

    mansr@society.oftrolls.comM This user is from outside of this forum
    mansr@society.oftrolls.comM This user is from outside of this forum
    mansr@society.oftrolls.com
    wrote last edited by
    #18

    @mo @vathpela @tek Variable length encoding adds a little complexity at the input and output stages, but I think the benefits outweigh that, especially the 8-bit compatibility that allows a lot of software to work (at least to some extent) unmodified.

    1 Reply Last reply
    0
    • tek@freeradical.zoneT tek@freeradical.zone

      Whoa. UTF-8 is older now than ASCII was when UTF-8 was invented.

      jaddle@toot.communityJ This user is from outside of this forum
      jaddle@toot.communityJ This user is from outside of this forum
      jaddle@toot.community
      wrote last edited by
      #19

      @tek
      And yet, my bank still won't let me add a contact (for etransfers) with an accent in their name.

      1 Reply Last reply
      0
      • enno@mastodon.gamedev.placeE This user is from outside of this forum
        enno@mastodon.gamedev.placeE This user is from outside of this forum
        enno@mastodon.gamedev.place
        wrote last edited by
        #20

        @tek @loke @vathpela there is a BOM defined for UTF-8, as pointless as that may seem, and it's screwing up that whole beautiful ASCII compatibility whenever someone uses it.

        loke@functional.cafeL 1 Reply Last reply
        0
        • tek@freeradical.zoneT tek@freeradical.zone

          Whoa. UTF-8 is older now than ASCII was when UTF-8 was invented.

          alper@rls.socialA This user is from outside of this forum
          alper@rls.socialA This user is from outside of this forum
          alper@rls.social
          wrote last edited by
          #21

          @tek MySQL will still happily mangle it.

          1 Reply Last reply
          0
          • enno@mastodon.gamedev.placeE enno@mastodon.gamedev.place

            @tek @loke @vathpela there is a BOM defined for UTF-8, as pointless as that may seem, and it's screwing up that whole beautiful ASCII compatibility whenever someone uses it.

            loke@functional.cafeL This user is from outside of this forum
            loke@functional.cafeL This user is from outside of this forum
            loke@functional.cafe
            wrote last edited by
            #22

            @enno @tek @vathpela I'd go as far as saying it's actively harmful. There are exactly zero cases when it's useful, and it will actively mess things up in most cases.

            But, of course windows applications tend to add them at times.

            1 Reply Last reply
            0
            • vathpela@infosec.exchangeV This user is from outside of this forum
              vathpela@infosec.exchangeV This user is from outside of this forum
              vathpela@infosec.exchange
              wrote last edited by
              #23

              @glent @ahltorp @mxk @tek do y'all just not believe people still have to deal with actual UARTs, or what?

              mxk@hachyderm.ioM ahltorp@mastodon.nuA 2 Replies Last reply
              0
              • vathpela@infosec.exchangeV vathpela@infosec.exchange

                @glent @ahltorp @mxk @tek do y'all just not believe people still have to deal with actual UARTs, or what?

                mxk@hachyderm.ioM This user is from outside of this forum
                mxk@hachyderm.ioM This user is from outside of this forum
                mxk@hachyderm.io
                wrote last edited by
                #24

                @vathpela @glent @ahltorp @tek I do work with actual uarts but only for debugging purposes as a fallback when ssh fails.
                That doesn't stop me from considering using utf-8 a net benefit.

                vathpela@infosec.exchangeV 1 Reply Last reply
                0
                • vathpela@infosec.exchangeV vathpela@infosec.exchange

                  @glent @ahltorp @mxk @tek do y'all just not believe people still have to deal with actual UARTs, or what?

                  ahltorp@mastodon.nuA This user is from outside of this forum
                  ahltorp@mastodon.nuA This user is from outside of this forum
                  ahltorp@mastodon.nu
                  wrote last edited by
                  #25

                  @vathpela @glent @mxk But even if it’s raw UART with no layer in between, it’s no more of a problem than with Ascii or ISO 8859, if you don’t count the larger surface area of a wide character, which is sort of unavoidable.

                  vathpela@infosec.exchangeV 1 Reply Last reply
                  0
                  • mxk@hachyderm.ioM mxk@hachyderm.io

                    @vathpela @glent @ahltorp @tek I do work with actual uarts but only for debugging purposes as a fallback when ssh fails.
                    That doesn't stop me from considering using utf-8 a net benefit.

                    vathpela@infosec.exchangeV This user is from outside of this forum
                    vathpela@infosec.exchangeV This user is from outside of this forum
                    vathpela@infosec.exchange
                    wrote last edited by
                    #26

                    @mxk @glent @ahltorp @tek I agree, but I also think it could and should have improved.

                    1 Reply Last reply
                    0
                    • ahltorp@mastodon.nuA ahltorp@mastodon.nu

                      @vathpela @glent @mxk But even if it’s raw UART with no layer in between, it’s no more of a problem than with Ascii or ISO 8859, if you don’t count the larger surface area of a wide character, which is sort of unavoidable.

                      vathpela@infosec.exchangeV This user is from outside of this forum
                      vathpela@infosec.exchangeV This user is from outside of this forum
                      vathpela@infosec.exchange
                      wrote last edited by
                      #27

                      @ahltorp @glent @mxk we could have made the whole situation better, but we didn't.

                      1 Reply Last reply
                      0
                      • drajt@fosstodon.orgD drajt@fosstodon.org shared this topic
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups