Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Heads up you code maintainers who take submissions from people, delete unicode characters.

Heads up you code maintainers who take submissions from people, delete unicode characters.

Scheduled Pinned Locked Moved Uncategorized
infosecfossgithub
16 Posts 6 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • dalias@hachyderm.ioD dalias@hachyderm.io

    @ChuckMcManis @dangoodin In particular, the example text shown has no "invisible characters" in it, and PUA (*private* use area, not public) characters generally show up as a replacement character like � or a hex code, not a blank or much less zero-width glyph, unless you're using a font that assigns them for some particular use.

    j0057@hachyderm.ioJ This user is from outside of this forum
    j0057@hachyderm.ioJ This user is from outside of this forum
    j0057@hachyderm.io
    wrote last edited by
    #6

    @dalias @ChuckMcManis @dangoodin Judging by the decoder snippet, the string between the backticks contains U+FE00 VARIATION SELECTOR 1 to U+FE0F VARIATION SELECTOR 16, and U+E0100 to U+E01EF have more variation selectors for 17 to 256. So we can theoretically choose between 256 variants of an emoji!

    https://en.wikipedia.org/wiki/Variation_Selectors_%28Unicode_block%29

    dalias@hachyderm.ioD 1 Reply Last reply
    0
    • j0057@hachyderm.ioJ j0057@hachyderm.io

      @dalias @ChuckMcManis @dangoodin Judging by the decoder snippet, the string between the backticks contains U+FE00 VARIATION SELECTOR 1 to U+FE0F VARIATION SELECTOR 16, and U+E0100 to U+E01EF have more variation selectors for 17 to 256. So we can theoretically choose between 256 variants of an emoji!

      https://en.wikipedia.org/wiki/Variation_Selectors_%28Unicode_block%29

      dalias@hachyderm.ioD This user is from outside of this forum
      dalias@hachyderm.ioD This user is from outside of this forum
      dalias@hachyderm.io
      wrote last edited by
      #7

      @j0057 @ChuckMcManis @dangoodin It's possible there is some actual nefarious thing going on, but that Aikido's slopbot writing the blog just completely botched the explanation of it...

      j0057@hachyderm.ioJ 1 Reply Last reply
      0
      • dalias@hachyderm.ioD dalias@hachyderm.io

        @j0057 @ChuckMcManis @dangoodin It's possible there is some actual nefarious thing going on, but that Aikido's slopbot writing the blog just completely botched the explanation of it...

        j0057@hachyderm.ioJ This user is from outside of this forum
        j0057@hachyderm.ioJ This user is from outside of this forum
        j0057@hachyderm.io
        wrote last edited by
        #8

        @dalias @ChuckMcManis @dangoodin Calling `eval` on an apparently empty string being decoded is definitely suspicious and nefarious, it should never pass any decent code review.

        dalias@hachyderm.ioD chuckmcmanis@chaos.socialC 2 Replies Last reply
        0
        • j0057@hachyderm.ioJ j0057@hachyderm.io

          @dalias @ChuckMcManis @dangoodin Calling `eval` on an apparently empty string being decoded is definitely suspicious and nefarious, it should never pass any decent code review.

          dalias@hachyderm.ioD This user is from outside of this forum
          dalias@hachyderm.ioD This user is from outside of this forum
          dalias@hachyderm.io
          wrote last edited by
          #9

          @j0057 @ChuckMcManis @dangoodin Yeah, this looks like a complete non-issue.

          1 Reply Last reply
          0
          • chuckmcmanis@chaos.socialC chuckmcmanis@chaos.social

            Heads up you code maintainers who take submissions from people, delete unicode characters. See this: https://arstechnica.com/security/2026/03/supply-chain-attack-using-invisible-code-hits-github-and-other-repositories/ Yes, people put back doors in code using unicode characters that don't show up on the screen. #infosec #foss #github

            chuckmcmanis@chaos.socialC This user is from outside of this forum
            chuckmcmanis@chaos.socialC This user is from outside of this forum
            chuckmcmanis@chaos.social
            wrote last edited by
            #10

            Y'all are gonna force me to write a POC and then we'll all be in trouble.

            1 Reply Last reply
            0
            • j0057@hachyderm.ioJ j0057@hachyderm.io

              @dalias @ChuckMcManis @dangoodin Calling `eval` on an apparently empty string being decoded is definitely suspicious and nefarious, it should never pass any decent code review.

              chuckmcmanis@chaos.socialC This user is from outside of this forum
              chuckmcmanis@chaos.socialC This user is from outside of this forum
              chuckmcmanis@chaos.social
              wrote last edited by
              #11

              @j0057

              I don't disagree with this, but how about calling eval() on a non-empty and innocuous string? Said string being only one or two regexes away from being not innocuous? How about your web service's json parameter list which when 'touched' by the magic regex has more parameters in it than you thought? People don't sanitize the strings that their own code sends them, and perhaps that is unwise.

              @dalias @dangoodin

              dalias@hachyderm.ioD 1 Reply Last reply
              0
              • chuckmcmanis@chaos.socialC chuckmcmanis@chaos.social

                Heads up you code maintainers who take submissions from people, delete unicode characters. See this: https://arstechnica.com/security/2026/03/supply-chain-attack-using-invisible-code-hits-github-and-other-repositories/ Yes, people put back doors in code using unicode characters that don't show up on the screen. #infosec #foss #github

                ewhac@mastodon.socialE This user is from outside of this forum
                ewhac@mastodon.socialE This user is from outside of this forum
                ewhac@mastodon.social
                wrote last edited by
                #12

                @ChuckMcManis I'm curious to see a de-fanged example, so that I can see how Vim and Neovim display it, and also how it appears when run through `od -ax`.

                ewhac@mastodon.socialE 1 Reply Last reply
                0
                • chuckmcmanis@chaos.socialC chuckmcmanis@chaos.social

                  @j0057

                  I don't disagree with this, but how about calling eval() on a non-empty and innocuous string? Said string being only one or two regexes away from being not innocuous? How about your web service's json parameter list which when 'touched' by the magic regex has more parameters in it than you thought? People don't sanitize the strings that their own code sends them, and perhaps that is unwise.

                  @dalias @dangoodin

                  dalias@hachyderm.ioD This user is from outside of this forum
                  dalias@hachyderm.ioD This user is from outside of this forum
                  dalias@hachyderm.io
                  wrote last edited by
                  #13

                  @ChuckMcManis @j0057 @dangoodin Applying the mapping to it is super sus already, and even presence of eval at all is sus. I wouldn't accept a PR with eval without a detailed explanation of why it's not practical without eval.

                  1 Reply Last reply
                  0
                  • ewhac@mastodon.socialE ewhac@mastodon.social

                    @ChuckMcManis I'm curious to see a de-fanged example, so that I can see how Vim and Neovim display it, and also how it appears when run through `od -ax`.

                    ewhac@mastodon.socialE This user is from outside of this forum
                    ewhac@mastodon.socialE This user is from outside of this forum
                    ewhac@mastodon.social
                    wrote last edited by
                    #14

                    @ChuckMcManis Also: What bozo thought it was a good idea to silently transcode the Public Use Area down to the ASCII range and then interpret it? Is this transcoding mandated by the Unicode standard, or just something "clever" they did on their own?

                    1 Reply Last reply
                    0
                    • chuckmcmanis@chaos.socialC chuckmcmanis@chaos.social

                      @labbatt50 And now they have LLMs that will generate code using only unicode characters.

                      labbatt50@mastodon.worldL This user is from outside of this forum
                      labbatt50@mastodon.worldL This user is from outside of this forum
                      labbatt50@mastodon.world
                      wrote last edited by
                      #15

                      @ChuckMcManis

                      Wasn't thinking. Absolutely, your right on the mark.

                      1 Reply Last reply
                      0
                      • chuckmcmanis@chaos.socialC chuckmcmanis@chaos.social

                        Heads up you code maintainers who take submissions from people, delete unicode characters. See this: https://arstechnica.com/security/2026/03/supply-chain-attack-using-invisible-code-hits-github-and-other-repositories/ Yes, people put back doors in code using unicode characters that don't show up on the screen. #infosec #foss #github

                        peterrenshaw@ioc.exchangeP This user is from outside of this forum
                        peterrenshaw@ioc.exchangeP This user is from outside of this forum
                        peterrenshaw@ioc.exchange
                        wrote last edited by
                        #16

                        @ChuckMcManis back when Flickr was (written in Perl) built, Stewart Butterfield had a Unicode scanner on all user input. There’s an animal book where he describes the filter. I remember writing on in very slow Python calling inconv, (before Py handled UTF-8 unicode, PEP 3120). <https://www.yourtechstory.com/2024/08/15/stewart-butterfield-the-visionary-behind-flickr-and-slack/>

                        1 Reply Last reply
                        1
                        0
                        • R relay@relay.infosec.exchange shared this topic
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups