Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. "Fun bug of the month, mesa edition, episode may"

"Fun bug of the month, mesa edition, episode may"

Scheduled Pinned Locked Moved Uncategorized
25 Posts 11 Posters 33 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • karolherbst@chaos.socialK This user is from outside of this forum
    karolherbst@chaos.socialK This user is from outside of this forum
    karolherbst@chaos.social
    wrote last edited by
    #1

    "Fun bug of the month, mesa edition, episode may"

    so if you do "uint64_t some_var = 1 << 31;" in C you get "0xffffffff80000000" as the value, because that's super obvious and not confusing at all.

    It's pretty funny getting reminded how non-intuitive and broken C is from time to time.

    trilader@chaos.socialT J pavel@social.kernel.orgP L lkundrak@metalhead.clubL 6 Replies Last reply
    0
    • karolherbst@chaos.socialK karolherbst@chaos.social

      "Fun bug of the month, mesa edition, episode may"

      so if you do "uint64_t some_var = 1 << 31;" in C you get "0xffffffff80000000" as the value, because that's super obvious and not confusing at all.

      It's pretty funny getting reminded how non-intuitive and broken C is from time to time.

      trilader@chaos.socialT This user is from outside of this forum
      trilader@chaos.socialT This user is from outside of this forum
      trilader@chaos.social
      wrote last edited by
      #2

      @karolherbst For my understanding: That's default int promotion + sign extend on 64 bit extension? Would 1L << 31L fix this or is there other pitfalls with that?

      karolherbst@chaos.socialK ewhac@mastodon.socialE pavel@social.kernel.orgP lkundrak@metalhead.clubL 5 Replies Last reply
      0
      • trilader@chaos.socialT trilader@chaos.social

        @karolherbst For my understanding: That's default int promotion + sign extend on 64 bit extension? Would 1L << 31L fix this or is there other pitfalls with that?

        karolherbst@chaos.socialK This user is from outside of this forum
        karolherbst@chaos.socialK This user is from outside of this forum
        karolherbst@chaos.social
        wrote last edited by
        #3

        @trilader yeah sure, but any competent and modern language would type the constant to what's expected, not make it int32 by default, because that's just broken imho.

        Like any new language doing that today would be considered broken on arrival.

        trilader@chaos.socialT 1 Reply Last reply
        0
        • karolherbst@chaos.socialK karolherbst@chaos.social

          @trilader yeah sure, but any competent and modern language would type the constant to what's expected, not make it int32 by default, because that's just broken imho.

          Like any new language doing that today would be considered broken on arrival.

          trilader@chaos.socialT This user is from outside of this forum
          trilader@chaos.socialT This user is from outside of this forum
          trilader@chaos.social
          wrote last edited by
          #4

          @karolherbst Yeah. Things like this make me think someone needs to invent -fbackwards-compatible-bs=off

          1 Reply Last reply
          0
          • karolherbst@chaos.socialK karolherbst@chaos.social

            "Fun bug of the month, mesa edition, episode may"

            so if you do "uint64_t some_var = 1 << 31;" in C you get "0xffffffff80000000" as the value, because that's super obvious and not confusing at all.

            It's pretty funny getting reminded how non-intuitive and broken C is from time to time.

            J This user is from outside of this forum
            J This user is from outside of this forum
            jann@infosec.exchange
            wrote last edited by
            #5

            @karolherbst I think that's UB? see C99 6.5.7 "Bitwise shift operators" - the LHS is signed and the result of the computation is not representable in the result type

            J 1 Reply Last reply
            0
            • J jann@infosec.exchange

              @karolherbst I think that's UB? see C99 6.5.7 "Bitwise shift operators" - the LHS is signed and the result of the computation is not representable in the result type

              J This user is from outside of this forum
              J This user is from outside of this forum
              jann@infosec.exchange
              wrote last edited by
              #6

              @karolherbst but apparently gcc has decided to not treat it as UB, except when using UBSAN: https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html

              karolherbst@chaos.socialK P 2 Replies Last reply
              0
              • J jann@infosec.exchange

                @karolherbst but apparently gcc has decided to not treat it as UB, except when using UBSAN: https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html

                karolherbst@chaos.socialK This user is from outside of this forum
                karolherbst@chaos.socialK This user is from outside of this forum
                karolherbst@chaos.social
                wrote last edited by
                #7

                @jann yeah technically it's UB, but there is only so much you can optimize with a 1-2 instruction pattern that it doesn't really matter in practice, because most impls will do the same (more or less).

                Like there is UB and then there is UB.

                J david_chisnall@infosec.exchangeD 2 Replies Last reply
                0
                • karolherbst@chaos.socialK karolherbst@chaos.social

                  @jann yeah technically it's UB, but there is only so much you can optimize with a 1-2 instruction pattern that it doesn't really matter in practice, because most impls will do the same (more or less).

                  Like there is UB and then there is UB.

                  J This user is from outside of this forum
                  J This user is from outside of this forum
                  jann@infosec.exchange
                  wrote last edited by
                  #8

                  @karolherbst yeah, I guess my point is that, for the code you showed, a C compiler would be well within its rights to refuse to build that code or complain about it, so this is not entirely the language's fault

                  karolherbst@chaos.socialK 1 Reply Last reply
                  0
                  • J jann@infosec.exchange

                    @karolherbst but apparently gcc has decided to not treat it as UB, except when using UBSAN: https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html

                    P This user is from outside of this forum
                    P This user is from outside of this forum
                    pinskia@hachyderm.io
                    wrote last edited by
                    #9

                    @jann @karolherbst

                    It was not UB in C90. That is why it was UB without ubsan ...

                    1 Reply Last reply
                    0
                    • J jann@infosec.exchange

                      @karolherbst yeah, I guess my point is that, for the code you showed, a C compiler would be well within its rights to refuse to build that code or complain about it, so this is not entirely the language's fault

                      karolherbst@chaos.socialK This user is from outside of this forum
                      karolherbst@chaos.socialK This user is from outside of this forum
                      karolherbst@chaos.social
                      wrote last edited by
                      #10

                      @jann ohh it's totally the languages fault even if it wouldn't be UB, because that's just the worst way to specify this.

                      Like it's just a design bug really. And no matter how much this is UB or not won't change that.

                      1 Reply Last reply
                      0
                      • trilader@chaos.socialT This user is from outside of this forum
                        trilader@chaos.socialT This user is from outside of this forum
                        trilader@chaos.social
                        wrote last edited by
                        #11

                        @puppethead @karolherbst When not using the U (or L) suffix 1<<31 triggers clang's -Wshift-sign-overflow warning. However that warning is not enabled by default and gcc doesn't support it at all.

                        1 Reply Last reply
                        0
                        • karolherbst@chaos.socialK karolherbst@chaos.social

                          @jann yeah technically it's UB, but there is only so much you can optimize with a 1-2 instruction pattern that it doesn't really matter in practice, because most impls will do the same (more or less).

                          Like there is UB and then there is UB.

                          david_chisnall@infosec.exchangeD This user is from outside of this forum
                          david_chisnall@infosec.exchangeD This user is from outside of this forum
                          david_chisnall@infosec.exchange
                          wrote last edited by
                          #12

                          @karolherbst @jann

                          It’s UB in the general case because, if the operand is not a constant, you want to lower it to a shift instruction but C works with targets that have different number representations. Ones or twos complements, or explicit sign bits are all permitted, but all of these will give different behaviours if you flip the top bit.

                          For wider shifts, different ISAs had different semantics for shifts wider than the register, so C made that fully undefined.

                          This combination lets you lower source-level shifts to a shift instruction.

                          C also doesn’t mandate that this be constant evaluated unless the result is used as a constant, so there’s no way to force implementations to diagnose the UB at compile time for this case. But, as a QoI issue, it is permitted and compilers should.

                          karolherbst@chaos.socialK 1 Reply Last reply
                          0
                          • trilader@chaos.socialT trilader@chaos.social

                            @karolherbst For my understanding: That's default int promotion + sign extend on 64 bit extension? Would 1L << 31L fix this or is there other pitfalls with that?

                            ewhac@mastodon.socialE This user is from outside of this forum
                            ewhac@mastodon.socialE This user is from outside of this forum
                            ewhac@mastodon.social
                            wrote last edited by
                            #13

                            @trilader @karolherbst "Um, actually..."

                            I believe it would work as expected with:

                            1U << 31;

                            The unexpected part is that sign extension from 32 to 64 bits takes place before reinterpretation to unsigned. The C99 standard is admittedly opaque on this point. If you make the rvalue unsigned as well, then you get the (presumably) expected result.

                            (Just tested it with GCC 13.3 -- it works.)

                            trilader@chaos.socialT 1 Reply Last reply
                            0
                            • ewhac@mastodon.socialE ewhac@mastodon.social

                              @trilader @karolherbst "Um, actually..."

                              I believe it would work as expected with:

                              1U << 31;

                              The unexpected part is that sign extension from 32 to 64 bits takes place before reinterpretation to unsigned. The C99 standard is admittedly opaque on this point. If you make the rvalue unsigned as well, then you get the (presumably) expected result.

                              (Just tested it with GCC 13.3 -- it works.)

                              trilader@chaos.socialT This user is from outside of this forum
                              trilader@chaos.socialT This user is from outside of this forum
                              trilader@chaos.social
                              wrote last edited by
                              #14

                              @ewhac @karolherbst Yes. In the other thread leg of the original post I also posted about that clang even warns you about the behavior of 1 << 31 without U or L suffix, provided you enable the right, off by default, warning that GCC doesn't have. Another leg notes that GCC catches this at runtime with ubsan enabled.

                              1 Reply Last reply
                              0
                              • karolherbst@chaos.socialK karolherbst@chaos.social

                                "Fun bug of the month, mesa edition, episode may"

                                so if you do "uint64_t some_var = 1 << 31;" in C you get "0xffffffff80000000" as the value, because that's super obvious and not confusing at all.

                                It's pretty funny getting reminded how non-intuitive and broken C is from time to time.

                                pavel@social.kernel.orgP This user is from outside of this forum
                                pavel@social.kernel.orgP This user is from outside of this forum
                                pavel@social.kernel.org
                                wrote last edited by
                                #15
                                @karolherbst C is 1973 or so. If you believe it is confusing, try assembly :-).
                                1 Reply Last reply
                                0
                                • trilader@chaos.socialT trilader@chaos.social

                                  @karolherbst For my understanding: That's default int promotion + sign extend on 64 bit extension? Would 1L << 31L fix this or is there other pitfalls with that?

                                  pavel@social.kernel.orgP This user is from outside of this forum
                                  pavel@social.kernel.orgP This user is from outside of this forum
                                  pavel@social.kernel.org
                                  wrote last edited by
                                  #16
                                  @trilader @karolherbst Yes, int promotion. 1L<<31L would still be broken on 32-bit architectures (as long is 32 bit there). You'd need 1LL or something, AFAICT.
                                  1 Reply Last reply
                                  0
                                  • trilader@chaos.socialT trilader@chaos.social

                                    @karolherbst For my understanding: That's default int promotion + sign extend on 64 bit extension? Would 1L << 31L fix this or is there other pitfalls with that?

                                    pavel@social.kernel.orgP This user is from outside of this forum
                                    pavel@social.kernel.orgP This user is from outside of this forum
                                    pavel@social.kernel.org
                                    wrote last edited by
                                    #17
                                    @trilader @karolherbst Actually right solution would be uint64_t some_var = ((uint 64_t)1) << 31; AFAICT.
                                    karolherbst@chaos.socialK 1 Reply Last reply
                                    0
                                    • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                                      @karolherbst @jann

                                      It’s UB in the general case because, if the operand is not a constant, you want to lower it to a shift instruction but C works with targets that have different number representations. Ones or twos complements, or explicit sign bits are all permitted, but all of these will give different behaviours if you flip the top bit.

                                      For wider shifts, different ISAs had different semantics for shifts wider than the register, so C made that fully undefined.

                                      This combination lets you lower source-level shifts to a shift instruction.

                                      C also doesn’t mandate that this be constant evaluated unless the result is used as a constant, so there’s no way to force implementations to diagnose the UB at compile time for this case. But, as a QoI issue, it is permitted and compilers should.

                                      karolherbst@chaos.socialK This user is from outside of this forum
                                      karolherbst@chaos.socialK This user is from outside of this forum
                                      karolherbst@chaos.social
                                      wrote last edited by
                                      #18

                                      @david_chisnall @jann at least C23 fixes one part of this by requiring two's complement for integers.

                                      But also, I just wished C would mandate that constants are just assumed to be of the "expected" type, because in 99.999999% of all cases a programmer really meant the obvious thing with "uint64_t x = 1 << 31".

                                      But I guess we'll just keep those horrible semantics C has in a couple of areas, because nobody want to fix those things, because "it could break things".

                                      blp@framapiaf.orgB 1 Reply Last reply
                                      0
                                      • pavel@social.kernel.orgP pavel@social.kernel.org
                                        @trilader @karolherbst Actually right solution would be uint64_t some_var = ((uint 64_t)1) << 31; AFAICT.
                                        karolherbst@chaos.socialK This user is from outside of this forum
                                        karolherbst@chaos.socialK This user is from outside of this forum
                                        karolherbst@chaos.social
                                        wrote last edited by
                                        #19

                                        @pavel @trilader the actual right solution would be to fix the language 😛

                                        pavel@social.kernel.orgP 1 Reply Last reply
                                        0
                                        • karolherbst@chaos.socialK karolherbst@chaos.social

                                          @david_chisnall @jann at least C23 fixes one part of this by requiring two's complement for integers.

                                          But also, I just wished C would mandate that constants are just assumed to be of the "expected" type, because in 99.999999% of all cases a programmer really meant the obvious thing with "uint64_t x = 1 << 31".

                                          But I guess we'll just keep those horrible semantics C has in a couple of areas, because nobody want to fix those things, because "it could break things".

                                          blp@framapiaf.orgB This user is from outside of this forum
                                          blp@framapiaf.orgB This user is from outside of this forum
                                          blp@framapiaf.org
                                          wrote last edited by
                                          #20

                                          @karolherbst @david_chisnall @jann That particular change would break a lot!

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups