Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. Are there any regex implementations out there that support intersection and complement operators in the regex syntax itself?

Are there any regex implementations out there that support intersection and complement operators in the regex syntax itself?

Scheduled Pinned Locked Moved Uncategorized
8 Posts 3 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • bradlarsen@infosec.exchangeB This user is from outside of this forum
    bradlarsen@infosec.exchangeB This user is from outside of this forum
    bradlarsen@infosec.exchange
    wrote last edited by
    #1

    Are there any regex implementations out there that support intersection and complement operators in the regex syntax itself?

    (These operators appear to have been omitted in regular expressions by historical accident, tracing back to a paper from Kleene in 1951, where he skipped them because they added no expressive power to the formalism he was working on)

    pervognsen@mastodon.socialP d6@merveilles.townD 2 Replies Last reply
    0
    • bradlarsen@infosec.exchangeB bradlarsen@infosec.exchange

      Are there any regex implementations out there that support intersection and complement operators in the regex syntax itself?

      (These operators appear to have been omitted in regular expressions by historical accident, tracing back to a paper from Kleene in 1951, where he skipped them because they added no expressive power to the formalism he was working on)

      pervognsen@mastodon.socialP This user is from outside of this forum
      pervognsen@mastodon.socialP This user is from outside of this forum
      pervognsen@mastodon.social
      wrote last edited by
      #2

      @bradlarsen It's not just by historical accident. Intersection/complement blows up the state space of even an NFA-based matcher. https://cstheory.stackexchange.com/questions/52062/intersection-non-emptiness-problem-over-regular-expressions-and-nfa

      pervognsen@mastodon.socialP bradlarsen@infosec.exchangeB 2 Replies Last reply
      0
      • pervognsen@mastodon.socialP pervognsen@mastodon.social

        @bradlarsen It's not just by historical accident. Intersection/complement blows up the state space of even an NFA-based matcher. https://cstheory.stackexchange.com/questions/52062/intersection-non-emptiness-problem-over-regular-expressions-and-nfa

        pervognsen@mastodon.socialP This user is from outside of this forum
        pervognsen@mastodon.socialP This user is from outside of this forum
        pervognsen@mastodon.social
        wrote last edited by
        #3

        @bradlarsen Although if you're just trying to match top-level intersections or complements against particular inputs, you can run different DFAs/NFAs in parallel on the same input and combine their accept/reject results with the appropriate boolean operator.

        bradlarsen@infosec.exchangeB 1 Reply Last reply
        0
        • bradlarsen@infosec.exchangeB bradlarsen@infosec.exchange

          Are there any regex implementations out there that support intersection and complement operators in the regex syntax itself?

          (These operators appear to have been omitted in regular expressions by historical accident, tracing back to a paper from Kleene in 1951, where he skipped them because they added no expressive power to the formalism he was working on)

          d6@merveilles.townD This user is from outside of this forum
          d6@merveilles.townD This user is from outside of this forum
          d6@merveilles.town
          wrote last edited by
          #4

          @bradlarsen i have a scala implementation that does this: https://github.com/non/antimirov

          the real goal was being able to test regexes for equality/inclusion (i.e. subset/superset). it uses partial derivatives of regular expressions to try to do this efficiently. it does fairly well most of the time but in the worst case it is very slow

          bradlarsen@infosec.exchangeB 1 Reply Last reply
          0
          • pervognsen@mastodon.socialP pervognsen@mastodon.social

            @bradlarsen Although if you're just trying to match top-level intersections or complements against particular inputs, you can run different DFAs/NFAs in parallel on the same input and combine their accept/reject results with the appropriate boolean operator.

            bradlarsen@infosec.exchangeB This user is from outside of this forum
            bradlarsen@infosec.exchangeB This user is from outside of this forum
            bradlarsen@infosec.exchange
            wrote last edited by
            #5

            @pervognsen right! At the top level, it's straightforward to use boolean operations in the host language, possibly with multiple calls to regex operations, to approximate complement / intersection. No change in algorithmic complexity doing that

            1 Reply Last reply
            1
            0
            • R relay@relay.infosec.exchange shared this topic
            • pervognsen@mastodon.socialP pervognsen@mastodon.social

              @bradlarsen It's not just by historical accident. Intersection/complement blows up the state space of even an NFA-based matcher. https://cstheory.stackexchange.com/questions/52062/intersection-non-emptiness-problem-over-regular-expressions-and-nfa

              bradlarsen@infosec.exchangeB This user is from outside of this forum
              bradlarsen@infosec.exchangeB This user is from outside of this forum
              bradlarsen@infosec.exchange
              wrote last edited by
              #6

              @pervognsen is that page discussing a slightly different problem though? That PSPACE-complete problem discussed there is to determine if the intersection of DFAs is empty. This seems like a stronger property than whether a particular input string is accepted by all the DFAs (how you might imagine implementing intersection in a DFA-based regex engine). Or am I overlooking something?

              pervognsen@mastodon.socialP 1 Reply Last reply
              0
              • bradlarsen@infosec.exchangeB bradlarsen@infosec.exchange

                @pervognsen is that page discussing a slightly different problem though? That PSPACE-complete problem discussed there is to determine if the intersection of DFAs is empty. This seems like a stronger property than whether a particular input string is accepted by all the DFAs (how you might imagine implementing intersection in a DFA-based regex engine). Or am I overlooking something?

                pervognsen@mastodon.socialP This user is from outside of this forum
                pervognsen@mastodon.socialP This user is from outside of this forum
                pervognsen@mastodon.social
                wrote last edited by
                #7

                @bradlarsen You run into the same problem with intersection/complement if you try to "internalize" the product construction as a single DFA/NFA the way we do with the other regular expression operations. Once you have a single DFA/NFA you can check its equivalence to the empty set in linear time and space by searching for a path from the initial states to accepting states.

                1 Reply Last reply
                0
                • d6@merveilles.townD d6@merveilles.town

                  @bradlarsen i have a scala implementation that does this: https://github.com/non/antimirov

                  the real goal was being able to test regexes for equality/inclusion (i.e. subset/superset). it uses partial derivatives of regular expressions to try to do this efficiently. it does fairly well most of the time but in the worst case it is very slow

                  bradlarsen@infosec.exchangeB This user is from outside of this forum
                  bradlarsen@infosec.exchangeB This user is from outside of this forum
                  bradlarsen@infosec.exchange
                  wrote last edited by
                  #8

                  @d6 how slow is "very slow"? Would it realistically be able to handle equality testing of small variations to patterns like this?

                  https://github.com/praetorian-inc/noseyparker/blob/2e6e7f36ce36619852532bbe698d8cb7a26d2da7/crates/noseyparker/data/default/builtin/rules/phpmailer.yml#L7-L12

                  1 Reply Last reply
                  1
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups