Regex spiffification: free stuff!
I thought it could be nice if you could use a string (as opposed to a compiled regular expression) as the second operand of a concatenation or union operation, but I left that for later.
Keeping the implementation minimal has several benefits. One being that my current implementation already does allow that too:
>>> a = re.compile(r"a|b")
>>> a + "c|d"
<Regex object for '(?:a|b)(?:c|d)'>
This is probably not a good idea, though. Consider this example:
>>> a = re.compile("A")
>>> a | "B|C" + "X|Y"
<Regex object for 'A|B|CX|Y'>
Since the +
operator takes precedence over |
, the second and third operands are concatenated as strings, and the expression becomes a | "B|CX|Y"
instead of a | "(B|C)(X|Y)"
. That kind of subtleties is not something I'd want to wrestle in real code.
Something else that came for free (as in beer, lunch, Tibet and/or Willy) was this thought: what, exactly, will the benefit of adding +
and |
to regex objects if you still have to compile the subexpressions first? Why not just keep them as strings and do things like re.compile("%s|%s%s" % (a, b, c))
instead of a | b + c
?
I'll let that sink in for a bit.
Next up: no, less.