I thought it could be nice if you could use a string (as opposed to a compiled regular expression) as the second operand of a concatenation or union operation, but I left that for later.
Keeping the implementation minimal has several benefits. One being that my current implementation already does allow that too:
>>> a = re.compile(r"a|b") >>> a + "c|d" <Regex object for '(?:a|b)(?:c|d)'>
This is probably not a good idea, though. Consider this example:
>>> a = re.compile("A") >>> a | "B|C" + "X|Y" <Regex object for 'A|B|CX|Y'>
+ operator takes precedence over
|, the second and third operands are concatenated as strings, and the expression becomes
a | "B|CX|Y" instead of
a | "(B|C)(X|Y)". That kind of subtleties is not something I'd want to wrestle in real code.
Something else that came for free (as in beer, lunch, Tibet and/or Willy) was this thought: what, exactly, will the benefit of adding
| to regex objects if you still have to compile the subexpressions first? Why not just keep them as strings and do things like
re.compile("%s|%s%s" % (a, b, c)) instead of
a | b + c?
I'll let that sink in for a bit.
Next up: no, less.