Regex spiffification: retreat and regroup

By Filip Salomonsson; published on October 03, 2006.

Uhm, where was I on the spiffification issue?

Right.

Operations on regex objects with strings as the "other" operand didn't seem very good, and operations on two regex objects with different flags set can't easily be handled consistently.

That pretty much leaves us with operations on two compiled regex objects. Is the potential cosmetic benefit of

a = re.compile("A")
b = re.compile("B")
a_or_b = a | b

really so much better than

a = "A"
b = "B"
a_or_b = re.compile("(?:%s)|(?:%s)" % (a, b))

which works right out of the box?

Well, there is the slight benefit that a and b in the first scenario can also be used separately - they're compiled patterns already. No need for a string holding the pattern and a compiled counterpart.

To eliminate the need for that separate string, all we need is to add a string representation to the regex objects. I've got that:

>>> a = re.compile("(a|b)+cd*")
>>> print a
(?:(a|b)+cd*)

Perhaps this is all that is really needed? (Though I'd gladly throw in a better __repr__ just because it's pretty.)

Pause for thought.

Next up: just do it