Inconsistent split Behavior in Python

Saturday, November 05, 2011.

Here's a futile but cathartic bug report I filed against Python recently.

In Python, string.split and re.split both take an optional argument that limits the number of splits that are done. This is unlike Perl's split builtin, which limits the number of pieces. But it makes sense I guess, and consistency between the two languages is not something I'd necessarily expect.

However, consistency within a language...a reasonable expectation, no?

The inconsistency lies in how the string.split and re.split handle the edge cases of "do an unlimited number of splits" and "don't do any splits." The two agree that "unlimited splits" is the default. They don't agree on how to interpret the value of an explicit maxsplit parameter.

maxsplit=0 maxsplit=-1
string.split no splits unlimited splits
re.split unlimited splits no splits

I think string.split is doing the sensible thing here.

Of course, the "bug" has zero chance of being fixed at this point. I pretty much just filed it to create a search result for others similarly bitten, annoyed, or both.

Posted by Alan on Saturday, November 05, 2011.