Possibly the most interesting anti-pattern I saw was: a_list_of_words = "my list...

oddthink · on July 9, 2014

I do that all the time in the interpreter, especially when slicing pandas DataFrame objects, e.g.:

    df_subset = df['date buyer nwidgets'.split()]

That is far easier to type than the explicit list, with all its punctuation. Now, it's definitely weird that they did a `split(" ")` rather than just using the default, but the idea is the same.

I do try to strip stuff like that out before I put it into a script, replacing it with the explicit list, but I'm never sure if that actually improves anything. It's not as if the explicit list is any easier to read.

klibertp · on July 9, 2014

It's not weird, it's wrong:

    In [1]: "a    string     r".split(" ")
    Out[1]: ['a', '', '', '', 'string', '', '', '', '', 'r']

    In [2]: "a    string     r".split()
    Out[2]: ['a', 'string', 'r']

glomph · on July 9, 2014

In their case though that wouldn't have been a problem as each word was split on a single space.

manoDev · on July 9, 2014

Split without argument is equivalent to:

    >>> filter(None, " quick   hack for  split".split(" "))
    ['quick', 'hack', 'for', 'split']

klibertp · on July 9, 2014

Yeah, or re.split(r'\s+', ...), or something - the issue here is that you need to know about this.

From the comment a few levels up I understood that the code which used the str.split with " " argument didn't signify that someone who written it knew about its semantics. If he did and it was really what was intended then ofc it's completely ok, but if not, it can easily lead to bugs.

For example, if the user is required to input several ints separated with whitespace, this:

    map(int, input_str.split())

will rise only in expected cases, while this:

    map(int, input_str.split(" "))

can lead to rejecting correct input just because someone pressed space twice. It's very frustrating for the user, too, because whitespace are hard to spot visually.

So, I don't know if this qualifies as antipattern, but I think if I saw .split(" ") instead of .split() in the code I'd at the very least expect the comment explaining why it's used.

blossoms · on July 9, 2014

I don't mean to be pedantic, but a list (I am assuming df is a list) requires an int.

(That sentence I wrote about using hashable types need not apply, sorry!)

If you ran that code, you would get this error:

    TypeError: list indices must be integers, not list

gipp · on July 9, 2014

That's a pandas dataframe (idiomatically denoted df), not a list. It has funky slicing properties, and he's selecting columns of the dataframe in a perfectly valid way.

bluecalm · on July 9, 2014

This is actually useful: you may want to experiment with different list_of_words in the future and typing the words between [" ", " ", " "] is time consuming. It's also less readable.

coldtea · on July 9, 2014

How is this a real "anti-pattern"?

Inefficent, or bizarre way to do it maybe.

Anti-pattern is supposed to mean something more, though.

In this case there are no adverse effects and no ambiguity -- so, I guess the programmer was just lazy to construct the list.

james2vegas · on July 9, 2014

perhaps that line was written by someone used to Perl, where they would have had

@a_list_of_words = qw/my list of words/;

there

dozzie · on July 9, 2014

Or rubyst:

  a_list_of_words = %w{my list of words}

rbonvall · on July 10, 2014

You'd be sure if they wrote:

    >>> qw = str.split
    >>> qw('my list of words')
    ['my', 'list', 'of', 'words']

_ondq · on July 9, 2014

Why even use a list here? Tuples are for immutable/constant data.

a_tuple_of_words = ("my", "tuple", "of", "words")

or

a_tuple_of_words = "my", "tuple", "of", "words"

dozzie · on July 9, 2014

...because it's a list? Tuples were supposed to have a structure (at least that's what all the rest of the world thinks of them), so iterating through combination of apples, cars and languages makes no sense whatsoever.

But yes, Python misses entirely the point of tuples, treating them as read-only lists.

http://dozzie.jogger.pl/2014/04/11/python-tuples-the-useless...

pdonis · on July 10, 2014

> Tuples were supposed to have a structure (at least that's what all the rest of the world thinks of them)

No, a structure is, you know, a structure -- what C calls a struct. Python calls it a namedtuple. If some people call it just a tuple, well, that's a difference in terminology, but it doesn't mean Python is confused about the concepts, it's just using terminology you're not used to.

Also, if we're going to be pedantic about the meaning of data types, your blog post is wrong about lists. You say "position in the list doesn't matter", but that means ordering doesn't matter, and an unordered collection of similar objects is a set, not a list. Python makes this distinction clear: a list is ordered, a set is not.

dozzie · on July 10, 2014

> [...] that's a difference in terminology, but it doesn't mean Python is confused about the concepts

No, it menas exactly this. The term "tuple" and its use predates Python. Sorry, no banana.

> [...] your blog post is wrong about lists. You say "position in the list doesn't matter", but that means ordering doesn't matter

Oh, so what's the difference in meaning of element True on position 1 and element True on position 20? Position in list doesn't matter if we're talking about meaning of the elements.

pdonis · on July 10, 2014

> The term "tuple" and its use predates Python.

References, please? And not mathematical references; programming references. C was using the keyword "struct" long before Python to refer to what you are calling a tuple.

> what's the difference in meaning of element True on position 1 and element True on position 20?

The fact that the index is 1 instead of 20. Both elements have the same type, and might well refer to the same property of some sequence of things; but the index being 1 instead of 20 means the element True is describing that property relative to the first item in some sequence, instead of the 20th item. That's why position in the list makes a difference: the ordering of the items, as well as the type of the items, carries information.

(Of course, in Python the list items don't even have to be of the same type; but most uses of Python lists in practice that I've seen do assume that all the elements are "the same kind of thing".)

fjh · on July 10, 2014

> References, please? And not mathematical references; programming references.

ML has had tuples several decades before Python existed.

_ondq · on July 9, 2014

I'm not sure what you're getting at here, or what you are expecting tuples to be like. They can have as much "structure" as you need--they're just a collection.

Lighter-weight, immutable collections have a use case. The code in OP appears to be one where it makes sense. I follow the rule where variables are mutable IFF they need to be mutable.

dozzie · on July 9, 2014

Tuples by pythonists are used as they were mere lists, just immutable. This is clearly displayed by Python's own interface.

For the rest of the world, tuples are not immutable lists. They are tuples, i.e. collections of "objects" that could share nothing about their type. Tuples often are not even iterable! (Erlang, Haskell)

The fact that tuples in Python can have as much structure as one wants is derived from dynamic typing, not from the tuples' nature. The same you could say about Python's lists.

This is a really subtle issue. It takes to know more languages to see it clearly.

maxerickson · on July 9, 2014

Have you looked at named tuples? They shipped in the python standard library sometime in the last few years (they are at least in 3.3) and are clearly intended for storing structured data.

A typical rule of thumb in Python land is that heterogeneous data probably belongs in a tuple, so practice goes a little further than immutable lists.

I think you could improve your demonstration of the usage in the standard library by examining a random selection of usages to try to find out what is typical. But maybe you already looked at more than you talk about in the article (and I understand that this might not be an interesting use of your time).

dozzie · on July 10, 2014

No, I haven't looked at them. Python 2 has them since release 2.6, so it's out of my reach for any practical purpose at the moment (I need to preserve compatibility with Python 2.4).

> A typical rule of thumb in Python land is that heterogeneous data probably belongs in a tuple, so practice goes a little further than immutable lists.

The problem with Python tuples is it's two things mixed: immutable lists and a container for heterogenous data. It's the same situation as JavaScript's objects.

_ondq · on July 9, 2014

If it's so subtle, does it matter? This sounds like you just have a problem with the word "tuple" applied to an object that behaves differently from tuples in a statically-typed language.

Would you feel better if they named it "ImmutableList" instead?

jholman · on July 9, 2014

Can't speak for GP, but I would [feel better with that name].

(Although I agree with you that statically-typed-language-tuples don't seem to make sense in Python.)

But hey... Python's weird choice of how to name the ImmutableList could be worse, right?

For example, someone could be malicious enough to call their general-purpose associative array a "hash", just because a hashmap (note: not a hash) is a good implementation for large associative arrays. Wow, that'd be hilariously misleading, wouldn't it? Good times!

Or imagine someone was silly enough to name their auto-resizing arrays "vectors", even though in all previously existing contexts a "vector" is a sort of thing which absolutely cannot be meaningfully resized/extended. Ha. Think of the tiny cognitive burden placed on generations of future programmers-who-study-math, trying to juggle these two very-similar-but-distinct concepts, multiplied by the number of such future programmers. Amazing practical joke, right?

/rant

wtetzner · on July 10, 2014

Not really that important, but I think map is a better name than associative array.

dozzie · on July 10, 2014

No, it's not a problem with word "tuple" behaving differently from statically-typed language. It's a problem with word "tuple" behaving differently from all the rest of the world.

Yes, I would feel better if it was named "ImmutableList" or any other way that is not misleading about the purpose.

jl6 · on July 9, 2014

I'm sorry, I don't see clearly what a tuple should be. What would be different about Python tuples if they were true tuples?

alextgordon · on July 9, 2014

In most languages you can't usually:

1. Iterate over a tuple

2. Convert a list to a tuple

3. Construct a tuple of a length not known at compile-time

Python allows these because "why not?" but it does break their "one and only one way to do it" rule and confuses beginners a hell of a lot.

There are definitely borderline cases. For instance, should a Vector be a list or a tuple? A Vec3 type is obviously a tuple, but a large Vector destined for BLAS is obviously a list.

dragonwriter · on July 9, 2014

> Python allows these because "why not?"

No, it allows them because the distinction that those restrictions are founded on is only useful in a statically-typed languages, and Python isn't statically typed.

> For instance, should a Vector be a list or a tuple?

A real vector/array should be its own data type (probably implemented in a C, or similar low-level, extension) that happens to implement the interface expected of an indexable, iterable collection, neither a list nor a tuple.

dozzie · on July 10, 2014

> [this] distinction [...] is only useful in a statically-typed languages

...like Erlang.

dragonwriter · on July 10, 2014

Yeah, while snarkily made its a good point that Erlang does make use of it without being statically typed.

There is a deep difference that goes beyond use of tuples in language approach between Python and Erlang here where it comes to types in which Erlang, while dynamically typed, has a deep concern for types in its pattern matching system to make path decisions while Python is very much centered on using dynamic OO techniques -- how objects respond to messages -- to do that.

So I'd still say its the same kind of deep language approach difference at work.

jl6 · on July 10, 2014

So why does Python distinguish between a list and a tuple at all?

dragonwriter · on July 10, 2014

Because the distinction between a mutable list and an immutable list is still meaningful in a dynamic language like python.

hueving · on July 10, 2014

One is mutable, one is not. Performance. Also, look up namedtuple. Useful for returning multiple values.

Peaker · on July 9, 2014

Python tuples are used in both ways.

Even in Haskell, though, people often write all kinds of type-class magic to allow "iterating" over a tuple. For example, a Binary instance over a tuple wants to call "put" on each element.

Haskell's (Oleg's) HList is basically a tuple with iteration/list-like operations.

dragonwriter · on July 9, 2014

The distinction between "tuple" and "immutable list" doesn't make any sense outside of a staticly-typed language, since the only difference is what other values a particular value is type-compatible with.

dozzie · on July 10, 2014

Yes, of course it doesn't. You have just vanished whole Erlang. Or is it statically typed?...

blossoms · on July 9, 2014

NB a tuple of one item only requires a trailing comma, and a tuple of zero items is represented as ()