Python best practices

bayareaguy · on July 9, 2008

Most of that stuff is a bit silly but I learned python before it supported verbose regular expressions so I found this example useful.

  finder = re.compile(r"""
    ^ \s*         # start at beginning+ opt spaces
    ( [\[\]] )    # Group 1: opening bracket
    \s*           # optional spaces
    ( [-+]? \d+ ) # Group 2: first number
    \s* , \s*     # opt spaces+ comma+ opt spaces
    ( [-+]? \d+ ) # Group 3: second number
    \s*           # opt spaces
    ( [\[\]] )    # Group 4: closing bracket
    \s* $         # opt spaces+ end at the end
    """, flags=re.VERBOSE)

ken · on July 9, 2008

I love regular expressions as much as anybody, but I never understood why people wrote code like that.

r"\s" is repeated 6 times in there, each time with a comment. Is there a reason nobody uses DRY when writing a regexp?

    spaces = r"\s*"
    number = r"([-+]?\d+)"
    bracket = r"([\[\]])"
    finder = re.compile(spaces.join(["^", bracket, number, ",", number, bracket, "$"]))

delackner · on July 10, 2008

That is the most beautiful bit of RE I have seen in ages. Thank you.

I always do one-off RE transformations of text in vi or even just sed, but this alone is reason enough to start writing my more complicated RE one-offs in Python.

sah · on July 9, 2008

This is an excellent overview of how Python is usually written by people who write (and read) a lot of it.

I hardly ever see coding conventions documents that do such a good job of capturing the popular conventions without inserting quirky personal preferences. I clicked through expecting something to point and laugh at, but was pleasantly surprised!

tdavis · on July 9, 2008

Always use 4 spaces as indent

My god how I wish everyone did this... would make life so much easier.

bkovitz · on July 9, 2008

Awww, I like 3! 3 makes the tab levels very distinct, without wasting space. Of course it's just a matter of personal preference. But, is 4 really a widespread standard in the Python world?

danohuiginn · on July 10, 2008

"is 4 really a widespread standard in the Python world?" yes. http://www.python.org/dev/peps/pep-0008/

bkovitz · on July 10, 2008

I am so sad to see this. Thanks for posting it, though. Maybe I'll...ghhk..ekkk...switch. I went through the entire PEP and found that I'm following every other convention but two (multiple imports on one line, and I seldom write docstrings).

spydez · on July 9, 2008

What's wrong with 3?

IMO, indent is personal preference, and as long as you're 100% consistent, and stay away from the devil-spawn tab character, you're ok.

cheponis · on July 10, 2008

I read a study that showed that 2,3, or 4-space indents were equivalent for readability. Therefore, I always use 2.

yogipatel · on July 9, 2008

> if x is None: ... > if items is None: ...

None's truth value is false, so the above are equivalent to:

  if x: ...
  if items: ...

Empty sequences and mappings are also considered false, so you don't need to

  if len(items): ...

Instead, you should

  if items: ...

On a side note, does it annoy the hell out of anyone else that a sequence's length is len(foo) instead of foo.length() (or size, count, etc)?

sah · on July 9, 2008

There's a reason why "if x is None" is idiomatic when "if not x" is shorter:

  >>> x = ''
  >>> if x is None:
  ...   print 'x is None'
  ...
  >>> if not x:
  ...   print 'x is empty'
  ...
  x is empty

yogipatel · on July 9, 2008

Good point. I was only thinking of sequence types, and failed to see that it would break with strings. I also somehow forgot to include "not" in the if statements! Pretty sad for a first post...

etal · on July 9, 2008

Most of the builtins have a corresponding __magic_attribute__ for objects that support it: bool -> foo.__bool__, len -> foo.__len__, iter -> foo.__iter__. This is usually because some syntactic sugar or other built-in function/method relies on it being there. Example: "for x in foo" relies on foo having __iter__ (or an equivalent, maybe).

Erwin · on July 9, 2008

bool(x) may sometimes do more than x is None. If x is not None, then truth value is determined by calling __nonzero__ or __len__. For an ordinary sequence type that's fine, but some years ago I had code like this:

if not self.db: self.db = bsddb.hashopen(...).

I just couldn't find out why my process spent valuable seconds apparently reading in the entire bsddb database into memory at random times -- but that's because bool(self.db) above turned into len(self.db.keys()) != 0

attack · on July 8, 2008

My comments:

> x=5 || x = 5

Noooo. That first one is backwards. Extraneous spaces annoy me to no end. Makes it a pain to search for things too.

On the other hand, using newlines to break things up at commas for example, is great. But that's not applicable here.

> class fooclass: ... || class Fooclass(object): ...

Is this a joke?

> d = dict() || frequences = {}

How can you say that longer names are always better? (Is that part of the message here?) Usually most variables are throw-away so using short names should be the most common case. Similar to mathematics.

> # Use iter* methods when possible

This should mention that thread-safety is probably the most directly relevant use case for the non-iterable methods.

> # coding: latin

Better to use this:

     #!/usr/bin/env python
     # -*- coding: UTF-8 -*-

aston · on July 9, 2008

Though I am a lowly Python noob, I'll tell you it's generally very bad practice to use a mutable as a default argument value. The reason is that a function's default is only ever initialized once. Case in point:

  >>> def f(l=[]):
  ...     l.append(0)
  ...     print l
  ... 
  >>> f()
  [0]
  >>> f()
  [0, 0]
  >>> f()
  [0, 0, 0]

Unless you 1) actually want the appearance of a 'static' local variable or 2) are really careful to make a copy of the mutable before messing around with it, you'll get yourself into trouble.

nostrademons · on July 9, 2008

I always took that as a case against mutating arguments, not against defaults. You still have the same problem if you pass in an argument:

>>> MY_CONSTANT = ['foo', 'bar', 'baz'] >>> def f(l): ... l.append(0) ... print l ... >>> f() ['foo', 'bar', 'baz', 0] >>> f() ['foo', 'bar', 'baz', 0, 0]

I would've rewritten f as:

  def f(l=[]):
      print l + [0]

...unless you specifically want f to mutate its caller's variables and have documented it as such.

Generally, I try to avoid mutating objects unless a.) I just created the object within the function or b.) it's specifically intended as a "long lived" data structure, i.e. something that survives multiple user interactions. For everything else, I try to use the non-mutating operations (slicing, concatenation, list comprehensions) or make an explicit copy of the argument.

BrandonM · on July 9, 2008

> class fooclass: ... || class Fooclass(object): ...

Is this a joke?

No... it is generally recommended that class names are capitalized, and any classes you create are supposed to inherit from object. This mainly comes into effect when using super().

In Python 3K, I'm pretty sure that all classes will inherit from object without having to explicitly say it.

Erwin · on July 9, 2008

More importantly property getters will silently do nothing in old-style classes and just let the attribute setting through without calling the setter, while getters work fine.

DocSavage · on July 9, 2008

> Extraneous spaces annoy me to end.

It is recommended in the "official" python style guide to surround the assignment operator with a single space: http://www.python.org/dev/peps/pep-0008/

attack · on July 9, 2008

And look how complicated they make it. Sometimes spaces, sometimes not, how confusing. And tell me that their "Use spaces around arithmetic operators" example doesn't make you want to puke:

          i = i + 1
          submitted += 1
          x = x * 2 - 1
          hypot2 = x * x + y * y
          c = (a + b) * (a - b)

Who writes like that?? I very strongly disagree.

sah · on July 9, 2008

"Who writes like that??"

The authors of:

BitTorrent: https://develop.participatoryculture.org/trac/democracy/brow...

Django: http://code.djangoproject.com/browser/django/trunk/django/ut...

Pylons: http://pylonshq.com/project/pylonshq/browser/pylons/util.py

Twisted: http://twistedmatrix.com/trac/browser/trunk/twisted/python/u...

BrandonM · on July 9, 2008

I do. I think it looks a lot better and more clear. It certainly looks a lot like what I would write down on paper, and that is one of the inherent qualities of Python in general.

BrandonM · on July 9, 2008

The one place that I don't follow the guideline (unless it's the guideline and I just don't know it... it rarely comes up, so I haven't bothered to check) is on array indices. I do write:

  a[i+1]

albertcardona · on July 9, 2008

Same here. I conceptualize it as part of a code compactness strategy: array indices are part of the same item, and thus I use syntax (like yours) that suggests inlining.

bkovitz · on July 9, 2008

Same here, for the same reason.

bkovitz · on July 9, 2008

I do. I find that code with the spaces around assignment and arithmetic operators is way easier to read.

thorax · on July 9, 2008

I do all of these (even in interactive shell) except I have a weak spot for this sort of notation:

    i+=2

pmorici · on July 8, 2008

I think he was saying use the curly braces instead of the dict() constructor.

d0mine · on July 10, 2008

Instead of:

  tot = x + y

Better:

  sum_ = x + y

And I can't see nothing wrong with:

  freqs = {}
  for c in "abracadabra":
      try:
          freqs[c] += 1
      except KeyError:
          freqs[c] = 1

Or:

  nested = [[1, 2, 3], [4], [5, 6]]
  flattened = sum(nested, [])

It is worth to mention dict.setdefault():

  indices = {}
  for i, c in enumerate("abracadabra"):
      indices.setdefault(c, []).append(i)

hooande · on July 9, 2008

Just in time for the start of our python project

mleonhard · on July 9, 2008

VERY COOL! http://psyco.sourceforge.net/introduction.html

hsmyers · on July 8, 2008

Biased but interesting. Seems to me that there are only two examples of 'strange' here the 'while(<>)' and the regex. Interesting thing about regex---python itself uses perl compatible regex, so throw that item on the trash heap. As for the other its shorthand for grab a file from the command line, open it, Feed each line(\n terminated) to $_(could just as easily been a declared variable, but $_ is always there) and parse it with the regex. While you're at it, throw away the first and second parts, set $_ to the third part and continue. After that get another string from the file. All of this is covered in Wrox's Perl for Beginners as well as else where. I've done reasonable projects in both languages. Don't have anything bad to say about Python--- just wonder what all of the 'have to prove my language is better' crap is coming from...

BrandonM · on July 9, 2008

What are you talking about?