Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Python best practices (fantascienza.net)
76 points by alexk on July 8, 2008 | hide | past | favorite | 35 comments


Most of that stuff is a bit silly but I learned python before it supported verbose regular expressions so I found this example useful.

  finder = re.compile(r"""
    ^ \s*         # start at beginning+ opt spaces
    ( [\[\]] )    # Group 1: opening bracket
    \s*           # optional spaces
    ( [-+]? \d+ ) # Group 2: first number
    \s* , \s*     # opt spaces+ comma+ opt spaces
    ( [-+]? \d+ ) # Group 3: second number
    \s*           # opt spaces
    ( [\[\]] )    # Group 4: closing bracket
    \s* $         # opt spaces+ end at the end
    """, flags=re.VERBOSE)


I love regular expressions as much as anybody, but I never understood why people wrote code like that.

r"\s" is repeated 6 times in there, each time with a comment. Is there a reason nobody uses DRY when writing a regexp?

    spaces = r"\s*"
    number = r"([-+]?\d+)"
    bracket = r"([\[\]])"
    finder = re.compile(spaces.join(["^", bracket, number, ",", number, bracket, "$"]))


That is the most beautiful bit of RE I have seen in ages. Thank you.

I always do one-off RE transformations of text in vi or even just sed, but this alone is reason enough to start writing my more complicated RE one-offs in Python.


This is an excellent overview of how Python is usually written by people who write (and read) a lot of it.

I hardly ever see coding conventions documents that do such a good job of capturing the popular conventions without inserting quirky personal preferences. I clicked through expecting something to point and laugh at, but was pleasantly surprised!


Always use 4 spaces as indent

My god how I wish everyone did this... would make life so much easier.


Awww, I like 3! 3 makes the tab levels very distinct, without wasting space. Of course it's just a matter of personal preference. But, is 4 really a widespread standard in the Python world?


"is 4 really a widespread standard in the Python world?" yes. http://www.python.org/dev/peps/pep-0008/


I am so sad to see this. Thanks for posting it, though. Maybe I'll...ghhk..ekkk...switch. I went through the entire PEP and found that I'm following every other convention but two (multiple imports on one line, and I seldom write docstrings).


What's wrong with 3?

IMO, indent is personal preference, and as long as you're 100% consistent, and stay away from the devil-spawn tab character, you're ok.


I read a study that showed that 2,3, or 4-space indents were equivalent for readability. Therefore, I always use 2.


> if x is None: ... > if items is None: ...

None's truth value is false, so the above are equivalent to:

  if x: ...
  if items: ...
Empty sequences and mappings are also considered false, so you don't need to

  if len(items): ...
Instead, you should

  if items: ...
On a side note, does it annoy the hell out of anyone else that a sequence's length is len(foo) instead of foo.length() (or size, count, etc)?


There's a reason why "if x is None" is idiomatic when "if not x" is shorter:

  >>> x = ''
  >>> if x is None:
  ...   print 'x is None'
  ...
  >>> if not x:
  ...   print 'x is empty'
  ...
  x is empty


Good point. I was only thinking of sequence types, and failed to see that it would break with strings. I also somehow forgot to include "not" in the if statements! Pretty sad for a first post...


Most of the builtins have a corresponding __magic_attribute__ for objects that support it: bool -> foo.__bool__, len -> foo.__len__, iter -> foo.__iter__. This is usually because some syntactic sugar or other built-in function/method relies on it being there. Example: "for x in foo" relies on foo having __iter__ (or an equivalent, maybe).


bool(x) may sometimes do more than x is None. If x is not None, then truth value is determined by calling __nonzero__ or __len__. For an ordinary sequence type that's fine, but some years ago I had code like this:

if not self.db: self.db = bsddb.hashopen(...).

I just couldn't find out why my process spent valuable seconds apparently reading in the entire bsddb database into memory at random times -- but that's because bool(self.db) above turned into len(self.db.keys()) != 0


My comments:

> x=5 || x = 5

Noooo. That first one is backwards. Extraneous spaces annoy me to no end. Makes it a pain to search for things too.

On the other hand, using newlines to break things up at commas for example, is great. But that's not applicable here.

> class fooclass: ... || class Fooclass(object): ...

Is this a joke?

> d = dict() || frequences = {}

How can you say that longer names are always better? (Is that part of the message here?) Usually most variables are throw-away so using short names should be the most common case. Similar to mathematics.

> # Use iter* methods when possible

This should mention that thread-safety is probably the most directly relevant use case for the non-iterable methods.

> # coding: latin

Better to use this:

     #!/usr/bin/env python
     # -*- coding: UTF-8 -*-


Though I am a lowly Python noob, I'll tell you it's generally very bad practice to use a mutable as a default argument value. The reason is that a function's default is only ever initialized once. Case in point:

  >>> def f(l=[]):
  ...     l.append(0)
  ...     print l
  ... 
  >>> f()
  [0]
  >>> f()
  [0, 0]
  >>> f()
  [0, 0, 0]
Unless you 1) actually want the appearance of a 'static' local variable or 2) are really careful to make a copy of the mutable before messing around with it, you'll get yourself into trouble.


I always took that as a case against mutating arguments, not against defaults. You still have the same problem if you pass in an argument:

>>> MY_CONSTANT = ['foo', 'bar', 'baz'] >>> def f(l): ... l.append(0) ... print l ... >>> f() ['foo', 'bar', 'baz', 0] >>> f() ['foo', 'bar', 'baz', 0, 0]

I would've rewritten f as:

  def f(l=[]):
      print l + [0]
...unless you specifically want f to mutate its caller's variables and have documented it as such.

Generally, I try to avoid mutating objects unless a.) I just created the object within the function or b.) it's specifically intended as a "long lived" data structure, i.e. something that survives multiple user interactions. For everything else, I try to use the non-mutating operations (slicing, concatenation, list comprehensions) or make an explicit copy of the argument.


> class fooclass: ... || class Fooclass(object): ...

Is this a joke?

No... it is generally recommended that class names are capitalized, and any classes you create are supposed to inherit from object. This mainly comes into effect when using super().

In Python 3K, I'm pretty sure that all classes will inherit from object without having to explicitly say it.


More importantly property getters will silently do nothing in old-style classes and just let the attribute setting through without calling the setter, while getters work fine.


> Extraneous spaces annoy me to end.

It is recommended in the "official" python style guide to surround the assignment operator with a single space: http://www.python.org/dev/peps/pep-0008/


And look how complicated they make it. Sometimes spaces, sometimes not, how confusing. And tell me that their "Use spaces around arithmetic operators" example doesn't make you want to puke:

          i = i + 1
          submitted += 1
          x = x * 2 - 1
          hypot2 = x * x + y * y
          c = (a + b) * (a - b)
Who writes like that?? I very strongly disagree.



I do. I think it looks a lot better and more clear. It certainly looks a lot like what I would write down on paper, and that is one of the inherent qualities of Python in general.


The one place that I don't follow the guideline (unless it's the guideline and I just don't know it... it rarely comes up, so I haven't bothered to check) is on array indices. I do write:

  a[i+1]


Same here. I conceptualize it as part of a code compactness strategy: array indices are part of the same item, and thus I use syntax (like yours) that suggests inlining.


Same here, for the same reason.


I do. I find that code with the spaces around assignment and arithmetic operators is way easier to read.


I do all of these (even in interactive shell) except I have a weak spot for this sort of notation:

    i+=2


I think he was saying use the curly braces instead of the dict() constructor.


Instead of:

  tot = x + y
Better:

  sum_ = x + y
And I can't see nothing wrong with:

  freqs = {}
  for c in "abracadabra":
      try:
          freqs[c] += 1
      except KeyError:
          freqs[c] = 1
Or:

  nested = [[1, 2, 3], [4], [5, 6]]
  flattened = sum(nested, [])
It is worth to mention dict.setdefault():

  indices = {}
  for i, c in enumerate("abracadabra"):
      indices.setdefault(c, []).append(i)


Just in time for the start of our python project



Biased but interesting. Seems to me that there are only two examples of 'strange' here the 'while(<>)' and the regex. Interesting thing about regex---python itself uses perl compatible regex, so throw that item on the trash heap. As for the other its shorthand for grab a file from the command line, open it, Feed each line(\n terminated) to $_(could just as easily been a declared variable, but $_ is always there) and parse it with the regex. While you're at it, throw away the first and second parts, set $_ to the third part and continue. After that get another string from the file. All of this is covered in Wrox's Perl for Beginners as well as else where. I've done reasonable projects in both languages. Don't have anything bad to say about Python--- just wonder what all of the 'have to prove my language is better' crap is coming from...


What are you talking about?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: