That is the most beautiful bit of RE I have seen in ages. Thank you.
I always do one-off RE transformations of text in vi or even just sed, but this alone is reason enough to start writing my more complicated RE one-offs in Python.
This is an excellent overview of how Python is usually written by people who write (and read) a lot of it.
I hardly ever see coding conventions documents that do such a good job of capturing the popular conventions without inserting quirky personal preferences. I clicked through expecting something to point and laugh at, but was pleasantly surprised!
Awww, I like 3! 3 makes the tab levels very distinct, without wasting space. Of course it's just a matter of personal preference. But, is 4 really a widespread standard in the Python world?
I am so sad to see this. Thanks for posting it, though. Maybe I'll...ghhk..ekkk...switch. I went through the entire PEP and found that I'm following every other convention but two (multiple imports on one line, and I seldom write docstrings).
Good point. I was only thinking of sequence types, and failed to see that it would break with strings. I also somehow forgot to include "not" in the if statements! Pretty sad for a first post...
Most of the builtins have a corresponding __magic_attribute__ for objects that support it: bool -> foo.__bool__, len -> foo.__len__, iter -> foo.__iter__. This is usually because some syntactic sugar or other built-in function/method relies on it being there. Example: "for x in foo" relies on foo having __iter__ (or an equivalent, maybe).
bool(x) may sometimes do more than x is None. If x is not None, then truth value is determined by calling __nonzero__ or __len__. For an ordinary sequence type that's fine, but some years ago I had code like this:
if not self.db: self.db = bsddb.hashopen(...).
I just couldn't find out why my process spent valuable seconds apparently reading in the entire bsddb database into memory at random times -- but that's because bool(self.db) above turned into len(self.db.keys()) != 0
Noooo. That first one is backwards. Extraneous spaces annoy me to no end. Makes it a pain to search for things too.
On the other hand, using newlines to break things up at commas for example, is great. But that's not applicable here.
> class fooclass: ... || class Fooclass(object): ...
Is this a joke?
> d = dict() || frequences = {}
How can you say that longer names are always better? (Is that part of the message here?) Usually most variables are throw-away so using short names should be the most common case. Similar to mathematics.
> # Use iter* methods when possible
This should mention that thread-safety is probably the most directly relevant use case for the non-iterable methods.
Though I am a lowly Python noob, I'll tell you it's generally very bad practice to use a mutable as a default argument value. The reason is that a function's default is only ever initialized once. Case in point:
Unless you 1) actually want the appearance of a 'static' local variable or 2) are really careful to make a copy of the mutable before messing around with it, you'll get yourself into trouble.
...unless you specifically want f to mutate its caller's variables and have documented it as such.
Generally, I try to avoid mutating objects unless a.) I just created the object within the function or b.) it's specifically intended as a "long lived" data structure, i.e. something that survives multiple user interactions. For everything else, I try to use the non-mutating operations (slicing, concatenation, list comprehensions) or make an explicit copy of the argument.
> class fooclass: ... || class Fooclass(object): ...
Is this a joke?
No... it is generally recommended that class names are capitalized, and any classes you create are supposed to inherit from object. This mainly comes into effect when using super().
In Python 3K, I'm pretty sure that all classes will inherit from object without having to explicitly say it.
More importantly property getters will silently do nothing in old-style classes and just let the attribute setting through without calling the setter, while getters work fine.
And look how complicated they make it. Sometimes spaces, sometimes not, how confusing. And tell me that their "Use spaces around arithmetic operators" example doesn't make you want to puke:
i = i + 1
submitted += 1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)
I do. I think it looks a lot better and more clear. It certainly looks a lot like what I would write down on paper, and that is one of the inherent qualities of Python in general.
The one place that I don't follow the guideline (unless it's the guideline and I just don't know it... it rarely comes up, so I haven't bothered to check) is on array indices. I do write:
Same here. I conceptualize it as part of a code compactness strategy: array indices are part of the same item, and thus I use syntax (like yours) that suggests inlining.
Biased but interesting. Seems to me that there are only two examples of 'strange' here the 'while(<>)' and the regex. Interesting thing about regex---python itself uses perl compatible regex, so throw that item on the trash heap. As for the other its shorthand for grab a file from the command line, open it, Feed each line(\n terminated) to $_(could just as easily been a declared variable, but $_ is always there) and parse it with the regex. While you're at it, throw away the first and second parts, set $_ to the third part and continue. After that get another string from the file. All of this is covered in Wrox's Perl for Beginners as well as else where. I've done reasonable projects in both languages. Don't have anything bad to say about Python--- just wonder what all of the 'have to prove my language is better' crap is coming from...