Ten quirky things about Python
January 2008
Just thought I’d share a bunch of neat (and weird) things I’ve noticed about the Python programming language:
- You can chain comparisons as in
assert 3.14 < pi < 3.15
. It’s a neat equivalent ofassert pi > 3.14 and pi < 3.15
that you can’t do in most other languages.
Ints don’t overflow at 31 (or 32) bits, they just get promoted to longs automatically. And long in Python doesn’t mean 64 bits, it means arbitrarily long (albeit somewhat slower). In fact, it looks like in Python 3000 there won’t even be the int/long distinction.
Default values are only evaluated once, at compile-time, not run-time. Try
def func(A=[]): A.append(42); return A
and the A-list will grow between calls. The Python tutorial has more.When concatenating strings,
''.join(list)
is much faster thanfor x in list: s += x
. In fact, thejoin
is O(N) whereas the+=
is O(N²). There’s been a lot of debate about making this faster, and it looks like it should be faster in Python 2.5, but my tests show otherwise. Any ideas why?The syntax of
print >>file, values
is just plain weird. Not to mention the spacing “features” ofprint
. I’m glad to hear that for Python 3000 they’re making print a function, and one with more sensible habits.You can create a one-element tuple with
(x,)
. Tuples are normally written(x, y, z)
, but if you go(x)
Python sees it as just a parenthesised value.And for all those times you reference methods of integer literals, you can go
(5).__str__
. You’d think it’d be just5.__str__
, but the parser thinks the5.
is a float and then gets stuck.You can use properties instead of getter and setter functions. For example,
serial.baudrate = 19200
can setserial._baud
as well as running some code to set your serial port’s bit rate.An
else:
clause after afor
loop will be executed only if the loop doesn’t end viabreak
. Quite useful for search loops and the like — in other languages you often need an extra test after the loop.You tell me yours. Ha! You thought you were going to escape. :-)
Comments
she 16 Jan 2008, 22:28
i am sure people will come and dont like you for showing quirks, but then again nothing can beat critique in the same spirit like C++ FQAs to C** FAQs ;)
JJM 16 Jan 2008, 22:29
def f1(b):
if b:
return 1
else:
return 0
def f2(b):
return 1 if b else 0
def f3(b):
return b and 1 or 0
def f3bug(b):
return b and 0 or 1
# must use f1 or f2 idiom because bool(0) == False
JJM 16 Jan 2008, 22:31
whitespace? quirk? nah
RMD 16 Jan 2008, 23:14
- Bool values are actually strange integers.
>>> a==b
True
>>> a+b
2
So you think a and b are ones? Ha, ha!
>>> str(a)==str(b)
False < ---- quirk!!! :-)
>>> a
1
>>> b
True
… surprise!
- It gets even quirkier when you mix bools and ints as dictionary keys:
>>> a,b = {},{}
>>> a[True], b[1] = 1, 1
>>> a[1] , b[True] = 42, 42
>>> a, b
({True:42}, {1:42})
Quirky!
rt 16 Jan 2008, 23:22
Actually for creating tuples it is the comma that decides it not the parentheses plus comma e.g.
>>> x = 1,
>>> x
(1,)
>>> len(x)
1
JM 16 Jan 2008, 23:53
Whitespace supports spaces, tabs or… a mix of both.
perl hacker 16 Jan 2008, 23:53
Yay – nice to see that there are some of the coolest features of perl in python :)
JM 16 Jan 2008, 23:55
Provides access to real threads but… the GIL only allows one thread (in most cases) to execute at a time.
ajordan 16 Jan 2008, 23:58
A variable is defined outside a function and used in several places within that function. Now, if you do one assignment of that variable within the function to something of a different type, the type of the variable changes for all uses within the function, even the ones before the assignment! I call that “spooky action at a distance”.
Alfie 17 Jan 2008, 00:22
Perl can also do For::Else ;p
http://search.cpan.org/dist/For-Else/lib/For/Else.pm
JJM 17 Jan 2008, 01:47
@rt: dicts can only use hashable types as keys.
id(1) != id(True)
hash(1) == hash(True)
Justin 17 Jan 2008, 01:54
In fact, you don’t even need parenthesis to make a tuple — the comma defines a tuple.
So, this produces a tuple of length 1 as well:
1,
Tuples are defined by the comma, not by the braces :-)
David 17 Jan 2008, 02:28
If you want to print WITHOUT a newline you can just add an extra comma after the arguments to ‘print’. Examples.
print 'Hello'
print ', World'
# Will end up on TWO lines.
print 'Hello', # <-- notice the comma
print ', World'
# Will end up on ONE line.
Jonathan Mark 17 Jan 2008, 03:27
(1) The depth limit, normally 500, on recursions.
(2) The need to pass ‘self’ as a parameter when calling an object’s member functions.
(3) The leading and trailing double underscores look funny.
(4) One-line lambdas but no anonymous blocks as in Ruby.
Peter 17 Jan 2008, 03:48
How about this one?
import decimal
x = decimal.Decimal(1)
x < 2.0 # returns False!
Joe 17 Jan 2008, 04:09
Hey what does python say is 4 divided by 5? 0
How many underscores does it take to create a basic class? 4 for each item
How many orders of magnitude slower is python than Java or C++? 1-2
How many drag & drop gui designers and IDEs are there for python similar in class to Visual Studio, Netbeans, or Eclipse? 0
Shannon 17 Jan 2008, 05:04
How can “for x in list: s += x” be O(n^2)? Seems rather unlikely…
paddy3118 17 Jan 2008, 05:07
Some of the quirks, I too would say are quirks. Some in fact are under review/changed in Python3.0; but some are just the consistent way that Python does things that may not be the same as other languages, such as dividing two integers to produce an integer – why not? How many orders of magnitude slower is Java or C++ development compared to python? 1-2 ;-) How many drag & drop 500Mb GUI designers IDE’s and repetitive code generators are required for Python development? – None!
SB 17 Jan 2008, 05:13
Joe, some of your comments are misleading. Care to provide examples / explanations?
Reggie Drake 17 Jan 2008, 05:28
Closed over variables are (unintentionally, it seems) read-only. Say you want to write a function that creates a counter function, something like this:
def makeCounter():
current = 0
def counter():
current += 1
return current
return counter
Because assigning to a variable will create that variable in the current scope (unless declared global), current += 1
will create a new variable (and then complain that it is unbound), instead of using the existing variable in the parent scope.
Ben 17 Jan 2008, 06:37
Great comments, guys. Peter, that Decimal quirk sure is quirky!
Sorry about the code formatting, guys — I’ve fixed up a bunch of them. I’m using WordPress with PHP Markdown, but the combo isn’t exactly perfect (understatement).
In my tests it looks like you can insert nice code, but you have to precede it with a normal text sentence/paragraph and then a black line. (Oh, and each code line prefixed with 4 spaces, as per markdown syntax.) Let’s try it:
# there was a blank line just above this
def square(n):
return n * 2
Paddy3118 17 Jan 2008, 06:47
You can store function state in function attributes .
def makeCounter():
def counter():
counter.current += 1
return counter.current
counter.current = 0
return counter
Although most people would use a class in Python.
— Paddy.
abhik 17 Jan 2008, 07:10
About string concatenation:
python strings are immutable. So when you do “for x in list: s+=x” a new string is created len(list) times. ”.join(list) only creates a string once. I don’t quite see how the first case is O(n^2) though. They’re both O(n) in terms of number of concatenations but the first case involves many more mallocs.
omegian 17 Jan 2008, 07:24
O(n^2) because the memory allocations go like this: 1 12 123 1234 12345 123456 1234567 12345678
As you can see, memory foot print grows quadratically: O(n(n-1)/2) = n^2
Ben 17 Jan 2008, 07:45
Omegian, yeah, that’s right. Though isn’t the speed related more to the “copying footprint” than the memory footprint? Allocating the memory isn’t the N^2 part, but the copying is.
When you do ''.join('a' for i in range(N))
it has to copy only N
bytes. But when you do for i in range(N): s += 'a'
it has to copy 1+2+3+...+N
bytes = O(N^2) bytes, as you’ve calculated for the memory footprint.
Ben 17 Jan 2008, 08:52
Inspired by keernerd’s comment on programming.reddit, “{}.get
as switch” could be quite a useful construct, for things like:
for i in range(5):
print {0: 'no things', 1: 'one thing'}.get(i, '%d things' % i)
It’s fairly clear, I reckon. BTW, some of the other comments over at reddit are pretty interesting, too.
Tom 17 Jan 2008, 09:51
This page has been added to TechZe.us. Feel free to submit future content to the list.
Sven Helmberger 17 Jan 2008, 12:08
@Joe: 1. in ipython:
4/5 is 0
divides an integer number by an integer which gives 0 in most programming languages.. one migh argue that / should convert to float but I’m not sure if that would work..
4./5 is 0.80000000000000004 shows the usual float representation bugs
syntax nitpicking?
psyco can speed up python, c extensions can speed up python, jython should be comparable in speed to similar java code.. but speed is usually not the reason you go for python…
wxglade.. glade.. and i bet there are others..
Peter Hosey 17 Jan 2008, 18:48
There’s been a lot of debate about making this faster, and it looks like it should be faster in Python 2.5, but my tests show otherwise. Any ideas why?
Because the optimization only applies to string literals ('foo' + 'bar' + 'baz'
). If you add to a non-literal, such as a variable containing a string, you don’t get the optimization.
Ben 17 Jan 2008, 19:57
Hi Peter, if you look at the Python-dev discussion and at the changelog and patch, it seems clear that s=s+t
and s+=t
are the cases it speeds up, not actually the string literal case you mention. So I guess I’m still stuck as to why my Python 2.5.1 doesn’t do s+=t
in linear time … :-)
Reid 17 Jan 2008, 22:29
For loops and if blocks don’t create their own scope. i.e.:
ls = []
for x in range(5):
ls.append(lambda: x)
for l in ls:
print l(),
Writes: 44444. You would expect: 01234
This is because for loops don’t create their own scope, and when you create the lambda, it keeps a pointer to x, which is the same x as all the others.
arkanes 18 Jan 2008, 10:31
Continual string appending is still O(N^2), the optimization just gives a (significant) constant factor speedup.
It still has to do a resize and a reallocation for every string append, what the optimization does is save the destruction of the old string object and the creation of a new one. Try using a global instead of the local (so something else holds a reference to it and the optimization can’t be used) to see the difference.
Ben 18 Jan 2008, 10:43
Ah, thanks arkanes. I didn’t look at the ceval.c patch closely enough. :-( Below is a test with seconds elapsed for s+=x
with s as a local and as a global.
---- N = 10000
plus 0.000
plusglobal 0.063
---- N = 20000
plus 0.000
plusglobal 0.281
---- N = 40000
plus 0.016
plusglobal 1.109
---- N = 80000
plus 0.031
plusglobal 4.891
---- N = 160000
plus 0.125
plusglobal 60.797
So it’s not an order change, but yeah, sure is a speedup. :-)
JMC 18 Jan 2008, 16:34
some of these are pretty cool features. thanks guys!
verte 24 Apr 2008, 18:38
Jonathan Mark: “The need to pass ’self’ as a parameter when calling an object’s member functions.”
You don’t need to pass it, it is passed automatically. you need to receive it. The only other way to do so is to add a new reserved keyword to the language, which is ugly.
Joe: “Hey what does python say is 4 divided by 5? 0”
fixed in 3.0, or you can “from future import division” today. use // to get integer division again. Note that the behaviour you mention is exactly how integer division works in most languages.
“How many drag & drop gui designers and IDEs are there for python similar in class to Visual Studio, Netbeans, or Eclipse? 0”
While I’m not a fan of drag & drop gui designers and haven’t used either, what about Glade? and… Eclipse? I’ve no doubt that there is a reasonable way to use Visual Studio with IronPython, too.
There are a few quirks that usually take a few mistakes for people to learn. These stem from ‘every name is a reference’, mutability, and scoping rules, mostly. For example, a = 3; b = a; a = 7 leaves b == 3. However, a = [3];b = a; a[0] = 7 leaves b == [7].
This is because = rebinds the left hand side to point at the object referred to on the right hand side. In the first case, a is rebound to refer at a new integer object with value 7, and in the latter case a and b still refer to the same list, but the first element of the list has been rebound to the new integer object.
A lot of expressiveness comes from this mechanism. You end up with no desire for an ugly ‘pointer type’, and you don’t have to worry about immutable objects like strings changing under your feet. But it seems like this trips up a lot of new Python programmers.
Also, name binding occurs in the local scope, so a = 3 binds the name a to 3 in the local scope. However, you can refer to and even mutate objects referred to by names in other scopes without any extra effort- eg, if a = [3] in the global scope, and in a function with no local name a, you do a[0] = 7, it will change the object referred to by the global name. Although if you use raw integers as in the above example, it will bind a new local variable a instead. To bind the global name, you will have to tell the function to treat that name global. [This is quite neat in that functions can’t accidentally trip on ‘global’ variables without knowing of their existence.]
[and unfortunately, what I’ve written is a bit thick for a Python beginner, although the concept is not.]
tolo 15 Aug 2015, 02:12
a = 100 b = 1000 a is 100 and b is 1000
False
a is 100
True b is 1000 False