|
1 ************************************ |
|
2 Idioms and Anti-Idioms in Python |
|
3 ************************************ |
|
4 |
|
5 :Author: Moshe Zadka |
|
6 |
|
7 This document is placed in the public domain. |
|
8 |
|
9 |
|
10 .. topic:: Abstract |
|
11 |
|
12 This document can be considered a companion to the tutorial. It shows how to use |
|
13 Python, and even more importantly, how *not* to use Python. |
|
14 |
|
15 |
|
16 Language Constructs You Should Not Use |
|
17 ====================================== |
|
18 |
|
19 While Python has relatively few gotchas compared to other languages, it still |
|
20 has some constructs which are only useful in corner cases, or are plain |
|
21 dangerous. |
|
22 |
|
23 |
|
24 from module import \* |
|
25 --------------------- |
|
26 |
|
27 |
|
28 Inside Function Definitions |
|
29 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
30 |
|
31 ``from module import *`` is *invalid* inside function definitions. While many |
|
32 versions of Python do not check for the invalidity, it does not make it more |
|
33 valid, no more then having a smart lawyer makes a man innocent. Do not use it |
|
34 like that ever. Even in versions where it was accepted, it made the function |
|
35 execution slower, because the compiler could not be certain which names are |
|
36 local and which are global. In Python 2.1 this construct causes warnings, and |
|
37 sometimes even errors. |
|
38 |
|
39 |
|
40 At Module Level |
|
41 ^^^^^^^^^^^^^^^ |
|
42 |
|
43 While it is valid to use ``from module import *`` at module level it is usually |
|
44 a bad idea. For one, this loses an important property Python otherwise has --- |
|
45 you can know where each toplevel name is defined by a simple "search" function |
|
46 in your favourite editor. You also open yourself to trouble in the future, if |
|
47 some module grows additional functions or classes. |
|
48 |
|
49 One of the most awful question asked on the newsgroup is why this code:: |
|
50 |
|
51 f = open("www") |
|
52 f.read() |
|
53 |
|
54 does not work. Of course, it works just fine (assuming you have a file called |
|
55 "www".) But it does not work if somewhere in the module, the statement ``from os |
|
56 import *`` is present. The :mod:`os` module has a function called :func:`open` |
|
57 which returns an integer. While it is very useful, shadowing builtins is one of |
|
58 its least useful properties. |
|
59 |
|
60 Remember, you can never know for sure what names a module exports, so either |
|
61 take what you need --- ``from module import name1, name2``, or keep them in the |
|
62 module and access on a per-need basis --- ``import module;print module.name``. |
|
63 |
|
64 |
|
65 When It Is Just Fine |
|
66 ^^^^^^^^^^^^^^^^^^^^ |
|
67 |
|
68 There are situations in which ``from module import *`` is just fine: |
|
69 |
|
70 * The interactive prompt. For example, ``from math import *`` makes Python an |
|
71 amazing scientific calculator. |
|
72 |
|
73 * When extending a module in C with a module in Python. |
|
74 |
|
75 * When the module advertises itself as ``from import *`` safe. |
|
76 |
|
77 |
|
78 Unadorned :keyword:`exec`, :func:`execfile` and friends |
|
79 ------------------------------------------------------- |
|
80 |
|
81 The word "unadorned" refers to the use without an explicit dictionary, in which |
|
82 case those constructs evaluate code in the *current* environment. This is |
|
83 dangerous for the same reasons ``from import *`` is dangerous --- it might step |
|
84 over variables you are counting on and mess up things for the rest of your code. |
|
85 Simply do not do that. |
|
86 |
|
87 Bad examples:: |
|
88 |
|
89 >>> for name in sys.argv[1:]: |
|
90 >>> exec "%s=1" % name |
|
91 >>> def func(s, **kw): |
|
92 >>> for var, val in kw.items(): |
|
93 >>> exec "s.%s=val" % var # invalid! |
|
94 >>> execfile("handler.py") |
|
95 >>> handle() |
|
96 |
|
97 Good examples:: |
|
98 |
|
99 >>> d = {} |
|
100 >>> for name in sys.argv[1:]: |
|
101 >>> d[name] = 1 |
|
102 >>> def func(s, **kw): |
|
103 >>> for var, val in kw.items(): |
|
104 >>> setattr(s, var, val) |
|
105 >>> d={} |
|
106 >>> execfile("handle.py", d, d) |
|
107 >>> handle = d['handle'] |
|
108 >>> handle() |
|
109 |
|
110 |
|
111 from module import name1, name2 |
|
112 ------------------------------- |
|
113 |
|
114 This is a "don't" which is much weaker then the previous "don't"s but is still |
|
115 something you should not do if you don't have good reasons to do that. The |
|
116 reason it is usually bad idea is because you suddenly have an object which lives |
|
117 in two separate namespaces. When the binding in one namespace changes, the |
|
118 binding in the other will not, so there will be a discrepancy between them. This |
|
119 happens when, for example, one module is reloaded, or changes the definition of |
|
120 a function at runtime. |
|
121 |
|
122 Bad example:: |
|
123 |
|
124 # foo.py |
|
125 a = 1 |
|
126 |
|
127 # bar.py |
|
128 from foo import a |
|
129 if something(): |
|
130 a = 2 # danger: foo.a != a |
|
131 |
|
132 Good example:: |
|
133 |
|
134 # foo.py |
|
135 a = 1 |
|
136 |
|
137 # bar.py |
|
138 import foo |
|
139 if something(): |
|
140 foo.a = 2 |
|
141 |
|
142 |
|
143 except: |
|
144 ------- |
|
145 |
|
146 Python has the ``except:`` clause, which catches all exceptions. Since *every* |
|
147 error in Python raises an exception, this makes many programming errors look |
|
148 like runtime problems, and hinders the debugging process. |
|
149 |
|
150 The following code shows a great example:: |
|
151 |
|
152 try: |
|
153 foo = opne("file") # misspelled "open" |
|
154 except: |
|
155 sys.exit("could not open file!") |
|
156 |
|
157 The second line triggers a :exc:`NameError` which is caught by the except |
|
158 clause. The program will exit, and you will have no idea that this has nothing |
|
159 to do with the readability of ``"file"``. |
|
160 |
|
161 The example above is better written :: |
|
162 |
|
163 try: |
|
164 foo = opne("file") # will be changed to "open" as soon as we run it |
|
165 except IOError: |
|
166 sys.exit("could not open file") |
|
167 |
|
168 There are some situations in which the ``except:`` clause is useful: for |
|
169 example, in a framework when running callbacks, it is good not to let any |
|
170 callback disturb the framework. |
|
171 |
|
172 |
|
173 Exceptions |
|
174 ========== |
|
175 |
|
176 Exceptions are a useful feature of Python. You should learn to raise them |
|
177 whenever something unexpected occurs, and catch them only where you can do |
|
178 something about them. |
|
179 |
|
180 The following is a very popular anti-idiom :: |
|
181 |
|
182 def get_status(file): |
|
183 if not os.path.exists(file): |
|
184 print "file not found" |
|
185 sys.exit(1) |
|
186 return open(file).readline() |
|
187 |
|
188 Consider the case the file gets deleted between the time the call to |
|
189 :func:`os.path.exists` is made and the time :func:`open` is called. That means |
|
190 the last line will throw an :exc:`IOError`. The same would happen if *file* |
|
191 exists but has no read permission. Since testing this on a normal machine on |
|
192 existing and non-existing files make it seem bugless, that means in testing the |
|
193 results will seem fine, and the code will get shipped. Then an unhandled |
|
194 :exc:`IOError` escapes to the user, who has to watch the ugly traceback. |
|
195 |
|
196 Here is a better way to do it. :: |
|
197 |
|
198 def get_status(file): |
|
199 try: |
|
200 return open(file).readline() |
|
201 except (IOError, OSError): |
|
202 print "file not found" |
|
203 sys.exit(1) |
|
204 |
|
205 In this version, \*either\* the file gets opened and the line is read (so it |
|
206 works even on flaky NFS or SMB connections), or the message is printed and the |
|
207 application aborted. |
|
208 |
|
209 Still, :func:`get_status` makes too many assumptions --- that it will only be |
|
210 used in a short running script, and not, say, in a long running server. Sure, |
|
211 the caller could do something like :: |
|
212 |
|
213 try: |
|
214 status = get_status(log) |
|
215 except SystemExit: |
|
216 status = None |
|
217 |
|
218 So, try to make as few ``except`` clauses in your code --- those will usually be |
|
219 a catch-all in the :func:`main`, or inside calls which should always succeed. |
|
220 |
|
221 So, the best version is probably :: |
|
222 |
|
223 def get_status(file): |
|
224 return open(file).readline() |
|
225 |
|
226 The caller can deal with the exception if it wants (for example, if it tries |
|
227 several files in a loop), or just let the exception filter upwards to *its* |
|
228 caller. |
|
229 |
|
230 The last version is not very good either --- due to implementation details, the |
|
231 file would not be closed when an exception is raised until the handler finishes, |
|
232 and perhaps not at all in non-C implementations (e.g., Jython). :: |
|
233 |
|
234 def get_status(file): |
|
235 fp = open(file) |
|
236 try: |
|
237 return fp.readline() |
|
238 finally: |
|
239 fp.close() |
|
240 |
|
241 |
|
242 Using the Batteries |
|
243 =================== |
|
244 |
|
245 Every so often, people seem to be writing stuff in the Python library again, |
|
246 usually poorly. While the occasional module has a poor interface, it is usually |
|
247 much better to use the rich standard library and data types that come with |
|
248 Python then inventing your own. |
|
249 |
|
250 A useful module very few people know about is :mod:`os.path`. It always has the |
|
251 correct path arithmetic for your operating system, and will usually be much |
|
252 better then whatever you come up with yourself. |
|
253 |
|
254 Compare:: |
|
255 |
|
256 # ugh! |
|
257 return dir+"/"+file |
|
258 # better |
|
259 return os.path.join(dir, file) |
|
260 |
|
261 More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and |
|
262 :func:`splitext`. |
|
263 |
|
264 There are also many useful builtin functions people seem not to be aware of for |
|
265 some reason: :func:`min` and :func:`max` can find the minimum/maximum of any |
|
266 sequence with comparable semantics, for example, yet many people write their own |
|
267 :func:`max`/:func:`min`. Another highly useful function is :func:`reduce`. A |
|
268 classical use of :func:`reduce` is something like :: |
|
269 |
|
270 import sys, operator |
|
271 nums = map(float, sys.argv[1:]) |
|
272 print reduce(operator.add, nums)/len(nums) |
|
273 |
|
274 This cute little script prints the average of all numbers given on the command |
|
275 line. The :func:`reduce` adds up all the numbers, and the rest is just some |
|
276 pre- and postprocessing. |
|
277 |
|
278 On the same note, note that :func:`float`, :func:`int` and :func:`long` all |
|
279 accept arguments of type string, and so are suited to parsing --- assuming you |
|
280 are ready to deal with the :exc:`ValueError` they raise. |
|
281 |
|
282 |
|
283 Using Backslash to Continue Statements |
|
284 ====================================== |
|
285 |
|
286 Since Python treats a newline as a statement terminator, and since statements |
|
287 are often more then is comfortable to put in one line, many people do:: |
|
288 |
|
289 if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ |
|
290 calculate_number(10, 20) != forbulate(500, 360): |
|
291 pass |
|
292 |
|
293 You should realize that this is dangerous: a stray space after the ``\`` would |
|
294 make this line wrong, and stray spaces are notoriously hard to see in editors. |
|
295 In this case, at least it would be a syntax error, but if the code was:: |
|
296 |
|
297 value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ |
|
298 + calculate_number(10, 20)*forbulate(500, 360) |
|
299 |
|
300 then it would just be subtly wrong. |
|
301 |
|
302 It is usually much better to use the implicit continuation inside parenthesis: |
|
303 |
|
304 This version is bulletproof:: |
|
305 |
|
306 value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] |
|
307 + calculate_number(10, 20)*forbulate(500, 360)) |
|
308 |