symbian-qemu-0.9.1-12/python-2.6.1/Modules/gc_weakref.txt
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 Intro
       
     2 =====
       
     3 
       
     4 The basic rule for dealing with weakref callbacks (and __del__ methods too,
       
     5 for that matter) during cyclic gc:
       
     6 
       
     7     Once gc has computed the set of unreachable objects, no Python-level
       
     8     code can be allowed to access an unreachable object.
       
     9 
       
    10 If that can happen, then the Python code can resurrect unreachable objects
       
    11 too, and gc can't detect that without starting over.  Since gc eventually
       
    12 runs tp_clear on all unreachable objects, if an unreachable object is
       
    13 resurrected then tp_clear will eventually be called on it (or may already
       
    14 have been called before resurrection).  At best (and this has been an
       
    15 historically common bug), tp_clear empties an instance's __dict__, and
       
    16 "impossible" AttributeErrors result.  At worst, tp_clear leaves behind an
       
    17 insane object at the C level, and segfaults result (historically, most
       
    18 often by setting a new-style class's mro pointer to NULL, after which
       
    19 attribute lookups performed by the class can segfault).
       
    20 
       
    21 OTOH, it's OK to run Python-level code that can't access unreachable
       
    22 objects, and sometimes that's necessary.  The chief example is the callback
       
    23 attached to a reachable weakref W to an unreachable object O.  Since O is
       
    24 going away, and W is still alive, the callback must be invoked.  Because W
       
    25 is still alive, everything reachable from its callback is also reachable,
       
    26 so it's also safe to invoke the callback (although that's trickier than it
       
    27 sounds, since other reachable weakrefs to other unreachable objects may
       
    28 still exist, and be accessible to the callback -- there are lots of painful
       
    29 details like this covered in the rest of this file).
       
    30 
       
    31 Python 2.4/2.3.5
       
    32 ================
       
    33 
       
    34 The "Before 2.3.3" section below turned out to be wrong in some ways, but
       
    35 I'm leaving it as-is because it's more right than wrong, and serves as a
       
    36 wonderful example of how painful analysis can miss not only the forest for
       
    37 the trees, but also miss the trees for the aphids sucking the trees
       
    38 dry <wink>.
       
    39 
       
    40 The primary thing it missed is that when a weakref to a piece of cyclic
       
    41 trash (CT) exists, then any call to any Python code whatsoever can end up
       
    42 materializing a strong reference to that weakref's CT referent, and so
       
    43 possibly resurrect an insane object (one for which cyclic gc has called-- or
       
    44 will call before it's done --tp_clear()).  It's not even necessarily that a
       
    45 weakref callback or __del__ method does something nasty on purpose:  as
       
    46 soon as we execute Python code, threads other than the gc thread can run
       
    47 too, and they can do ordinary things with weakrefs that end up resurrecting
       
    48 CT while gc is running.
       
    49 
       
    50     http://www.python.org/sf/1055820
       
    51 
       
    52 shows how innocent it can be, and also how nasty.  Variants of the three
       
    53 focussed test cases attached to that bug report are now part of Python's
       
    54 standard Lib/test/test_gc.py.
       
    55 
       
    56 Jim Fulton gave the best nutshell summary of the new (in 2.4 and 2.3.5)
       
    57 approach:
       
    58 
       
    59     Clearing cyclic trash can call Python code.  If there are weakrefs to
       
    60     any of the cyclic trash, then those weakrefs can be used to resurrect
       
    61     the objects.  Therefore, *before* clearing cyclic trash, we need to
       
    62     remove any weakrefs.  If any of the weakrefs being removed have
       
    63     callbacks, then we need to save the callbacks and call them *after* all
       
    64     of the weakrefs have been cleared.
       
    65 
       
    66 Alas, doing just that much doesn't work, because it overlooks what turned
       
    67 out to be the much subtler problems that were fixed earlier, and described
       
    68 below.  We do clear all weakrefs to CT now before breaking cycles, but not
       
    69 all callbacks encountered can be run later.  That's explained in horrid
       
    70 detail below.
       
    71 
       
    72 Older text follows, with a some later comments in [] brackets:
       
    73 
       
    74 Before 2.3.3
       
    75 ============
       
    76 
       
    77 Before 2.3.3, Python's cyclic gc didn't pay any attention to weakrefs.
       
    78 Segfaults in Zope3 resulted.
       
    79 
       
    80 weakrefs in Python are designed to, at worst, let *other* objects learn
       
    81 that a given object has died, via a callback function.  The weakly
       
    82 referenced object itself is not passed to the callback, and the presumption
       
    83 is that the weakly referenced object is unreachable trash at the time the
       
    84 callback is invoked.
       
    85 
       
    86 That's usually true, but not always.  Suppose a weakly referenced object
       
    87 becomes part of a clump of cyclic trash.  When enough cycles are broken by
       
    88 cyclic gc that the object is reclaimed, the callback is invoked.  If it's
       
    89 possible for the callback to get at objects in the cycle(s), then it may be
       
    90 possible for those objects to access (via strong references in the cycle)
       
    91 the weakly referenced object being torn down, or other objects in the cycle
       
    92 that have already suffered a tp_clear() call.  There's no guarantee that an
       
    93 object is in a sane state after tp_clear().  Bad things (including
       
    94 segfaults) can happen right then, during the callback's execution, or can
       
    95 happen at any later time if the callback manages to resurrect an insane
       
    96 object.
       
    97 
       
    98 [That missed that, in addition, a weakref to CT can exist outside CT, and
       
    99  any callback into Python can use such a non-CT weakref to resurrect its CT
       
   100  referent.  The same bad kinds of things can happen then.]
       
   101 
       
   102 Note that if it's possible for the callback to get at objects in the trash
       
   103 cycles, it must also be the case that the callback itself is part of the
       
   104 trash cycles.  Else the callback would have acted as an external root to
       
   105 the current collection, and nothing reachable from it would be in cyclic
       
   106 trash either.
       
   107 
       
   108 [Except that a non-CT callback can also use a non-CT weakref to get at
       
   109  CT objects.]
       
   110 
       
   111 More, if the callback itself is in cyclic trash, then the weakref to which
       
   112 the callback is attached must also be trash, and for the same kind of
       
   113 reason:  if the weakref acted as an external root, then the callback could
       
   114 not have been cyclic trash.
       
   115 
       
   116 So a problem here requires that a weakref, that weakref's callback, and the
       
   117 weakly referenced object, all be in cyclic trash at the same time.  This
       
   118 isn't easy to stumble into by accident while Python is running, and, indeed,
       
   119 it took quite a while to dream up failing test cases.  Zope3 saw segfaults
       
   120 during shutdown, during the second call of gc in Py_Finalize, after most
       
   121 modules had been torn down.  That creates many trash cycles (esp. those
       
   122 involving new-style classes), making the problem much more likely.  Once you
       
   123 know what's required to provoke the problem, though, it's easy to create
       
   124 tests that segfault before shutdown.
       
   125 
       
   126 In 2.3.3, before breaking cycles, we first clear all the weakrefs with
       
   127 callbacks in cyclic trash.  Since the weakrefs *are* trash, and there's no
       
   128 defined-- or even predictable --order in which tp_clear() gets called on
       
   129 cyclic trash, it's defensible to first clear weakrefs with callbacks.  It's
       
   130 a feature of Python's weakrefs too that when a weakref goes away, the
       
   131 callback (if any) associated with it is thrown away too, unexecuted.
       
   132 
       
   133 [In 2.4/2.3.5, we first clear all weakrefs to CT objects, whether or not
       
   134  those weakrefs are themselves CT, and whether or not they have callbacks.
       
   135  The callbacks (if any) on non-CT weakrefs (if any) are invoked later,
       
   136  after all weakrefs-to-CT have been cleared.  The callbacks (if any) on CT
       
   137  weakrefs (if any) are never invoked, for the excruciating reasons
       
   138  explained here.]
       
   139 
       
   140 Just that much is almost enough to prevent problems, by throwing away
       
   141 *almost* all the weakref callbacks that could get triggered by gc.  The
       
   142 problem remaining is that clearing a weakref with a callback decrefs the
       
   143 callback object, and the callback object may *itself* be weakly referenced,
       
   144 via another weakref with another callback.  So the process of clearing
       
   145 weakrefs can trigger callbacks attached to other weakrefs, and those
       
   146 latter weakrefs may or may not be part of cyclic trash.
       
   147 
       
   148 So, to prevent any Python code from running while gc is invoking tp_clear()
       
   149 on all the objects in cyclic trash,
       
   150 
       
   151 [That was always wrong:  we can't stop Python code from running when gc
       
   152  is breaking cycles.  If an object with a __del__ method is not itself in
       
   153  a cycle, but is reachable only from CT, then breaking cycles will, as a
       
   154  matter of course, drop the refcount on that object to 0, and its __del__
       
   155  will run right then.  What we can and must stop is running any Python
       
   156  code that could access CT.]
       
   157                                      it's not quite enough just to invoke
       
   158 tp_clear() on weakrefs with callbacks first.  Instead the weakref module
       
   159 grew a new private function (_PyWeakref_ClearRef) that does only part of
       
   160 tp_clear():  it removes the weakref from the weakly-referenced object's list
       
   161 of weakrefs, but does not decref the callback object.  So calling
       
   162 _PyWeakref_ClearRef(wr) ensures that wr's callback object will never
       
   163 trigger, and (unlike weakref's tp_clear()) also prevents any callback
       
   164 associated *with* wr's callback object from triggering.
       
   165 
       
   166 [Although we may trigger such callbacks later, as explained below.]
       
   167 
       
   168 Then we can call tp_clear on all the cyclic objects and never trigger
       
   169 Python code.
       
   170 
       
   171 [As above, not so:  it means never trigger Python code that can access CT.]
       
   172 
       
   173 After we do that, the callback objects still need to be decref'ed.  Callbacks
       
   174 (if any) *on* the callback objects that were also part of cyclic trash won't
       
   175 get invoked, because we cleared all trash weakrefs with callbacks at the
       
   176 start.  Callbacks on the callback objects that were not part of cyclic trash
       
   177 acted as external roots to everything reachable from them, so nothing
       
   178 reachable from them was part of cyclic trash, so gc didn't do any damage to
       
   179 objects reachable from them, and it's safe to call them at the end of gc.
       
   180 
       
   181 [That's so.  In addition, now we also invoke (if any) the callbacks on
       
   182  non-CT weakrefs to CT objects, during the same pass that decrefs the
       
   183  callback objects.]
       
   184 
       
   185 An alternative would have been to treat objects with callbacks like objects
       
   186 with __del__ methods, refusing to collect them, appending them to gc.garbage
       
   187 instead.  That would have been much easier.  Jim Fulton gave a strong
       
   188 argument against that (on Python-Dev):
       
   189 
       
   190     There's a big difference between __del__ and weakref callbacks.
       
   191     The __del__ method is "internal" to a design.  When you design a
       
   192     class with a del method, you know you have to avoid including the
       
   193     class in cycles.
       
   194 
       
   195     Now, suppose you have a design that makes has no __del__ methods but
       
   196     that does use cyclic data structures.  You reason about the design,
       
   197     run tests, and convince yourself you don't have a leak.
       
   198 
       
   199     Now, suppose some external code creates a weakref to one of your
       
   200     objects.  All of a sudden, you start leaking.  You can look at your
       
   201     code all you want and you won't find a reason for the leak.
       
   202 
       
   203 IOW, a class designer can out-think __del__ problems, but has no control
       
   204 over who creates weakrefs to his classes or class instances.  The class
       
   205 user has little chance either of predicting when the weakrefs he creates
       
   206 may end up in cycles.
       
   207 
       
   208 Callbacks on weakref callbacks are executed in an arbitrary order, and
       
   209 that's not good (a primary reason not to collect cycles with objects with
       
   210 __del__ methods is to avoid running finalizers in an arbitrary order).
       
   211 However, a weakref callback on a weakref callback has got to be rare.
       
   212 It's possible to do such a thing, so gc has to be robust against it, but
       
   213 I doubt anyone has done it outside the test case I wrote for it.
       
   214 
       
   215 [The callbacks (if any) on non-CT weakrefs to CT objects are also executed
       
   216  in an arbitrary order now.  But they were before too, depending on the
       
   217  vagaries of when tp_clear() happened to break enough cycles to trigger
       
   218  them.  People simply shouldn't try to use __del__ or weakref callbacks to
       
   219  do fancy stuff.]