I’m looking for documents that describes in details how python garbage collection works.
I’m interested what is done in which step. What objects are in these 3 collections? What kinds of objects are deleted in each step? What algorithm is used for reference cycles finding?
Background: I’m implementing some searches that have to finish in small amount of time. When the garbage collector starts collecting the oldest generation, it is “much” slower than in other cases. It took more time than it is intended for searches. I’m looking how to predict when it will collect oldest generation and how long it will take.
It is easy to predict when it will collect oldest generation with
get_threshold(). That also can be manipulated with
set_threshold(). But I don’t see how easy to decide is it better to make
collect() by force or wait for scheduled collection.
There’s no definitive resource on how Python does its garbage collection (other than the source code itself), but those 3 links should give you a pretty good idea.
The source is actually pretty helpful. How much you get out of it depends on how well you read C, but the comments are actually very helpful. Skip down to the
collect() function at https://github.com/python/cpython/blob/master/Modules/gcmodule.c and the comments explain the process well (albeit in very technical terms).