www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Example of Why Reference Counting is Important

reply Vijay Nayar <madric gmail.com> writes:
**TL;DR**: Reference Counting is a frequent topic of discussion 
in the D community. The motivations aren't always clear. In the 
context of machine learning, reference counting is valuable due 
to the low memory capacity of graphics cards, and the need to 
free unused memory as quickly as possible.

Every year, especially at DConf, there's a lot of talk about 
various approaches and difficulties regarding Reference Counting. 
Without context, there's plenty of pros and cons associated with 
it 
(https://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#
eference_counting), however, the reasons to go with reference counting over
tracing or escape analysis weren't very clear to me.

However, in the course of my work, I came across a clear use-case 
that others might find illuminating. The context is machine 
learning, where large numbers of partial derivatives of an error 
function, in the context of hundreds of thousands, if not 
millions, of parameters. Due to the high number of matrix 
operations involved, this work is primarily vectorized and 
executed on graphics cards.

Towards this end, I came across this snippet in a paper on a well 
known machine learning library called PyTorch 
(https://pytorch.org/).

https://openreview.net/pdf?id=BJJsrmfCZ
 **Memory management** The main use case for PyTorch is training 
 machine learning models on
 GPU. As one of the biggest limitations of GPUs is low memory 
 capacity, PyTorch takes great care to
 make sure that all intermediate values are freed as soon as 
 they become unneeded. Indeed, Python is
 well-suited for this purpose, because it is reference counted 
 by default (using a garbage collector only
 to break cycles).
 PyTorch’s Variable and Function must be designed to work well 
 in a reference counted regime.
 For example, a Function records pointers to the Function which 
 consumes its result, so that a
 Function subgraph is freed when its retaining output Variable 
 becomes dead. This is opposite of the
 conventional ownership for closures, where a closure retains 
 the closures it invokes (a pointer to the
 Function which produces its result.)
I hope this provides some clarity and insight to others.
Aug 10 2023
parent Vijay Nayar <madric gmail.com> writes:
On Thursday, 10 August 2023 at 13:33:21 UTC, Vijay Nayar wrote:
 **TL;DR**: Reference Counting is a frequent topic of discussion 
 in the D community. The motivations aren't always clear. In the 
 context of machine learning, reference counting is valuable due 
 to the low memory capacity of graphics cards, and the need to 
 free unused memory as quickly as possible.
And just for additional reference, the discussion about the choice of garbage collector in Python (reference counting paired with a cyclic garbage collector to find reference cycles) can be found here: https://devguide.python.org/internals/garbage-collector/ "CPython" is just the reference implementation of the Python language, it's akin to `dmd`.
Aug 11 2023