Are all objects returned by value rather than by reference?
I am coding in Python trying to decide whether I should return a numpy array (the result of a diff on some other array) or return numpy.where(diff), which is a smaller array but requires that little extra work to create. Let's call the method where this happens methodB.
I call methodB from methodA. The rub is that I won't necessarily always need the where() result in methodA, but I might. So is it worth doing this work inside methodB, or should I pass back the (much larger memory-wise) diff itself and then only process it further in methodA if needed? That would be the more efficient choice assuming methodA just gets a reference to the result.
So, are function results ever not copied when they are passed back the the code that called that function?
I believe that when methodB finishes, all the memory in its frame will be reclaimed by the system, so methodA has to actually copy anything returned by methodB in to its own frame in order to be able to use it. I would call this "return by value". Is this correct?
Assignment never copies data. If you have a function
foothat returns a value, then an assignment like
result = foo(arg)never copies any data. (You could, of course, have copy-operations in the function's body.) Likewise,
return xdoes not copy the object
Your question lacks a specific example, so I can't go into more detail.
edit: You should probably watch the excellent Facts and Myths about Python names and values talk.
Yes, you are correct. In Python, arguments are always passed by value, and return values are always returned by value. However, the value being returned (or passed) is a reference to a potentially shared, potentially mutable object.
There are some types for which the value being returned or passed may be the actual object itself, e.g. this is the case for integers, but the difference between the two can only be observed for mutable objects which integers aren't, and de-referencing an object reference is completely transparent, so you will never notice the difference. To simplify your mental model, you may just assume that arguments and return values are always passed by value (this is true anyhow), and that the value being passed is always a reference (this is not always true, but you cannot tell the difference, you can treat it as a simple performance optimization).
Note that passing / returning a reference by value is in no way similar (and certainly not the same thing) as passing / returning by reference. In particular, it does not allow you to mutate the name binding in the caller / callee, as pass-by-reference would allow you to.
This particular flavor of pass-by-value, where the value is typically a reference is the same in e.g. ECMAScript, Ruby, Smalltalk, and Java, and is sometimes called "call by object sharing" (coined by Barbara Liskov, I believe), "call by sharing", "call by object", and specifically within the Python community "call by assignment" (thanks to @timgeb) or "call by name-binding" (thanks to @Terry Jan Reedy) (not to be confused with call by name, which is again a different thing).
So roughly your code is:
def methodA(arr): x = methodB(arr) .... def methodB(arr): diff = somefn(arr) # return diff or # return np.where(diff)
arris a (large) array, that is passed a reference to
methodB. No copies are made.
diffis a similar size array that is generated in
methodB. If that is returned, it be referenced in the
x. No copy is made in returning it.
wherearray is returned,
methodBreturns. Assuming it doesn't share a data buffer with some other array (such as
arr), all the memory that it occupied is recovered.
But as long as memory isn't tight, returning
diffinstead of the
whereresult won't be more expensive. Nothing is copied during the return.
A numpy array consists of small object wrapper with attributes like
dtype. It also has a pointer to a potentially large
data buffer. Where possible
numpytries to share buffers, but readily makes new
ndarrayobjects. Thus there's an important distinction between
I see what I missed now: Objects are created on the heap, but function frames are on the stack. So when methodB finishes, its frame will be reclaimed, but that object will still exist on the heap, and methodA can access it with a simple reference.