Pass By Ref Hierarchy Argument Essay

Many answers here (and in particular the most highly upvoted answer) are factually incorrect, since they misunderstand what "call by reference" really means. Here's my attempt to set matters straight.

TL;DR

In simplest terms:

  • call by value means that you pass values as function arguments
  • call by reference means that you pass variables as function arguments

In metaphoric terms:

  • Call by value is where I write down something on a piece of paper and hand it to you. Maybe it's a URL, maybe it's a complete copy of War and Peace. No matter what it is, it's on a piece of paper which I've given to you, and so now it is effectively your piece of paper. You are now free to scribble on that piece of paper, or use that piece of paper to find something somewhere else and fiddle with it, whatever.
  • Call by reference is when I give you my notebook which has something written down in it. You may scribble in my notebook (maybe I want you to, maybe I don't), and afterwards I keep my notebook, with whatever scribbles you've put there. Also, if what either you or I wrote there is information about how to find something somewhere else, either you or I can go there and fiddle with that information.

What "call by value" and "call by reference" don't mean

Note that both of these concepts are completely independent and orthogonal from the concept of reference types (which in Java is all types that are subtypes of , and in C# all types), or the concept of pointer types like in C (which are semantically equivalent to Java's "reference types", simply with different syntax).

The notion of reference type corresponds to a URL: it is both itself a piece of information, and it is a reference (a pointer, if you will) to other information. You can have many copies of a URL in different places, and they don't change what website they all link to; if the website is updated then every URL copy will still lead to the updated information. Conversely, changing the URL in any one place won't affect any other written copy of the URL.

Note that C++ has a notion of "references" (e.g. ) that is not like Java and C#'s "reference types", but is like "call by reference". Java and C#'s "reference types", and all types in Python, are like what C and C++ call "pointer types" (e.g. ).


OK, here's the longer and more formal explanation.

Terminology

To start with, I want to highlight some important bits of terminology, to help clarify my answer and to ensure we're all referring to the same ideas when we are using words. (In practice, I believe the vast majority of confusion about topics such as these stems from using words in ways that to not fully communicate the meaning that was intended.)

To start, here's an example in some C-like language of a function declaration:

And here's an example of calling this function:

Using this example, I want to define some important bits of terminology:

  • is a function declared on line 1 (Java insists on making all functions methods, but the concept is the same without loss of generality; C and C++ make a distinction between declaration and definition which I won't go into here)
  • is a formal parameter to , also declared on line 1
  • is a variable, specifically a local variable of the function , declared and initialized on line 2
  • is also an argument to a specific invocation of on line 3

There are two very important sets of concepts to distinguish here. The first is value versus variable:

  • A value is the result of evaluating an expression in the language. For example, in the function above, after the line , the expression has the value.
  • A variable is a container for values. A variable can be mutable (this is the default in most C-like languages), read-only (e.g. declared using Java's or C#'s ) or deeply immutable (e.g. using C++'s ).

The other important pair of concepts to distinguish is parameter versus argument:

  • A parameter (also called a formal parameter) is a variable which must be supplied by the caller when calling a function.
  • An argument is a value that is supplied by the caller of a function to satisfy a specific formal parameter of that function

Call by value

In call by value, the function's formal parameters are variables that are newly created for the function invocation, and which are initialized with the values of their arguments.

This works exactly the same way that any other kinds of variables are initialized with values. For example:

Here and are completely independent variables -- their values can change independently of each other. However, at the point where is declared, it is initialized to hold the same value that holds -- which is .

Since they are independent variables, changes to do not affect :

This is exactly the same as the relationship between and in our example above, which I'll repeat here for symmetry:

It is exactly as if we had written the code this way:

That is, the defining characteristic of what call by value means is that the callee ( in this case) receives values as arguments, but has its own separate variables for those values from the variables of the caller ( in this case).

Going back to my metaphor above, if I'm and you're , when I call you, I hand you a piece of paper with a value written on it. You call that piece of paper . That value is a copy of the value I have written in my notebook (my local variables), in a variable I call .

(As an aside: depending on hardware and operating system, there are various calling conventions about how you call one function from another. The calling convention is like us deciding whether I write the value on a piece of my paper and then hand it to you, or if you have a piece of paper that I write it on, or if I write it on the wall in front of both of us. This is an interesting subject as well, but far beyond the scope of this already long answer.)

Call by reference

In call by reference, the function's formal parameters are simply new names for the same variables that the caller supplies as arguments.

Going back to our example above, it's equivalent to:

Since is just another name for -- that is, they are the same variable, changes to are reflected in . This is the fundamental way in which call by reference differs from call by value.

Very few languages support call by reference, but C++ can do it like this:

In this case, doesn't just have the same value as , it actually is (just by a different name) and so can observe that has been incremented.

Note that this is not how any of Java, JavaScript, C, Objective-C, Python, or nearly any other popular language today works. This means that those languages are not call by reference, they are call by value.

Addendum: call by object sharing

If what you have is call by value, but the actual value is a reference type or pointer type, then the "value" itself isn't very interesting (e.g. in C it's just an integer of a platform-specific size) -- what's interesting is what that value points to.

If what that reference type (that is, pointer) points to is mutable then an interesting effect is possible: you can modify the pointed-to value, and the caller can observe changes to the pointed-to value, even though the caller cannot observe changes to the pointer itself.

To borrow the analogy of the URL again, the fact that I gave you a copy of the URL to a website is not particularly interesting if the thing we both care about is the website, not the URL. The fact that you scribbling over your copy of the URL doesn't affect my copy of the URL isn't a thing we care about (and in fact, in languages like Java and Python the "URL", or reference type value, can't be modified at all, only the thing pointed to by it can).

Barbara Liskov, when she invented the CLU programming language (which had these semantics), realized that the existing terms "call by value" and "call by reference" weren't particularly useful for describing the semantics of this new language. So she invented a new term: call by object sharing.

When discussing languages that are technically call by value, but where common types in use are reference or pointer types (that is: nearly every modern imperative, object-oriented, or multi-paradigm programming language), I find it's a lot less confusing to simply avoid talking about call by value or call by reference. Stick to call by object sharing (or simply call by object) and nobody will be confused. :-)

Programming languages use evaluation strategies to determine when to evaluate the argument(s) of a function call (for function, also read: operation, method, or relation) and what kind of value to pass to the function. For example, call by value/call by reference specifies that a function application evaluates the argument before it proceeds to the evaluation of the function's body and that it passes two capabilities to the function, namely, the ability to look up the current value of the argument and to modify it via an assignment statement.[1] The notion of reduction strategy in lambda calculus is similar but distinct.

In practical terms, many modern programming languages have converged on a call-by-value/call-by-reference[clarification needed] evaluation strategy for function calls (C#, Java). Some languages, especially lower-level languages such as C++, combine several notions of parameter passing. Historically, call by value and call by name date back to ALGOL 60, a language designed in the late 1950s. Call by reference is used by PL/I and some Fortran systems.[2] Purely functional languages like Haskell, as well as non-purely functional languages like R, use call by need.

The evaluation strategy is specified by the programming language definition, and is not a function of any specific implementation.

Strict evaluation[edit]

Main article: Eager evaluation

In strict evaluation, the arguments to a function are always evaluated completely before the function is applied.

Under Church encoding, eager evaluation of operators maps to strict evaluation of functions; for this reason, strict evaluation is sometimes called "eager". Most existing programming languages use strict evaluation for functions.

Applicative order[edit]

Applicative order (or leftmost innermost[3][4]) evaluation refers to an evaluation strategy in which the arguments of a function are evaluated from left to right in a post-order traversal of reducible expressions (redexes). Applicative order is a call-by-value evaluation.

Call by value[edit]

Call by value (also referred to as pass by value) is the most common evaluation strategy, used in languages as different as C and Scheme. In call by value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local variable is assigned—that is, anything passed into a function call is unchanged in the caller's scope when the function returns.

Call by value is not a single evaluation strategy, but rather the family of evaluation strategies in which a function's argument is evaluated before being passed to the function. While many programming languages (such as Common Lisp, Eiffel and Java) that use call by value evaluate function arguments left-to-right, some evaluate functions and their arguments right-to-left, and others (such as Scheme, OCaml and C) leave the order unspecified.

Implicit limitations[edit]

In some cases, the term "call by value" is problematic, as the value which is passed is not the value of the variable as understood by the ordinary meaning of value, but an implementation-specific reference to the value. The effect is that what syntactically looks like call by value may end up rather behaving like call by reference or call by sharing, often depending on very subtle aspects of the language semantics.

The reason for passing a reference is often that the language technically does not provide a value representation of complicated data, but instead represents them as a data structure while preserving some semblance of value appearance in the source code. Exactly where the boundary is drawn between proper values and data structures masquerading as such is often hard to predict. In C, an array (of which strings are special cases) is a data structure but the name of an array is treated as (has as value) the reference to the first element of the array, while a struct variable's name refers to a value even if it has fields that are vectors. In Maple, a vector is a special case of a table and therefore a data structure, but a list (which gets rendered and can be indexed in exactly the same way) is a value. In Tcl, values are "dual-ported" such that the value representation is used at the script level, and the language itself manages the corresponding data structure, if one is required. Modifications made via the data structure are reflected back to the value representation, and vice versa.

The description "call by value where the value is a reference" is common (but should not be understood as being call by reference); another term is call by sharing. Thus the behaviour of call by value Java or Visual Basic and call by value C or Pascal are significantly different: in C or Pascal, calling a function with a large structure as an argument will cause the entire structure to be copied (except if it's actually a reference to a structure), potentially causing serious performance degradation, and mutations to the structure are invisible to the caller. However, in Java or Visual Basic only the reference to the structure is copied, which is fast, and mutations to the structure are visible to the caller.

Call by reference[edit]

Call by reference (also referred to as pass by reference) is an evaluation strategy where a function receives an implicit reference to a variable used as argument, rather than a copy of its value. This typically means that the function can modify (i.e. assign to) the variable used as argument—something that will be seen by its caller. Call by reference can therefore be used to provide an additional channel of communication between the called function and the calling function. A call-by-reference language makes it more difficult for a programmer to track the effects of a function call, and may introduce subtle bugs.

Many languages support call by reference in some form or another, but comparatively few use it as a default. FORTRAN II is an early example of a call-by-reference language. A few languages, such as C++, PHP, Visual Basic .NET, C# and REALbasic, default to call by value, but offer special syntax for call-by-reference parameters. C++ additionally offers call by reference to const. Rust also offers call by reference, but defaults to immutable (const) references[5]. Mutable references have a similar syntax to immutable references.

Call by reference can be simulated in languages that use call by value and don't exactly support call by reference, by making use of references (objects that refer to other objects), such as pointers (objects representing the memory addresses of other objects). Languages such as C and ML use this technique. It is not a separate evaluation strategy—the language calls by value—but sometimes it is referred to as call by address (also referred to as pass by address). In ML references are type- and memory- safe.

A similar effect is achieved by call by sharing (passing an object, which can then be mutated), used in languages like Java, Python and Ruby.

In purely functional languages there is typically no semantic difference between the two strategies (since their data structures are immutable, so there is no possibility for a function to modify any of its arguments), so they are typically described as call by value even though implementations frequently use call by reference internally for the efficiency benefits.

Following is an example that demonstrates call by reference in E:

def modify(var p, &q) { p := 27 # passed by value: only the local parameter is modified q := 27 # passed by reference: variable used in call is modified } ? var a := 1 # value: 1 ? var b := 2 # value: 2 ? modify(a, &b) ? a # value: 1 ? b # value: 27

Following is an example of call by address that simulates call by reference in C:

voidmodify(intp,int*q,int*r){p=27;// passed by value: only the local parameter is modified*q=27;// passed by value or reference, check call site to determine which*r=27;// passed by value or reference, check call site to determine which}intmain(){inta=1;intb=1;intx=1;int*c=&x;modify(a,&b,c);// a is passed by value, b is passed by reference by creating a pointer (call by value),// c is a pointer passed by value// b and x are changedreturn0;}

Call by sharing[edit]

Call by sharing (also referred to as call by object or call by object-sharing) is an evaluation strategy first named by Barbara Liskov et al. for the language CLU in 1974.[6] It is used by languages such as Python,[7]Iota,[8]Java (for object references), Ruby, JavaScript, Scheme, OCaml, AppleScript, and many others. However, the term "call by sharing" is not in common use; the terminology is inconsistent across different sources. For example, in the Java community, they say that Java is call by value.[9] Call by sharing implies that values in the language are based on objects rather than primitive types, i.e. that all values are "boxed".

The semantics of call by sharing differ from call by reference: "In particular it is not call by value because mutations of arguments performed by the called routine will be visible to the caller. And it is not call by reference because access is not given to the variables of the caller, but merely to certain objects". So e.g. if a variable was passed, it is not possible to simulate an assignment on that variable in the callee's scope[11]. However, since the function has access to the same object as the caller (no copy is made), mutations to those objects, if the objects are mutable, within the function are visible to the caller, which may appear to differ from call by value semantics. Mutations of a mutable object within the function are visible to the caller because the object is not copied or cloned — it is shared. For example, in Python, lists are mutable, so:

deff(l):l.append(1)m=[]f(m)print(m)

outputs because the method modifies the object on which it is called.

Assignments within a function are not noticeable to the caller, because, in these languages, passing the variable only means passing (access to) the actual object referred to by the variable, not access to the original (caller's) variable. Since the rebound variable only exists within the scope of the function, the counterpart in the caller retains its original binding. Compare the Python mutation above with this code that binds the formal argument to a new object:

deff(l):l=[1]m=[]f(m)print(m)

outputs , because the statement reassigns a new list to the variable rather than to the location it references.

For immutable objects, there is no real difference between call by sharing and call by value, except if object identity is visible in the language. The use of call by sharing with mutable objects is an alternative to input/output parameters:[12] the parameter is not assigned to (the argument is not overwritten and object identity is not changed), but the object (argument) is mutated.

Although this term has widespread usage in the Python community, identical semantics in other languages such as Java and Visual Basic are often described as call by value, where the value is implied to be a reference to the object.[citation needed]

Call by copy-restore[edit]

Call by copy-restore (also referred to as copy-in copy-out, call by value result or call by value return—as termed in the Fortran community) is a special case of call by reference where the provided reference is unique to the caller. This variant has gained attention in multiprocessing contexts and Remote procedure call[citation needed]: if a parameter to a function call is a reference that might be accessible by another thread of execution, its contents may be copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").

The semantics of call by copy-restore also differ from those of call by reference where two or more function arguments alias one another; that is, point to the same variable in the caller's environment. Under call by reference, writing to one will affect the other; call by copy-restore avoids this by giving the function distinct copies, but leaves the result in the caller's environment undefined depending on which of the aliased arguments is copied back first—will the copies be made in left-to-right order both on entry and on return?

When the reference is passed to the callee uninitialized, this evaluation strategy may be called call by result.

Partial evaluation[edit]

Main article: Partial evaluation

In partial evaluation, evaluation may continue into the body of a function that has not been applied. Any sub-expressions that do not contain unbound variables are evaluated, and function applications whose argument values are known may be reduced. In the presence of side-effects, complete partial evaluation may produce unintended results; for this reason, systems that support partial evaluation tend to do so only for "pure" expressions (expressions without side-effects) within functions.

Non-strict evaluation[edit]

In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.

Under Church encoding, lazy evaluation of operators maps to non-strict evaluation of functions; for this reason, non-strict evaluation is often referred to as "lazy". Boolean expressions in many languages use a form of non-strict evaluation called short-circuit evaluation, where evaluation returns as soon as it can be determined that an unambiguous Boolean will result—for example, in a disjunctive expression where true is encountered, or in a conjunctive expression where false is encountered, and so forth. Conditional expressions also usually use lazy evaluation, where evaluation returns as soon as an unambiguous branch will result.

Normal order[edit]

Normal-order (or leftmost outermost) evaluation is the evaluation strategy where the outermost redex is always reduced, applying functions before evaluating function arguments.

In contrast, call by name does not evaluate inside the body of an unapplied function.

Call by name[edit]

Call by name is an evaluation strategy where the arguments to a function are not evaluated before the function is called—rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears. (See Jensen's Device.)

Call-by-name evaluation is occasionally preferable to call-by-value evaluation. If a function's argument is not used in the function, call by name will save time by not evaluating the argument, whereas call by value will evaluate it regardless. If the argument is a non-terminating computation, the advantage is enormous. However, when the function argument is used, call by name is often slower, requiring a mechanism such as a thunk.

An early use was ALGOL 60. Today's .NET languages can simulate call by name using delegates or Expression<T> parameters. The latter results in an abstract syntax tree being given to the function. Eiffel provides agents, which represent an operation to be evaluated when needed. Seed7 provides call by name with function parameters.

Call by need[edit]

Main article: Lazy evaluation

Call by need is a memoized variant of call by name where, if the function argument is evaluated, that value is stored for subsequent uses. If the argument is side-effect free, this produces the same results as call by name, saving the cost of recomputing the argument.

Haskell is a well-known language that uses call-by-need evaluation. Because evaluation of expressions may happen arbitrarily far into a computation, Haskell only supports side-effects (such as mutation) via the use of monads or uniqueness types. This eliminates any unexpected behavior from variables whose values change prior to their delayed evaluation.

In R, all arguments are passed by call-by-need. R allows arbitrary side-effects in call-by-need arguments.

Lazy evaluation is the most commonly used implementation strategy for call-by-need semantics, but variations exist—for instance optimistic evaluation.

.NET languages implement call by need using the type .

Call by macro expansion[edit]

Call by macro expansion is similar to call by name, but uses textual substitution rather than capture-avoiding substitution. With uncautious use, macro substitution may result in variable capture and lead to undesired behavior. Hygienic macros avoid this problem by checking for and replacing shadowed variables that are not parameters.

Nondeterministic strategies[edit]

Full β-reduction[edit]

Under full β-reduction, any function application may be reduced (substituting the function's argument into the function using capture-avoiding substitution) at any time. This may be done even within the body of an unapplied function.

Call by future[edit]

See also: Futures and promises

Call by future (also referred to as parallel call by name) is a concurrent evaluation strategy where the value of a future expression is computed concurrently with the flow of the rest of the program by one or more promises. When the value of the future is needed, the main program blocks until the future has a value (the promise or one of the promises finishes computing, if it has not already completed by then).

This strategy is non-deterministic, as the evaluation can occur at any time between creation of the future (i.e., when the expression is given) and use of the future's value. It is similar to call by need in that the value is only computed once, and computation may be deferred until the value is needed, but it may be started before. Further, if the value of a future is not needed, such as if it is a local variable in a function that returns, the computation may be terminated part-way through.

If implemented with processes or threads, creating a future will spawn one or more new processes or threads (for the promises), accessing the value will synchronize these with the main thread, and terminating the computation of the future corresponds to killing the promises computing its value.

If implemented with a coroutine, as in .NET async/await, creating a future calls a coroutine (an async function), which may yield to the caller, and in turn be yielded back to when the value is used, cooperatively multitasking.

Optimistic evaluation[edit]

Optimistic evaluation[13] is another variant of call by need in which the function's argument is partially evaluated for some amount of time (which may be adjusted at runtime), after which evaluation is aborted and the function is applied using call by need. This approach avoids some of the runtime expense of call by need, while still retaining the desired termination characteristics.

See also[edit]

Notes[edit]

References[edit]

  • Abelson, Harold; Sussman, Gerald Jay (1996). Structure and Interpretation of Computer Programs (Second ed.). Cambridge, Massachusetts: The MIT Press. ISBN 978-0-262-01153-2. 
  • Baker-Finch, Clem; King, David; Hall, Jon; Trinder, Phil (1999-03-10). "An Operational Semantics for Parallel Call-by-Need"(ps). Research report. Faculty of Mathematics & Computing, The Open University. 99 (1). 
  • Ennals, Robert; Peyton Jones, Simon (2003). Optimistic Evaluation: a fast evaluation strategy for non-strict programs(PDF). International Conference on Functional Programming. ACM Press. 
  • Ludäscher, Bertram (2001-01-24). "CSE 130 lecture notes". CSE 130: Programming Languages: Principles & Paradigms. 
  • Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1. 
  • Sestoft, Peter (2002). Mogensen, T; Schmidt, D; Sudborough, I. H., eds. Demonstrating Lambda Calculus Reduction(PDF). The Essence of Computation: Complexity, Analysis, Transformation. Essays Dedicated to Neil D. Jones. Lecture Notes in Computer Science. 2566. Springer-Verlag. pp. 420–435. ISBN 3-540-00326-6. 
  • "Call by Value and Call by Reference in C Programming". Call by Value and Call by Reference in C Programming explained. 
  1. ^Essentials of Programming Languages by Daniel P. Friedman and Mitchell Wand, MIT Press 1989–2006
  2. ^Some Fortran systems use call by copy-restore.
  3. ^"Lambda Calculus"(PDF). Cs.uiowa.edu. Retrieved 2013-08-18. 
  4. ^"applicative order reduction definition of applicative order reduction in the Free Online Encyclopedia". Encyclopedia2.thefreedictionary.com. Retrieved 2013-08-18. 
  5. ^https://doc.rust-lang.org/book/second-edition/ch04-02-references-and-borrowing.html
  6. ^Liskov, Barbara; Atkinson, Russ; Bloom, Toby; Moss, Eliot; Schaffert, Craig; Scheifler, Craig; Snyder, Alan (October 1979). "CLU Reference Manual"(PDF). Laboratory for Computer Science. Massachusetts Institute of Technology. Retrieved 2011-05-19. 
  7. ^Lundh, Fredrik. "Call By Object". effbot.org. Retrieved 2011-05-19. 
  8. ^"Iota Language Definition". CS 412/413 Introduction to Compilers. Cornell University. 2001. Retrieved 2011-05-19. 
  9. ^"Java is Pass-by-Value, Dammit!". Retrieved 2016-12-24. 
  10. ^Note: in CLU language, "variable" corresponds to "identifier" and "pointer" in modern standard usage, not to the general/usual meaning of "variable".
  11. ^"CA1021: Avoid out parameters". Microsoft. 
  12. ^Ennals, Robert; Jones, Simon Peyton (August 2003). "Optimistic Evaluation: a fast evaluation strategy for non-strict programs". 
Categories: 1

0 Replies to “Pass By Ref Hierarchy Argument Essay”

Leave a comment

L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *