Deprecating &$references
PHP-style references (&$variable
) have been a source of confusion and
unsoundness in the Hack type system (read on to learn why). After a long
effort, we are now finally ready to remove them from the Hack language.
We expect to remove support for PHP-style references from HHVM in about 2–4 weeks.
Migrating your code
By far the most common use case for PHP references is passing function arguments by reference. Inout parameters are built to address this use case without relying on PHP references. See also Migrating to inout parameters.
We have updated all built-in functions with reference parameters (such as
sort()
, array_pop()
, or preg_match()
with a &$matches
argument) to use
inout parameters instead. In some cases, the function that takes an inout
argument has a new name (e.g. preg_match_with_matches()
). You can migrate your
calls automatically using HHAST v4.21.7 or newer:
1
hhast-migrate --ref-to-inout
For references other than function parameters, the
Ref
class can be used
to explicitly achieve reference semantics when necessary. A common example is
needing to share mutable state between a lambda and enclosing function:
1
2
3
4
5
6
7
8
9
10
11
function sort_and_count_comparisons(inout vec<int> $a): int {
$comparisons = Ref(0);
\usort(
inout $a,
($left, $right) ==> {
$comparisons->value++;
return $left - $right;
}
);
return $comparisons->value;
}
What are PHP references?
PHP references allow the same variable to be accessed using multiple different names. Changes done via one reference are visible through all references to the same variable. Let’s look at a simple example (all examples use PHP7 unless noted otherwise, as Hack no longer supports most of reference behaviors):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$var = 'abc';
$other_var = &$var; // $other_var is now an alias to $var
$one_more_var = &$other_var; // $one_more_var is also an alias to $var
$arr['key'] = &$var; // references can be stored inside arrays
$obj->prop = &$var; // and in object properties
$other_var = 'def'; // $var, $other_var, $arr['key'] and $obj->prop now contain 'def'
foreach($arr as &$x) { // iterate over array by reference
++$x; // modifying $x modifies $arr elements
}
function &returns_ref(&$arr) {
return $arr['key']; // references can be returned from functions
}
// functions can take parameters by reference
function increment(&$value) {
$value += 1;
}
References are widely used in PHP standard library—sort()
,
array_pop()
, preg_match()
to name a few most frequently used examples.
References can be confusing
Let’s look at another example:
1
2
3
4
5
6
7
$arr = array(0, 1, 2);
foreach ($arr as &$v) {
$v += 1;
}
foreach ($arr as $v) {
var_dump($v);
}
What is the produced output, and more importantly why does it work this way? (Spoilers).
This is almost never intended behavior, and the resulting issue is very easy to miss. As part of deprecating foreach-by-reference support at Facebook we have discovered multiple bugs, following this pattern.
References are unsound
PHP references are a major blocker for making the Hack type system sound (i.e. reaching the point where Hack typechecker is correct about types 100% of time). PHP references were designed for a dynamically typed language and they cannot be modeled in Hack type system without significantly restricting their power.
The unsoundness problem is best illustrated by another example (3v4l):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Foo {
public $prop;
}
$foo = new Foo();
$str = 'abc'; // $str is a string
$foo->prop = &$str;
var_dump($foo->prop); // "abc"
$arr = array();
$foo->prop = $arr;
var_dump($foo->prop); // array()
var_dump($str); // array() - surprise! $str is now an array
HHVM wins
PHP references are a significant source of complexity and performance overhead in HHVM. Moreover, references make it possible for Hack code to observe a number of runtime implementation details (e.g. reference counting), blocking HHVM team from exploring more efficient alternatives.
Multiple efforts in HHVM have already taken advantage of references removal. All of these provided significant measurable CPU improvements when we deployed them to Facebook servers:
- Property type enforcement—binding references to typed properties is disallowed
- Hack arrays—references to Hack array elements are disallowed
- improvements around handling function calls in HHVM internals—made possible by requiring reference (and inout) parameters to be annotated at call sites
With references removal now complete, we expect more wins down the line.