Marco Pivetta (Ocramius)

Fast PHP Object to Array conversion

A couple of months ago, I found a forgotten feature of PHP itself.

Apparently, it is possible to cast objects to arrays like following:

<?php

class Foo
{
    public $bar = 'barValue';
}

$foo = new Foo();

$arrayFoo = (array) $foo;

var_dump($arrayFoo);

This will produce something like:

array(1) {
    ["bar"]=> string(8) "barValue"
}

Private and Protected properties

If we start adding private and protected properties to our Foo class, things get very interesting:

<?php

class Foo
{
    public $bar = 'barValue';
    protected $baz = 'bazValue';
    private $tab = 'tabValue';
}

$foo = new Foo();

$arrayFoo = (array) $foo;

var_dump($arrayFoo);

The output will be like following in this case:

array(3) {
    ["bar"]=> string(8) "barValue"
    ["*baz"]=> string(8) "bazValue"
    ["Footab"]=> string(8) "tabValue"
}

Weird, so $baz is copied to array key '*baz' and $tab is copied to Footab...

Let's try accessing those keys:

<?php

var_dump($arrayFoo['*baz']);
var_dump($arrayFoo['Footab']);

Something even more strange happens here: we get two notices.

Notice: Undefined index: *baz in [...] on line [...]
NULL

Notice: Undefined index: Footab in [...] on line [...]
NULL

I actually spent some time trying to understand why this was happening, and even the debugger was failing me! Then I tried using var_export:

<?php

var_export($arrayFoo);

The output is quite interesting:

array (
    'bar' => 'barValue',
    '' . "\0" . '*' . "\0" . 'baz' => 'bazValue',
    '' . "\0" . 'Foo' . "\0" . 'tab' => 'tabValue',
)

Null characters are used as delimiters between the visibility scope of a particular property and its name!

That's some really strange results, and they give us some insight on how PHP actually keeps us from accessing private and protected properties.

Direct property read attempt

What happens if we try to directly access the $foo properties with this new trick?

<?php

var_dump($foo->{"\0*\0baz"});
var_dump($foo->{"\0Foo\0tab"});

Looks like the engine was patched after PHP 5.1 to fix this (un-documented break), since we get a fatal:

Fatal error: Cannot access property started with '\0' in [...] on line [...]

Too bad! That would have had interesting use cases. The change makes sense though, since we shouldn't modify internal state without explicitly using an API that cries out "I do things with your objects state!".

Some notes and suggestions

  • This way of accessing properties via array conversion is quite useful when it actually makes sense to access object internal state. Don't use it otherwise.
  • It is safe to use since an eventual behaviour change has to be documented. I provided a test for PHP-SRC in a pull request to protect this kind of usage.
  • You should probably not re-map the private properties to simple names such as baz, since multiple inheritance levels may cause collisions in key names.
  • You may have already noticed that I work a lot with internal object states: that doesn't mean that you should too.

I'm currently writing a small library called GeneratedHydrator to take advantage of this behaviour and the one that I described in my previous blog post. That should prevent you from doing this kind of dangerous things with PHP :-)

Tags: php, oop