I want to look at two PHP functions: eval
and exec
. They’re so often thrown under the sensible-developers-never-use-these bus that I sometimes wonder how many awesome applications we miss out on.
Like every other function in the standard library, these have their uses. They can be abused. Their danger lies in the amount of flexibility and power they offer even the most novice of developers.
Let me show you some of the ways I’ve seen these used, and then we can talk about safety precautions and moderation.
Dynamic Class Creation
The first time I ever saw dynamic class creation was in the bowels of CodeIgniter. At the time, CodeIgniter was using it to create ORM classes. eval
is still used to rewrite short open tags for systems that don’t have the feature enabled…
More recently, though, my friend Adam Wathan tweeted about using it to dynamically create Laravel facades. Take a look at what the classes usually look like:
namespace Illuminate\Support\Facades;class Artisan extends Facade{
protected static function getFacadeAccessor()
{
return "Illuminate\Contracts\Console\Kernel";
}}
This is from github.com/laravel/framework/blob/5.3/src/Illuminate/Support/Facades/Artisan.php
These facade classes aren’t facades in the traditional sense, but they do act as static references to objects stored in Laravel’s service locator class. They project an easy way to refer to objects defined and configured elsewhere, and have benefits over traditional static (or singleton) classes. One of these benefits is in testing:
public function testNotificationWasQueued(){
Artisan::shouldReceive("queue")
->once()
->with(
"user:notify",
Mockery::subset(["user" => 1])
);
$service = new App\Service\UserService();
$service->notifyUser(1);}
…and though these facades are simple to create, there are a lot of them. That’s not the kind of code I find interesting to write. It seems Adam felt the same what when we wrote the tweet.
So, how could we create these facade classes dynamically? I haven’t seen Adam’s implementation code, but I’m guessing it looks something like:
function facade($name, $className) {
if (class_exists($name)) {
return;
}
eval(" class $name extends Facade
{
protected static function getFacadeAccessor()
{
return $className::class;
}
}
");}
That’s a neat trick. Whether of not you use (or even like) Laravel facades, I’m guessing you can see the benefits of writing less code. Sure, this probably adds to the execution time of each request, but we’d have to profile performance to decide if it even matters much.
I’ve used a similar trick before, when working on a functional programming library. I wanted to create the supporting code to enable me to write applications using fewer (if any) classes. This is what I used eval
to create:
functional\struct\create("person", [
"first_name" => "string",
"last_name" => "string",]);$me = person([
"first_name" => "christopher",
"last_name" => "pitt",]);
I wanted these structures to be accessible from anywhere, and self-validating. I wanted the freedom to be able to pass in an array of properties, and the control to be able to reject invalid types.
The following code uses many instances of
ƒ
. It’s a unicode character I’ve used as a sort of pseudo-namespace for properties and methods I can’t make private, yet don’t want others to access directly. It’s interesting to know that most unicode characters are valid characters for method and property names.
To allow this, I created the following code:
abstract class ƒstruct{
/**
* @var array
*/
protected $ƒdef = [];
/**
* @var array
*/
protected $ƒdata = [];
/**
* @var array
*/
protected $ƒname = "structure";
public function __construct(array $data)
{
foreach ($data as $prop => $val) {
$this->$prop = $val;
}
assert($this->ƒthrow_not_all_set());
}
private function ƒthrow_not_all_set()
{
foreach ($this->ƒdef as $prop => $type) {
$typeIsNotMixed = $type !== "mixed";
$propIsNotSet = !isset($this->ƒdata[$prop]);
if ($typeIsNotMixed and $propIsNotSet) {
// throw exception
}
}
return true;
}
public function __set($prop, $value)
{
assert($this->ƒthrow_not_defined($prop, $value));
assert($this->ƒthrow_wrong_type($prop, $value));
$this->ƒdata[$prop] = $value;
}
private function ƒthrow_not_defined(string $prop)
{
if (!isset($this->ƒdef[$prop])) {
// throw exception
}
return true;
}
private function ƒthrow_wrong_type(string $prop, $val)
{
$type = $this->ƒdef[$prop];
$typeIsNotMixed = $type !== "mixed";
$typeIsNotSame = $type !== type($val);
if ($typeIsNotMixed and $typeIsNotSame) {
// throw exception
}
return true;
}
public function __get($prop)
{
if ($property === "class") {
return $this->ƒname;
}
assert($this->ƒthrow_not_defined($prop));
if (isset($this->ƒdata[$prop])) {
return $this->ƒdata[$prop];
}
return null;
}}function type($var) {
$checks = [
"is_callable" => "callable",
"is_string" => "string",
"is_integer" => "int",
"is_float" => "float",
"is_null" => "null",
"is_bool" => "bool",
"is_array" => "array",
];
foreach ($checks as $func => $val) {
if ($func($var)) {
return $val;
}
}
if ($var instanceof ƒstruct) {
return $var->class;
}
return "unknown";}function create(string $name, array $definition) {
if (class_exists("\\ƒ" . $name)) {
// throw exception
}
$def = var_export($definition, true);
$code = " final class ƒ$name extends ƒstruct {
protected $ƒdef = $def;
protected $ƒname = '$name';
}
function $name(array \$data = []) {
return new ƒ$name(\$data);
}
";
eval($code);}
This is similar to code found at github.com/assertchris/functional-core
There’s a lot going on here, so let’s break it down:
- The
ƒstruct
class is the abstract basis for these self-validating structures. It defines__get
and__set
behavior that includes checks for presence and validity of the data used to initialize each struct. - When a struct is created,
ƒstruct
checks if all required properties have been provided. That is, unless any of the properties are mixed they must be defined. - As each property is set, the value provided is checked against the expected type for that property.
- All of these checks are designed to work with (and wrapped in) calls to
assert
. This means the checks are only performed in development environments. - The
type
function is used to return predictable type strings for the most common types of variables. In addition, if the variable is a subclass ofƒstruct
, theƒname
property value is returned as the type string. This means we can define nested structures as easily as:create("account", ["holder" => "person"])
. A caveat is that the pre-defined types (like"int"
and"string"
) will always be resolved before structures of the same name. - The
create
function useseval
to create new subclasses ofƒstruct
, containing the appropriate class name,ƒname
, andƒdef
.var_export
takes the value of a variable and returns the syntax string form of it.
The assert
function is usually disabled in production environments by having zend.assertions
at 0 in php.ini
. If you’re not seeing assertion errors where you expect them, check what this setting is set to.
Domain Specific Languages
Domain Specific Languages (or DSLs as they’re usually referred to) are alternative programming languages that express an idea or problem domain well. Markdown is an excellent example of this.
I’m writing this post in Markdown, because it allows me to define the meaning and importance of each bit of text, without getting bogged down in the visual appearance of the post.
CSS is another excellent DSL. It provides many and varied means of addressing one or more HTML elements (by a selector), so that visual styles can be applied to them.
DSLs can be internal or external. Internal DSLs use an existing programming language as their syntax, but they are uniquely structured within that syntax. Fluent interfaces are a good example of this:
Post::where("is_published", true)
->orderBy("published_at", "desc")
->take(6)
->skip(12)
->get();
This is an example of some code you might see in a Laravel application. It’s using an ORM called Eloquent, to build a query for a SQL database.
External DSLs use their own syntax, and need some kind of parser or compiler to transform this syntax into machine code. SQL syntax is a good example of this:
SELECT * FROM posts WHERE is_published = 1
ORDER BY published_at DESC
LIMIT 12, 6;
The above PHP code should approximately render to this SQL code. It’s sent over the wire to a MySQL server, which transforms it into code servers can understand.
If we wanted to make our own external DSL, we would need to transform custom syntax into code a machine can understand. Short of learning how assembler works, we could translate custom syntax into a lower-level language. Like PHP.
Imagine we wanted to make a language that was a super-set language. That means the language would support everything PHP does, but also a few extra bits of syntax. A small example could be:
$numbers = [1, 2, 3, 4, 5];print_r($numbers[2..4]);
How could we convert this into valid PHP code? I answered this exact question in a previous post, but the gist of it is by using code similar to:
function replace($matches) {
return ' call_user_func(function($list) {
$lower = '.explode('..', $matches[2])[0].';
$upper = '.explode('..', $matches[2])[1].';
return array_slice(
$list, $lower, $upper - $lower
);
}, $'.$matches[1].')
';}function parse($code) {
$replaced = preg_replace_callback(
'/$(\S+)\[(\S+)\]/', 'replace', $code
);
eval($replaced);}parse(' $numbers = [1, 2, 3, 4, 5];
print_r($numbers[2..4]);');
This code takes a string of PHP-like syntax and parses it by replacing new syntax with standard PHP syntax. Once the syntax is standard PHP, the code can be evaluated. It essentially does an inline code replacement, which is only possible when code can be executed dynamically.
To do this, without the eval
function, we’d need to build a compiler. Something that takes high-level code and gives back low-level code. In this case, it would need to take our PHP super-set language code, and give back valid PHP code.
Parallelism
Let’s take a look at another jaded core function: exec
. Perhaps more decried than even eval
, exec
is universally denounced by all but the more adventurous developers. And I have to wonder why.
In case you’re unfamiliar, exec
works like this:
exec("ls -la | wc -l", $output);print $output[0]; // number of files in the current dir
exec
is a way for PHP developers to run an operating system command, in a new sub-process of the current script. With a little bit of prodding, we can actually make this sub-process run completely in the background:
exec("sleep 30 > /dev/null 2> /dev/null &");
To do this: we redirect stdout
and stderr
to /dev/null
and add an &
to the end of the command we want to run in the background. There are many reasons you’d want to do something like this, but my favorite is to be able to perform slow and/or blocking tasks away from the main PHP process.
Image you had a script like this:
foreach ($images as $image) {
$source = imagecreatefromjpeg($image["src_path"]);
$icon = imagecreatetruecolor(64, 64);
imagecopyresampled(
$source, $icon, 0, 0, 0, 0,
64, 64, $image["width"], $image["height"]
);
imagejpeg($icon, $image["ico_path"]);
imagedestroy($icon);
imagedestroy($source);}
This is fine, for a few images. But imagine hundreds of images, or dozens of requests per second. Traffic like that could easily affect server performance. In cases like these, we can isolate slow code and run it in parallel (or even remotely) to user-facing code.
Here’s how we could run the slow code:
exec("php slow.php > /dev/null 2> /dev/null &");
We could even take it a step further by generating a dynamic script for the PHP command-line interface to run. To begin with, we can install SuperClosure :
require __DIR__ . '/vendor/autoload.php';use SuperClosure\Serializer;function defer(Closure $closure) {
$serializer = new Serializer();
$serialized = $serializer->serialize($closure);
$autoload = __DIR__ . '/vendor/autoload.php';
$raw = '
require \'' . $autoload . '\';
use SuperClosure\Serializer; $serializer = new Serializer(); $serialized = \'' . $serialized . '\';
call_user_func( $serializer->unserialize($serialized)
);
';
$encoded = base64_encode($raw);
$script = 'eval(base64_decode(\'' . $encoded . '\'));';
exec('php -r "' . $script . '"', $output);
return $output;}$output = defer(function() {
print "hi";});
Why do we need to hard-code a script (to run in parallel) when we could just dynamically generate the code we want to run, and pipe it directly into the PHP binary?
We can even combine this exec
trick with eval
, by encoding the source code we want to run, and decoding it upon execution. This makes the command to start the sub-process much neater overall.
We can even add a unique identifier, so that the sub-process is easier to track and kill:
function defer(Closure $closure, $id = null) {
// create $script
if (is_string($id)) {
$script = '/* id:' . $id . ' */' . $script;
}
$shh = '> /dev/null 2> /dev/null &';
exec(
'php -r "' . $script . '" ' . $shh,
$output
);
return $output;}
Staying Safe
The main reason so many developers dislike and/or advise against eval
and exec
is because their misuse leads to far more disastrous outcomes than, say, count
.
I’d suggest, instead of listening to these folks and immediately dismissing eval
and exec
, you learn how to use them securely. The main thing you want to avoid is using them with unfiltered user-supplied input.
Avoid at all costs:
exec($_GET["op"] . " " . $_GET["path"]);
Try instead:
$op = $_GET["op"];$path = $_GET["path"];if (allowed_op($op) and allowed_path($path)) {
$clean = escapeshellarg($path);
if ($op === "touch") {
exec("touch {$clean}");
}
if ($op === "remove") {
exec("rm {$clean}");
}}
…or better yet: avoid putting any user-supplied data directly into an exec
command! You can also try other escaping functions, like escapeshellcmd. Remember that this is a gateway into your system. Anything the user running the PHP process is allowed to do, exec
is allowed to do. That’s why it’s intentionally disabled on shared hosting.
As with all PHP core functions, use these in moderation. Pay special attention to the data you allow in, and avoid unfiltered user-supplied data in exec
. But, don’t avoid the functions without understanding why you’re avoiding them. That’s not helpful for you or the people who will learn from you.