Preprocessing PHP
This is a cool trick for implementing self-preprocessing in PHP using the __halt_compiler() function and the GNU CPP utility. It originally appeared as a placeholder on another one of my domains but was moved here.
|
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 10 20 21 22 |
<?php #define __CPP__ true #ifndef __CPP__ $tmp_file = __FILE__ . "c"; if (filemtime(__FILE__) > filemtime($tmp_file)) { exec("cpp -E -Wp,-w,-P " . __FILE__ . " -o " . $tmp_file); } include($tmp_file); __halt_compiler(); #endif #define make_ip4(...) implode(, array(__VA_ARGS__)) echo "Making Address: one.two.three.four = " . make_ip4('one','two','three','four') . "\n"; echo "Making Address: one.two.three = " . make_ip4('one','two','three') . "\n"; echo "Making Address: one.two = " . make_ip4('one','two') . "\n"; echo "Making Address: one = " . make_ip4('one') . "\n\n"; ?> |
Output on a unix-compatible system:
zab@wickedphp [~] > dir test.* -rw-r----- 1 zab developers 818 Nov 10 17:43 test.php -rw-r----- 1 zab developers 434 Nov 10 22:14 test.phpc zab@wickedphp [~] > php test.php Making Address: one.two.three.four = one.two.three.four Making Address: one.two.three = one.two.three Making Address: one.two = one.two Making Address: one = one
Discussion
Mr. Zablocky, did you just preprocess a preprocessor language?
The above code example checks for a preprocessed php file and compares the timestamp with the current script. If need be, it invokes the C Preprocessor using itself as the input. The output is saved to the same filename with a letter "c" added to the extension.
Obviously, lines 3, 4, and 14 are interpreted as comments by php, even though they contain preprocessor directives. These three lines do not mean anything until line 9, when the cpp utility is called into action.
Observe also that line 9 is not even reached if it does not need to be, such as when the filemtime() checks fail on line 7. The checks would fail if the temporary file specified in line 5 is newer than the original source file. Due to the checks, the php is only preprocessed once when the script is first called.
No matter what, the first time the script is called, nothing past line 13 is even looked at by php. This is due to the __halt_compiler() call, a language construct which by definition stops php from parsing the file any further. The commonly cited usage for this construct is to embed installation media (such as zip or tar data) inside the script file, theoretically making an install set cleaner.
According to PHP, this is the script without transclusion:
|
01 02 03 04 05 06 07 08 09 10 11 12 13 |
<?php #define __CPP__ true #ifndef __CPP__ $tmp_file = __FILE__ . "c"; if (filemtime(__FILE__) > filemtime($tmp_file)) { exec("cpp -E -Wp,-w,-P " . __FILE__ . " -o " . $tmp_file); } include($tmp_file); // __halt_compiler(); would be here |
Where php takes over again is the include() directive on line 12. This directive loads the preprocessed php file (designated on line 5) into the current scope (just before __halt_compiler()).
During Preprocessing
When cpp is called on line 9, the current script is preprocessed and saved as a new script file. During this process, everything from line 4 through 14 is blanked out due to the definition on line 3. Lines 18 through 20 are modified according to the definition on line 16. The entire file is stripped of comments, but whitespace is preserved.
Take a closer look at line 16. This definition (useless to php) states that all instances of make_ip4(...) are to be changed to implode(, array(__VA_ARGS__)). Here we are defining a function (official: macro) using the C Preprocessor that can be used anywhere after it is defined (but not across included files). Once cpp is done with the script file, there will be no instances of "make_ip4()", all of them having been converted to the complex implode() call.
The expansion of "make_ip4()" changes a statement such as make_ip4(3, 5); to something like implode(, array(3, 5)); when preprocessed. The possibilities are not astronomical here, but interesting.
Here is what the code looks like now:
|
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 |
<?php #define __CPP__ true #ifndef __CPP__ $tmp_file = __FILE__ . "c"; if (filemtime(__FILE__) > filemtime($tmp_file)) { exec("cpp -E -Wp,-w,-P " . __FILE__ . " -o " . $tmp_file); } // include($tmp_file); would be here echo "Making Address: one.two.three.four = " . implode(, array('one','two','three','four')) . "\n"; echo "Making Address: one.two.three = " . implode(, array('one','two','three')) . "\n"; echo "Making Address: one.two = " . implode(, array('one','two')) . "\n"; echo "Making Address: one = " . implode(, array('one')) . "\n\n"; __halt_compiler(); would be here |
If you were to dump the contents of the ?.?c file (the temp file specified on line 5), you would see something similar to the reduced-size text in the last example. Altogether, the last example comprises the final code as executed by php. Of course, php does some further processing before execution begins, but that is beyond scope.
Summary
PHP does not currently allow preprocessor directives like this. It is arguable that such things make any language more complex and harder to read. However, I feel that the same can be argued about php, and then again about html. All of these are techniques for filtering some type of symbolic data and producing something different from it.
As of right now, I cannot come up with any practical use for preprocessing PHP, but I thought the trick was wholly shareable.
Good luck, and happy coding!
