I was reminded by a recent article of an obscure and dangerous property of PHP’s preg_replace() function which can lead to code execution in some not-all-that-uncommon circumstances. I recently found some code vulnerable to this attack in the wild, so I thought I’d put together a quick writeup for pentesters and PHP coders who may not be familiar with the danger.
Let’s start with a code example:
<code><?php $in = 'Somewhere, something incredible is waiting to be known'; echo preg_replace($_GET['replace'], $_GET['with'], $in); ?></code>
The code will take a user-supplied regular expression and replace whatever it matches with a user-supplied string. So if we were to call preg_replace.php?replace=/Known/i&with=eaten, the script would perform a case-insensitive regex search (the i modifier) and echo Somewhere, something incredible is waiting to be eaten. Seems safe enough, right?
You’ve just been owned
The above code is vulnerable to code injection as it fails to account for dangerous PCRE modification flags in the input string. Most modifiers are quite harmless and let you do things like case-insensitive and multi-line searches, however one modifier, “e” will cause PHP to execute the result of the preg_replace() operation as PHP code. Let me just restate:
Setting the e regex modifier will cause PHP to execute the replacement value as code.
Why does PHP have this option? I have no idea, and it’s actually been deprecated in later revisions (PHP >= 5.5.0) because of its recklessly insecure nature. Many people are still using PHP 5.2 or moving onto PHP 5.3, and even when deprecated the option will still work (it’ll generate a warning at a log level turned off by default), so the issue will be around for a while yet.
Exploiting the code
To exploit the code, all the attacker has to do is provide some PHP code to execute, generate a regular expression which replaces some or all of the string with the code, and set the e modifier on the regular expression:
That’s all there is to it. In this example the complete command output is displayed when the system() call is executed, followed by the string with the match replaced with the last line of output (which is what system() returns).
The null byte problem
It’s common to find code in which the coder manually sets the regex delimiters and any modifiers they want to use:
<code><?php $in = 'Somewhere, something incredible is waiting to be known'; echo preg_replace('/' . $_GET['replace'] . '/i', $_GET['with'], $in); ?></code>
This code seems safe because the attacker can no longer end the regular expression with their own modifier. Safety is an illusion, however, because of the way preg_replace() handles null bytes. By passing in a “spoofed” end delimiter and e modifier followed by a null byte chaser, the end delimiter and modifiers in the code never get processed.
Instead, the attacker’s string – with the execution modifier – terminates the expression and leads to code execution.
Securing the code
PHP provides a handy function named preg_quote() which will quote any nasty characters in the input string and prevent code injection. Even when using this function, be aware that your custom regex delimiter my fall outside of the scope of the function. Use the the second parameter to specify the delimiter which you’ve chosen:
<code><?php $in = 'Somewhere, something incredible is waiting to be known'; echo preg_replace('#' . preg_quote($_GET['replace'], '#') . '#', $_GET['with'], $in); ?></code>
Using preg_quote() renders all regex characters inert, so if you need to allow some access to use regular expressions, you’ll need to escape your delimitation character by hand. Be very careful though, this approach is error prone; you’ll need to escape the escape character as well, otherwise the attacker can just escape your escaping with their own escape character.
The implications of this issue stretch far and wide. Its subtle yet deadly nature make it an easy vulnerability to miss when developing and reviewing code. Be careful out there, and always think about how you use your input.
preg_quote
(PHP 4, PHP 5)
preg_quote — 转义正则表达式字符
说明 ¶
$str
[, string $delimiter
= NULL
] )preg_quote()需要参数 str
并向其中 每个正则表达式语法中的字符前增加一个反斜线。 这通常用于你有一些运行时字符串 需要作为正则表达式进行匹配的时候。
正则表达式特殊字符有: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : –
参数 ¶
str
- 输入字符串
delimiter
- 如果指定了可选参数
delimiter
,它也会被转义。这通常用于 转义PCRE函数使用的分隔符。 /是最通用的分隔符。
返回值 ¶
返回转义后的字符串。
更新日志 ¶
版本 | 说明 |
---|---|
5.3.0 | 字符 – 被增加为需要转义的。 |
范例 ¶
Example #1 preg_quote()示例
<?php
$keywords = '$40 for a g3/400';
$keywords = preg_quote($keywords, '/');
echo $keywords; // 返回 \$40 for a g3\/400
?>
Example #2 将文本中的单词替换为斜体
<?php
//在这个例子中,preg_quote($word) 用于保持星号原文涵义,使其不使用正则表达式中的特殊语义。
$textbody = "This book is *very* difficult to find.";
$word = "*very*";
$textbody = preg_replace ("/" . preg_quote($word) . "/",
"<i>" . $word . "</i>",
$textbody);
?>
注释 ¶
Note: 此函数可安全用于二进制对象。
转载请注明:jinglingshu的博客 » The unexpected dangers of preg_replace()