For example, the filter regexp:
Code: Select all
\bfoo\b
wouldn't match at all some feeds. Now, with pgrep_match unicode enabled, it works as expected.
I don't know if there is any other (bad?) consequences to enabling unicode, but I have the patch running for several months now, and all my filter regexps (with and without "\b") seem to work properly.
Here is the patch:
Code: Select all
diff --git a/include/rssfuncs.php b/include/rssfuncs.php
index 32bc69819b..ccc6d51545 100644
--- a/include/rssfuncs.php
+++ b/include/rssfuncs.php
@@ -1382,29 +1382,29 @@
switch ($rule["type"]) {
case "title":
- $match = @preg_match("/$reg_exp/i", $title);
+ $match = @preg_match("/$reg_exp/iu", $title);
break;
case "content":
// we don't need to deal with multiline regexps
$content = preg_replace("/[\r\n\t]/", "", $content);
- $match = @preg_match("/$reg_exp/i", $content);
+ $match = @preg_match("/$reg_exp/iu", $content);
break;
case "both":
// we don't need to deal with multiline regexps
$content = preg_replace("/[\r\n\t]/", "", $content);
- $match = (@preg_match("/$reg_exp/i", $title) || @preg_match("/$reg_exp/i", $content));
+ $match = (@preg_match("/$reg_exp/iu", $title) || @preg_match("/$reg_exp/iu", $content));
break;
case "link":
- $match = @preg_match("/$reg_exp/i", $link);
+ $match = @preg_match("/$reg_exp/iu", $link);
break;
case "author":
- $match = @preg_match("/$reg_exp/i", $author);
+ $match = @preg_match("/$reg_exp/iu", $author);
break;
case "tag":
foreach ($tags as $tag) {
- if (@preg_match("/$reg_exp/i", $tag)) {
+ if (@preg_match("/$reg_exp/iu", $tag)) {
$match = true;
break;
}