[Patch] enable unicode for filter regexps

Development-related discussion, including bundled plugins
Gravemind
Bear Rating Trainee
Bear Rating Trainee
Posts: 1
Joined: 01 Oct 2016, 16:00

[Patch] enable unicode for filter regexps

Postby Gravemind » 01 Oct 2016, 16:34

I had issues with feeds not matched by some of my filter regexp, I look into it and found that at least the rule "\b" needs unicode enabled to work properly on some feeds (probably unicode encoded feeds ? I don't know).

For example, the filter regexp:
[code]\bfoo\b [/code]
wouldn't match at all some feeds. Now, with pgrep_match unicode enabled, it works as expected.

I don't know if there is any other (bad?) consequences to enabling unicode, but I have the patch running for several months now, and all my filter regexps (with and without "\b") seem to work properly.

Here is the patch:
[code]
diff --git a/include/rssfuncs.php b/include/rssfuncs.php
index 32bc69819b..ccc6d51545 100644
--- a/include/rssfuncs.php
+++ b/include/rssfuncs.php
@@ -1382,29 +1382,29 @@

switch ($rule["type"]) {
case "title":
- $match = @preg_match("/$reg_exp/i", $title);
+ $match = @preg_match("/$reg_exp/iu", $title);
break;
case "content":
// we don't need to deal with multiline regexps
$content = preg_replace("/[\r\n\t]/", "", $content);

- $match = @preg_match("/$reg_exp/i", $content);
+ $match = @preg_match("/$reg_exp/iu", $content);
break;
case "both":
// we don't need to deal with multiline regexps
$content = preg_replace("/[\r\n\t]/", "", $content);

- $match = (@preg_match("/$reg_exp/i", $title) || @preg_match("/$reg_exp/i", $content));
+ $match = (@preg_match("/$reg_exp/iu", $title) || @preg_match("/$reg_exp/iu", $content));
break;
case "link":
- $match = @preg_match("/$reg_exp/i", $link);
+ $match = @preg_match("/$reg_exp/iu", $link);
break;
case "author":
- $match = @preg_match("/$reg_exp/i", $author);
+ $match = @preg_match("/$reg_exp/iu", $author);
break;
case "tag":
foreach ($tags as $tag) {
- if (@preg_match("/$reg_exp/i", $tag)) {
+ if (@preg_match("/$reg_exp/iu", $tag)) {
$match = true;
break;
}
[/code]

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: [Patch] enable unicode for filter regexps

Postby fox » 03 Oct 2016, 18:46

i don't think this should break anything (although i can be wrong), it would be cool if you refiled this as a gitlab merge request.


Return to “Development”

Who is online

Users browsing this forum: No registered users and 2 guests