mb_convert_encoding(): Illegal character encoding specified

Support requests, bug reports, etc. go here. Dedicated servers / VDS hosting only
Striker21
Bear Rating Trainee
Bear Rating Trainee
Posts: 42
Joined: 27 Oct 2015, 00:30

mb_convert_encoding(): Illegal character encoding specified

Postby Striker21 » 04 Sep 2016, 15:06

Hey all, trying to fix some warnings in my tiny log, any thoughts on what causes this one and if fixable? Not critical is looks but a warning less is a warning less.

VPS, Ubuntu 14.04, PostgreSQL, TT-RSS v16.8 (main git)

http://www.xboxlife.dk/rss/reviews.rss


Code: Select all

mb_convert_encoding(): Illegal character encoding specified
1. classes/feedparser.php(18): mb_convert_encoding(

http://www.xboxlife.dk/

Sun, 04 Sep 2016 13:00:00 +0200
http://backend.userland.com/rss/
PHP
Reviews
http://static.xboxlife.dk/i/xboxlife_88x18.gif
[email protected]
[email protected]
da-dk

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: mb_convert_encoding(): Illegal character encoding specified

Postby fox » 04 Sep 2016, 15:39

i'm not sure if there's a way to check if the encoding is supported by mbstring, if not the easiest way would be to hush the warning with a @. it shouldn't break anything.

Striker21
Bear Rating Trainee
Bear Rating Trainee
Posts: 42
Joined: 27 Oct 2015, 00:30

Re: mb_convert_encoding(): Illegal character encoding specified

Postby Striker21 » 04 Sep 2016, 17:56

Note sure, maybe through mb_detect_encoding and mb_list_encodings, but if nothing is broken it might not be worth it. Maybe I'll throw in a @ as suggested.

JustAMacUser
Bear Rating Overlord
Bear Rating Overlord
Posts: 373
Joined: 20 Aug 2013, 23:13

Re: mb_convert_encoding(): Illegal character encoding specified

Postby JustAMacUser » 04 Sep 2016, 18:13

Code: Select all

function normalize_encoding($data) {
      if (preg_match('/^(<\?xml[\t\n\r ].*?encoding[\t\n\r ]*=[\t\n\r ]*["\'])(.+?)(["\'].*?\?>)/s', $data, $matches) === 1) {
         $enc = strtolower($matches[2]);
         
         if (!in_array($enc, mb_list_encodings()))
            $enc = 'auto';
            
         $data = mb_convert_encoding($data, 'UTF-8', $enc);

         $data = preg_replace('/^<\?xml[\t\n\r ].*?\?>/s', $matches[1] . "UTF-8" . $matches[3] , $data);




I'm not at my computer to test this or even submit a merge request, but I think this change might fix it. It will make the mb functions try to figure it out on its own if the encoding is not in the accepted list.

My experience with the mb_detect... is that it sucks; if memory serves me you have to give it a list of encodings to check against, which doesn't even make sense since you don't know the encoding. Recognizing character sets is just frustrating, period.

e: I also don't know how expensive mb_list_encodings() is, if it's really slow, it makes more sense to just store the results as a class property once. I'm going to guess it's fast, but hey, you never know.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: mb_convert_encoding(): Illegal character encoding specified

Postby fox » 05 Sep 2016, 14:50

i don't think auto is a good idea because it can corrupt data if mbstring screws up detection (probably? idk), mb_list_encoding() fits however. since this is done once per feed parse i don't think it should be too slow.

https://tt-rss.org/gitlab/fox/tt-rss/co ... b0e91f9dad


Return to “Support”

Who is online

Users browsing this forum: No registered users and 11 guests