Specifing codepage for single feed

Request new functionality here
fhoshino
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 10 Apr 2013, 09:45

Specifing codepage for single feed

Postby fhoshino » 10 Apr 2013, 09:49

I'd like to request an option to specify the codepage for a single feed (not for writing to the database)
As one of my feed is not in UTF-8, I sometime get garbaged strings.

phz
Bear Rating Disaster
Bear Rating Disaster
Posts: 77
Joined: 18 Mar 2013, 18:32

Re: Specifing codepage for single feed

Postby phz » 10 Apr 2013, 21:52

The problem is most likely is that the feed announces the wrong charset. That is a bug in the specific feed.

If you post a link to the feed in question people can look at it and perhaps try to figure what's wrong.

fhoshino
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 10 Apr 2013, 09:45

Re: Specifing codepage for single feed

Postby fhoshino » 10 Apr 2013, 21:55

I've post in G+ but there is no answer.
https://plus.google.com/109766076672185 ... DnEwWAxRnX

phz
Bear Rating Disaster
Bear Rating Disaster
Posts: 77
Joined: 18 Mar 2013, 18:32

Re: Specifing codepage for single feed

Postby phz » 10 Apr 2013, 22:17

http://www.rthk.org.hk/rthk/news/rss/c_expressnews.xml — this is the feed in question for those who wonder.

It seems to be Chinese characters in big5 encoding, and it announces itself as that as well. The `file` tool identifies the file as ISO 8859-1, but that involves quite some guessing from its part, and running it through `iconv` interpreted as big5 seems to be perfectly consistent.

I actually don't know why it doesn't work as it should. Perhaps someone with more internal knowledge on how encoding is handled in TT-RSS can give more info on whether TT-RSS is to "blame" here.

One weird and technically convoluted hack could be to setup a local translation script that polled the feed and converted it to UTF-8 on-the-fly, but I guess that is not really an option for most people.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Specifing codepage for single feed

Postby fox » 10 Apr 2013, 22:31

There will be no per-feed charset dropdowns. This should be properly fixed by contacting the publisher and asking them to fix their fucking feed. If that is not possible, this could be handled by a plugin (there's a hook for that).

User avatar
LifeWOutMilk
Bear Rating Disaster
Bear Rating Disaster
Posts: 52
Joined: 02 Apr 2013, 21:57

Re: Specifing codepage for single feed

Postby LifeWOutMilk » 10 Apr 2013, 23:42

Image

The feed appears to be fine. It doesn't look like the issue is with tt-rss.

phz
Bear Rating Disaster
Bear Rating Disaster
Posts: 77
Joined: 18 Mar 2013, 18:32

Re: Specifing codepage for single feed

Postby phz » 11 Apr 2013, 09:52

LifeWOutMilk wrote:Image

The feed appears to be fine. It doesn't look like the issue is with tt-rss.

Actually testing the feed to see if you can reproduce the bug!? Oh, I never… :-D

Well, that's great then. As I said, the feed seems to represent itself as big5, as it should.

As for the original poster, here are some leading questions for error searching:
  • which TT-RSS version are you using? Make sure it is somewhat recent.
  • which browser are you using (which version, and on what OS)? If you paste the URL to the XML feed directly into the browser, can you see the correct characters? How about if you "View source"? Can you try accessing the feed via TT-RSS in another browser?

fhoshino
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 10 Apr 2013, 09:45

Re: Specifing codepage for single feed

Postby fhoshino » 11 Apr 2013, 11:27

phz wrote:http://www.rthk.org.hk/rthk/news/rss/c_expressnews.xml — this is the feed in question for those who wonder.

It seems to be Chinese characters in big5 encoding, and it announces itself as that as well. The `file` tool identifies the file as ISO 8859-1, but that involves quite some guessing from its part, and running it through `iconv` interpreted as big5 seems to be perfectly consistent.

I actually don't know why it doesn't work as it should. Perhaps someone with more internal knowledge on how encoding is handled in TT-RSS can give more info on whether TT-RSS is to "blame" here.

One weird and technically convoluted hack could be to setup a local translation script that polled the feed and converted it to UTF-8 on-the-fly, but I guess that is not really an option for most people.


As I mention, the feed is sometimes rendered correctly, but sometimes it doesn't.
Yes, the feed is big5, and it has said to be big5.
I'm on trunk code (git source), I'm using firefox nightly on windows 8 x64, and it correctly renders the xml file without a problem.
My server is a WAMP resides on a windows 7 x86 machine.

fhoshino
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 10 Apr 2013, 09:45

Re: Specifing codepage for single feed

Postby fhoshino » 11 Apr 2013, 11:29

fox wrote:There will be no per-feed charset dropdowns. This should be properly fixed by contacting the publisher and asking them to fix their fucking feed. If that is not possible, this could be handled by a plugin (there's a hook for that).


I guess there is no way to make them fix the feed (a government organization which is reluctant to change anything for the public).
Could you guide me how I can setup the plugin?

phz
Bear Rating Disaster
Bear Rating Disaster
Posts: 77
Joined: 18 Mar 2013, 18:32

Re: Specifing codepage for single feed

Postby phz » 11 Apr 2013, 18:17

fhoshino wrote:As I mention, the feed is sometimes rendered correctly, but sometimes it doesn't.

Try saving a snapshot of the feed on an occasion when it doesn't work for others to check.

fhoshino
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 10 Apr 2013, 09:45

Re: Specifing codepage for single feed

Postby fhoshino » 11 Apr 2013, 18:28

phz wrote:
fhoshino wrote:As I mention, the feed is sometimes rendered correctly, but sometimes it doesn't.

Try saving a snapshot of the feed on an occasion when it doesn't work for others to check.

1.png
1.png (136.71 KiB) Viewed 2995 times

It works right sometimes, but sometimes it doesn't.


Return to “Feature requests”

Who is online

Users browsing this forum: No registered users and 3 guests