I think there's something else wrong, always get invalid JSON when pressing save, even when the feedcleaner window is blank.
I messaged espn mid last week with no reply yet, have a feeling they ignored it. The feed works in GR, feedly, tor, inoreader, ttrss is the only reader tried that it doesn't work in.
Plugin ff_FeedCleaner
Re: Plugin ff_FeedCleaner
Thanks for the plugin, I just set it up on the newly updated tt-rss. Messed a bit with regex, looks like it's working.
I would guess you need to use double backslashes in your patterns:
Try entering
roshambo wrote:So far I have:Code: Select all
{
"#^http://feeds.feedburner\\.com/1500espn/sportswire/all\\#" : {
"type" : "regex",
"pattern" : "/\x80\x99/",
"replacement" : ""
},
...
}
Which results in an invalid JSON. Any help would be appreciated.
I would guess you need to use double backslashes in your patterns:
Code: Select all
"pattern" : "/\\x85\\x94/"
roshambo wrote:always get invalid JSON when pressing save, even when the feedcleaner window is blank
Try entering
Code: Select all
{}
Re: Plugin ff_FeedCleaner
Double backslashes fixed it, thanks. I read the whole section on regex syntax on php.net and used this regex tester http://www.solmetra.com/scripts/regex/index.php. All tested out okay using every delimiter and meta-character that tested okay but no luck.
Should remove just the 2 instances of â if I understand correctly but ttrss still errors at 'Entity 'acirc' not defined'
Code: Select all
{
"^http://feeds\\.feedburner\\.com/1500espn/sportswire/all" : {
"type" : "regex",
"pattern" : "[â]",
"replacement" : ""
}
}
Should remove just the 2 instances of â if I understand correctly but ttrss still errors at 'Entity 'acirc' not defined'
Re: Plugin ff_FeedCleaner
roshambo wrote:Should remove just the 2 instances of â if I understand correctly but ttrss still errors at 'Entity 'acirc' not defined'
Because it doesn't.
roshambo wrote:I read the whole section on regex syntax on php.net
OK. You should have noted then that the regexes need delimiters, so try it with
Code: Select all
"#^http://feeds\\.feedburner\\.com/1500espn/sportswire/all#" : {
"type" : "regex",
"pattern" : "[â]",
"replacement" : ""
}
Works for me at least.
Re: Plugin ff_FeedCleaner
Thanks that worked. I realize now the feed has to update to test, which makes sense, wasn't waiting before.
Re: Plugin ff_FeedCleaner
Version 0.8 was released right now. It includes changes in the configuration format, and while old style configurations should still work, all users are encouraged to switch to the new style. Details can be found in the README on the github page.
Re: Plugin ff_FeedCleaner
I want to extract the URL embedded in a feed from Google RSS. Some of these URLS being http. Others with https. An example of what Google provides me with:
https://www.google.com/url?rct=j&sa=t&url=http://thecork.ie/2016/03/16/cork-airport-get-two-new-routes-to-southampton-and-leeds-bradford/&ct=ga&cd=CAIyGzI5ZDZjMWRhMzczNzBlOTU6aWU6ZW46SUU6Ug&usg=AFQjCNGaSw-_EEoppiW7fQFFjFKSbcISEQ
The part I want to extract is in bold. I have pretty much worked out that the code I need is:
[
{
"URL": “www.google.com",
"type": "regex",
"pattern": "/^http\S+url=|&ct\S+/",
"replacement": ""
}
]
The problem I'm getting now is one of an "invalid JSON". I have read the earlier posts, though I don't think there is anything I need to "escape". Any feedback would be of great help.
https://www.google.com/url?rct=j&sa=t&url=http://thecork.ie/2016/03/16/cork-airport-get-two-new-routes-to-southampton-and-leeds-bradford/&ct=ga&cd=CAIyGzI5ZDZjMWRhMzczNzBlOTU6aWU6ZW46SUU6Ug&usg=AFQjCNGaSw-_EEoppiW7fQFFjFKSbcISEQ
The part I want to extract is in bold. I have pretty much worked out that the code I need is:
[
{
"URL": “www.google.com",
"type": "regex",
"pattern": "/^http\S+url=|&ct\S+/",
"replacement": ""
}
]
The problem I'm getting now is one of an "invalid JSON". I have read the earlier posts, though I don't think there is anything I need to "escape". Any feedback would be of great help.
Re: Plugin ff_FeedCleaner
I've sorted the JSON issue. I believe the pattern ought to have been
[
{
"URL": “www.google.com",
"type": "regex",
"pattern": "/^http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
The configuration will now save BUT it is still not working for me.
[
{
"URL": “www.google.com",
"type": "regex",
"pattern": "/^http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
The configuration will now save BUT it is still not working for me.
Re: Plugin ff_FeedCleaner
I've managed to get this working partially. What's not clear, even from the instructions, is whether ff_feedcleaner can make two changes to one URL simultaneously. I want to crop the first and last part of a redirected URL to extract a URL contained with it. The sub-URL is bracketed on the left by "url=" and on the right by "&ct". Using the code snippet below I can remove everything on the left of the longer URL string, but not on the right.
[
{
"URL_re" : "#www\\.google\\.ie#",
"type" : "regex",
"pattern": "/http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
I have checked and rechecked this code online with Debugexx. I've tested it too with TT-RSS. I'm not even sure ff_feedcleaner can achieve what I'm trying to do.
[
{
"URL_re" : "#www\\.google\\.ie#",
"type" : "regex",
"pattern": "/http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
I have checked and rechecked this code online with Debugexx. I've tested it too with TT-RSS. I'm not even sure ff_feedcleaner can achieve what I'm trying to do.
Re: Plugin ff_FeedCleaner
dlohan wrote:I've managed to get this working partially. What's not clear, even from the instructions, is whether ff_feedcleaner can make two changes to one URL simultaneously.
Programmer here. The plugin can do two or more changes if you can pack them into one regex. If you can't, you can split your changes into several regexes and put these in the config, they will be applied in order (roughly speaking).
dlohan wrote:I have checked and rechecked this code online with Debugexx. I've tested it too with TT-RSS. I'm not even sure ff_feedcleaner can achieve what I'm trying to do.
There is a preview pane which is titled Show Diff. Might not be the most suitable name I guess. You can see what the plugin does there on the XML level, but you need Unix' diff for now.
Final word from me on this matter: I think you (and forum subscribers) would be better off if you coded your own plugin for your purpose since the structure of URL query parameters is not that well suited to regexes and php's std_lib has functions that deal with just that, parse_url/str if memory serves.
- fox
- ^ me reading your posts ^
- Posts: 6318
- Joined: 27 Aug 2005, 22:53
- Location: Saint-Petersburg, Russia
- Contact:
Re: Plugin ff_FeedCleaner
op, try (a)|(b) or something?
Re: Plugin ff_FeedCleaner
Thanks Feader & Fox.
I'll try what is suggested. If I do manage to get a fix I'll post it here.
Appreciate what you said too Feader that custom programming might be the best option.
I'll try what is suggested. If I do manage to get a fix I'll post it here.
Appreciate what you said too Feader that custom programming might be the best option.
Re: Plugin ff_FeedCleaner
Speculating right now, but I think I know where the problem is. I have been using:
[
{
"URL_re" : "#www\\.google\\.ie#",
"type" : "regex",
"pattern": "/http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
The http\\S+url= bit works fine stripping everything on the left of url= (including url=), but the part on the right does not (ie. &ct\\S+). It has something to do with the fact that the first character of this part is ampersand. This is clashing with Regex. It's a step closer. I think I'm nearly there if I can isolate this last issue.
[
{
"URL_re" : "#www\\.google\\.ie#",
"type" : "regex",
"pattern": "/http\\S+url=|&ct\\S+/",
"replacement": ""
}
]
The http\\S+url= bit works fine stripping everything on the left of url= (including url=), but the part on the right does not (ie. &ct\\S+). It has something to do with the fact that the first character of this part is ampersand. This is clashing with Regex. It's a step closer. I think I'm nearly there if I can isolate this last issue.
-
- Bear Rating Overlord
- Posts: 373
- Joined: 20 Aug 2013, 23:13
Re: Plugin ff_FeedCleaner
I'm guessing it's because you're using an OR operator (the vertical bar, |). The preg function is not internally recursive so it matches the first half and that's it. If the first half was missing, it would strip the second half. Anyway, this is really the wrong approach, as has been mentioned, just use the parse_url/parse_str functions and be done with it.
Code above is untested, but should be complete enough to do what you want if you wrap it in the plugin class.
If you insist on using regex, then after matching the contents of url simply create a subsequent filter entry for the feed cleaner that does a regex .* to strip the rest (I don't know what all is available in the feed cleaner plugin so I'm generalizing here).
Code: Select all
function get_url_query_string_value( $link ) {
$qs = parse_url( $link, PHP_URL_QUERY );
if ( $qs ) {
$parts = array();
parse_str( $qs, $parts );
if ( array_key_exists( 'url', $parts ) && filter_var( $parts['url'], FILTER_VALIDATE_URL ) )
return $parts['url'];
}
return $link;
}
Code above is untested, but should be complete enough to do what you want if you wrap it in the plugin class.
If you insist on using regex, then after matching the contents of url simply create a subsequent filter entry for the feed cleaner that does a regex .* to strip the rest (I don't know what all is available in the feed cleaner plugin so I'm generalizing here).
Return to “Themes and plugins”
Who is online
Users browsing this forum: No registered users and 1 guest