*UNOFFICIAL* Frontier Forums Developer Posts RSS feed

TL;DR - It could easily be another week before I have this working and handling everything again.

I started work on the new scraping code today. I'm at best half way done with it and taking the rest of the day off from this. The curious can check the 'xf2' branch in the github repo (which I linked somewhere in a previous post).

Small differences in data available in the activity list make me grateful I was given access to the forum's API (although it doesn't contain an equivalent to the per-user activity list so that bit is still HTML scraping). Re-creating exactly the same experience might be a little tricky, particularly when it comes to the details of post content that includes quotes (I'm going to have to investigate Parse::BBCode to turn the BB code version of posts that the API spits out into appropriate HTML for the RSS feed and search interface).

I also found a whole bunch of duplicate posts I had due to various past forum shenanigans with changing the format of URLs. I think I have all that cleaned up now, but still need to go back and make sure my 'guid_url' data is something that will work on thread-starting posts for old posts.
 
More work on this today. I might have beaten the code into shape, but out of an abundance of caution I'm now going to run just the developer instance for a day or so (perhaps until the end of Monday to ensure some new posts), to see if anything else shakes out.

Tentatively the feed might be back in operation Monday evening (UK time).

I still need to look at the search interface and make sure it's outputting correct URLs for threads, posts and posters. As part of that a load of "start of a new thread" posts need their DB entries tweaking as well.
 
The feed is live again, there'll be over 100 new posts in it. For now it's generating the feed with the last 28 days of posts in it, whereas normally I only put the last 7 days in. I'll likely change that back in a month's time so as to give all consumers a chance to pick up the posts from the gap in service.
 
One Planet Coaster post briefly snuck through as I had an incorrect id in ed-devtracker-forums-ignored.json. I've now corrected that file (and removed the offending post from the live .rss files) and reviewed all the other entries as well.
 
Excellent news, thanks Athan.
Just to let you know it's flagging up when a staff member has given a Like - for example https://forums.frontier.co.uk/threa...-npc-to-fight-with.507635/page-3#post-7717969 is flagged up with 'Sarah' as the author (she posted a Like) even though Ashenfox is the author of the post. Not sure if this is intentional or not.
Hmmm, I thought I'd made it exclude any reactions at all. Thanks for the heads up, I'll take a look.
 
Excellent news, thanks Athan.
Just to let you know it's flagging up when a staff member has given a Like - for example https://forums.frontier.co.uk/threa...-npc-to-fight-with.507635/page-3#post-7717969 is flagged up with 'Sarah' as the author (she posted a Like) even though Ashenfox is the author of the post. Not sure if this is intentional or not.
Right, I've loosened up that regex a little, and it correctly detects those three Reactions as being such. I've now removed them from the database, and will shortly edit them out of the live RSS files. (Edit: there were 17 such in total, all removed now.)
 
Last edited:
Heads up, I just found a brain-fart where RSS generation was using a hard-coded 'base_url' value, rather than the configured one. It's this that was causing all the link URLs to have a '//' in them. I've corrected this to use the configured forum_base_url value now which means all the link/guid values will change on next generation and cause all posts to be seen as new. I felt it better to 'rip the plaster' off on this now, rather than have it come back to bite me in the future.
 
Now all the pre-migration links are missing the slash between "forums.frontier.co.uk" and "showthread.php".
Yes, there's some fix up I still need to do. I'll take a look at it tomorrow.

Note that some very old threads just plain don't exist any more (due to forum deletion), so if you're ever using the search interface don't be too surprised if you hit a forum error on clicking through (but right now that's true of just over a thousand thread-starter posts which I'll also be finding a way to fix up).
 
I'm aware that the fulltext feed isn't interpreting some BBCodes, such as:

Code:
[CENTER]
[ATTACH]
[COLOR]
[INDENT]
I'll take a look at this Soon™, no ETA.
 
@Athan / Miggy - you are the saviour of my ED life. I haven't been able to play properly for ages so I've using your feed to keep me up to date with events, especially the dev posts. When the feed stopped working I thought that I would be lost so much that not even the Fuel Rats could save me.... but you got it fixed and now I'm back in touch!

Honestly, thanks. If we ever meet I'll get you a beverage of your choosing :)
 
The feed host is down due to a lack of bill payment (not my responsibility). No ETA on its return at this time.

Edit: it was back a short while ago.
 
Last edited:
I'm also now aware of the forum's ability to show a user set a 'status' in their activity stream. I'm pondering how to handle this.

On the one hand we now know BrettC is away until 28th April, which can be handy if any 'dev' sets it. On the other hand that's a profile post and not associated with a forum and I need to do some code fix up to avoid spewing errors when it encounters such activity.... It'll also slightly mess with the search index.
 
I've just set the new 'Planet Zoo' forums to be ignored. A couple of posts snuck through into the feed first, which is what alerted me to these new forums. Those two posts are now deleted from the posts database, and the feeds have been re-generated without them.

This is the small cost of not wanting to miss any ED related posts, i.e. I don't want to whitelist the ED forums in case I miss the addition of a new forum and miss relevant posts. I'd much rather have the occasional non-ED post make it into the feed due to new forums I then ignore.
 
Top Bottom