*UNOFFICIAL* Frontier Forums Developer Posts RSS feed

There have been some forum changes that are breaking my scraper. I've got code to work around one (putting text like " 4 replies and 242 views" inside the element of class=date). But the links in the Activity Lists are now also broken, not properly containing the anchor to the specific post within the thread.

I've PM'd BrettC about both and hopefully he'll fix things.
 
I've just updated the code, and the 'about' page linked to in the OP here to a HTTPS URL. This is also using miggy.org rather than www.miggy.org. The old URLs will continue to work. You can take your pick of any combination of HTTP or HTTPS and www.miggy.org or miggy.org, but the feed itself now uses https://miggy.org/... throughout, so if you don't update the feed URL in our RSS reader you may find it carping about a mis-match.

Oh, and as per the changelog I've also started putting " (<Forum Name>)" text on the end of item titles.
Over the past few days the feed seems to be a little broken, some posts are appearing with no (<Forum Name>) - plus I have noticed on the posts this happens to there is also no preview of the news item content

Edit: seems I was posting as you were looking into it
 
Last edited:
Over the past few days the feed seems to be a little broken, some posts are appearing with no (<Forum Name>) - plus I have noticed on the posts this happens to there is also no preview of the news item content

Edit: seems I was posting as you were looking into it
Yup, it's all down to the changes that appear to have happened sometime in the last 24 hours. It's causing mis-parsing of the Activity List and thus thinking older posts are new again (because they're missing the XXXXXX on #postXXXXXX which made my code think they're new posts).
 
The issue with the post URLs seems to have been fixed.

Do note that because the URLs now include the thread title in text form it means my scraper has seen a lot of older posts as now-new, as it uses exactly that URL to know if it's seen a post before. So there's been a splurge of ~50 posts repeated in the feed.
 
Yup, it's all down to the changes that appear to have happened sometime in the last 24 hours. It's causing mis-parsing of the Activity List and thus thinking older posts are new again (because they're missing the XXXXXX on #postXXXXXX which made my code think they're new posts).
Has been working smoothly for past few days.

+rep and I would like to say thank you for such a useful scraper for those of us that use rss readers
 
I've just done another run of my "any new dev IDs?" script and nothing turned up.

Note I did see the few weird duplicates earlier today. Given how they happened I can only think the forum Activity List is to blame. It looked like Sandro Sammarco posted the same thing twice, but on checking the relevant URLs the posts are different. My scraper would only have gotten the content directly out of the Activity List.
 
Thanks to Garix for the heads up about Dominic Corner being a 'new' dev to track.

His ID is actually one I'd checked with a script before, but back then he must not have yet had a custom title so I didn't pick up on the account. Please do contact me if you spot any posts by (not QA or Support) FDev accounts that you think should be added to the RSS feed.
 
If you came here to complain about a sudden spam of old posts in your feed then hold fire. From my changelog:

2017-03-10 17:20 UTC
I've just finished work to use what I hope are truly unique and static GUIDs for each post. Before this change if someone on the Frontier Forums edited the title of a topic then the scraper would use the changed URL as the permaLink and thus put it out as an additional new post. Where several posts had already been made within the topic this would cause spam after the edit.

I now strip out the embedded topic title from the URL, using just those parts that are necessary to reach it on the forums. A side-effect of setting this up is that the feed just got spammed with a huge amount of not-actually-new posts due to changing over to using these better GUIDs in the permaLink property of each post. This will
be a one-off thing.
 
No worries. To be honest, in Feedly, all that seemed to happen was that the times on the posts all appeared the same. The actual order was correct.

Many thanks for your work on this. I use it daily.
 
And there was one more duplicate as I corrected a little more code for the GUID changes. Should be all good now (famous last words).
 
looks like the miggy.org site is down. :( (haven't been able to access it all morning)
I'm hoping this is just a temporary blip, and it's not going away permanently.
 
Yes, there was a problem with the host and I wasn't around to notice it. Eventually one of the other admins rebooted it. There'll be a short downtime later this today to move back to the 4.4.x kernel series as 4.9.x has exhibited an issue three times now.
 
Yes, there was a problem with the host and I wasn't around to notice it. Eventually one of the other admins rebooted it. There'll be a short downtime later this today to move back to the 4.4.x kernel series as 4.9.x has exhibited an issue three times now.
Awesome!
Thanks for the quick response. :)
 
No change to the feed, but the search (yes, folks, remember I have a search facility too, great for finding that dev post you can only vaguely remember! It's linked in my sig here) has just been updated to allow for searches in thread titles as well as the existing precis field.

I'm also intending to work on grabbing the full text of each post, including back-filling the historical posts I have stored. Then I'll update the search to work on these, hopefully with an option to include/exclude any quoted text.
 
Top Bottom