*UNOFFICIAL* Frontier Forums Developer Posts RSS feed

I've been beavering away on this fulltext thing. The scraper has been collecting that, in addition to what it already was, for around a week now. I've also run a script to backfill all the still reachable posts I had archived. If a new posts is found but the 'click through' to the fulltext failed then the post will be stored without the fulltext. I didn't want to not store the post at all and continue having the same issue until after the post was no longer in the relevant member's activity list, and thus lose the post entirely. This does mean that the fulltext RSS feed below might sometimes contain posts with only precis text. I'll know when this has happened and will re-run the backfill script to ensure the archive has the full text.

I've just tweaked things to now generate an extra .rss file that utilises the fulltext. This output comes with the same HTML class names/ids as in the forum HTML, but most readers will strip all the attributes out, so I've not attempted to style things as per the forums. Instead I change the blockquote element that's there into a div so that I can then change the divs that contain post quotes into blockquotes. I'm also stripping out the inline images used for emoticons, mostly because of an attribute I'd have to have stripped from them otherwise to get the RSS to validate. I may re-visit that.

Anyway, if anyone wants to test with the new fulltext RSS feed the URL is: https://miggy.org/games/elite-dangerous/devtracker/ed-dev-posts-fulltext.rss

I've switched over my personal Tiny-Tiny RSS instance to using this now and will be monitoring it. I still have a few things on my TODO list for it, like making sure that any forum code elements are rendered in a sane manner (but that's likely just changing an enclosing div into a pre instead, due to not being able to apply specific styling).

The old feed URL remains unchanged and will continue to contain only the 'precis' text that's available from the forum member Activity Lists.

Next up will be updating the search UI/code to allow for searching in the fulltext, not just the precis and title.
 
A user pointed out that my per-item timestamps were claiming to be in UTC, but were actually UTC + 1.

After a while tracking down what was going on this turned out to be due to a code refactor I did recently (to allow some code re-use in separate scripts), which caused the actual scraping code it ignore the session/cookies set up by the part of the script that handles login.

So that code is fixed, and I've updated the timestamp in the database on all the affected posts. I've also re-generated both the old and new fulltext feeds so the timestamps are correct in them now.
 
I've just updated all the URLs in the OP to utilise https://ed.miggy.org/ rather than any old version. These new URLs get rid of the "games/elite-dangerous/" component in the URL paths, making them much shorter.

I should have done this year(s) ago.

I'm about to go edit the HTML files and what goes in the RSS files to match this as well.
 
I've just removed Drew Wagar from the monitored forum accounts. He's no longer actively writing for ED and his posts aren't really relevant to the game any more. Someone let me know if this changes and I'll consider adding him back.
 
I've just merged a pull request that someone else kindly put together to add Will Flanagan (new Community Manager) to the scraped accounts. Their posts should start appearing shortly.
 
The host that I run the dev tracker on will be undergoing some extended maintenance one weekend this month. This involves an OS upgrade which I have performed on other machines already so I'm 99% confident it won't take more than 4 hours, but expect some downtime. I'll try to remember to post here when I actually start the maintenance. For each weekend I don't get this work done you should expect it to happen the next weekend. I am currently expecting to perform the work this coming Saturday, 10th February 2018.

So if https://ed.miggy.org/ is unreachable you'll know why.

P.S. I did remove Dale Emasiri from the tracker the other day, as per the changelog on https://ed.miggy.org/devposts.html - he no longer works at Frontier.
 
So, I forgot to say I was staring the work, but this is me now saying it looks like I've finished it, or at least anything that would affect the web services and scraper for the RSS feed.
 
As per new changelog entry on the feed home page:

2018-03-28 15:45 UTC
I've just implemented ignoring of selected forums. This is initially the current four "Jurassic World: Evolution" forums. Once I'm confident this is working without causing issues I may expand that list so that any forum not specific to ED is ignored.
 
I've now blacklisted all the new JWE related forums, so posts in them will no longer appear in this RSS feed.
 
Top Bottom