*UNOFFICIAL* Frontier Forums Developer Posts RSS feed

The forum is now enforcing HTTPS for the login script, so I had to quickly change my collector. It's running again now, but for ~30 minutes wouldn't have been collecting anything due to the login failure. There don't seem to have been any posts in that time period anyway.
 
Thanks for this very useful tool.

One request - any chance to get the quoted texts into the feed as well? Lots of posts go like "Yes, that's true" or similar and when I follow the link back to the forum the related posting is actually quoted in the dev's post. Any chance to scrape that as well?
 
Thanks for this very useful tool.

One request - any chance to get the quoted texts into the feed as well? Lots of posts go like "Yes, that's true" or similar and when I follow the link back to the forum the related posting is actually quoted in the dev's post. Any chance to scrape that as well?
No. I'm simply scraping what's in each user's Activity listing. To do anything else would mean doing an extra HTTP request per entry to pull the contents, and certainly when the forums are under heavy load the script struggles enough as it is.
 
@Athan
Your RSS feed is buggy: It's encoding Windows-1252 characters (e.g. 0x93, aka LEFT DOUBLE QUOTATION MARK) as using wierd XML escape codes (e.g. & #xC2;& #x93). Check out any of the "Combat Logging" posts.

The 0xC2 looks like the start of a UTF-8 sequence, but it's not the one which corresponds to the desired character. And in any case, it would be wrong to encode UTF-8 like that.
 
Last edited:
@Athan
Your RSS feed is buggy: It's encoding Windows-1252 characters (e.g. 0x93, aka LEFT DOUBLE QUOTATION MARK) as using wierd XML escape codes (e.g. & #xC2;& #x93). Check out any of the "Combat Logging" posts.

The 0xC2 looks like the start of a UTF-8 sequence, but it's not the one which corresponds to the desired character. And in any case, it would be wrong to encode UTF-8 like that anyway.
Indeed. I'm looking into this now.

The annoying thing is that I thought the perl modules I'm using were taking care of this. Either one of them has a bug or it's mis-guessing the incoming encoding.

Hmmm, I'm being told the page is in ISO-8859-1, but indeed those are windows-1252 encoding characters that don't even exist in ISO-8859-1 according to Wikipedia. So I'm not sure what I can even do about this, given the FD forums/web server are flat out lying.
 
Last edited:
Hmmm, I'm being told the page is in ISO-8859-1, but indeed those are windows-1252 encoding characters that don't even exist in ISO-8859-1 according to Wikipedia. So I'm not sure what I can even do about this, given the FD forums/web server are flat out lying.
I'm not surprised, when Frontier's RSS feed implies it's using UTF-8 but is actually using Windows-1252.

I ran a different RSS feed which claimed to be ISO-8859-1 but was Windows-1252. In my case I checked for invalid/unused ISO-8859-1 characters that exist in Windows-1252 (0x80 to 0x9F), and if so changed the assumed encoding to Windows-1252...
 
So, forcing Windows-1252 encoding when I decode what the forum throws at me does lead to proper utf-8 internally, including in my database. But then XML::RSS (via HTML::Entities encoding routines) insists on giving me “ rather than the more straight forward “ and browsers don't seem to like the former, despite it decoding to the same according to http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder .

I'm loathe to start writing my own encode_cb routine for XML::RSS, but will do more digging to see if I can get things to behave in a manner that makes browsers happy.

It'd still be so much simpler if Sandro Sammarco just didn't use stupid windows 'smart quotes' or whatever these are.
 
OK, gave up and wrote a custom encode_cb routine (which basically just tells encode_entities_numeric to ONLY encode the necessary things for HTML encoding to work, and thus leaves UTF-8 alone). I'm going to let that sit on the dev version for now and will look tomorrow at ensuring this does actually work and then put it live (with the hacky use of windows-1252 only triggered by non-ISO-8859-1 characters as suggested).
 
OK, all the hackyness for charset and stupid windows characters is now live, it'll only affect the feed the next time there's a new dev post though.
 
Search!

Remember folks, check http://www.miggy.org/games/elite-dangerous/devposts.html for news about changes, including both bugfixes and new/changed features.

Also, I've just finished writing the first cut of a page to search the contents of the database this has resulted in. Simply go here and enter some words to search on. As per the notes if you simply put in a few words it will only search for posts containing all of the words.

The first useful posts in there are from around 10th October 2014. Some are before that but only because the user profiles contained very old posts.

Any feedback appreciated in the usual manner.
 
Last edited:

wolverine2710

Tutorial & Guide Writer
In the past your excellent tool, thread has been added to EDCodex. Before EDCodex was released on the 17th of August you have received in the period 5th - 10th of August a PM with an invitation and a special link. After registering and logging in you would automatically become owner of your entry. According to the admin tool(s) you haven't used the special link (yet). Perhaps you have missed the PM or have been (temporarily) away from ED. Its also possible you choose not to claim your entry. Note: Its also possible to assign another commander editing rights for your entry. In either case please send us a PM. You can find your EDCodex entries here and here

Alternative way to get ownership
The special link will cease to function in the (near) future, for security reasons. Should you after that point want to become owner of your entry you can use the "Claim ownership" button. In that case please send biobob or myself a PM with the email addressed you used for registration - for verification. You can also use this procedure if you no longer have the PM.

What is EDCodex:
Its a website with a database of currently approx 215+ tools,threads,websites,videos for ED. Any one can and is encouraged to add entries there. EDCodex is and should be community-driven. EDCodex companion thread. Its equally suited for PC's, tablets and smartphones and has RSS feeds.

With kind regards,
Biobob
Wolverine2710
 
Just checked IDs from 110948 to 113092, no new dev accounts (i.e. they all had the game-rank titles that are for normal users).

Remember, PM/email me any dev accounts you think I'm not scraping - I'll check and add as necessary. NB: I chose to specifically not scrape the QA and Support accounts as they were mostly "Thanks for this report" posts which quickly got tedious.
 
HTTPS now supported

I've just updated the code, and the 'about' page linked to in the OP here to a HTTPS URL. This is also using miggy.org rather than www.miggy.org. The old URLs will continue to work. You can take your pick of any combination of HTTP or HTTPS and www.miggy.org or miggy.org, but the feed itself now uses https://miggy.org/... throughout, so if you don't update the feed URL in our RSS reader you may find it carping about a mis-match.

Oh, and as per the changelog I've also started putting " (<Forum Name>)" text on the end of item titles.
 
Last edited:
This is such a useful facility I don't know what I'd do without it.

I've just updated the code, and the 'about' page linked to in the OP here to a HTTPS URL. This is also using miggy.org rather than www.miggy.org. The old URLs will continue to work. You can take your pick of any combination of HTTP or HTTPS and www.miggy.org or miggy.org, but the feed itself now uses https://miggy.org/... throughout, so if you don't update the feed URL in our RSS reader you may find it carping about a mis-match.
Unfortunately my RSS reader wants to delete all the contents if I change the URL so I've left it as it seems happy enough.

Oh, and as per the changelog I've also started putting " (<Forum Name>)" text on the end of item titles.
Is it not possible to use the Category field or is that dependant on the source? (as you can tell I'm no RSS expert)
 
Top Bottom