In-Development TradeDangerous: power-user trade optimizer

eyeonus · Jun 20, 2018

I'm running the Bessel Gateway run right now with some debug lines added. I've seen the recursion level go all the way up to 11 a couple times, finally got a freeze at "YENISTANI/Johri Terminal". Let's see what's going on here.

Avi0013 · Jun 20, 2018

eyeonus said:
I'm running the Bessel Gateway run right now with some debug lines added. I've seen the recursion level go all the way up to 11 a couple times, finally got a freeze at "YENISTANI/Johri Terminal". Let's see what's going on here.

eyeonus · Jun 20, 2018

+rep for making me lol.

eyeonus · Jun 20, 2018

So, the thing is still running. I can see that it never actually freezes, it just gets really slow at some stations. Really, really slow.

From what I'm seeing, I'd say the best thing to do would be to set a minimum profit/ton. No idea what a good default would be, but I do know it'd have to be a ratio to work for little Sidewinders and giant Anacondas, not a fixed number.

I'm recording the run, I'll upload the video once the run is complete so you guys can see what I mean.

Tromador · Jun 20, 2018

+rep all around (where it let me). What a set of posts to wake up to. Going to get my morning tea and reread it all with an appropriately caffeinated brain. Looks like you are tracking it down though.
Now, I know that I was away from the game for a couple of years and that the database will have grown in that time but then again so has my CPU speed, and it really feels like TD runs much more slowly than it did in 2016.

If there are, in fact, bits of old buggy or unnecessary recursion sitting unfixed or unoptimised in the code that would certainly explain the slowdown.

Tromador · Jun 20, 2018

Avi0013 said:
Granted, the database has been updated, but running Tromador's call with the non-recursion version finished relatively quickly and the "unsupplied" version does return more credits than the supplied version.

I did snapshot the DB @ https://drive.google.com/open?id=1mMW-T2lLno9CHQwmWKr1N5wbzzoycSNn if you want a known test case. Just be aware that this is a zip up of the server's version of TD and that there are a couple of "undocumented" changes which make the server run correctly, so you may wish to replace the plugin with a fresh copy and if you want to run the listener, check the options json.

eyeonus · Jun 20, 2018

Tromador said:
I did snapshot the DB @ https://drive.google.com/open?id=1mMW-T2lLno9CHQwmWKr1N5wbzzoycSNn if you want a known test case. Just be aware that this is a zip up of the server's version of TD and that there are a couple of "undocumented" changes which make the server run correctly, so you may wish to replace the plugin with a fresh copy and if you want to run the listener, check the options json.

Regarding that, none of the changes you made for the server will affect testing, because they only affect the plugin and the listener. Neither the listener nor the plugin is used on a trade run calculation, and the changes you made don't affect TD itself.

Also, updating the data is exactly what we don't want to do wrt testing this particular bug, because updating the data might make the bug go away for the particular use case we're testing with.

tl:dr - Doesn't matter, TD isn't changed, just the EDDB stuff. Data update = bad for bug testing.

Tromador · Jun 20, 2018

Yes. Quite right. Dim of me. Still not awake. What he said... etc.

MarkAusten · Jun 20, 2018

eyeonus said:
Okay, I figured out what this thing does.
It takes the first item, which has the highest profit/ton, and tries to fill the hold with it.
If the hold isn't full, it tries to fill what's left with the second item, then the third, and so on, until the hold is full, there are no more items, or there's not enough money to buy any more stuff.
^^^ This is where the recursion comes in.
It gets the total profit from this whole process, and checks to see if the total profit from this combination is better than the current "best" combination's profit.

It then goes on to the second item, repeating the entire process.
^^^ This is where the loop comes in.

That does not sound like the optimal solution to the problem. Unless I'm missing something ~~the problem is a bounded knapsack problem (BKP) and~~ two loops, one of which is recursive, is a brute force way to solve it and not a very good one since it misses many possible solutions.

~~There are a number of published non-brute force solutions to the BKP in python, perhaps one of those could be adapted to TD and may run much faster.~~

~~[Edit] See the solution 'A more Pythonic solution?' at the bottom of the page at https://codereview.stackexchange.co...ming-solution-to-knapsack-problem/20581#20581~~

[Edit] This is not a bounded knapsack problem since that does not take into account being able to afford the items that are selected. Still, it is a brute force solution.

Avi0013 · Jun 20, 2018

MarkAusten said:
That does not sound like the optimal solution to the problem. Unless I'm missing something the problem is a bounded knapsack problem (BKP) and two loops, one of which is recursive, is a brute force way to solve it.

There are a number of published non-brute force solutions to the BKP in python, perhaps one of those could be adapted to TD and may run much faster.

[Edit] See the solution 'A more Pythonic solution?' at the bottom of the page at https://codereview.stackexchange.co...ming-solution-to-knapsack-problem/20581#20581

YES, but…

What we have here seems to be a BKP with TWO constraints: capacity and credits. Even if we view the knapsack as being the available credits and the weights as the cost, we are constrained by ship capacity. If we view the knapsack as ship capacity and the weights as one (all in ton units) we are constrained by available credits. This is non standard, and even Pisinger's minimal algorithm, which seems to be the state-of-the-art for the exact BKP, will not address this directly.

Regarding Pisinger, I attached his paper earlier, and his ANSI C code for BOUKNAP.c is available via the wayback machine and is freely usable for non-commercial purposes, which this is. However, I could not get it to compile (some WinMain issues, etc.). Stack overflow user Andy Jones gives a good overview of the algorithm, which may be enough for someone to code something similar in Python.

Mark, the code you link to is seems to be the 0/1 KP. Now Martello & Toth (1990) do solve the BKP by expanding the collection of items into discrete subsets (of one or more) and turning the result into a 0/1 problem, but Pisinger (1994) points out that with a large number of items this becomes unweildly (n > ~50). In the TD case, we deal with supply in the thousands at times, so the 0/1 conversion is just out.

[Edit]: We edit conflicted, it seems. It took me a while to write this post

I am obsessing a little bit (I found a cheap copy of Kellerer et. al. 2004 textbook on Knapsack Problems at Abe Books, and should get it within the next two weeks), but the way I see it now, we have three possible routes:

We continue solving the exact problem on the entire universe of possibilities
1. We live with Oliver's brute force method and suffer.
2. We develop a more efficient algorithm
  1. We "merge" the constraints as per Pisinger (1995, section 1.3.1), and then use an efficient BKP algorithm like Pisinger (1994).
  2. We directly modify Pisinger (1994) by simultaneously fitting two knapsacks, one for capacity and one for cost, and take the "best" allocation that fits into both knapsacks. We do this, we may be able to publish a paper . I'm thinking "The Journal of Elite Economics and Intra-Galactic Trade".
We solve the exact problem on a restricted universe
1. By limiting the universe of trades through a profit-per-ton, supply, or age limitation, the brute force method will converge, but be sub-optimal compared to the (intractable) solution for the entire universe
We solve an approximate problem on the entire universe
1. We can limit the recursion to a maximum depth.
2. We can forget the recursion and just fill based on the ordered list as provided.
  - Here, we lose the case where 2, 3, 4, 5 are more profitable than 1, 2, 3; is this worth it?
3. We implement Pisinger's Linear Bounded Knapsack problem (Pisinger 1994, Section 3) by finding the break item (under both capacity & credit constraint) and filling the ship up to that item and whatever of that item fits.
  - This too suffers from the case where a perturbation of the core set (dropping an item before the break point and replacing it by more break and subsequent to break items which may be more profitable) would result in a more profitable case (e.g. {1, 2, 3, 5, 6} is more profitable than {1, 2, 3, 4}), but it should be more efficient than limiting the recursion, although we won't know until it's programmed.

Thoughts?

References:

Kellerer H., Pferschy U., Pisinger D. (2004) Knapsack Problems. Springer, Berlin, Heidelberg ISBN: 3540402861
Martello, Silvano. "Knapsack problems: algorithms and computer implementations." Wiley-Interscience series in discrete mathematics and optimization (1990). ISBN: 0471924202
Pisinger, David. "A minimal algorithm for the bounded knapsack problem", University of Copenhagen, Technical Report 94/27, (1994). http://www.diku.dk/~pisinger/94-27.ps
Pisinger, David. "Algorithms for knapsack problems." (1995). http://hjemmesider.diku.dk/~pisinger/95-1.pdf

Avi0013 · Jun 20, 2018

There is another option under 1:

We implement use of an external integer programming solver (like cvxpy - example of 0/1 here) and formulate the multi-constraint knapsack problem in a way that can be passed to the solver. The problem with this is that now TD will need numpy, scipy, and various solvers. The latter often are difficult to install (especially under Windows) and any small deviation from the precise problem will lead to who knows what. I'm not necessarily recommending it, but it is a possible enhancement to TD.

Tromador · Jun 20, 2018

Let me stand away from the (nevertheless interesting) discussion of implementing knapsack algorithms according to our particular position and, at risk of sounding like a middle management executive - think outside the box. Aka vague handwaving.

Taking Eyeonus' description of how it operates.

It takes the first item, which has the highest profit/ton, and tries to fill the hold with it.

If it does fill the hold, well and good and that's the end of the calculation. We have a full load of the most profitable item, move on. (If it considers anything else, then it's broken).
This is the commonest scenario, by a very long way, especially for smaller ships and only really a big deal on larger ships as more and more different items may be needed to fill the hold.

So in approximate terms, we only really an issue with multi-commodity trades on large hold ships.

Thus when talking about what is, or is not, optimal in such scenarios, is it really worth the time (human and CPU) to generate the holy grail of a 100% optimal solution? Especially when even if multi-commodity trade A-->B is suboptimal, the next single-commodity trade B-->C is likely to be optimal.
Furthermore, how many credits difference are we actually talking about in an approximated solution vs the optimal solution? Especially as this is a big ship problem, does the commander really care all that much if one awkward to calculate trade isn't quite 100%. I can't honestly say I've seen any other tool generate multi-commodity trades at all, so even a suboptimal solution is likely to be best of breed when it comes down to it.

Reducing the workload to an approximated solution seems then like a good idea, maybe limit the recursion depth as has been suggested may be a solution.
Alternatively, perhaps there are other approaches (and I really don't know what TD does under the bonnet, so I'm not claiming any of this is feasible) like reducing the scope of the problem. Break it into chunks, chances are that if you bung the first few most profitable things in the hold you've got a winner so do that and then if you really want to, worry about what space is left - maybe find a sweetspot where running Oliver's full algorithm on a smaller hold is worth it and approximate anything larger which is therefore going to be more complex.

I suppose the thrust of what I'm saying here is that perhaps obsessing over this perfect solution really honestly isn't a good way forward, even if you do find the algorithm, will it be complete in a timely fashion?
Instead, I do think that concentrating on finding the best approximation which gives a big speed boost to the application with the least loss of accuracy is the thing to do.

Put yourself into the average user's shoes - do you care more if
A: You make a few more credits from a perfect answer delivered slowly.
B: You get quick answers and can get on with the game.

In fact, thinking about playing the game generally - getting quick answers is more likely to generate credits than a lot of calculation to get perfect routes. Waiting for your trade tool to give perfectly calculated answers is time which could be spent trading - and time is money.

Avi0013 · Jun 20, 2018

No arguments, Tromador, and that is what general options 2, and even better 3, suggest in my longer post. I'm all for an approximate answer which requires no recursion, It can be as simple as I naïvely tried originally; forget the recursion, decrement the space/credits and just run the for loop. This is similar (and may exactly be, but I don't know yet) to the Linear Bounded Knapsack problem approximation (LBKP) where you just pack the hold with the most profitable elements until you exhaust space/credits. IF you could use fractional tons, it would be the optimal solution. The complexity comes in when are restricted to integers. But the sub-optimal solution is, in general, very close to the optimal, and is solvable in linear time.

Based on the code, we can implement that as a "new" solver and make it default, but allow masochists to use the recursive version should they desire, the way that Oliver's "bruteforce" algorithm can still be called by adjusting line 483 in tradecalc.py or passing a fit variable in line 1103 of run_cmd.py.

Tromador · Jun 20, 2018

Avi0013 said:
IF you could use fractional tons, it would be the optimal solution.

Why can't you? Do fractional tons, round everything down to the nearest int (so we don't exceed supply) and do a (presumably quicker to complete) calculation for the few tons of space left.

Even so, if the suboptimal solution is much quicker, I think it would be preferable regardless. TD's downfall is speed, not precision.

(edit: On reflection, I may have misread the quote, as not referring to the previous sentence, in which case, never mind).

Avi0013 · Jun 20, 2018

Tromador said:
Why can't you? Do fractional tons, round everything down to the nearest int (so we don't exceed supply) and do a (presumably quicker to complete) calculation for the few tons of space left.

Even so, if the suboptimal solution is much quicker, I think it would be preferable regardless. TD's downfall is speed, not precision.

(edit: On reflection, I may have misread the quote, as not referring to the previous sentence, in which case, never mind).

For completeness: We cannot buy 0.79 ton of an item. Rounding down the optimal linear programming (LP) solution does not necessarily lead to the optimal integer programming (IP) solution. It depends on the simplex, where a "corner" representing an intergral solution in the parameter space sticks out further than the "best" option shaved down to the nearest integers. Pisinger (1995) brings an example in Figure 1.1, where the optimal solution is x_1…x_9 and 0.35 of x_10, but the optimal integer solution is x_…x6 + x_9 + x_10 + x_12. Buying the extra 0.65 of x10 means selling x_7 and x_8 and making it up the remainder with x_12. Figuring THAT out is the heart of the complexity of IP. BUT, just buying x1…x_9 and then filling up the rest with whatever you can is a CLOSE solution (in general, there are cases where it isn't) and that may be all we need, as you say.

Overall, I agree that I believe the vast majority of TD users would prefer a fast approximation to a slow exact solution. We do something similar now by restricting the universe through supply, per-ton, and age limits.

Avi0013 · Jun 20, 2018

Avi0013 said:
For completeness: We cannot buy 0.79 ton of an item. Rounding down the optimal linear programming (LP) solution does not necessarily lead to the optimal integer programming (IP) solution. It depends on the simplex, where a "corner" representing an intergral solution in the parameter space sticks out further than the "best" option shaved down to the nearest integers. Pisinger (1995) brings an example in Figure 1.1, where the optimal solution is x_1…x_9 and 0.35 of x_10, but the optimal integer solution is x_…x6 + x_9 + x_10 + x_12. Buying the extra 0.65 of x10 means selling x_7 and x_8 and making it up the remainder with x_12. Figuring THAT out is the heart of the complexity of IP. BUT, just buying x1…x_9 and then filling up the rest with whatever you can is a CLOSE solution (in general, there are cases where it isn't) and that may be all we need, as you say.

Overall, I agree that I believe the vast majority of TD users would prefer a fast approximation to a slow exact solution. We do something similar now by restricting the universe through supply, per-ton, and age limits.

As an example, the optimal integer solution to the set of equations in the image below is actually at (0, 5); nowhere near the LP-optimal solution.

2.25 * 5 + 3.75 * 8 = 41.25
2 * 5 + 3 * 8 = 34
2 * 5 + 4 * 8 Is not feasible as 5 * 5 + 9 * 8 > 45
0 * 5 + 5 * 8 = 40!

Tromador · Jun 20, 2018

Avi0013 said:
BUT, just buying x1…x_9 and then filling up the rest with whatever you can is a CLOSE solution

So close, in suspect (by the time you've got down to the 6th commodity) that the fiddling small change you are dealing with for the rest of the load won't be enough to own one pu[1]

in general, there are cases where it isn't

Example? How esoteric are such edge cases? (are they sufficiently common that edge case is doing them an injustice?)

Overall, I agree that I believe the vast majority of TD users would prefer a fast approximation to a slow exact solution. We do something similar now by restricting the universe through supply, per-ton, and age limits.

We can, via supplied arguments to the script, of course. But compare (for example) the speed at which EDDB can spit out a simple route compared to a similar job by TD. Now I know from experience there are routes where EDDB shrugs and tells me "impossible!", whilst TD comes up with something, sometimes something pretty decent, but man alive it runs slowly.

Now, of course, this whole recent discussion was brought about by a couple of test cases we found where TD appears to lock up, or at least disappear so far up its recursive rear end that it can see its own tonsils and so the natural and expected response is to look for and fix the bug.
After seeing this go back and forth I think that's completely the wrong approach. I more and more think what's needed is not something which fixes Oliver's brute force algorithm but something which spits out a "good enough" answer as speedily as possible.

[1] As if you didn't know - but anyway. http://hitchhikers.wikia.com/wiki/Triganic_Pu

Tromador · Jun 20, 2018

Avi0013 said:
Some equations and graphs

Sure, but if we go for the optimal LP solution, round off and then add a third term (as in filling the rest of the hold with whatever) we can still make 40 oh... 3 ways I can see, and equal the optimal IP solution. That might be coincidental on the numbers you've chosen, but illustrates the point of being *near enough*.
(or if this is wrong, then the numbers are too abstracted from the game for my comprehension).

Avi0013 · Jun 20, 2018

The example is a 2D example when all you have is 2 options. I find it somewhat more difficult to envision 20-dimensional simplexes (which is what a 20-item optimization must traverse) than ningis

Tromador · Jun 20, 2018

Avi0013 said:
The example is a 2D example when all you have is 2 options. I find it somewhat more difficult to envision 20-dimensional simplexes (which is what a 20-item optimization must traverse) than ningis

And that is a good summary of the problem. Computing a solution in n-dimensional space is never going to be trivial or quick.