A decent machine for '98! Be careful with RAM, different versions of windows can mean they only allow different amounts of RAM, I don't just mean 32 vs 64 bit either.
linky to microsoft site bit For example Windows 7(64bit) Home Premium can only use 16GB. I only mention this, as it was something I had not considered.
Windows 8 is beginning to look like a good upgrade (bar the touch screen-a-like-xbox-tiles-thing.) Better searching, copying and no doubt lots of other background improvements. Hmmm something to ponder.
And it was only a couple of months ago that I finally retired my old ultraportable laptop: a Pentium 233 MMX, 80Mb RAM, 7inch screen which I was using as my home mail/SSH/file server (hey it's got a built in UPS...). It came with Win95 but has been running a variety of Linux distros, and Gentoo for the last 10+ years.
I'll be upgrading and dual booting my main machine for Elite, so Windows 8 also has the advantage that it's a lot more forgiving of changing the hardware as part of the new System Builder license compared to the Win7 OEM license (ie you can change the motherboard without having to prove to Microsoft that you're a pirate) - very handy for those of us who are not buying an ultra-rig outright but are hoping to see how we get on and then upgrade as we go.
I understand the memory management is the big under-the-covers improvement from 7 to 8.
And to hark back to earlier posts, yeah, the O/S will use multiple threads even if games don't, so the original Windows NT for example had the memory heap routines split across about 4 or 5 threads (eg one thread was just there to zero pages, another coalesced free blocks etc). But most games will largely tend to do their own memory management using object pools and fixed size heaps to avoid heap management or garbage collection (for any games written in C# etc).
And while I'm at it, hyper-threading is not so much "hardware based thread switching" - any thread switch between threads in the same process just involves switching register values including instruction and stack pointers. The point about hyper threading is that when a CPU stalls for a few cycles, say to wait for a cache-miss (or the FPU pipeline stalls, or there's a branch misprediction and the instruction pipeline stalls), then rather than stalling the entire pipeline, HT allows the core to switch to another thread and hopefully use other parts of the pipeline productively to make use of those lost cycles.
That is, one core pretends to be two. Only one of the two can run at once, but the gain comes from the core using part of its pipeline to process another thread when some part of the pipeline is stalled, hence it's trying to make use of idle resources at a sub-core level (like a human reader swapping between 2 browser tabs while they wait for the next page to load in one rather than staring out the window instead).
Now in CPU intensive work (I used to work on large computational finance codebases that ran on grids of thousands of machines - we used to swap staff with games companies) you work hard at the lowest levels to avoid these kind of stalls, unrolling loops (Duffs device is a classic example, even if it misfires sometimes) and turning computations around to avoid FPU pipeline stalls in particular. Summing an array of FP numbers the naive way is a great way to waste most of your clock cycles (13 out of 14 on some older CPUs, meaning your 2.8GHz chip was effectively running like a 200MHz processor). Similarly you use your own memory management not just to avoid GC interruptions, but to manage your cache alignment and maximise cache usage (don't kick your own data out of the cache if you can avoid it). And all that's before you get to SSE instruction sets, lock free threading and offloading work to the GPU...
Hence many carefully written bits of CPU intensive code will not see much if any benefit from HT as there may not be many stalls where it counts, whereas your more normal bog-standard code (written for a VM and JIT compiled) typically will see boosts from HT as it will stall quite a lot and HT will recover some of the waste cycles.
So don't expect a 2-core-4-thread i5 (with HT) to run like a 4-core-4-thread (non HT) i5 on anything like an optimised codebase if multiple threads are involved (clockspeeds and cache size and microarchitecture differences etc notwithstanding).
And don't expect a 4-core-8-stream i7 (with HT) to offer much improvement over a 4-core-4-thread i5 (non HT) from the HT component - the differences there will be due to cache and clock speeds and microarchitecture improvements between generations.
Personally I stick with AMD chips anyway as I get more bang per buck building my own mid-market machines (X4 965 at the moment). In practice, I've been better able to do things like upgrade CPUs on the same cheaper motherboard, and put the cash saved into more RAM, SSDs, nicer monitors, and I'm expecting to be able to get an NVidia 650 Ti Boost in the new year for a little over £100 and I suspect that'll do me fine...
Sorry - that was only going to be a quick reply...