Double Buffered

A Programmer’s View of Game Design, Development, and Culture

Can EVE Evolve?

Posted by Ben Zeigler on October 5, 2008

Massively has an article about EVE Online’s server architecture and their plans for the future. The article is a great overview, and matches with notes I took from a GDC ’07 presentation CCP gave. I’m frankly impressed with CCP’s ability to get a max concurrent of 40k, but I really don’t think there’s much room left for improvement. Eve’s population is growing just slowly enough that the can keep up with it, but the fact that they’ve started putting in zone limits shows that even they realize this. Why does the EVE model work, and why can’t it go much farther?

First of all, they obviously have some solid programmers. Getting 40k concurrent on one database server is impressive, ESPECIALLY one based on SQL. My understanding is that they have some hardcore SQL programmers who write a lot of logic in higly optimized stored procedures. But, they still needed to buy a military-grade static ram hard drive to keep up, and I know they’ve been getting help from IBM and other companies to get performance as high as possible. So, database performance is stretched near breaking, but isn’t actually the current problem.

The problem is the performance on their application servers, or SOL servers. These are the ones that handle combat and all of the player interaction, and these have always been incredibly lagged. But, how are they able to get a few thousand people in a zone in the first place, without resorting to the client-side heavy method used by WoW (WoW does almost everything client side, which is why hacked servers are possible)? The answer is that it’s heavily optimized for automation. For instance, ship movement is not synced every frame, but is instead sent down only when players actually change parameters. In normal movement, the client solves complicated differential equations to predict the location, which works perfectly when you’re mining.

When does this model break down? It breaks down in the most complicated, hardest to optimize and yet most important part of the game: combat. During combat players are constantly changing movement and using powers, which kills all of their optimizations. I’m sure they’ve done work since launch, but interaction between players has always been deemphasized. From the GDC talk, I learned that the original version of combat in EVE was entirely deterministic, and it took a LOT of complaining from designers to make combat fun at all. So, the entire EVE architecture is designed to optimize highly parallel, noninteractive processes in the vein of a supercomputer. So how are they proposing they fix the performance problems with combat, which is the least parallel computing activity I know of?

As mentioned in the article, they want to fix this by… adding in a bunch of supercomputer features. The main thing they’re working on now is to set up Infiniband network connects to make it easier to swap processes between physical machines. I guess the idea is to split up the over-taxed zones between several physical machines, but this is going to be fiendishly complicated. My understanding is that large fleet battles include a large variety of connections between players, so splitting these up accross machines, even with a fast net connect, means that anything involving connections between players in different physical machines is going to be slow. They’re also going to have to rewrite a large chunk of their code.

Paralellizing multiplayer combat accross different processes and physical machines is an insanely complicated task, and I frankly don’t have much confidence that CCP will actually be able to do it effectively. I could be proven wrong, we’ll see if EVE is still having horrible combat performance problems in a year.

Advertisements

6 Responses to “Can EVE Evolve?”

  1. Noah said

    I think the basic model of their SOL nodes will hold up well for the forseeable future. Stackless uses a similar concurrency model to Erlang, and those guys have been doing automated task migration for years (mostly to allow for hardware failure, it just switches to a backup CPU seemlessly). Stackless already support [de]serializing tasklets, so adding support to the VM to transparently move them around should be relatively easy. MOSIX (and openMOSIX) has some pretty good algorithms for knowing when to initiate a relocation and such, not that I doubt CCP could figure it out. I think a harder problem will be DB consistency. They already have a lot of limits in the game as to how fast you send commands, my guess is that they use a decentrazied cache with an eventual consistency model. The further you try to scale a non-transactional system like that, the more likely people will find ways to exploit it. Improving the client-side prediction will also be hard, but I think that will come over time. Given how much they are pushing for ambulation, I think a big part of their future strategy may be to draw people away from constant gate-camping (and other forms of PvP combat) and into more social activities.

  2. JZig said

    Yeah, I agree that moving self-contained tasklets between physical machines works fine, and there are many supercomputer solutions to such a problem.

    I just don’t think combat can really be an isolated tasklet, and be remotely efficient. There are too many interconnections of data for it to really be isolated in the way that would be easy to switch it around.

  3. CrazyKinux said

    Though I have a very limited knowledge of the technical aspects you’re all discussing, I’m finding this post and its comments very interesting.

    @Noah – You make a good point of influencing players (mostly the Carebears, Industrialists RPers and Socilalizer) to stay-in-station through Ambulation. It’ll be interesting where they focus their upcoming expansion – Midas & Ambulation being not-PvP focused.

    I’m looking forward to see if they’re able to pull this off, as they’ve been doing the past 5 years.

    CrazyKinux

    P.S.: JZig, hope you don’t mind if I include a link to this post in my EVE Speedlinking on Friday?

  4. JZig said

    Sure, link away.

  5. YoMma said

    Interesting article. Always good to get the viewpoint of someone with real knowledge and experience of the way these things work.

  6. Whaledawg said

    The answer is that it’s heavily optimized for automation. For instance, ship movement is not synced every frame, but is instead sent down only when players actually change parameters. In normal movement, the client solves complicated differential equations to predict the location, which works perfectly when you’re mining.

    They did the same thing with Shadowbane and had the same results. In that game you did character movement by clicking on the ground and the server pathed you to the location. But if you got a bunch of people together for a battle it quickly became a slideshow.

    However that game was primarily PvP. If all there is to do in your game is seige towns and that’s laggy as hell you have a problem. At least EVE gives you a lot to do within their model.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: