Game Discussions Star Citizen Discussion Thread v12

Source: https://youtu.be/TSzUWl4r2rU


My Amateur Summary:

Old Server Meshing idea scrapped:

Because its crazy...




Replaced by the 'Replication Layer':.
A system which will magically connect a ton of clients, a ton of servers, and the persistence database etc. And which is not crazy at all. Oh no.



A single server will have authority for any given entity:.
Allocated on a first-come-first-served basis, and transmitting that simulation to other servers via the RL. Authority can migrate between servers depending on changing circumstances.



There will be multiple shards:
They've scrapped the single shard idea.




Bonus stuff:



---

Introduction: [20s]

  • OCS is a thing.
  • Dynamic entities are nested. "Additionally to the static hierarchy of object containers there are also all the dynamic entities which bring the universe to life. NPCs, an interactive vending machine, and of course players and spaceships. Most of these entities are made of a hierarchy as well. For example a player has this body and undersuit and armor attached to it, and they are all child entities of the player. The streaming system treats these mini hierarchies as streaming groups to make sure that an object like a spaceship is always streamed in as one unit."




'Streaming and Replication': [2m50s]
  • Streaming bubbles: "When a player connects we create a so-called streaming bubble around that player and object containers as well as streaming groups that are visible from the player's point of view are considered inside this bubble. Any object container that is inside the bubble will stream its content and any streaming group within the bubble will also be streamed in on the server and then replicated to the client."
  • "Entities are considered inside the streaming bubble if their projected screen size on a virtual 1080p plane is larger than 5 pixels based on distance of the player. So while a large object like a moon will be considered inside the bubble from far away a small object like ship will only be considered inside when it's much closer to the player."
  • "When the player starts moving across the universe entities that leave the streaming bubble will become unbound and the replication layer will remove these entities from the client. Entities that enter the streaming bubble will get bound to the client which caused the network layer to replicate these entities to the client, effectively streaming them in. We call this technique entity bind calling because streaming on the client is driven by the network layer binding and unbinding entities."


'Problems to Solve': [4m38s]



  • "This model works quite well on the client however it doesn't scale well on the server... The more clients we try to match to given game server the likelihood of a player being at every single location increases and that basically nullifies the benefit of server-side streaming."
  • "So how do we solve this? The answer should be simple. Allow multiple instances of the game server to work together so they can split up the work. Well it's not quite that simple."

'Architecture': [5m19s]

  • Current dedicated game servers

'Replication Layer': [6m4s]

  • "The goal of server meshing is to allow multiple DGS instances to work together and divide simulation cost between each server and the mesh. In the best case we can scale this to infinity by adding more nodes to the mesh."
  • "If you want to mesh these servers together we need to find an efficient way to synchronize data between each server. With our current architecture, depending on the vision of the simulation, and the overlap this would require, a lot of synchronization points between each node. It's an exponential problem as in the worst case each node would need to talk to each other node in the mesh, severely limiting our ability to scale it."


[Otherwise known as the 'Chris is a Moron' slide ;)]

  • "To solve this issue we are separating simulation and replication."
  • "The replication layer has two major functions. It holds the state of every entity in memory, and replicates the state to clients but also to server nodes."
  • "I said server nodes because in this setup the traditional dedicated game server becomes a game server node. This server node connects to the replication layer, very similar to a client, and only a subset of entities are replicated to that server node."



  • "Replication to server nodes is controlled by the network bind culling algorithm that we saw earlier, and it's driven by streaming bubbles, and it works very similar to how it works on clients."
  • "The server node has certain[?] streaming bubbles assigned to it which will cause the replication layer to replicate entities from these streaming bubbles to the server node."
  • "Contrary to a player's client the server node has the additional responsibility to execute server-side authoritative code for those entities. Controlling AI, doing damage calculations, etc etc. The result of the simulation is then written back from the server node to the replication layer, and from there it is replicated to all connected clients and other server nodes."



  • "Since streaming bubbles can overlap entities may be replicated to multiple server nodes, exactly the same way how they are currently replicated to multiple clients if players are at the same location. To avoid two server nodes trying to simulate the same entity only one server node can have authority over any given entity, and only that server is allowed to write entity state back to the replication layer. This is usually the first server node who replicated the entity and other server nodes will only run client code on those entities"



  • "Authority can transfer between server nodes. For example if an entity leaves the streaming bubble of the current authoritative server it is then transferred to the next server node that has this entity currently streamed in.Further authority can be transferred between server nodes on demand in order to load balance the mesh."

'Shards & Persistent Streaming': [9m8s]

  • "Since we now mesh multiple server instances together to simulate a shared state of the universe we no longer call this instance but instead we call it shard."
  • "A shard is still a unique version of the universe,and we still have multiple shards running in parallel. However, the server mesh will lift our current hard limit of 50 players and it will enable us to steadily increase the number of players we can support within one shard."
  • "There are some fundamental differences between a shard and an instance. And for this, we need to take a closer look at the replication layer and talk a little bit about persistent streaming." [Chin emote here...]
  • NB there seems to be a replication layer for each shard, going by this graphic:



  • "The entire state of the universe is stored in a graph database. We call this entity graph and it's an evolution of the original iCache."



'Persistence': [11m13s]

  • Benoit on the graph format for storing data in the entity graph etc. Blah blah.
  • "Our objective is to be able to save the state of the replication layer, which includes all entities in a given universe shard,in order to provide a truly persistent world, where actions you take as a player can influence environments in the game world, permanently."
  • "Each of these entity nodes holds properties with regard to what the entity represents in the game. That class of object it is, the item type, its legal owner, orientation, and of course, the very precise physical location within the game world."
  • Much blah about how the graph system super much better than columns. Easy to serialise & transfers, data won't be lost etc
  • "The system retrieves a constant ordered streams of mutation from the replicant scribes that are part of the replication layer and are enqueued in durable queues, to ensure that no message is lost even if the service is unavailable or paused."

'Seeding': [19m10s]



  • So the Replication Layer [RL] starts with every entity loaded into it in their default / 'initial' state. (Universe / solar system / planets / station etc. From unstreamable conceptual entities, to static stations, to dynamic items like doors, ship components, bottles etc)
  • "As you play the game and go about with your ship,your playing character entity moves from location to location, getting attached to new zones as you travel. Your player aggregate is itself part of the entity graph, and your location and state are persisted by the replication layer scribes to the entity graph of your given territory."
  • "When you interact with dynamic objects and their properties change, the state of that entity will not persist until this instance of the database is undeployed." [So dynamic objects only saved in place periodically etc?]
  • "There are in fact multiple copies of the universe that are seeded at a given time. We call those shards. Each shard is a unique copy of the game world,complete with all of its entities and unique states. Think of it as an alternate universe." [Oh well, so much for Clive's single shard universe]
  • "Dynamic entities that have been modified in each dimension will have different states. The bottle on the bar was moved or the door was destroyed, might not be in the same state between shards."
  • A way to gain scalability as playerbase grows. "Even if each shard database is itself clustered and can grow substantially past a single machine there is a point where multiple clusters are needed." [IE one version can't hold all the entities, seems to be the admission , and/or handle all the info networked between multiple machines, once it grows beyond a certain capacity?]
  • "In order to select the correct universe shard for you to play on using multiple data points like your friends location your active party your last game session and or which shards still have items on it that you own this is to ensure as much as possible that you end up on the same shard you expect to be as a player in order to provide a seamless game experience."-"It would be terrible if you lost items you used when you were in a given shard versus another or if your character was bound to a shard forever. To alleviate this the system includes the concept of stowed and unstowed entities."

'Benefits and Challenges' [25m39s]

  • Synchronisation issue dumped: Don't have the issue of synchronisation between servers [in the old plan etc]."Each server node has one single connection to the replication layer which is used to push and get updates for entities of interest."
  • Servers like current clients: "The second advantage is that the same streaming and replication logic that we already use for clients can be applied to servers and that server nodes will only stream in a small area which will greatly increase performance"
  • 'Static to start...': "The first version of this technology will contain a static server mesh instead of the fully dynamic mesh that we saw earlier the static mesh assigns server nodes to predefined sections of the solar system this will reduce the amount of authority transfer that game code has to address in this first release." [GULPS]
  • Loads of current stuff will need redesign: "So you can imagine that this will affect things like missions that currently are spawned locally on the game server. These now need to be spawned globally within the chart and also persist their state. So all services that are attached to missions where, whether it's the Quantum system in the back end, or the Quasar tools, need to know know about the concept of a shard. This also goes deep into like things that are mechanical, like you know getting global chat to work on a server. That concept now needs to be extended to the shard where this will probably push us to implement this as a location-based chat for example. And so many teams in the company now need to change their feature to take into account the meshing technology that's behind it."

---

SUMMARY OF SORTS:

So the Replication Layer...
  • Is connected to every player client in that shard.
  • Is connected to every single server in that shard.
  • Is connected to the entity graph data store etc. (Of the precise location, orientation etc of all of the things...)

Conclusion: The replication layer is the new promised land.
Sooo...will it work? I showed it to a few of the more technically minded of the cows, they just looked blank and crapped a bit 🤷‍♂️
 
Depends, this looks like a message bus, a pretty typical element to introduce when many distributed processes need to publish / receive from many other processes, in real time (or close to it). It is a common pattern for state replication among multiple systems.

Yeah the principle seems fine. My amateur cynicism kinda lies in these areas though:

  • The 'dynamic' implementation of this would still involve loads of authority swapping, as bubbles intersect and players move between them. Tricky stuff.
  • Even in a 'static' version, there's still a lot of twitch data being shuttled back and forth between servers when at scale. (Org X is in a firefight with Orgs Y & Z, all handled by their own servers due to hitting the player cap. Servers X, Y & Z are all flinging twitch data between themselves. Via the intermediary of the RL. That's a lot of transfer...)
  • How does the RL itself function? Will it bog down when reaching max throughput etc?

Essentially it still feels like there's a lot being asked of that black box. I don't see it being able to perform 'Server Meshing' style miracles, as its been painted historically by the almighty thumb. (1000s in the same vicinity etc).

But doing the static variant, and being better than what they've got (with some unique downsides of its own)? Ay, maybe ;)
 
Last edited:
It will. But it won't. It is really not a "deep dive". The level of detail is on par with whiteboard drawings you do when you are in an early conceptualisation meeting of a system that you are building.

Early decades!
The idiotic comments under the YT vid indicate that either the fans have no idea what they're talking about or it's just auto generated fake. I'd prefer the autogenerated fake, because it'd mean there aren't so many stupid morons out there.
 
It will. But it won't. It is really not a "deep dive". The level of detail is on par with whiteboard drawings you do when you are in an early conceptualisation meeting of a system that you are building.

Early decades!
Which is about where Turbulent are I suppose having inherited the task late in 2020, they scrapped all of Ci¬G's nonsense and started from scratch. Not a position I'd envy...taking apart a Franken-engine of a project to insert something that should have been done at the start. I wish them luck :)
 
Last edited:
New armour set for the 400i owners...

gttqiajf6os71.jpg
 
So for tier 0 how will they define a location. Is it going to be a series of boxes in space or will it be like ED does with there drop out/in of SC where you clearly have some form of instance/shard transition.
 
Yeah the principle seems fine. My amateur cynicism kinda lies in these areas though:

  • The 'dynamic' implementation of this would still involve loads of authority swapping, as bubbles intersect and players move between them. Tricky stuff.
  • Even in a 'static' version, there's still a lot of twitch data being shuttled back and forth between servers when at scale. (Org X is in a firefight with Orgs Y & Z, all handled by their own servers due to hitting the player cap. Servers X, Y & Z are all flinging twitch data between themselves. Via the intermediary of the RL. That's a lot of transfer...)
  • How does the RL itself function? Will it bog down when reaching max throughput etc?

Essentially it still feels like there's a lot being asked of that black box. I don't see it being able to perform 'Server Meshing' style miracles, as its been painted historically by the almighty thumb. (1000s in the same vicinity etc).

But doing the static variant, and being better than what they've got (with some unique downsides of its own)? Ay, maybe ;)

(Usual disclaimers apply, I have never built an MMO backend)

You are generally right in your intuition. A "layer" like this is a black box, designed specifically for this pattern and will have a lot of internal complexity abstracted away. Fortunately this means that CIG does not have to build their own version, because honestly, they could not. There are open source tools, like Apache Kafka, that can do it for them. There are also platforms / languages designed specifically for the telecom industry, like Erlang, that are focused on similar use cases - making sure that metric craploads of messages are delivered across complex networks in real time. AWS and other cloud providers have their own implementations as well.

Will they scale? Yes, they can scale to deliver crazy amounts of messages per second.

There is a tradeoff, of course - the ordering of messages. Scaling a system of this kind "horizontally" means that a single "topic", or a "a stream of messages about one thing, however that thing is defined" gets "partitioned", or divided into several parallel substreams. There can be many partitions in each topic and the more there are, the entire system is more "parallelised" in terms of publishing and consuming. It can process more messages per second.

Unfortunately, the ordering of messages is guaranteed only on the level of a single partition. Sometimes this does not matter, sometimes it does. It won't matter if messages get delivered out of order but they are about two players that have no apparent influence on one another. I am walking in Port Olisar, enjoying the view, you are getting killed by an elevator on Arccorp - it does not matter much if those events are registered by the rest of the system in order. But if we are in a dog fight or trying to snipe one another, suddenly ordering means a lot. Were you faster than me in pulling the trigger? Did your ship cross path with my missile? Did I push a cargo cart fast enough towards the ramp of your ship to destroy it before you managed to take off? This is where you need to maintain stricter ordering. The "denser" the situation is (are we taking part in a 1000-player Battle of the Random Settlement?), the tricker it is as more messages need to be distributed between clients and various processes constituting the backend.

This is why I always said that SC, as advertised, was impossible. Too many ordered messages in real time would be required for it not to feel completely janky and unreliable.

Other games sacrifice various aspects like the number of objects tracked (magical inventory, nothing "physicalised", simplified physics), use hit scanning instead of fully modelled ballistics, do not offer real time destruction (only pre-baked instances), or, like EVE, slow down time so that every client can catch up with the action during a large battle. The last option is not viable for SC, of course.

Interestingly, the fact that CIG are now advertising this "replication layer" does not actually say anything about whether they scrapped previous concepts or not. Whatever they designed earlier, they would still, with high probability, need to use such a layer, as it is the only viable pattern that avoids exponential growth in the number of direct connections between various processes.

Either the message bus was always there and they have failed to mention it as obvious, or it was not included in the previous concept of server meshing. To my best judgement, the second option would mean they had absolutely (and I mean, absolutely) no idea what they were doing.
 
Last edited:
Cheers, very interesting :)

Either the message bus was always there and they have failed to mention it as obvious, or it was not included in the previous concept of server meshing. To my best judgement, the second option would mean they had absolutely (and I mean, absolutely) no idea what they were doing.

That's the impression Reindell gives by calling it a "new layer" etc:

Instead of just meshing multiple dedicated game servers together, and have them synchronize state between each other, we are introducing a new layer called replication layer.

But I think what's more telling is that we've heard about every other element here before. Every arcane aspect from the 'Entity Graph' (né iCache), to Entity Bind Culling has been flagged and discussed ad nauseum by CIG. But this is the first time we're hearing about the 'Replication Layer'.

(So on the competence front, draw your own conclusions ;))
 
Last edited:
So for tier 0 how will they define a location. Is it going to be a series of boxes in space or will it be like ED does with there drop out/in of SC where you clearly have some form of instance/shard transition.

The older theorising had it as being larger areas like planets etc initially:

Clive Johnson said:
The first version of server meshing will be where servers are assigned to fixed regions of space, such as a planet or a moon (for future reference we're calling this initial version static server meshing). The boundaries between these servers will be so far out into space that it is unlikely anyone will be able to pass through them other than when quantum travelling between locations. That will reduce what types of entities can pass between servers, and how often they will do so. It will also give us a good chunk of time to perform the server transition so that we can monitor the performance of this process and make improvements without it being too disruptive to gameplay. (source)

But Clive seems to have disappeared now, so have Chris's version instead...

Croberts said:
So the key with Server Meshing is it, this is allowing multiple servers to handle different areas inside a star system. Uh, and, even later stages of it, multiple servers working together to handle even one area. Like you know an Orison or a New Babbage or something like that (source)
 
Sooo...will it work? I showed it to a few of the more technically minded of the cows, they just looked blank and crapped a bit 🤷‍♂️
I showed it to a few devmates who are in the gaming industry, one who's a server developer, and they all said something like "yeah, that looks fine and doable, because it's very much like many games and other online services already do. And the time to get it up and running, modify existing code and assets to use it, dunno... 1-2 years dependent on how much they've done so far? But the way they talk it sounds like they've just changed from an old plan to this new one, so I'd guess 18 months."
 
Back
Top Bottom