A Software Engineer Reacts To Star Citizen's Graph Database Woes

  Рет қаралды 4,216

Free Dissociation - Kevin Riggle

Free Dissociation - Kevin Riggle

Күн бұрын

In which I attempt to provide some background context from a software engineering perspective to the recent post from Chris Roberts about the struggles Star Citizen is currently having with the graph database underneath the Persistent Entity Streaming service.
CitizenCon presentation on persistent entity streaming from 2021: • CitizenCon 2951: Serve...
‪@BoredGamerUK‬ video with Chris Roberts's response talking about the graph database: • Chris Roberts Promises...
The podcast episode I mention in the video:
• HE LOCKED PLAYERS OUT ...

Пікірлер: 61
@StarCitizenJorunn
@StarCitizenJorunn 8 ай бұрын
I enjoyed the perspective. Most of us are NOT experts in the field and it already sounded pretty complex so having more outside folks that are in that field saying ""CIG has bitten off 5 of the hardest problems in computer science to solve all at once" gives credence to how long this is taking and how hard it is.
@Euler0311
@Euler0311 7 ай бұрын
Now that citizen con has passed and we have seen the first working demo of server meshing, I would love to hear a follow-up discussing what was shared. Cheers!
@nisserov
@nisserov 6 ай бұрын
This!
@free-dissociation
@free-dissociation 6 ай бұрын
Here you go! Thanks for the encouragement to do this :) kzbin.info/www/bejne/sGrdoppjYrKhpas
@adriankoch964
@adriankoch964 2 ай бұрын
Compute problems nowadays are not about "make x work" but "make x work at a certain scale" and choosing a technology where the compute effort scales closer to linear instead of double exponential or so. So going from 100 to 1000 players with a linear "cost" would mean ok I need 10x servers for this, but double exponential would mean very quickly each player added would require one extra server in terms of compute and interconect bandwith. If the complexity effort of the solution is double exponential, for example, going from 100 to 1000 players would cost the same compute/bandwith effort as 100.000 players in a linearly scaling solution. While it is great to see they finally have a
@Sypheara
@Sypheara 6 ай бұрын
My left ear enjoyed this
@Mercurio-Morat-Goes-Bughunting
@Mercurio-Morat-Goes-Bughunting 6 ай бұрын
Hi Kevin, yet another Star Citizen enthusiast, here. I did a lot of private research, a couple decades back, into what we now call "Graph Databases". I had the similar objectives, to CiG, in terms of data access speeds, relational complexity, and data volume - which is why I decided to write my own system from scratch and that had to start with a lot of experimental research. My experience with companies in the business of database development is the assumption that either volume, or relational complexity is low which is why their products are always as slow as a wet week when subjected to real world data volumes. A big part of the problem is that, as relational complexity increases, maintaining the performance of a linear data stream, on disk, becomes geometrically more complex and time consuming wherever you are using variable length fields and, by extension, records. It's doable but using DDR, SSD or HDD tech to store graph data is like trying to mow the lawn with a pair of secateurs. That's also doable, it's just the a lawnmower is a much better tool for the job. Neural network memory and neural network SSDs are the right hardware for a graph database. Sadly, no-one anticipated that application demand for graph data would break the linear data storage paradigm and the neural network card is just another IT industry myth - not unlike the quantum chip. Persistent Entity Streaming is a separate issue from graph data structures. The problem being solved, here, is transacting an account which is open to another entity (i.e. being accessed by more than one entity at a time, e.g. client and server modifying the same record simultaneously). This comes back to the tremendously high maintenance cost of graph data when forced into indexed linear streams. So, what you need to do is defragment the data (just like defragging an old-fashioned hard disk) during down-time slots which aren't predictable or, individually long enough, to keep up. Taken together, there is enough time but you can't break up something like this so what to do? The solution is to run a pair of databases which allow one to be defragged while the other is updated with overflow transactions logged to a transaction database which updates the offline database between defrag/"miantenance" tasks. When the offline database update status catches up to the online database status, transactions are switched to the transaction database while the offline and online databases are switched, then the completely maintained database goes online while the former online database goes into maintenance/update cycle until it catches up with the current online database update cycle. This keeps transactions in play even when something goes wrong at the back end or one of the back-end server layers drops out - and it also keeps the graph data optimised for access without delaying transaction processing. Then we have server meshing; which is a separate beastie from both persistent entity streaming and the graph database. This is where we talk about in-game server boundaries and, to be sure, functional Persistent Entity Streaming is very important for entities crossing server boundaries for reasons similar to what you state but, wait, there's more. Server boundaries are also a source of actor bugs such as non-player actors failing to respond to players inside their aggro radius because the player is on the other side of a server boundary and no-one thought to code for that. MindArk's Entropia Universe had this bug for well over a decade and, last time I logged in, it was still a problem. Coaxing non-player actors into navigating across server boundaries is also rather challenging in view of the fact that even just propagating NavMesh across uGrids is quite difficult and requires manual "finalisation" to link across uGrids in the design tools for some of Bethesda's games (e.g. Fallout and Elder Scrolls franchises). I think that a two way database with transaction/replication layer PES just might do the trick, in the case of Star Citizen.
@thirdworldrider6991
@thirdworldrider6991 6 ай бұрын
do the engineers at sc know this? you should tell them
@free-dissociation
@free-dissociation 6 ай бұрын
Super-useful comment, thank you. Very good to hear from someone who's been deeper in the weeds on this than I have. What's neural network memory? Not something I've encountered before, and I'm curious what it's supposed to be, even if like the eternal dream of the quantum chip it remains vaporware ;)
@Mercurio-Morat-Goes-Bughunting
@Mercurio-Morat-Goes-Bughunting 6 ай бұрын
Hi @@free-dissociation . Neural network memory is something researchers were playing around with 25 years ago. Instead of arranging data in a linear fashion with association by proximity, it assigns multiple, non-linear relationships between data at a hardware level to mimic the way natural neural networks hold information; such as the central nervous system or the brain. The problem is that it's a really poor fit for the traditional approach to data management and, particularly, project management for data solutions because it's poorly understood and there are a lot of magical expectations which come with the territory and miss the useful opportunities for application. More recent interest in graph database systems on increasingly demanding scales seems to have renewed interest in the area, though, and I think I heard this caught nVidia's interest over the past couple years.
@Mercurio-Morat-Goes-Bughunting
@Mercurio-Morat-Goes-Bughunting 6 ай бұрын
Hi @@thirdworldrider6991 . CiG recently ran face-first into the problem so I'm pretty sure they know. I just hope they find a solution I couldn't see.
@signalamplifier
@signalamplifier 6 ай бұрын
@@free-dissociation NN memory is basically approximating data with NN as far as I know. Check out NeRF where they approximate 3d radiance fields with NN. In the end, different data structures are used to approximate radiance fields right now, cause NN are too slow to evaluate.
@LostTerminalVideos
@LostTerminalVideos 6 ай бұрын
My left ear loved this!
@moonryder203
@moonryder203 8 ай бұрын
Hey Kevin, this was a very cool video on this subject. It seems CIG is building something very special that has never been done before. You seem excited on this and maybe you should reach out to CIG, they could use smart engineers like yourself to help build this massive project. 🙂 It is amazing and look forward to the day this monster is fully working.
@sc_cintara
@sc_cintara 6 ай бұрын
I had the same thought as you, regarding using a Graph database like Neo4j. My next thought was that they don't really need a Graph database for this; it is just a tree structure, not even a DAG. So perhaps they should go with a simple very fast, very scalable NoSQL-database like Cassandra. There are many NoSQL DBs that could do the job, but I have found Cassandra to be very reliable and it is extremely scalable as well as redundant. Since their trees are shallow and wide they can store one node with it's children per key in Cassandra and do one request per level they walk. The main issue with Cassandra is eventual consistency; they would need to find ways to avoid locking and roll-backs as far as possible and do it manually in the game servers when it can't be avoided, but if you want wide scaling, you just can't get full ACID.
@Sypher474
@Sypher474 6 ай бұрын
"limits of the available technology" is a quote that jumped out at me. CIG have pretty regularly found that edge in many other areas, and pushed past it by building their own solutions, so I'm relatively confident they will continue to do exactly that in this database realm. Definitely not my wheelhouse, but they have basically infinite money to throw at the problem, and what seems to be an insane team. We'll see 🤷‍♂
@Richard0110
@Richard0110 8 ай бұрын
Thank you for the technical details. CIG seems to have tried to implement the theoretical solution without understanding the practical limitations. They will have a hard road ahead of them if they keep doing things this way.
@free-dissociation
@free-dissociation 7 ай бұрын
I suspect they understood some of what they've bitten off, but I'm not sure if they understood the whole thing. They, or the people they're working with at Neo4j, are going to get some fascinating conference talks or papers out of this if it works.
@cvsmith122
@cvsmith122 7 ай бұрын
honestly cant watch this as the audio is only on the left channel, please remix audio to a dual mono or mono mix.
@theamericanaromantic
@theamericanaromantic 7 ай бұрын
Great stuff. Please do a post-CitizenCon 2023 video!
@machoalright
@machoalright 6 ай бұрын
You uploaded this mono sound? and its also very soft. Iam just an system admin, with only 25 years of experience. But man.. you should be able to figure that one out.
@Lokislav
@Lokislav 6 ай бұрын
A Software Engineer Speaks in My Left Ear For 20 Minutes Causing a Headache OUCH BRO
@sc_cintara
@sc_cintara 6 ай бұрын
The community really needs a youtuber to go through the SQ42/PU monthly reports engineering section...
@StuartGT
@StuartGT 8 ай бұрын
Really interesting video, thanks for that! I really hope you get your chance to chat with the server/database devs at CitizenCon, and look forward to a follow-up video with your thoughts from those chats
@RogerValor
@RogerValor 6 ай бұрын
I think if your stereo craps up you should reencode audio in mono for a quick win :D Luckily I have a mono button on my voicemeeter banana. I find your insight interesting, especially since I follow techniques to optimize spatial data since that first day I wrote a library to load Ultima Online maps as a kid, but may dayjob became more working with relational dbs, so I only followed simulating solar systems as pet projects, and never thought about graphs dbs at scale. However to me it still seems falling over the other edge of the cliff, to go into arbitrary nesting as the main database structure, while in reality, they could probably build inbetween layers as caches, as the scaling is not *entirely* arbitrary. While it impresses me how they try the idealistic approach in many problems, there is a skill in knowing how to make a good layered abstraction for a domain problem, so I am a bit critical about some of what they said or do, as they sound to me less the smartest garage-genius way, but the most businessy way to solve the issue. But I know by now, that you can achieve the same result in both ways, and also fail in both ways, so lets hope for their success, given what they have is already impressive.
@Ic3q4
@Ic3q4 19 күн бұрын
do you know which specific one they use? bc i can't remember
@Darthgenius
@Darthgenius 7 ай бұрын
Thanks for the video, hope you enjoyed citcon.
@JohnMcclaned
@JohnMcclaned 6 ай бұрын
memgraph is the only thing i can think of that tries to do graph in real-time
@RN1441
@RN1441 7 ай бұрын
Listening to this video makes me think that the recent characterization by Roberts as 'The database isn't performing well enough, especially when we do real time backup' as an incomplete one. It sounds like they are relying on it to be a real time tool when it is not.
@free-dissociation
@free-dissociation 7 ай бұрын
Yeah that's my worry. I could be wrong about their architecture, and we'll find out if it works in due time, but if my understanding is right, then they may wind up fully stuck on this and need to radically rethink their approach.
@SETHthegodofchaos
@SETHthegodofchaos 7 ай бұрын
I dont think thats right. The backup would run maybe once a day like explained in the video. But of course your DB still has to be available so in that sense it would still have to do that backup in "real-time"? But when the backup locks entries and the game is also trying to lock and apply transactions and it turns out some of these locks take a long time, then of course that leads to a negative impact to the actual real-time transactions from the game servers.
@free-dissociation
@free-dissociation 7 ай бұрын
​@@SETHthegodofchaos There are various strategies for this. "Real-time backup" could be a lay-person friendly way of saying replication, like a leader-follower setup with a leader graph database node that receives and processes writes and a couple follower nodes that serve as read replicas and only process read requests-but which in the current incarnation of Neo4j could still put unacceptable load on the main leader node. Or they could mean a nightly backup strategy like you suggest. But I hunch that it's something closer to the former.
@SETHthegodofchaos
@SETHthegodofchaos 7 ай бұрын
@free-dissociation i see, thanks for clarifying 👍
@RN1441
@RN1441 7 ай бұрын
@@free-dissociation The previous two times they've encountered this or similar problems their approach was to design a caching layer to sit between the database and the game. This didn't work with pCache because they said they didn't understand their stat structures well enough at the time, so they tried again with iCache which ultimately didn't work. I'm not sure if this will be 'third time's the charm' or 'those who fail to learn from their history....'
@SETHthegodofchaos
@SETHthegodofchaos 7 ай бұрын
I am aware of the difference between using indexes for looking up relationships vs just pointer hopping in graph DBs. Also the heavy use of linked lists to have property-type behavior. You did mention on issues such as data access, e.g. cache locality (probably linked to cache misses, I assume?). Is there anything else noteworthy about graph DBs in terms of the technical aspects that you know of? I am curious about this topic, so any info is much appreciated :)
@free-dissociation
@free-dissociation 7 ай бұрын
Full disclosure up front that I am not deep in this area and so there are probably a lot of graph specialists who could give you more specifics. That said, cache misses are the problem, exactly. Think about the big problems that graph databases have to solve all the time, things like A* (shortest path search), and how to do that while keeping all of the required information as close to the CPU as possible. Here at least I believe CIG are using a directed acyclic graph, which lets them optimize and parallelize some, but many graph algorithms in the degenerate cases (e.g. cyclic graphs) don't parallelize _AT ALL_, so you can't just throw more compute at them, either CPU computer like with MapReduce or Hadoop, or GPU compute like with AI/ML/LLMs. The state of the art for many graph queries as far as I know is still just to build a big enough computer with a fast-enough single thread on the processor to be able to store the entire graph structure in RAM and return results using the single processor thread in acceptably close to real-time.
@SETHthegodofchaos
@SETHthegodofchaos 7 ай бұрын
@free-dissociation yeah its a complex topic. Regardless, thanks for your input. Much appreciated 🙏
@sharxbyte
@sharxbyte 6 ай бұрын
I am very interested to see the response to someone who has professional experience with this. As I'm only amateur with software and databases set. Thank you so much for making this.
@CosmicD
@CosmicD 7 ай бұрын
I'm not a programmer myself but from the way they explained how groups of nested data (ship, ship components, ship weapons, players that inhabit the ship, the weapons on the players etc) will always transfer from one zone to another, i've tried to imagine all kinds of instances where your ship can basically be pulled apart by entering a new zone. As thousands of ships will cross zones all the time, it's not hard to imagine the structure suddenly loses cohesion, like a few ship components, or the player's head are suddenly missing if it has transferred to another sone ? And then : oopsie, we see a few of glitches. I really hope they do get this streamlined.
@aidancuite6919
@aidancuite6919 7 ай бұрын
The idea with a graph database, is that instead of having to transfer the connection (relationship) between all items, instead you only need to move one edge to represent the whole shift. Obviously this is an idealized explanation, but in their case I don't really imagine things are getting transferred at a super granular level every time an operation happens (which would be neccesary for a more standard SQL database).
@sebhirsch4010
@sebhirsch4010 6 ай бұрын
Stereo!
@thomasip9938
@thomasip9938 8 ай бұрын
Pretty obvious to software engineers that graph data structures, while elegant in the academic sense, are not suited to real time use cases such a games, especially not one at the scale of Star Citizen. I hope they figure out something soon.
@free-dissociation
@free-dissociation 8 ай бұрын
I mean AIUI graphs are all over inside game engines, and they work great at the scales they're used at there, but going from that to the scale of Star Citizen is... a tall order. We've come a long way on graph databases at scale and in real time since Google first launched, but I'm not surprised to hear CIG are still struggling with it, because they're still such a specialized tool. And I also hope they're able to work the bugs out and get everything stabilized soon.
@SETHthegodofchaos
@SETHthegodofchaos 7 ай бұрын
@@free-dissociation what is AIUI?
@free-dissociation
@free-dissociation 7 ай бұрын
@@SETHthegodofchaos as i understand it
@EdsEnemy
@EdsEnemy 8 ай бұрын
fix your mic balance and volume
@bareloto
@bareloto 7 ай бұрын
Somewhere in the simulation of the game sounded so much like Neo in matrix 2 waiting for the train man to save him. That what happened then ? :P :P
@CoryTheSimmons
@CoryTheSimmons 7 ай бұрын
I wonder if Dgraph would've scaled to Star Citizen's scope. Roberts should've hired Manish R. Jain.
@RayTieRom
@RayTieRom 6 ай бұрын
Excuse my ignorance, wouldn't server meshing solve this issue? Essentially dividing the workload and database requirements for each server, no?
@Tarkovclips-mw7ur
@Tarkovclips-mw7ur 6 ай бұрын
no sound
@merclord
@merclord 6 ай бұрын
I hope more experts like you put out videos recognizing the "impossible" hurdles that CIG has overcome. As an original backer of the game, who has become sick and tired of hearing all the jokes and pessimism surrounding the project, it is refreshing to hear a professional programmer that actually knows WTF he's talking about give an honest opinion on what CIG has accomplished. And yes while the server meshing demo was earth-shattering, and tear jerking, it would never have been possible without the graph database.
@thirdworldrider6991
@thirdworldrider6991 6 ай бұрын
conclusion. get google engineers to help them on the project, since google solved the problem of scaling graph databases?
@TheHorodateur
@TheHorodateur 6 ай бұрын
Audio is horrible
@memorycl
@memorycl 6 ай бұрын
Fairly sure you're correct about neo4j...simply based on the number of version updates that have been pushed in the past 2 years vs previous years :) en.wikipedia.org/wiki/Neo4j#Release_history
A Software Engineer Reacts To Star Citizen Server Meshing & Replication Layer - CitizenCon 2953
1:11:54
98% Cloud Cost Saved By Writing Our Own Database
21:45
ThePrimeTime
Рет қаралды 300 М.
СНЕЖКИ ЛЕТОМ?? #shorts
00:30
Паша Осадчий
Рет қаралды 4,5 МЛН
We Got Expelled From Scholl After This...
00:10
Jojo Sim
Рет қаралды 19 МЛН
Formation Low Flying | Star Citizen
5:22
BTPDano
Рет қаралды 12 М.
A Jr Dev For Life?? | Prime Reacts
21:33
ThePrimeTime
Рет қаралды 275 М.
Why I Quit the Scrum Alliance
7:58
The Passionate Programmer
Рет қаралды 7 М.
Why I Love Being a Software Engineer
8:53
Marko
Рет қаралды 753 М.
CitizenCon 2953 Highlight | Server Meshing, PES & Replication Layer
20:03
50 Tips & Tricks for Star Citizen
39:13
Loud Guns
Рет қаралды 66 М.
Why I Love Being a Software Engineer
8:14
Brian Ruiz
Рет қаралды 337 М.
Persistent Entity Streaming | How It Could Change Star Citizen
12:30
TOP-18 ФИШЕК iOS 18
17:09
Wylsacom
Рет қаралды 606 М.
iPhone 15 Unboxing Paper diy
0:57
Cute Fay
Рет қаралды 3,5 МЛН
📦Он вам не медведь! Обзор FlyingBear S1
18:26
ПОКУПКА ТЕЛЕФОНА С АВИТО?🤭
1:00
Корнеич
Рет қаралды 631 М.
keren sih #iphone #apple
0:16
Muhammad Arsyad
Рет қаралды 1,4 МЛН