Replay: It's here, but what took so long?
Published: 2024-09-04
Author: Teddi
After many years of complaints, begging, prodding, and poking, we finally have replays on [BB]! đ
However, the question remains: what took so long?
But wait a minute, other servers (including in CS:S and CS:GO!) have had replays for years!
If we look at how most other servers handle replay systems (often called a WR Bot), they usually just record runs similarly to how we do, but only keep the most recent data available. On top of that, they donât offer much customization with the data they collect. For games like CS:S or CS:GO this approach works fine. But for Garryâs Mod, where we have a lot more control and flexibility why not aim for something better?
This became my wishlist:
- Store as many replays as possible. I want every All-Time on the leaderboard logged and recorded. If it isnât, how is a run valid?
- I want to be able to store historical records of replays. It would be neat if we could show players a history of their runs, or at the very least how theyâve improved to become an All-Time great.
- I want to offer more control over the camera. If players want to analyse something frame-by-frame, this should be possible.
- I want to offer more control over the playback. If players want to slow down the replay, this should be possible. Speed it up? Lets go sonic.
- It should be possible to jump to any part of the run with an instant click. None of this having to wait.
- Players should be able to independently watch a replay of a run. No sharing bots, no needing bots. Bots shouldnât even be part of the equation. We have control of the player camera.
That being said, when you try to create a more expansive system youâre bound to run into the same challenges others have faced. Thereâs a reason why many servers only keep the latest record or limit how many people can record or why they use opt-in timers. It all comes down to resource usage - it can become a major issue if not managed carefully.
Storage Woes
The biggest issue for years was our disk drives. Back when we were in Texas we were working with 2x 2TB HDDs - yep, not SSDs, just old-fashioned hard drives. If I even tried to open a file larger than 50MB while players were online Iâd instantly hear complaints like, âOmg, lag?!â. So while we technically had the storage space, we didnât have the speed to match. Writing multiple files to disk at the same time couldâve easily caused an I/O bottleneck, especially with the OS and other services running on the server.
Since then, weâve upgraded to a new server chassis with SSDs! The speed boost is great, but weâve traded storage space for it - now weâre down to just 250GB. Sure, we can write files without causing lockups, but weâll burn through that storage quickly making it a premium resource. So what other options do we have?
Cloud�
For years, [BB] has experimented with different cloud providers in various ways. At one point, we even hosted our FastDL on AWS, which was super fast but also a quick way to rack up unmanageable costs. Given the volume of replays we want to store and distribute, relying on a cloud solution like AWS just isnât sustainable for the long haul.
These days, we take advantage of the Bandwidth Alliance between Cloudflare and Backblaze to create a CDN-like setup for FastDL. Iâve looked into using this system to store our replay data, and overall, it works pretty well. However, there can be a slight âlagâ when loading a file thatâs not cached yet. Another issue is that Backblaze sometimes goes into prolonged maintenance, where you can read data but canât write new data. This is fine for FastDL but not ideal for constantly uploading replay files. Cost-wise, though, itâs great: $6 per TB and free egress bandwidth when routed through Cloudflare! The only drawback is the reliability.
A Competitor Emerges
Back in 2021, Cloudflare announced R2, their solution to Amazonâs S3, which promises $0 egress fees (!!!) with 99.999999999%
(eleven 9âs) reliability at a cost of $0.015 per GB, or around $15 a month for a TB of storage. The only issue? R2 wasnât widely available yet. It wouldnât be until the back-end of 2022 that it became generally available although it was still missing some useful features.
To recap so far, our options for developing a replay system were:
- Use our own server storage, but weâd run out of space quickly.
- Use a reliable cloud storage provider (AWS), but weâd run out of money quickly.
- Copy existing systems, which would probably never advance beyond âgood enough.â
- Wait for Cloudflare R2 to become available and hope it delivers on its promises.
So, I sat on my hands with option #4. If R2 didnât turn out to be good enough, then option #3 would be the fallback. The original plan was to start testing R2 and see how it performed right during Q3 2023, but other things got in the way and that work got pushed back to Summer 2024.
The Final Push
About two weeks before Replays were set to launch, I scrapped the entire Replay system Iâd been working on for years. It had become fragmented over the past 3-5 years with different ideas and goals. I wasnât happy with it and so I decided to start fresh. If we were going to make this work, weâd do it right with as few preconceived notions as possible. After about an hour of work, I had a prototype that was already better than the old system. A bit more effort and I had something I was actually happy with, even if it was just the recording side of things.
Secondary concerns
Another concern I had during this time was how to get the data to the player. When Replay development first began, the current [BB] API didnât exist. While we likely would have built something to handle this, we would have been limited by srcdsâ internal network speeds, which are around 20kbit/s. Additionally, we can only send 64KB of data through the net system in one go, meaning weâd have to be careful about how much data we send at once. Weâd need to split it up and stream it properly to avoid issues like buffer overflows or net stream lockups.
Having the web API really solves this issue. Youâre no longer capped by the internal network speeds - just the API speed and your local internet speed. From the game serverâs perspective, that data doesnât even exist! Since it completely bypasses srcds, stability and performance shouldnât be affected any more than they already are with replay recording.
Ultimately
The rest is history, but I hope this gives you some insight into why it took so long to get Replays out. It wasnât for lack of effort, but primarily a lack of a solid storage solution. I hope you enjoy the new system and I canât wait to see all the replays set with new All-Times!