Wait… railway runs on GCP? Didn’t they make a whole thing about not “building a ...

miniman1337 · 2026-05-20T01:13:57 1779239637

from the blog linked via Wayback Machine. "From Day 1, we had this notion at the forefront.

The other notion that we have intuited is that you can’t build a cloud on another cloud. We have devoted years of practice running our own metal (and playing well with other clouds) to make sure that Railway’s business, which invariably becomes your customer’s business, is as rock solid as possible."

dlcarrier · 2026-05-20T04:16:42 1779250602

I'm not familiar with Railway, so this might not make any sense, but it's possible they were using their own hardware but managing it with Google accounts. It's not uncommon for a company's offsite human-to-human communications to fail when there's a Google outage or ban, so it's not unexpected to have the same interference with human-to-machine or machine-to-machine communications.

MrDarcy · 2026-05-20T01:46:04 1779241564

That’s strange, when I interviewed with the founder a few years ago he told me they were on AWS wanting to move to firecracker.

eoswald · 2026-05-20T01:07:20 1779239240

Yep, and this is why I'm pissed. They lied. They're completely dependent on GCP. So, I gotta do some research, i need something a little more stable (and less dependent on one company's whims) than this. This is bad for them, because it really strikes at the heart of their 'big claim,' peacefull software deployments. This is chaos.

ndneighbor · 2026-05-20T01:18:37 1779239917

Yea, I mean, that's the whole MO of our platform and we failed at that. So yea, that's disappointing and more so for our customers.

I can provide an explanation about the GCP dependency. Yes, we have host workloads off GCP, and we have been able to build a good business by performing a cloud exit. However, we were worried that we would have a circular dependency on our own cloud. I don't think we expected to get auto-modded out of our own account, hence we left our DB on CloudSQL.

It was never our intent to deceive people that we didn't own our own destiny with our business. The last GCP issue, we were assured that this scenario wouldn't happen (when we got auto-ratelimited, which was bad, but survivable) - but it seems like we have further work to do. Apologies.

fontain · 2026-05-20T01:24:55 1779240295

I’m very sympathetic and understand that decisions are easy to criticize in hindsight but leaving your database in GCP while moving everything else to your own data centres seems so backwards I can’t even begin to imagine how that could happen. Was this really an intentional design decision?

arjie · 2026-05-20T01:37:33 1779241053

I have exactly the same architecture. You can easily administer a postgres/mysql on your own infrastructure, but it's also the one thing where backups and availability are super strict. I can easily support multi-region in Google Cloud or AWS and that's way harder to do on-prem, and it's also hard to handle the replication story as safely as with Google Cloud. The hope is that GCP et al. give you safety and availability for the control plane stuff and you can run your data plane on-prem.

At $2m/mo spend, this kind of thing is insane. GCP has never been the most reliable of clouds but this is pretty awful. I would never have expected this.

ahofmann · 2026-05-20T04:42:56 1779252176

I have kind of the same architecture. I host multiple dedicated servers and vps instances in the Hetzner "cloud", but all of these connect to a few hosted databases by Hetzners web hosting packages for like 20 bucks a month. It sounds insane, but the one thing that absolutely needs to stay online, is the database, so not hosting this myself makes sense. And since Hetzner is apparently tuned their dirt cheap databases pretty well, we can hammer them pretty hard without any problems.

ndneighbor · 2026-05-20T01:33:38 1779240818

> decisions are easy to criticize in hindsight

I mean, the pain we have caused our customer ultimately proves you correct. That said, we made our decisions with the information and constraints that we knew in that moment in time. Railway has hosts in AWS/GCP/and co-los, so coordinating those workloads in a fully distributed manner would be ideal but end of the day, we didn't forsee that would just have our project get deleted just like that.

(Even if we did get assurances from them in 2024, that it wouldn't happen again, although we just got auto-rate limited the last time.)

csw-001 · 2026-05-20T02:04:56 1779242696

Thanks for getting things back up (genuinely mean that, btw). Upon logging back in I was prompted to promise I'm not deploying naughty things (I'm not). Was this in response to GCP detecting illegal (prohibited) behavior from something deployed via railway?

ndneighbor · 2026-05-20T02:20:32 1779243632

Actually, when I made the TOS check, I put that in Redis. That + the feature flags got reset.

r_lee · 2026-05-20T01:42:30 1779241350

could you clarify, did an automated process by Google delete a GCP project/account/resource(s)? like, what exactly were you seeing when trying to get access or see what happened?

ndneighbor · 2026-05-20T01:58:32 1779242312

They deleted our GCP proj. sans warning. Still working the details, but that's how this whole thing began.

yen223 · 2026-05-20T02:06:21 1779242781

this is easily explained by "database migrations are incredibly difficult and very risky"

purduemike · 2026-05-20T02:46:37 1779245197

Why CloudSQL? why not AlloyDB for stability?