If it was that expensive to host the API, they wouldn't have done it for over a decade. Reddit's popularity was at least partially built on the labor of 3rd party devs building mobile apps, moderation tools, useful bots, all types of stuff.
The bottom line is they'll happily pull the rug from under those 3rd party devs if it means they can pump their numbers for their IPO.
The framing in the question is wrong. Reddit gets ~80 comments per second on average. An API to get comments with ids starting after n would let you reasonably scrape all new comments with 1 request per second (or one every 10 seconds if the API returned up to 1000 results, which would be entirely reasonable). They get ~10 posts per second, so add another 1 request for that. The load is trivial. Storing the content is also trivial; the entire history of reddit fits very comfortably on a single consumer-tier SSD.
The infrastructure they provide is easily replicable at almost no cost (really it's just the bandwidth that costs any money at all). The community curation and moderation is done by volunteers. The content is all from the users. Reddit Inc is providing almost none of the value, and is just benefiting from network effects. People go there because people go there.
Your question isn't the question you should be asking.
Please answer the question:
How is reddit supposed to prevent scraping when it is legal to do so, and they have incentive to appear in search engines? Considering that scraping is legal and WILL happen, why not opt to reduce the load by offering an API?
If consuming their content by scraping was easy, nobody would be complaining about the APIs being taken away... it's very easy to make web scraping impractical if they want. Don't be deluded.
This content is text, it is extremely cheap to host. The site itself is not particularly resilient, so there is not a lot of overhead there. 10 euros a month is an absolutely ridiculous price for what probably doesn't even cost them 20 cents per user, per month.
There’s a real non-negligible cost to Reddit hosting the content.
The bottom line is there’s solutions here.
- High volume API users pay up and subsidize the cost of everyone else
OR
- All Reddit users pay a monthly 9.99 subscription and the API stays the same.
OR
- A not-for-profit let’s say Internet Archive takes ownership and begs the Reddit community for donations (ie. Wikipedia)