It's incredibly difficult to optimize performance around. With REST, I know how ...

andrewingram · on Nov 7, 2020

It's not that hard. For sure it's not trivial, but the thing you just said about REST is equally true for GraphQL -- you can literally solve performance in the same way. Ultimately fulfilling the data requirements for a use case boils down to composing the results of a bunch of I/O calls, the only real difference between REST and GraphQL is how these calls are triggered and their results composed.

initplus · on Nov 7, 2020

Sure, but the design of GraphQL, and the majority of the GraphQL backend tooling, lends itself to performance issues much more strongly than a boring REST API. I don't know if there are alternative backend tools, but my experience has been quite painful.

The "graph" part - GraphQL seems to want you to build these big interconnected webs of types. But there isn't good tooling on the backend to deal with resolving relationships between types efficiently. Someone decides they need user.friends.pets.names, and now my backend code is doing 3 DB round trips because my GraphQL server lib doesn't understand it can turn the relationship between the types into a JOIN. I decide that's no good and the query needs to be faster, the alternative is I have to do the JOIN every time even if the client didn't actually request friends. And if I'm going to overfetch from the DB, might as well send that data to the client anyway?

My experience was that because of how backend GraphQL servers are written, it becomes very challenging to reason about performance. It's hard to think about how your nice type resolvers are going to interact, and fan out into some nightmare n+1 query problem performance pit.

I know there are some better tools than this, which I sadly haven't tried...

andrewingram · on Nov 7, 2020

But if you were fetching the same data with a REST API (one that doesn't allow inclusion of relation fields, such as the JSON-API spec), you'd be doing the same amount of I/O and have the same horrible n+1 problem just between the client and server rather than inside the server, so it's spread across numerous requests rather than just one.

Tools like Dataloader solve the n+1 problem quite effectively, and they've been around for about 5 years, making it a solved problem for the overwhelming majority of GraphQL's open source existence. If you look at the examples I wrote in this article (https://andrewingram.net/posts/optimising-graphql-request-wa...), in the waterfall charts each "call" is roughly equivalent to one simple (and usually easy to optimise) database query. I don't want to unduly trivialise the work in getting Dataloader properly integrated, but I will say it doesn't take a lot to start seeing the fruits of your effort.

akra · on Nov 8, 2020

Not quite. There is an advantage to being "chatty" that GraphQL proponents don't often mention IMO. You can load balance each of those join requests, cache those GET requests even if they make up only a part of the overall query, shard the storage to multiple servers, rate limit, etc. Spreading requests is usually a good thing; with HTTP/2 the overhead of those extra requests is extremely low as well. It allows sharded processing with approx equal load between nodes at the backend.

Its much easier to define SLA's and manage your traffic profile/platform when the load of each "query" is definitely quantifiable and granular - unlike a GraphQL query.

Then people come to me and say - well you can pre-can your GraphQL queries so that you know which queries are going to be run therefore you have known performance. At which point I say - why not just make a specialised REST endpoint with that query inside it?

I see it as a good backend-for-frontend adapter technology where a bulk query can be sharded into individual smaller ones using core backend services for large use-case specific views. If a query uses too much resources then the GraphQL server and/or a client ID can be rate limited by the backend servers, etc. Which is why I think its mainly a JS thing to date IMO - it doesn't solve many core problems for most backends - it solves typical front-ender dev's issues who can't/don't want to write server side code.

fogetti · on Nov 8, 2020

I think you are confused about what the parent comment said. The point is that the n+1 problem is usually solved on the database level in case of REST and the endpoint is also changing in tandem to reflect the DB changes. There is no client and server fan-out as you describe.

And the second thing is that solving something by slapping an unknown library into the stack instead of using the DB tools which are readily available out of the box doesn't make much sense IMO.

initplus · on Nov 8, 2020

Sure, I'd have the same n+1 problem. But this time the frontend devs will notice it when they try to use the API. With GraphQL our frontend team doesn't reason about the backend performance characteristics of their queries until they kill perf on prod.

andrewingram · on Nov 8, 2020

At this point it's sounding more like you don't trust your front-end devs. Fixing that is going to be more critical than any technological choice you make.

kikimora · on Nov 8, 2020

I don't get "not that hard" part. With REST I have a finite set of queries to optimise. With GraphQL I have unlimited set of queries. How I supposed to optimise for what I don't know?

One good example - 'users' and 'comments' tables with natural relationship. 'Comments' grow to unreasonable size and split into different tables, fresh comments stay in 'comments', old comments moved to another database with cheaper storage. With REST it is easy, I get a query like 'get user's A comments from X till Y'. I have to deal with two dates and based on that figure out how to fetch data.

With GraphQL I can get query like this but also something like this 'get user's A comments from user.signup_date.year till user.signup_date.year + 1'. I suppose I would have to deal with query AST to figure out if it queries archive dates or not. This sounds like 100x more complicated compared to REST.

RedShift1 · on Nov 7, 2020

If you create those indexes for REST API's you can just as well create those indexes for GraphQL API's?

GordonS · on Nov 7, 2020

It's not just about indexes though, it's also about hand-crafted database queries.

RedShift1 · on Nov 7, 2020

GraphQL doesn't exclude hand-crafted database queries. If that is what needed you can perfectly make that fit with GraphQL. And if for some reason it won't fit in GraphQL, you could still create a one-off REST endpoint.

kikimora · on Nov 8, 2020

And also introduce all REST tooling in addition to GraphQL. At this time people often start thinking "why I didn't start with REST in the first place?"

dudul · on Nov 7, 2020

What prevents you from doing that with graphql? You're still getting parameters to use in your DB queries.

calrain · on Nov 8, 2020

Being able to predict how your customers use your API has a strong impact on how you deliver the data behind the API.

If we're talking about highly interconnected objects in an API, it would be expensive (from a DB point of view) to arbitrarily add indexes in the long shot case that someone may use a selector on that property.

Equally, when data is highly interconnected, by designing an API that focuses on particular consumption patterns, I'm able to add custom functionality, pre-processing, or highly customised queries/indexes to ensure the data is read as fast as possible.

Poor performing API's are bad for the consumer, bad for the database, and bad for the art of data.

There's a balance between centralization and independence in software models that is so hard to balance right.

MicroServices, GraphQL, and Docker Swarms are examples of technologies that, if done right, can be enabling, and if done wrong, will pull the ship to the bottom of the ocean.