Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

It's a bit high level to mention this, but it doesn't have the one query problem I see constantly. Paging using offset and limit. I swear every app I've worked on uses it somewhere. And it's horrifically inefficient 90% of the time.


Simply depends on pagination depth.

Unfortunately there aren't any solutions that generalize as well as LIMIT + OFFSET which is why we use it. In fact, other solutions usually take quit a bit of custom tailoring if you want Prev/Next and deep page jumping.


Can you elaborate on the issues with that? I've used that pattern and haven't found any major woes (yet), but also don't usually paginate complicated queries.


In postgres and mysql, at least, the database has to re-scan data every time you run a limit/offset query query. It gets progressively slower as your offset increases.

The efficient way to handle it is to set a lower limit on the pkid (or other atomically increasing row) of the last record fetched, and fetch in ascending order e.g.

    SELECT * from my_table where id > (last_row_id_seen) ORDER BY id asc limit 20;
Then you have an indexed jump to the correct rows to return. It's pretty straightforward to abstract a batched database iterator with this pattern for use application-wide (usually can be done in ~20-50 loc), but no ORM that I'm aware of supports this pattern natively.

Do note that, by extension, it's impossible to efficiently batch iterate over tables that don't have a unique, orderable key in postgres and mysql.


What if you need to order by something other than id?


Then, generally, give the user a bunch of filtering options and limit page depth to less than ~1000 entries


Guys behinds PostgreSQL / CitusDB wrote awesome article on which methods of pagination work how well and where they fail.

https://www.citusdata.com/blog/2016/03/30/five-ways-to-pagin...

Highly recommended read.


DB has to calculate every record from 0-> offset . A user can frequently crash a server by just going to page 100+ in a big list since the database has to load all that data.

Easiest way to prevent this is add enough filtering options that it doesn't inconvenience users to limit depth to page 10 or something


Probably still better than putting all the rows in a temp table with row numbers and then selecting where row number is between x and y, which I have also seen.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: