Random ordering via order_by("?")

Using order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.

Using order_by('?') can be very slow as to achieve random ordering the database has to:

  • Generate a random number for each row.
  • Scan the numbers for the smallest one.
  • Read the one row we're interested in.

Let that sink in: generating random numbers is slow and scanning is not quick. We're doing that for every row in the table just to read one row! If there are a lot of rows then expect a performance impact. In short, databases are not good at random. Applications are though. So consier splitting responsibilities between the database and the application:

  • At database level determine the number of rows via objects.count()
  • At application level get a random number between 0 and that count.
  • At database level select the row at that index.

This will mean two database reads are performed, but that will be significantly quicker than the database copying the table and ordering all the rows randomly.

If our GitHub code review bot spots this issue in your pull request it gives this advice:

django-doctorbotsuggested changes just now
views.py
1
+
def get_random_item():

Using order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.

Read more
Suggested changes
+
from random import randint
+
-
    return MyModel.objects.order_by('?')[0]
+
    index = randint(0, MyModel.objects.count() -1)
+
    return model.objects.all()[index]
Commit suggestion
2
+
    return MyModel.objects.order_by('?')[0]
Update views.py

Instantly check if you have this issue for free

    Works with tools you use

    Read about how it works.