order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.
order_by('?') can be very slow as to achieve random ordering the database has to:
Let that sink in: generating random numbers is slow and scanning is not quick. We're doing that for every row in the table just to read one row! If there are a lot of rows then expect a performance impact. In short, databases are not good at random. Applications are though. So consier splitting responsibilities between the database and the application:
This will mean two database reads are performed, but that will be significantly quicker than the database copying the table and ordering all the rows randomly.
If our GitHub code review bot spots this issue in your pull request it gives this advice:
Code Review Doctor will run this check by default. No configuration is needed but the check can be turned on/off using check code
inefficient-order-by-random in your pyproject.toml file.