There is a table:
create table log ( id serial not null constraint user_pkey primary key, date timestamp with time zone default now() not null constraint log_pkey primary key, user_id integer constraint log_user_id_fkey references "user" on delete set null, ... ); create index log_date on log (date DESC); When sampling only from it:
SELECT * FROM "log" WHERE date > '1.1.2018' ORDER BY date DESC; sorting goes by index.
Index Scan using log_date on log (cost=0.29..2330.05 rows=35689 width=341) (actual time=0.024..16.422 rows=35782 loops=1) Index Cond: (date > '2018-01-01 00:00:00+03'::timestamp with time zone) Planning time: 0.589 ms Execution time: 18.088 ms But when merging:
SELECT * FROM "log" JOIN "user" ON user.id = log.user_id WHERE date > '1.1.2018' ORDER BY log.date DESC; sorting is already without an index.
Sort (cost=3805.74..3894.96 rows=35689 width=509) (actual time=22.159..24.455 rows=13064 loops=1) Sort Key: log.date DESC Sort Method: quicksort Memory: 9573kB -> Merge Join (cost=0.43..1107.08 rows=35689 width=509) (actual time=0.012..9.642 rows=13064 loops=1) Merge Cond: (u.id = log.user_id) -> Index Scan using user_pkey on "user" u (cost=0.14..5.35 rows=59 width=168) (actual time=0.002..0.029 rows=61 loops=1) -> Index Scan using log_user_id on log (cost=0.29..2451.96 rows=35689 width=341) (actual time=0.006..6.190 rows=13065 loops=1) Filter: (date > '2018-01-01 00:00:00+03'::timestamp with time zone) Planning time: 0.564 ms Execution time: 25.899 ms Is this normal behavior? How can I persuade the scheduler to sort by index?
Neither in the documentation nor in Google did not find information.
Postgresql 9.6.
UPD. As mike noted in the comment, the left merge solves the problem.
Sort (cost=4816.65..4902.76 rows=34443 width=523) (actual time=48.332..55.717 rows=35000 loops=1) Sort Key: log.date DESC Sort Method: quicksort Memory: 21961kB -> Hash Left Join (cost=3.33..2221.04 rows=34443 width=523) (actual time=0.063..19.772 rows=35000 loops=1) Hash Cond: (log.user_id = "user".id) -> Seq Scan on log (cost=0.00..1957.54 rows=34443 width=347) (actual time=0.008..9.118 rows=35000 loops=1) Filter: (date > '2018-01-01 00:00:00+03'::timestamp with time zone) -> Hash (cost=2.59..2.59 rows=59 width=168) (actual time=0.030..0.030 rows=61 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 21kB -> Seq Scan on "user" (cost=0.00..2.59 rows=59 width=168) (actual time=0.004..0.013 rows=61 loops=1) Planning time: 0.967 ms Execution time: 62.681 ms In this particular query, the optimizer was right, but it helped dramatically on the combat query. And mike's statement helped that in one query one index per table is used.