Friday, November 16, 2012

Notes on index-only scans

One of the most important performance features in Postgres 9.2 is index-only scans: the ability for certain types of queries to be performed without retrieving data from tables, potentially greatly reducing the amount of I/O needed. I recently completely overhauled the Index-only scans PostgreSQL wiki page, so that the page is now targeted at experienced PostgreSQL users that hope to get the most out of the feature.

My apologies to the authors of the feature, Robert Haas, Ibrar Ahmed, Heikki Linnakangas and Tom Lane, if my handling of the topic seems to focus on the negatives. Any reasonable article about any given index-only scan implementation would have to extensively discuss that implementation's limitations. Any discussion of Postgres index-only scans that focussed on the positives would be much shorter, and would essentially just say: "Index-only scans can make some of your queries go much faster!".

2 comments:

  1. Peter, I think it's important to stress, that
    index-only scan available only for indexes, which index data can be used to reproduce column data. GiST and GIN can't guarantee that, but some opclasses actually satisfy above condition, for example, btree_gist and btree_gin, or ranges. It's current am infrastructure that currently prevents index-only scan for these opclasses.

    Also, index-only count may be introduced to support fast counting for all am's.

    ReplyDelete
  2. Thanks Oleg. I will modify the page to emphasize just what a non-lossy SP-GiST operator class might be.

    Since the limitation is in the am infrastructure (that is, GIN and GiST index access methods), was there any particular reason why Tom decided to support index-only scans with the SP-GiST access method first? Was that just an arbitrary selection, or are SP-GiST opclass index-only scans in some way more useful or likely to be used?

    ReplyDelete