Saturday, April 4, 2015

Abbreviated keys for numeric to accelerate numeric sorts

Andrew Gierth's numeric abbreviated keys patch was committed recently. This commit added abbreviation/sortsupport for the numeric type (the PostgreSQL type which allows practically arbitrary precision, typically recommended for representing monetary values).

The encoding scheme that Andrew came up with is rather clever - it has an excellent tendency to concentrate entropy from the original values into the generated abbreviated keys in real world cases. As far as accelerating sorts goes, numeric abbreviation is at least as effective as the original text abbreviation scheme. I easily saw improvements of 6x-7x with representative queries that did not spill to disk (i.e. that used quicksort). In essence, the patch makes sorting numeric values almost as cheap as sorting simple integers, since that is often all that is actually required during sorting proper (the abbreviated keys compare as integers, except that the comparison is inverted to comport with how abbreviation builds abbreviated values from numerics as tuples are copied into local memory ahead of sorting - see the patch for exact details).



Separately, over lunch at pgConf.US in New York, Corey Huinker complained about a slow, routine data warehousing CREATE INDEX operation that took far too long. The indexes in question were built on a single text column. I suggested that Corey check out how PostgreSQL 9.5 performs, where this operation is accelerated by text abbreviation, often very effectively.

Corey chose an organic set of data that could be taken as a reasonable proxy for how PostgreSQL behaves when he performs these routine index builds. In all cases maintenance_work_mem was set to 64MB, meaning that an external tapesort is always required - those details were consistent. This was a table with 18 million rows. Apparently, on PostgreSQL 9.4, without abbreviation, the CREATE INDEX took 10 minutes and 19 seconds in total. On PostgreSQL 9.5, with identical settings, it took only 51.3 seconds - a 12x improvement! This was a low cardinality pre-sorted column, but if anything that is a less compelling case for abbreviation - I think that the improvements could sometimes be even greater when using external sorts on big servers with fast CPUs. Further organic benchmarks of abbreviated key sorts are very welcome. Of course, there is every reason to imagine that abbreviation would now improve things just as much if not more with large numeric sorts that spill to disk.

Future work

With numeric abbreviation committed, and support for the "datum" case likely to be committed soon, you might assume that abbreviation as a topic on the pgsql-hackers development mailing list had more or less played out (the "datum " sort case is used by things like "SELECT COUNT(DISTINCT FOO) ..." - this is Andrew Gierth's work again).  You might now reasonably surmise that it would be nice to have support for the default B-Tree opclasses of one or two other types, like character(n), but that's about it, since clearly abbreviation isn't much use for complex/composite types - we're almost out of interesting types to abbreviate. However, I think that work on abbreviated keys is far from over. Abbreviation as a project is only more or less complete as a technique to accelerate sorting, but that's likely to only be half the story (Sorry Robert!).

I intend to undertake research on using abbreviated keys within internal B-Tree pages in the next release cycle. Apart from amortizing the cost of comparisons that are required to service index scans, I suspect that they can greatly reduce the number of cache misses by storing abbreviated keys inline in the ItemId array of internal B-Tree pages. Watch this space!


3 comments:

  1. great improvement ! now I'd like to see abbrev.keys for jsonb :)

    ReplyDelete
    Replies
    1. Do you have some abbreviation schemes in mind that you think will create a good base case, or at least some common case?

      Delete
  2. I think that it would probably be useful to use abbreviation of GIN storage types within the internal pages of GIN B-Trees. What I envisage for B-Tree is probably easily generalized to do the same for GIN (although perhaps that should be taken with a grain of salt - I don't know too much about GIN).

    I tend to think that abbreviating jsonb itself isn't likely to work out.

    ReplyDelete