Peter Geoghegan's blog: c++

Showing posts with label c++. Show all posts

Saturday, April 4, 2015

Abbreviated keys for numeric to accelerate numeric sorts

Andrew Gierth's numeric abbreviated keys patch was committed recently. This commit added abbreviation/sortsupport for the numeric type (the PostgreSQL type which allows practically arbitrary precision, typically recommended for representing monetary values).

The encoding scheme that Andrew came up with is rather clever - it has an excellent tendency to concentrate entropy from the original values into the generated abbreviated keys in real world cases. As far as accelerating sorts goes, numeric abbreviation is at least as effective as the original text abbreviation scheme. I easily saw improvements of 6x-7x with representative queries that did not spill to disk (i.e. that used quicksort). In essence, the patch makes sorting numeric values almost as cheap as sorting simple integers, since that is often all that is actually required during sorting proper (the abbreviated keys compare as integers, except that the comparison is inverted to comport with how abbreviation builds abbreviated values from numerics as tuples are copied into local memory ahead of sorting - see the patch for exact details).

Abbreviated keys: exploiting locality to improve PostgreSQL's text sort performance

On Monday, Robert Haas committed a patch of mine that considerably speeds up the sorting of text in PostgreSQL. This was the last and the largest in a series of such patches, the patch that adds "abbreviated keys". PostgreSQL 9.5 will have big improvements in sort performance.

In realistic cases, CREATE INDEX operations on text are over 3 times faster than in PostgreSQL 9.4. Not every such utility operation, or data warehousing query involving a big sort is sped up by that much, but many will be.

This was a piece of work that I spent a considerable amount of time on over the past few months. It's easy to justify that effort, though: sorting text is a very fundamental capability of any database system. Sorting is likely the dominant cost when creating B-Tree indexes, performing CLUSTER operations, and, most obviously, for sort nodes that are required by many plans that are executed in the service of queries with ORDER BY or DISTINCT clauses, or aggregates using the GroupAggregate strategy. Most of the utility statements that need to perform sorts must perform them with a very disruptive lock on the target relation (CREATE INDEX CONCURRENTLY is a notable exception), so quite apart from the expense of the sort, the duration of sorts often strongly influences how long a production system is seriously disrupted.

What I think of jsonb

Unsurprisingly, there has been a lot of interest in the jsonb type, which made it into the upcoming 9.4 release of Postgres. I was initially a reviewer of jsonb, although since I spent weeks polishing the code, I was ultimately credited as a co-author.

Jsonb is a new datatype for Postgres. It is distinct from the older json datatype in that its internal representation is binary, and in that it is internally typed. It also makes sophisticated nested predicates within queries on jsonb indexable. I've occasionally described the internally-typed scalar values as having “shadow types” unknown to the core SQL parser. This has several implications. For example, if you sort two Jsonb values containing only scalar numbers, the implementation invokes the numeric comparator (which the jsonb default B-Tree opclass comparator is defined in terms of). The on-disk representation of jsonb includes the same representation as is used for, say, numerics (as the internal binary representation of JSON primitive numbers, for example). Plus, JSON objects are de-duplicated by key on input, and optimized for cheap binary searches within a single jsonb. Still, like the earlier json type, jsonb in every sense “speaks JSON”. There are some limitations on what can be represented as a jsonb number, but those are exactly the same limitations that apply to the core numeric type (plus some limitations imposed by the JSON RFC, such as not accepting NaN values). I hope it suffices to say that these limitations are virtually irrelevant, and that many implementations have similar or worse limitations. All of these minor implementation-defined restrictions are explicitly anticipated and allowed for by the recent JSON RFC-7159.

Sorting improvements in PostgreSQL 9.2: the case for micro-optimisation

There has been much discussion of performance improvements in the upcoming 9.2 release of PostgreSQL. Recently, I noticed that Regina Obe and Leo Hsu's new book, "PostgreSQL: Up and running" prominently listed "Sorting improvements that improve in-memory sorting operations by as much as 20%" as a performance feature of that release. While they do get things about right there, I'm not sure that this improvement warrants such prominent placement, at least in sheer terms of its likely impact on the performance of production PostgreSQL systems - we packed a lot of great performance improvements into 9.2. The likely reason that it was picked up on in the book, and the real reason for this blogpost, is the story behind the development of the optimisation, which I for one find kind of interesting, and worth sharing. It's more interesting from the perspective of someone with a general interest in systems programming or PostgreSQL's design philosophy than a casual user, though. If you're a casual user, the short version is that simple queries that perform in-memory sorting of integers and floats will be about 23% faster.

Towards 14,000 write transactions per second on my laptop

Postgres 9.2 will have many improvements to both read and write scalability. Simon Riggs and I collaborated on a performance feature that greatly increased the throughput of small write transactions. Essentially, it accomplishes this by reducing the lock contention surrounding an internal lock called WALWriteLock. When an individual backend/connection holds this lock, it is empowered to write WAL from wal_buffers, an area of shared memory that temporarily holds WAL until it is written, and ultimately flushed to persistent storage.

Original update.sql "new group commit" benchmark, January 2012. This made it into Postgres 9.2. Here, we compare the performance of my original patch (red line) and Postgres master in January (green line). 9.1 performance on this benchmark would probably be very similar to that of the baseline seen here.

Much improved statement statistics coming to Postgres 9.2

There is a tendency for people with an interest in improving databases performance to imagine that it mostly boils down to factors outside of their application - the hardware, operating system configuration, and database settings. While these are obviously crucially important, experience suggests that in most cases, by far the largest gains are to be had by optimising the application’s interaction with the database. Doing so invariably involves analysing what queries are being executed in production, their costs, and what the significance of the query is to the application or business process that the database supports.

PostgreSQL has had a module available in contrib since version 8.4 - pg_stat_statements, originally developed by Takahiro Itagaki. The module blames execution costs on queries, so that bottlenecks in production can be isolated to points in the application. It does so by providing a view that is continually updated, giving real-time statistical information. Here is an example from the Postgres 9.2 docs:

Clang now builds Postgres without additional warnings

I'm happy to report that as of this evening, Clang builds PostgreSQL without any warnings, apart from a single remaining warning that also occurs when building with GCC, which is actually a bug in GNU Flex that the Flex developers don't seem to want to fix. On GCC 4.6, the warning looks like this:

In file included from gram.y:12962:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16246:23: warning: unused variable ‘yyg’ [-Wunused-variable]

With Clang, however, it looks like this:

scan.c:16246:23: warning: unused variable 'yyg' [-Wunused-variable]
struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be unused depending upon options. */
^
Note that the "^" is directly underneath the offending variable "yyg" on the terminal emulator that generated this warning.

Could Clang displace GCC generally? Part II: Performance of PostgreSQL binaries

This is the second in a two-part series on Clang. If you haven't already, you'll want to read my original post on the topic, Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile times.

So, what about the performance of PostgreSQL binaries themselves when built with each compiler? I had heard contradictory reports of the performance of binaries built with Clang. In Belgium, Chris Lattner said that Clang built binaries could perform better, but a number of independent benchmarks suggested that Clang was generally behind, with some notable exceptions. I asked 2ndQuadrant colleague and PostgreSQL performance expert Greg Smith to suggest a useful benchmark to serve as a good starting point for comparing Postgres performance when built with Clang to performance when built with GCC. He suggested that I apply Jeff Janes' recent patch for pgbench that he'd reviewed. It stresses the executor, and therefore the CPU quite effectively, rather than table locks or IPC mechanisms. The results of this benchmark were very interesting.

Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile times

Back in February, I attended FOSDEM, the free and open source software developers' European meeting. Most of my time was spent manning the PostgreSQL stand and networking with fellow members of the community; I often find the “hall track” of conferences is of most interest.

On this occasion though, there was one talk in particular that I really wanted to see: Chris Lattner's “LLVM and Clang: Advancing Compiler Technology”. I was certainly not alone in having the subject pique my interest, as it filled what was apparently Belgium's largest auditorium to capacity, and I was joined by a number of other PostgreSQL people, including some senior community members. Chris' slides are available from here:

http://www.scribd.com/doc/48921683/LLVM-Clang-Advancing-Compiler-Technology

Peter Geoghegan's blog

Saturday, April 4, 2015

Abbreviated keys for numeric to accelerate numeric sorts

Friday, January 23, 2015

Abbreviated keys: exploiting locality to improve PostgreSQL's text sort performance

Sunday, March 23, 2014

What I think of jsonb

Wednesday, August 1, 2012

Sorting improvements in PostgreSQL 9.2: the case for micro-optimisation

Monday, June 4, 2012

Towards 14,000 write transactions per second on my laptop

Thursday, March 29, 2012

Much improved statement statistics coming to Postgres 9.2

Saturday, August 6, 2011

Clang now builds Postgres without additional warnings

Thursday, July 28, 2011

Could Clang displace GCC generally? Part II: Performance of PostgreSQL binaries

Saturday, July 23, 2011

Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile times