Thursday, March 29, 2012

Much improved statement statistics coming to Postgres 9.2

There is a tendency for people with an interest in improving databases performance to imagine that it mostly boils down to factors outside of their application - the hardware, operating system configuration, and database settings. While these are obviously crucially important, experience suggests that in most cases, by far the largest gains are to be had by optimising the application’s interaction with the database. Doing so invariably involves analysing what queries are being executed in production, their costs, and what the significance of the query is to the application or business process that the database supports.

PostgreSQL has had a module available in contrib since version 8.4 - pg_stat_statements, originally developed by Takahiro Itagaki. The module blames execution costs on queries, so that bottlenecks in production can be isolated to points in the application. It does so by providing a view that is continually updated, giving real-time statistical information. Here is an example from the Postgres 9.2 docs:

Saturday, January 28, 2012

Power consumption in Postgres 9.2

One of the issues of major concern to CPU vendors is optimising the power consumption of their devices. In a world where increasingly, computing resources are purchased in terms of fairly abstract units of work, and where, when selecting the location of a major data-centre, the local price of a kilowatt hour is likely to be weighed just as heavily as the wholesale price of bandwidth, this is quite understandable.

Globally, data centres consumed between 1.1 and 1.5 percent of electricity in 2010 (Source: Koomey). The economic and ecological importance of minimizing that number is fairly obvious.

Saturday, August 6, 2011

Clang now builds Postgres without additional warnings

I'm happy to report that as of this evening, Clang builds PostgreSQL without any warnings, apart from a single remaining warning that also occurs when building with GCC, which is actually a bug in GNU Flex that the Flex developers don't seem to want to fix. On GCC 4.6, the warning looks like this:

In file included from gram.y:12962:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16246:23: warning: unused variable ‘yyg’ [-Wunused-variable]

With Clang, however, it looks like this:

scan.c:16246:23: warning: unused variable 'yyg' [-Wunused-variable]
    struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be unused depending upon options. */
                      ^
Note that the "^" is directly underneath the offending variable "yyg" on the terminal emulator that generated this warning.

Thursday, July 28, 2011

Could Clang displace GCC generally? Part II: Performance of PostgreSQL binaries

This is the second in a two-part series on Clang. If you haven't already, you'll want to read my original post on the topic, Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile times.

So, what about the performance of PostgreSQL binaries themselves when built with each compiler? I had heard contradictory reports of the performance of binaries built with Clang. In Belgium, Chris Lattner said that Clang built binaries could perform better, but a number of independent benchmarks suggested that Clang was generally behind, with some notable exceptions. I asked 2ndQuadrant colleague and PostgreSQL performance expert Greg Smith to suggest a useful benchmark to serve as a good starting point for comparing Postgres performance when built with Clang to performance when built with GCC. He suggested that I apply Jeff Janes' recent patch for pgbench that he'd reviewed. It stresses the executor, and therefore the CPU quite effectively, rather than table locks or IPC mechanisms. The results of this benchmark were very interesting.