Thursday, March 29, 2012

Much improved statement statistics coming to Postgres 9.2

There is a tendency for people with an interest in improving databases performance to imagine that it mostly boils down to factors outside of their application - the hardware, operating system configuration, and database settings. While these are obviously crucially important, experience suggests that in most cases, by far the largest gains are to be had by optimising the application’s interaction with the database. Doing so invariably involves analysing what queries are being executed in production, their costs, and what the significance of the query is to the application or business process that the database supports.

PostgreSQL has had a module available in contrib since version 8.4 - pg_stat_statements, originally developed by Takahiro Itagaki. The module blames execution costs on queries, so that bottlenecks in production can be isolated to points in the application. It does so by providing a view that is continually updated, giving real-time statistical information. Here is an example from the Postgres 9.2 docs:


bench=# SELECT pg_stat_statements_reset();

$ pgbench -i bench
$ pgbench -c10 -t300 bench

bench=# \x
bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
               nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
          FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
-[ RECORD 1 ]---------------------------------------------------------------------
query       | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;
calls       | 3000
total_time  | 9.60900100000002
rows        | 2836
hit_percent | 99.9778970000200936
-[ RECORD 2 ]---------------------------------------------------------------------
query       | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?;
calls       | 3000
total_time  | 8.015156
rows        | 2990
hit_percent | 99.9731126579631345
-[ RECORD 3 ]---------------------------------------------------------------------
query       | copy pgbench_accounts from stdin
calls       | 1
total_time  | 0.310624
rows        | 100000
hit_percent | 0.30395136778115501520
-[ RECORD 4 ]---------------------------------------------------------------------
query       | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
calls       | 3000
total_time  | 0.271741999999997
rows        | 3000
hit_percent | 93.7968855088209426
-[ RECORD 5 ]---------------------------------------------------------------------
query       | alter table pgbench_accounts add primary key (aid)
calls       | 1
total_time  | 0.08142
rows        | 0
hit_percent | 34.4947735191637631

This is an extremely useful feature. However, its usefulness has historically been highly limited by the fact that it differentiated queries based solely on their query string. This effectively limited the use of the module to code that mostly or entirely used prepared statements. If non-prepared statements were used, what you ended up with was something that looked like this:
-[ RECORD 1 ]---------------------------------------------------------------------
query       | UPDATE pgbench_branches SET bbalance = bbalance + 5 WHERE bid = 3;
calls       | 1
total_time  | 0.002
rows        | 1
hit_percent | 99.9778970000200936
-[ RECORD 2 ]---------------------------------------------------------------------
query       | UPDATE pgbench_branches SET bbalance = bbalance + 4343 WHERE bid = 42;
calls       | 1
total_time  | 0.0044
rows        | 1
hit_percent | 99.9778970000200936
-[ RECORD 3 ]---------------------------------------------------------------------
query       | UPDATE pgbench_branches SET bbalance = bbalance + -2329 WHERE bid = 4543;
calls       | 1
total_time  | 0.003
rows        | 1
hit_percent | 99.9778970000200936
-[ RECORD 4 ]---------------------------------------------------------------------
query       | UPDATE pgbench_branches SET bbalance = bbalance + 9005 WHERE bid = 7392;
calls       | 1
total_time  | 0.005
rows        | 1
hit_percent | 99.9778970000200936
Pretty soon, the shared memory area that stores these statements is filled with a great many entries, each perhaps differing only in the constants used in a single execution of the query, and with no way to aggregate the information to usefully inform optimisation efforts.

Why not just always use prepared statements? Well, some applications do. Prepared statements have historically been well supported by JDBC for example, so Java apps tended to be okay. However, there are practical reasons to avoid them. Prepared statements can easily have performance regressions compared to equivalent unprepared versions, since the optimizer won't necessarily be able to use as much data about MCVs and so on as in the regular case. There are also limitations in some popular ORMs, with, for example, the popular Rails' ActiveRecord ORM having just added prepared statement support recently. Ad-hoc queries are generally never prepared either. The traditional solution was log-parsing utilities, which analyse log files after-the-fact and output reports. I have found some of these tools to be rather awkward, with high resource requirements that often necessitate running log analysis on a separate server to process often hundreds of megabytes a day of verbose log ouput. Postgres logs are naturally very verbose when every single query's execution is logged.

Enter pg_stat_statements normalisation. Last night, Tom Lane committed the following patch of mine, which was sponsored by Heroku:

Improve contrib/pg_stat_statements to lump "similar" queries together.

pg_stat_statements now hashes selected fields of the analyzed parse tree to assign a "fingerprint" to each query, and groups all queries with the same fingerprint into a single entry in the pg_stat_statements view. In practice it is expected that queries with the same fingerprint will be equivalent except for values of literal constants. To make the display more useful, such constants are replaced by "?" in the displayed query strings.

This mechanism currently supports only optimizable queries (SELECT, INSERT, UPDATE, DELETE). Utility commands are still matched on the basis of their literal query strings.

There remain some open questions about how to deal with utility statements that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how to deal with expiring speclative hashtable entries that are made to save the normalized form of a query string. However, fixing these issues should require only localized changes, and since there are other open patches involving contrib/pg_stat_statements, it seems best to go ahead and commit what we've got.

Peter Geoghegan, reviewed by Daniel Farina

I would like to acknowledge the invaluable assistance of both Tom and Daniel in bringing this project to maturity.

I have benchmarked this featured, and found that it implies an overhead of about 1% - 2.5% on pgbench's default TPC-B style workload. This is rather good, and only marginally worse than the hit taken when using prepared statements with classic pg_stat_statements.

As if that wasn't good enough, pg_stat_statements has been made even more useful by the efforts of other people for the upcoming 9.2 release. A patch by Ants Aasma was recently committed to expose I/O timings at the query granularity through pg_stat_statements. Robert Haas wrote a patch to expose blocks dirtied and written by statements through pg_stat_statements too.

I expect the uptake of pg_stat_statements to really climb after the release of Postgres 9.2.

7 comments:

  1. Congratulations, Peter. I'm delighted that we were able to sponsor this work. It was a surprisingly difficult and subtle piece of work, but I can't wait to be able to take advantage of it in 9.2.

    ReplyDelete
  2. This is great stuff. Congratulations, Peter.

    ReplyDelete
  3. That's a great feature. It also has an unintended consequence that I can no longer use pg_stat_statements as a diagnostic tool for my specific case.

    I've inherited a codebase that never used prepared statements and binds. I've been slowly refactoring the code to use prepared statements. The tool I used to identify which queries were actually getting called, and so first on the hit list for refactoring, was by looking at pg_stat_statements.

    Since the literals in the queries would always change, a count=1 in pg_stat_statements probably meant an unbound query. And since pg_stat_statements rotates out the contents of its pseudotable, I know that the query had been recently executed.

    So, this patch will kill that ability for me. Oh well. Maybe I'll just switch to finding queries that have been called a lot and don't have a dollar sign in the query column.

    Thanks for the work. This will be a win for optimization. I definitely could have used this when I first got my ugly code and was looking for slow queries.

    ReplyDelete
  4. @Andy

    Thanks.

    While I did originally have a legacy compatibility mode, that was ultimately opposed by others. However, the fix in your case is simple - there's nothing (for now) to stop you from building a 9.1 pg_stat_statements against 9.2.

    Technically it's unsupported, but I can think of no reason why it wouldn't work.

    ReplyDelete