PostgreSQL has had a module available in contrib since version 8.4 - pg_stat_statements, originally developed by Takahiro Itagaki. The module blames execution costs on queries, so that bottlenecks in production can be isolated to points in the application. It does so by providing a view that is continually updated, giving real-time statistical information. Here is an example from the Postgres 9.2 docs:
bench=# SELECT pg_stat_statements_reset(); $ pgbench -i bench $ pgbench -c10 -t300 bench bench=# \x bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5; -[ RECORD 1 ]--------------------------------------------------------------------- query | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?; calls | 3000 total_time | 9.60900100000002 rows | 2836 hit_percent | 99.9778970000200936 -[ RECORD 2 ]--------------------------------------------------------------------- query | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE tid = ?; calls | 3000 total_time | 8.015156 rows | 2990 hit_percent | 99.9731126579631345 -[ RECORD 3 ]--------------------------------------------------------------------- query | copy pgbench_accounts from stdin calls | 1 total_time | 0.310624 rows | 100000 hit_percent | 0.30395136778115501520 -[ RECORD 4 ]--------------------------------------------------------------------- query | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?; calls | 3000 total_time | 0.271741999999997 rows | 3000 hit_percent | 93.7968855088209426 -[ RECORD 5 ]--------------------------------------------------------------------- query | alter table pgbench_accounts add primary key (aid) calls | 1 total_time | 0.08142 rows | 0 hit_percent | 34.4947735191637631
This is an extremely useful feature. However, its usefulness has historically been highly limited by the fact that it differentiated queries based solely on their query string. This effectively limited the use of the module to code that mostly or entirely used prepared statements. If non-prepared statements were used, what you ended up with was something that looked like this:
-[ RECORD 1 ]--------------------------------------------------------------------- query | UPDATE pgbench_branches SET bbalance = bbalance + 5 WHERE bid = 3; calls | 1 total_time | 0.002 rows | 1 hit_percent | 99.9778970000200936 -[ RECORD 2 ]--------------------------------------------------------------------- query | UPDATE pgbench_branches SET bbalance = bbalance + 4343 WHERE bid = 42; calls | 1 total_time | 0.0044 rows | 1 hit_percent | 99.9778970000200936 -[ RECORD 3 ]--------------------------------------------------------------------- query | UPDATE pgbench_branches SET bbalance = bbalance + -2329 WHERE bid = 4543; calls | 1 total_time | 0.003 rows | 1 hit_percent | 99.9778970000200936 -[ RECORD 4 ]--------------------------------------------------------------------- query | UPDATE pgbench_branches SET bbalance = bbalance + 9005 WHERE bid = 7392; calls | 1 total_time | 0.005 rows | 1 hit_percent | 99.9778970000200936Pretty soon, the shared memory area that stores these statements is filled with a great many entries, each perhaps differing only in the constants used in a single execution of the query, and with no way to aggregate the information to usefully inform optimisation efforts.
Why not just always use prepared statements? Well, some applications do. Prepared statements have historically been well supported by JDBC for example, so Java apps tended to be okay. However, there are practical reasons to avoid them. Prepared statements can easily have performance regressions compared to equivalent unprepared versions, since the optimizer won't necessarily be able to use as much data about MCVs and so on as in the regular case. There are also limitations in some popular ORMs, with, for example, the popular Rails' ActiveRecord ORM having just added prepared statement support recently. Ad-hoc queries are generally never prepared either. The traditional solution was log-parsing utilities, which analyse log files after-the-fact and output reports. I have found some of these tools to be rather awkward, with high resource requirements that often necessitate running log analysis on a separate server to process often hundreds of megabytes a day of verbose log ouput. Postgres logs are naturally very verbose when every single query's execution is logged.
Enter pg_stat_statements normalisation. Last night, Tom Lane committed the following patch of mine, which was sponsored by Heroku:
Improve contrib/pg_stat_statements to lump "similar" queries together.
pg_stat_statements now hashes selected fields of the analyzed parse tree to assign a "fingerprint" to each query, and groups all queries with the same fingerprint into a single entry in the pg_stat_statements view. In practice it is expected that queries with the same fingerprint will be equivalent except for values of literal constants. To make the display more useful, such constants are replaced by "?" in the displayed query strings.
This mechanism currently supports only optimizable queries (SELECT, INSERT, UPDATE, DELETE). Utility commands are still matched on the basis of their literal query strings.
There remain some open questions about how to deal with utility statements that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how to deal with expiring speclative hashtable entries that are made to save the normalized form of a query string. However, fixing these issues should require only localized changes, and since there are other open patches involving contrib/pg_stat_statements, it seems best to go ahead and commit what we've got.
Peter Geoghegan, reviewed by Daniel Farina
I would like to acknowledge the invaluable assistance of both Tom and Daniel in bringing this project to maturity.
I have benchmarked this featured, and found that it implies an overhead of about 1% - 2.5% on pgbench's default TPC-B style workload. This is rather good, and only marginally worse than the hit taken when using prepared statements with classic pg_stat_statements.
As if that wasn't good enough, pg_stat_statements has been made even more useful by the efforts of other people for the upcoming 9.2 release. A patch by Ants Aasma was recently committed to expose I/O timings at the query granularity through pg_stat_statements. Robert Haas wrote a patch to expose blocks dirtied and written by statements through pg_stat_statements too.
I expect the uptake of pg_stat_statements to really climb after the release of Postgres 9.2.
Congratulations, Peter. I'm delighted that we were able to sponsor this work. It was a surprisingly difficult and subtle piece of work, but I can't wait to be able to take advantage of it in 9.2.
ReplyDeleteThis is great stuff. Congratulations, Peter.
ReplyDeleteThanks guys!
ReplyDeleteWell done mate!
ReplyDeleteCheers Gabriele.
ReplyDeleteThat's a great feature. It also has an unintended consequence that I can no longer use pg_stat_statements as a diagnostic tool for my specific case.
ReplyDeleteI've inherited a codebase that never used prepared statements and binds. I've been slowly refactoring the code to use prepared statements. The tool I used to identify which queries were actually getting called, and so first on the hit list for refactoring, was by looking at pg_stat_statements.
Since the literals in the queries would always change, a count=1 in pg_stat_statements probably meant an unbound query. And since pg_stat_statements rotates out the contents of its pseudotable, I know that the query had been recently executed.
So, this patch will kill that ability for me. Oh well. Maybe I'll just switch to finding queries that have been called a lot and don't have a dollar sign in the query column.
Thanks for the work. This will be a win for optimization. I definitely could have used this when I first got my ugly code and was looking for slow queries.
@Andy
ReplyDeleteThanks.
While I did originally have a legacy compatibility mode, that was ultimately opposed by others. However, the fix in your case is simple - there's nothing (for now) to stop you from building a 9.1 pg_stat_statements against 9.2.
Technically it's unsupported, but I can think of no reason why it wouldn't work.