I had to come up with a way of passing the pg_hexedit frontend utility the relevant pg_attribute metadata to make this work. This metadata describes the "shape" of individual tuples in a relation (backend code uses a closely related structure called a "tuple descriptor"). My approach works seamlessly in simple cases, but can still be used when manually running the pg_hexedit command line tool.
pg_attribute system catalog table with column annotations/tags |
This new capability could be applied to optimizing the data layout of a table that is expected to eventually have a massive number of rows. Carefully choosing the order and type of each column can reduce the total on-disk footprint of a table by an appreciable amount, especially when the final table ends up with several 1 byte columns that get packed together.
I am aware of several PostgreSQL users that found it worthwhile to have a highly optimized tuple layout, going so far as to use their own custom dataypes. Alignment-aware micro-optimization of a Postgres client application's schema won't help much in most cases, but it can help noticeably with things like fact tables, or tables that contain machine-generated event data. Developing a sense of proportion around storage overhead should now be easier, and more intuitive.
No comments:
Post a Comment