For a long time, I've written all my automation scripts in bourne shell and bash with occasional forays into perl. I really like bash because you're always guaranteed to have bash installed on a modern linux system and so don't need to install perl and 15 modules to get the job done.
That said, I've recently begun using python for more automation. As an exercise, I've rewritten my dbbackup.sh script for python 2.7 trying to use only the standard library.
The script pg_dumps all the databases in a PostgreSQL cluster in the custom format allowing you to specify the compression level. (See this blog post regarding pg_dump compression level performance.) It also dumps the globals to a bzip2'd SQL file and then removes files in the dump destination directory that are older than the --keep value (default 7 days). Finally, if it had any errors during the run, it sends the error log file to the specified email address. It also picks up libpq environment variables, but will override those with anything specified on the command line.
You can find it here: https://github.com/pgexperts/pgbackup
Frosty PostgreSQL
Wednesday, May 23, 2012
Tuesday, May 15, 2012
Slony Upstart Script for Ubuntu 10.04 LTS
Since Ubuntu is moving towards using upstart instead of SysV init scripts, I thought I'd write a quick upstart script for starting slony compiled and installed in /usr/local/slony. Upstart 1.4+ supports setuid, but the version of upstart included with Ubuntu 10.04LTS is only 0.6.5 and even on Ubuntu 11.10, it's only 1.3, so instead I use start-stop-daemon's --chuid switch. The script should be installed as /etc/init/slony.conf. Here's the script:
---cut here---
# slony - slon daemon
#
# slon is the slony replication daemon
description "slony slon daemon"
start on net-device-up
stop on runlevel [06]
# can't use setuid until upstart 1.4
#setuid slony
# which is why we use start-stop-daemon with the --chuid switch
respawn
script
exec start-stop-daemon --start --chuid slony --exec /usr/local/slony/bin/slon -- -f /etc/slon.conf >> /var/log/slony/slony.log 2>&1
end script
---cut here---
This will have the slon daemon run as the slony user and log to /var/log/slony/slony.log. You'll also need a valid slony config file in /etc/slon.conf.
You can also find the script in the pgexperts upstart-scripts git repo:
https://github.com/pgexperts/upstart-scripts
---cut here---
# slony - slon daemon
#
# slon is the slony replication daemon
description "slony slon daemon"
start on net-device-up
stop on runlevel [06]
# can't use setuid until upstart 1.4
#setuid slony
# which is why we use start-stop-daemon with the --chuid switch
respawn
script
exec start-stop-daemon --start --chuid slony --exec /usr/local/slony/bin/slon -- -f /etc/slon.conf >> /var/log/slony/slony.log 2>&1
end script
---cut here---
This will have the slon daemon run as the slony user and log to /var/log/slony/slony.log. You'll also need a valid slony config file in /etc/slon.conf.
You can also find the script in the pgexperts upstart-scripts git repo:
https://github.com/pgexperts/upstart-scripts
Monday, April 2, 2012
Using the row_number() window function to reorder images
One of my customers uses the rails acts_as_list module for maintaining order of images on forms.
Through some application issues they ended up with images that had identical order for the same provider and image_category. So, it ended up looking like this:
select provider_id, image_category_id, position, id from images where image_category_id=1234 order by position;
Note that there are several rows with position=4 above. This was a prevalent issue throughout the images table and not just with this example provider.
So, what they wanted was to reorder the images such that the equal positions were kept together and the overall order stayed the same. So, for example, it should look like this (sorry for the small font):
Through some application issues they ended up with images that had identical order for the same provider and image_category. So, it ended up looking like this:
select provider_id, image_category_id, position, id from images where image_category_id=1234 order by position;
provider_id | image_category_id | position | id
------------+-------------------+----------+--------
9999 | 1234 | 3 | 484900
9999 | 1234 | 4 | 484899
9999 | 1234 | 4 | 484897
9999 | 1234 | 4 | 484896
9999 | 1234 | 4 | 484894
9999 | 1234 | 4 | 484893
9999 | 1234 | 4 | 484892
9999 | 1234 | 4 | 484890
9999 | 1234 | 5 | 484889
9999 | 1234 | 6 | 484887
9999 | 1234 | 7 | 484886
9999 | 1234 | 8 | 479818
9999 | 1234 | 9 | 479817
9999 | 1234 | 10 | 479815
Note that there are several rows with position=4 above. This was a prevalent issue throughout the images table and not just with this example provider.
So, what they wanted was to reorder the images such that the equal positions were kept together and the overall order stayed the same. So, for example, it should look like this (sorry for the small font):
provider_id |image_category_id | position | id
-------------+------------------+----------+-------
9999 | 1234 | 3 | 484900 ==> Position 1
9999 | 1234 | 4 | 484899 ==> Position 2
9999 | 1234 | 4 | 484897 ==> Position 3
9999 | 1234 | 4 | 484896 ==> Position 4
9999 | 1234 | 4 | 484894 ==> Position 5
9999 | 1234 | 4 | 484893 ==> Position 6
9999 | 1234 | 4 | 484892 ==> Position 7
9999 | 1234 | 4 | 484890 ==> Position 8
9999 | 1234 | 5 | 484889 ==> Position 9
9999 | 1234 | 6 | 484887 ==> Position 10
9999 | 1234 | 7 | 484886 ==> Position 11
9999 | 1234 | 8 | 479818 ==> Position 12
9999 | 1234 | 9 | 479817 ==> Position 13
9999 | 1234 | 10 | 479815 ==> Position 14
So, I thought about it for a bit, then it occurred to me: why not use the row_number() window function. So I wrote the following migration SQL script:
BEGIN;
CREATE temp TABLE new_image_positions
ON COMMIT DROP
AS
SELECT id
, row_number AS position
FROM (
SELECT provider_type
, provider_id
, image_category_id
, position
, id
, row_number()
OVER (
partition BY provider_id
, image_category_id
ORDER BY position
)
FROM images
WHERE provider_type = 'Project'AND position IS NOT NULL
) x ;
UPDATE images
SET position = n.position
FROM new_image_positions n
WHERE images.id = n.id;
COMMIT;And, voila, all the images are now properly ordered:
provider_id | image_category_id | position | id
-------------+-------------------+----------+--------
9999 | 1234 | 1 | 484900
9999 | 1234 | 2 | 484899
9999 | 1234 | 3 | 484897
9999 | 1234 | 4 | 484896
9999 | 1234 | 5 | 484894
9999 | 1234 | 6 | 484893
9999 | 1234 | 7 | 484892
9999 | 1234 | 8 | 484890
9999 | 1234 | 9 | 484889
9999 | 1234 | 10 | 484887
9999 | 1234 | 11 | 484886
9999 | 1234 | 12 | 479818
9999 | 1234 | 13 | 479817
9999 | 1234 | 14 | 479815
<SNIP>
9999 | 1234 | 46 | 403052
9999 | 1234 | 47 | 403051
9999 | 1234 | 48 | 403050
9999 | 1234 | 49 | 403049
9999 | 1235 | 1 | 484916
9999 | 1235 | 2 | 484915
9999 | 1235 | 3 | 484912
9999 | 1235 | 4 | 484910
9999 | 1236 | 1 | 493000
9999 | 1236 | 2 | 492999
Friday, December 23, 2011
pg_dump compression settings
After doing the base backup benchmarks, I thought it would be interesting to benchmark pg_dump run locally and remotely using all the different compression settings. This will allow me to compare the dump times and the space savings as well as see how much dumping remotely slows the process. Nobody is storing their dumps on the same server, right?
For the record, all these dumps used pg_dump's custom dump format (-Fc), the PGDATA directory is 93GB on disk and resides on a 4 disk RAID10 of 7200 RPM SATA drives.
CPU on the test server: Intel(R) Xeon(R) CPU X3330 @ 2.66GHz.
CPU on the test remote client: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz.
Network is gigabit.
The total size reported is from du -sh output.
Both client and test server are running postgresql-9.1.2.
For comparison, our best base backup time from the base backup blog post was: 15m52.221s
The results:
So, a few interesting observations:
For the record, all these dumps used pg_dump's custom dump format (-Fc), the PGDATA directory is 93GB on disk and resides on a 4 disk RAID10 of 7200 RPM SATA drives.
CPU on the test server: Intel(R) Xeon(R) CPU X3330 @ 2.66GHz.
CPU on the test remote client: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz.
Network is gigabit.
The total size reported is from du -sh output.
Both client and test server are running postgresql-9.1.2.
For comparison, our best base backup time from the base backup blog post was: 15m52.221s
The results:
Local pg_dump: ------------------- compression: 0 time: 19m7.455s total size: 77G ------------------- compression: 1 time: 21m53.128s total size: 11G ------------------- compression: 2 time: 22m27.507s total size: 11G ------------------- compression: 3 time: 24m18.966s total size: 9.8G ------------------- compression: 4 time: 30m10.815s total size: 9.2G ------------------- compression: 5 time: 34m26.119s total size: 8.3G ------------------- compression: 6 time: 41m35.340s total size: 8.0G ------------------- compression: 7 time: 49m4.484s total size: 7.9G ------------------- compression: 8 time: 91m28.689s total size: 7.8G ------------------- compression: 9 time: 103m24.883s total size: 7.8G |
Remote pg_dump: ------------------- compression: 0 time: 20m1.363s total size: 77G ------------------- compression: 1 time: 22m9.205s total size: 11G ------------------- compression: 2 time: 22m19.158s total size: 11G ------------------- compression: 3 time: 23m7.426s total size: 9.8G ------------------- compression: 4 time: 26m10.383s total size: 9.2G ------------------- compression: 5 time: 28m57.431s total size: 8.3G ------------------- compression: 6 time: 33m23.939s total size: 8.0G ------------------- compression: 7 time: 38m11.321s total size: 7.9G ------------------- compression: 8 time: 62m27.385s total size: 7.8G ------------------- compression: 9 time: 72m5.123s total size: 7.8G |
So, a few interesting observations:
- Base backups are indeed faster, but that's no surprise since they have less overhead.
- Taking the backup locally is only slightly faster than remotely for the smaller compression levels
- At compression level 3 and above the faster CPU on the test client becomes a huge benefit
- Looks like the default compression is level 6. This might not be what you want if you're concerned about minimizing the backup time and have the disk space to spare.
Labels:
benchmark,
pg_dump,
postgresql
Tuesday, December 6, 2011
PostgreSQL Base Backup Benchmark Results
For a while now we've sort of anecdotally thought that compression should help with base backups and also that the rsync server is faster than rsync over ssh, so while nothing was happening on our test server, I thought I'd benchmark a few different methods.
Looks like rsync using an rsync server with no compression was fastest, but not by much. Pg_basebackup is right there with rsync using any of the three RC4 ciphers all within a few seconds of one another. Looks like blowfish-cbc comes in next followed by aes128-cbc.
Most interesting is that adding compression quadruples the sync time! That's probably not the case over a slow link, but over gigabit it's definitely the case.
Tests were done with a 93GB PGDATA, gigabit network, linux software 4 disk RAID10 data dir on the first test server copied to linux software 2 disk RAID0 on the second server.
Following are the results:
Following are the results:
Without compression:
time rsync test0::data /data/tmp/test-data/ -avP
real 15m52.221s
user 3m10.257s
sys 4m11.460s
time /usr/pgsql-9.1/bin/pg_basebackup -D /data/tmp/test-data -c fast -h test0 -U postgres
real 16m13.440s
user 0m17.314s
sys 2m18.217s
time rsync -e 'ssh -c arcfour' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m13.242s
user 8m32.674s
sys 5m28.312s
time rsync -e 'ssh -c arcfour256' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m13.656s
user 8m48.356s
sys 6m7.343s
time rsync -e 'ssh -c arcfour128' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m3.947s
user 8m44.627s
sys 6m7.499s
time rsync -e 'ssh -c blowfish-cbc' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 26m22.516s
user 19m16.038s
sys 8m45.046s
time rsync -e 'ssh -c aes128-cbc' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 29m30.785s
user 27m31.919s
sys 6m5.154s
time rsync -e 'ssh' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 30m50.301s
user 24m10.951s
sys 9m13.090s
time rsync -e 'ssh -c aes256-ctr' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 38m23.789s
user 31m36.011s
sys 8m39.210s
Our Best Performers With Compression:
time rsync -e 'ssh -c arcfour' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avPz
real 62m31.933s
user 14m48.752s
sys 2m42.412s
time rsync test0::data /data/tmp/test-data/ -avPz
real 62m53.699s
user 6m42.846s
sys 1m44.357s
Update:
Since Rod, Joshua and Rodrigo's comments suggested pigz and netcat below, I thought I'd give them both a quick try to see if they produced a winner. Conveniently, gnu-tar has a --use-compress-program option, so you can easily use pigz in this manner.
So, tar over ssh with the arcfour256 cipher using pigz for compression:
time ssh -c arcfour256 test0 "cd /data/9.1/data ; tar --use-compress-program=pigz -cf - ." | (cd /data/tmp/test-data ; tar --use-compress-program=pigz -xvf -)
real 18m11.901s
user 7m3.587s
sys 3m53.487s
On test0: tar --use-compress-program=pigz -cf - . | nc -l 7000
On test1: time nc test0 7000 | tar --use-compress-program=pigz -xvf -
real 17m46.805s
user 6m17.562s
sys 4m4.563s
So, using tar with pigz over ssh and over netcat are much faster compression methods, they still are a little slower than no compression. This got me wondering if perhaps the 93GB of PGDATA doesn't compress all that well:
14G /data/tmp/data-test.tgz
So, that's 6:1 compression. Not bad! Interestingly, it took gnu-tar + pigz about 17m to create the test tgz file. That's about identical to the time it took to ship it over the network and uncompress/untar it on the other end.
That made me try tar with no compression piped over netcat:
On test0: tar -cf - . | nc -l 7000
On test1: time nc test0 7000 | tar -xvf -
real 15m52.313s
user 0m10.033s
sys 3m17.630s
So, tar with no compression over netcat clocked in almost identical to using rsync's native server.
Anyone have any other tools for me to test?
Looks like rsync using an rsync server with no compression was fastest, but not by much. Pg_basebackup is right there with rsync using any of the three RC4 ciphers all within a few seconds of one another. Looks like blowfish-cbc comes in next followed by aes128-cbc.
Most interesting is that adding compression quadruples the sync time! That's probably not the case over a slow link, but over gigabit it's definitely the case.
Tests were done with a 93GB PGDATA, gigabit network, linux software 4 disk RAID10 data dir on the first test server copied to linux software 2 disk RAID0 on the second server.
Following are the results:
Following are the results:
Without compression:
time rsync test0::data /data/tmp/test-data/ -avP
real 15m52.221s
user 3m10.257s
sys 4m11.460s
time /usr/pgsql-9.1/bin/pg_basebackup -D /data/tmp/test-data -c fast -h test0 -U postgres
real 16m13.440s
user 0m17.314s
sys 2m18.217s
time rsync -e 'ssh -c arcfour' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m13.242s
user 8m32.674s
sys 5m28.312s
time rsync -e 'ssh -c arcfour256' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m13.656s
user 8m48.356s
sys 6m7.343s
time rsync -e 'ssh -c arcfour128' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 16m3.947s
user 8m44.627s
sys 6m7.499s
time rsync -e 'ssh -c blowfish-cbc' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 26m22.516s
user 19m16.038s
sys 8m45.046s
time rsync -e 'ssh -c aes128-cbc' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 29m30.785s
user 27m31.919s
sys 6m5.154s
time rsync -e 'ssh' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 30m50.301s
user 24m10.951s
sys 9m13.090s
time rsync -e 'ssh -c aes256-ctr' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avP
real 38m23.789s
user 31m36.011s
sys 8m39.210s
Our Best Performers With Compression:
time rsync -e 'ssh -c arcfour' postgres@test0:/data/9.1/data/ /data/tmp/test-data/ -avPz
real 62m31.933s
user 14m48.752s
sys 2m42.412s
time rsync test0::data /data/tmp/test-data/ -avPz
real 62m53.699s
user 6m42.846s
sys 1m44.357s
Update:
Since Rod, Joshua and Rodrigo's comments suggested pigz and netcat below, I thought I'd give them both a quick try to see if they produced a winner. Conveniently, gnu-tar has a --use-compress-program option, so you can easily use pigz in this manner.
So, tar over ssh with the arcfour256 cipher using pigz for compression:
time ssh -c arcfour256 test0 "cd /data/9.1/data ; tar --use-compress-program=pigz -cf - ." | (cd /data/tmp/test-data ; tar --use-compress-program=pigz -xvf -)
real 18m11.901s
user 7m3.587s
sys 3m53.487s
On test0: tar --use-compress-program=pigz -cf - . | nc -l 7000
On test1: time nc test0 7000 | tar --use-compress-program=pigz -xvf -
real 17m46.805s
user 6m17.562s
sys 4m4.563s
So, using tar with pigz over ssh and over netcat are much faster compression methods, they still are a little slower than no compression. This got me wondering if perhaps the 93GB of PGDATA doesn't compress all that well:
14G /data/tmp/data-test.tgz
So, that's 6:1 compression. Not bad! Interestingly, it took gnu-tar + pigz about 17m to create the test tgz file. That's about identical to the time it took to ship it over the network and uncompress/untar it on the other end.
That made me try tar with no compression piped over netcat:
On test0: tar -cf - . | nc -l 7000
On test1: time nc test0 7000 | tar -xvf -
real 15m52.313s
user 0m10.033s
sys 3m17.630s
So, tar with no compression over netcat clocked in almost identical to using rsync's native server.
Anyone have any other tools for me to test?
Labels:
base backup,
benchmark,
cipher,
pitr,
postgresql,
ssh
Subscribe to:
Posts (Atom)