Optimizing Postgres for Sal29 Aug 2018
Over time, you may notice your Sal install getting slower and slower - this will happen faster the more devices you have checking in. You may even see rediciulous amounts of disk space being used - maybe even 1Gb per hour. This can all be solved by tweaking some simple matinenance settings on your Postgres server.
Before we crack on with how to stop this from happening, it will be useful to know how Postgres handles deleted data.
Take the following table (this is a represenation of the
facts table in Sal):
When a device checks into Sal, rather than asking the database what facts are stored for the machine, iterating over each one, working out which ones have values that need updating, working out which ones are missing, and working out which ones need to be removed, Sal instructs tha database to delete all of the facts for that device, and then to save the new ones. What could potentially be 1000 operations becomes two, which is much faster.
You would expect Postgres to delete the rows out of the database at this point. Unfortunately that isn’t what happens. What actually happens is Postgres marks the row as able to be deleted. There are various good reasons for this outlined in the documentation which I won’t go into here, but when an application like Sal is updating and deleting data constantly, the disk usage can skyrocket.
As time goes on, these empty tuples will mount up. This is where the database’s maintenance tasks come in. They are supposed to come along and vaccuum the tables, removing these dead tuples.
So what can we do?
But unfortunately the defaults are basically useless. I am not going to go in depth about why I chose the following settings - I learned a lot from this post and adjusted their recommendations to meet our needs. My Postgres server is Amazon’s RDS, so the settings are entered in the Parameter Group for the database. If you are running a bare metal install, you will be editing the Postgres configuration. I have added a few notes about why we chose the value we did next to the setting. Our general goal was to have maintenance performed more frequently, so it would take less time as it will have less work to do during each run, and to give the maintenance worker as much resources as possible so it would complete as quickly as possible.