HA PostgreSQL Node Config
Configurations
The PostgreSQL node in Chef Automate HA provides configuration options you can use to customize its behavior and meet specific requirements. This guide documents all settings you can patch.
The detailed document about how these individual properties affect the system is at Official PostgreSQL docs
Patch the below configuration to PostgreSQL nodes. Please add the values you want to patch to a config.toml file and run the chef-automate config patch config.toml --pg from the bastion.
Logging
log_level = "ERROR"
log_line_prefix = "%t [%p]: [%l-1] user=%u,db=%d,client=%h %r (%x:%e)"
logging_collector = "on"
In the above snippet:
log_levelcontrols which message levels the server log records. Valid values are DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, INFO, NOTICE, WARNING, ERROR, LOG, FATAL, and PANIC. The default value is WARNING.log_line_prefixis a printf-style string output at the beginning of each log line.logging_collectorenables the logging collector, which is a background process that captures log messages sent to stderr and redirects them into log files
Checkpoints
checkpoint_timeout = "5min"
max_wal_size = "1GB"
min_wal_size = "80MB"
In the above snippet:
checkpoint_timeoutsets the longest time between automatic WAL checkpoints. The valid range is between 30 seconds and one day. The default is five minutes (5min). Increasing this parameter can increase the amount of time needed for crash recovery.max_wal_sizesets the largest size that WAL can grow to during automatic checkpoints. The default is 1 GB. Increasing this parameter can increase the amount of time needed for crash recovery. Set this parameter in the PostgreSQL.conf file or on the server command line.min_wal_sizehelps reserve enough WAL space to handle spikes in WAL usage, for example, when running large batch jobs. If you specify this value without units, PostgreSQL treats it as megabytes. The default is 80 MB.
Write-ahead log keep size
wal_keep_size = 1600
wal_keep_size specifies the smallest size of past log file segments kept in the pg_wal directory if a standby server needs to fetch them for streaming replication. If wal_keep_size is zero (the default), the system doesn’t keep extra segments for standby purposes. As a result, the number of old WAL segments available to standby servers depends on the location of the previous checkpoint and the status of WAL archiving.
Lock Management
max_locks_per_transaction = 64
The shared lock table tracks locks on max_locks_per_transaction * (max_connections + max_prepared_transactions) objects (for example, tables). This means the system can lock at most that number of distinct objects at a time. This parameter controls the average number of object locks allocated for each transaction. Individual transactions can lock more objects as long as all transaction locks fit in the lock table. This doesn’t limit the number of rows that a transaction can lock. The default is 64.
When running a standby server, you must set this parameter to the same or higher value than on the primary server. Otherwise, the standby server won’t allow queries.
Max connections
max_connections = 350
The highest number of concurrent connections to the database server. The default for Chef Automate is 350 connections.
When running a standby server, set this parameter to the same or higher value than on the primary server. Otherwise, the standby server won’t allow queries.
Note: Modifications to max_connections require a reboot of the leader, and the cluster elects a new leader during that reboot. The configuration reflects the updated max_connections value after the reboot, as described in the PostgreSQL documentation.
Pg Dump
[pg_dump]
enable = true
path = "/mnt/automate_backups/postgresql/pg_dump"
This section configures pg_dump, a PostgreSQL utility for performing database backups. It enables pg_dump and specifies the path where pg_dump stores backups.
Replication
[replication]
name = 'replication'
password = 'replication'
# note: lag_health_threshold is in bytes - default to 300KB
# this is just greater than 1 WAL segment
lag_health_threshold = 307200
# largest lag time in seconds since log was last replayed before replica is eligible for a restart
max_replay_lag_before_restart_s = 180
max_wal_senders = 10
max_replication_slots = 5
wal_sender_timeout = 60
wal_receiver_timeout = 60
wal_compression = "off"
This section configures replication settings:
name: replication namepassword: replication password.lag_health_threshold: Sets the lag health threshold to 307200 bytes (300 KB), the highest allowed replication lag.max_replay_lag_before_restart_s: Custom setting; largest lag time in seconds since log was last replayed before replica is eligible for a restart.max_wal_senders: Limits the number of standbys that can connect for replication (default: 10).max_replication_slots: Sets the number of allowed replication slots (default: 5).wal_sender_timeout: Primary waits 60 seconds for standby response before disconnecting.wal_receiver_timeout: Standby waits 60 seconds for data from primary before timing out.wal_compression: Controls compression of WAL data; “off” disables it, “on” enables it.
Transport security settings
[ssl]
enable = true
issuer_cert = "----Enter Root CA----"
ssl_cert = "----Enter Public Key----"
ssl_key = "----Enter Private Key----"
tls_ciphers = "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256"
This section configures SSL/TLS settings. It turns on SSL and specifies the root CA (issuer) certificate, the public key certificate, the private key, and the allowed TLS ciphers.
User
[superuser]
name = "admin"
password = "admin"
This section specifies the username and password for the superuser (administrator) account.
Write-ahead log archive
[wal_archive]
enable = false
path = "/mnt/automate_backups/postgresql/archive"
This section configures WAL archiving. It specifies whether WAL archiving is on (false in this case) and the path where the system stores archived WAL files.
Full config for PostgreSQL node
checkpoint_timeout = "5min"
host = "0.0.0.0"
log_level = "ERROR"
log_line_prefix = "%t [%p]: [%l-1] user=%u,db=%d,client=%h %r (%x:%e)"
logging_collector = "on"
max_connections = 350
max_locks_per_transaction = 64
max_wal_size = "1GB"
min_wal_size = "80MB"
port = 5432
print_db_statistics = true
wal_keep_size = 1600
[pg_dump]
enable = true
path = "/mnt/automate_backups/postgresql/pg_dump"
[replication]
lag_health_threshold = 20480
max_replay_lag_before_restart_s = 180
name = "replication"
password = "replication"
[ssl]
enable = true
issuer_cert = "----Enter Root CA----"
ssl_cert = "----Enter Public Key----"
ssl_key = "----Enter Private Key----"
tls_ciphers = "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256"
[superuser]
name = "admin"
password = "admin"
[wal_archive]
enable = false
path = "/mnt/automate_backups/postgresql/archive"
Example
To increase the log level to DEBUG1:
- Create a log.toml file with the below contents on bastion:
log_level = "DEBUG"
- Run the patch command
chef-automate config patch log.toml --pgto apply the patch.
Centralized logs
Take a tour of the main page to know about Centralized logs.