/
Analyzing PostgreSQL RDS Workload with pgbadger

eGov ERP DevOps

Analyzing PostgreSQL RDS Workload with pgbadger

1. Enabling Required Monitoring:

Below are few of parameters which can be good starting point. Start with small amount of logging to avoid overhead on RDS.

log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 1
log_autovacuum_min_duration = 0
log_error_verbosity = default
log_statement = all
lc_messages='C'
log_min_duration_statement=100m (Modify this based on need)
rds.force_autovacuum_logging_level = 'log'

Note: Each of these parameters come with its own overhead since information gets logged every time any of these events occurs.

These parameters can be changed directly in parameters group using AWS console for given RDS instance or one can use AWS CLI to change these parameters.

aws rds modify-db-parameter-group \

  • -db-parameter-group-name $parameter_group \

  • -parameters "ParameterName=log_checkpoints,ParameterValue=on,ApplyMethod=immediate" \

"ParameterName=log_connections,ParameterValue=on,ApplyMethod=immediate" \
"ParameterName=log_disconnections,ParameterValue=on,ApplyMethod=immediate" \
"ParameterName=log_lock_waits,ParameterValue=on,ApplyMethod=immediate" \
"ParameterName=log_temp_files,ParameterValue=0,ApplyMethod=immediate" \
"ParameterName=log_autovacuum_min_duration,ParameterValue=0,ApplyMethod=immediate" \
"ParameterName=log_error_verbosity,ParameterValue=default,ApplyMethod=immediate" \
"ParameterName=log_statement,ParameterValue=all,ApplyMethod=immediate" \
"ParameterName=lc_messages,ParameterValue='C',ApplyMethod=immediate" \
"ParameterName=log_min_duration_statement,ParameterValue=100ms,ApplyMethod=immediate"\"ParameterName= rds.force_autovacuum_logging_level,ParameterValue=’log’,ApplyMethod=immediate"\

RDS instance restart/reboot may require depends on which parameters is being modified.

Validate PostgreSQL Logs Getting Generated:

If there is no direct AWS console access to AWS RDS then using AWS CLI, you can validate logs using below command.

Note: Make sure AWS CLI is configured prior to executing below command.

$ aws rds describe-db-log-files --db-instance-identifier <<RDS_INSTANCE>> | jq -r '.DescribeDBLogFiles[] | .LogFileName'


2. Extract Logs From AWS RDS to Local EC2/UNIX Server:

Now it is time to extract these logs to local desk or EC2 server.

Using AWS CLI, these logs can be extracted to local desk/EC2/UNIX system. If you are looking for a longer duration of retention, these logs can be sent to an S3 bucket and pulled from S3 to EC2 via AWS CLI.

$ for filename in $( aws rds describe-db-log-files --db-instance-identifier <<RDS_INSTANCE>> | jq -r '.DescribeDBLogFiles[] | .LogFileName' )
do
aws rds download-db-log-file-portion --db-instance-identifier <<RDS_INSTANCE>> --output text --no-paginate --log-file $filename  >> /data01/$filename
done


3. Execute pgbadger Against PostgreSQL Logs:

pgbadger installation steps can be referenced from

GitHub - darold/pgbadger: A fast PostgreSQL Log Analyzer

Since all required logs are now available in EC2 instance, pgbadger can be executed against these logs. pgbadger has various options as argument and can be use based on need.

$ pgbadger --help

Usage: pgbadger [options] logfile [...]

PostgreSQL log analyzer with fully detailed reports and graphs.

Arguments:

logfile can be a single log file, a list of files, or a shell command
returning a list of files. If you want to pass log content from stdin
use - as filename. Note that input from stdin will not work with csvlog.

In this example, pgbadger has been executed for one hour of timeline. 23:45 to 00:45 for a given date. You can also take pgbadger for the entire day and analyze your workload.

$ pgbadger -f stderr -p '%t:%r:%u@%d:[%p]:' -b " yyyy-mm-dd 23:45:00" -e " yyyy-mm-dd 00:45:00" /data01/postgres/pgbadger_logs/error/postgresql.log* -o pgbadger.html
[========================>] Parsed 2198308923 bytes of 2198308923 (100.00%), queries: 1411842, events: 65
LOG: Ok, generating html report...

Once the pgbadger report is generated. You will be able to understand your database load with more details in terms of type of SQLs, most used queries, workload type, peak hours, etc.


4. Sample pgbadger Report/Graphs:

 

DevOps as a Culture