Installing pgbadger on Linux for PostgreSQL log analysis
As a DBA, we often encounter scenarios when we need to track PostgreSQL database performance by analyzing workloads and identifying bottlenecks if any. There comes pgbadger
– an open-source log analyzer for PostgreSQL. It parses the PostgreSQL log files and gives you a nice overview of the PostgreSQL cluster including various query metrics like the number of query executions, their types and duration, session details, Locks statistics, warnings, errors and other event patterns etc. All of these details will be available in a nice HTML format on your web browser.
Installation:
Prerequisite:
The pgbadger
is written in pure Perl
and uses a JavaScript library (flotr2
) to draw graphs. Hence, you need to ensure that a modern Perl distribution is available in your system. Charts are rendered using a JavaScript library and Your web browser will do all the work. Nothing additional is required here.
Use the below command to install perl
if it is not already installed in your system.
$ yum install -y perl perl-devel
Either of the below two methods can be followed to install pgbadger
.
Method 1:
The first step is to download the latest pgbadger
installation package. The Official releases are published on the GitHub Release page of pgBadger
. While I am writing this blog post, the latest available version is 11.4
.
Let’s run the below command on the Linux command line interface to download the pgbadger
version 11.4
.
$ wget https://github.com/darold/pgbadger/archive/v11.4.tar.gz
Alternatively, you can clone the pgbadger
GitHub repo as well if you have Git installed in your system.
$ git clone https://github.com/dalibo/pgbadger.git
Once the download is complete, you have to extract the archive from the tarball.
$ tar xzvf v11.4.tar.gz
[user@server pgbadger_demo]$ tar xzvf v11.4.tar.gz
pgbadger-11.4/
pgbadger-11.4/.editorconfig
pgbadger-11.4/.gitignore
pgbadger-11.4/CONTRIBUTING.md
pgbadger-11.4/ChangeLog
pgbadger-11.4/HACKING.md
pgbadger-11.4/LICENSE
pgbadger-11.4/MANIFEST
pgbadger-11.4/META.yml
pgbadger-11.4/Makefile.PL
pgbadger-11.4/README
pgbadger-11.4/README.md
pgbadger-11.4/doc/
pgbadger-11.4/doc/pgBadger.pod
pgbadger-11.4/pgbadger
pgbadger-11.4/resources/
pgbadger-11.4/resources/.gitignore
pgbadger-11.4/resources/LICENSE
pgbadger-11.4/resources/README
pgbadger-11.4/resources/bean.js
pgbadger-11.4/resources/bootstrap.css
pgbadger-11.4/resources/bootstrap.js
pgbadger-11.4/resources/font/
pgbadger-11.4/resources/font/FontAwesome.otf
pgbadger-11.4/resources/font/fontawesome-webfont.eot
pgbadger-11.4/resources/fontawesome.css
pgbadger-11.4/resources/jqplot.barRenderer.js
pgbadger-11.4/resources/jqplot.canvasAxisTickRenderer.js
pgbadger-11.4/resources/jqplot.canvasTextRenderer.js
pgbadger-11.4/resources/jqplot.categoryAxisRenderer.js
pgbadger-11.4/resources/jqplot.cursor.js
pgbadger-11.4/resources/jqplot.dateAxisRenderer.js
pgbadger-11.4/resources/jqplot.highlighter.js
pgbadger-11.4/resources/jqplot.pieRenderer.js
pgbadger-11.4/resources/jqplot.pointLabels.js
pgbadger-11.4/resources/jquery.jqplot.css
pgbadger-11.4/resources/jquery.jqplot.js
pgbadger-11.4/resources/jquery.js
pgbadger-11.4/resources/patch-jquery.jqplot.js
pgbadger-11.4/resources/pgbadger.css
pgbadger-11.4/resources/pgbadger.js
pgbadger-11.4/resources/pgbadger_slide.js
pgbadger-11.4/resources/underscore.js
pgbadger-11.4/t/
pgbadger-11.4/t/01_lint.t
pgbadger-11.4/t/02_basics.t
pgbadger-11.4/t/03_consistency.t
pgbadger-11.4/t/exp/
pgbadger-11.4/t/exp/stmt_type.out
pgbadger-11.4/t/fixtures/
pgbadger-11.4/t/fixtures/cloudsql.log.gz
pgbadger-11.4/t/fixtures/light.postgres.log.bz2
pgbadger-11.4/t/fixtures/logplex.gz
pgbadger-11.4/t/fixtures/pg-syslog.1.bz2
pgbadger-11.4/t/fixtures/pgbouncer.log.gz
pgbadger-11.4/t/fixtures/rds.log.bz2
pgbadger-11.4/t/fixtures/stmt_type.log
pgbadger-11.4/tools/
pgbadger-11.4/tools/README.pgbadger_tools
pgbadger-11.4/tools/README.updt_embedded_rsc
pgbadger-11.4/tools/pgbadger_tools
pgbadger-11.4/tools/updt_embedded_rsc.pl
[user@server pgbadger_demo]$
Now go to the extracted directory.
$ cd pgbadger-11.4
Compile the makefile.
$ perl Makefile.PL
[user@server pgbadger-11.4]$ perl Makefile.PLChecking if your kit is complete...
Looks good
Writing Makefile for pgBadger
[user@server pgbadger-11.4]$
Now install the pgbadger
using the below command.
$ make && sudo make install
[user@server pgbadger-11.4]$ make && sudo make install
which: no pod2markdown in (/usr/local/rvm/gems/ruby-2.4.1/bin:/usr/local/rvm/gems/ruby-2.4.1@global/bin:/usr/local/rvm/rubies/ruby-2.4.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/rvm/bin:/home/cloud_user/.local/bin:/home/user/bin)
cp pgbadger blib/script/pgbadger
/usr/bin/perl -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/pgbadger
echo "=head1 SYNOPSIS" > doc/synopsis.pod
./pgbadger --help >> doc/synopsis.pod
echo "=head1 DESCRIPTION" >> doc/synopsis.pod
sed -i.bak 's/ +$//g' doc/synopsis.pod
rm doc/synopsis.pod.bak
sed -i.bak '/^=head1 SYNOPSIS/,/^=head1 DESCRIPTION/d' doc/pgBadger.pod
sed -i.bak '4r doc/synopsis.pod' doc/pgBadger.pod
rm doc/pgBadger.pod.bak
Manifying blib/man1/pgbadger.1p
rm doc/synopsis.pod
[sudo] password for user:
which: no pod2markdown in (/sbin:/bin:/usr/sbin:/usr/bin)
echo "=head1 SYNOPSIS" > doc/synopsis.pod
./pgbadger --help >> doc/synopsis.pod
echo "=head1 DESCRIPTION" >> doc/synopsis.pod
sed -i.bak 's/ +$//g' doc/synopsis.pod
rm doc/synopsis.pod.bak
sed -i.bak '/^=head1 SYNOPSIS/,/^=head1 DESCRIPTION/d' doc/pgBadger.pod
sed -i.bak '4r doc/synopsis.pod' doc/pgBadger.pod
rm doc/pgBadger.pod.bak
Manifying blib/man1/pgbadger.1p
Installing /usr/local/share/man/man1/pgbadger.1p
Installing /usr/local/bin/pgbadger
Appending installation info to /usr/lib64/perl5/perllocal.pod
rm doc/synopsis.pod
[user@server pgbadger-11.4]$
Finally, verify the installation by running the below command which will display the pgbadger
installed version 🎉
$ pgbadger -V
[user@server pgbadger-11.4]$ pgbadger -V
pgBadger version 11.4
[user@server pgbadger-11.4]$
Method 2:
The pgbadger
package is also available at the PostgreSQL yum repository. If you have already installed the repository configuration package from the official PostgreSQL repository, then pgbadger
can be installed by running the below command.
$ sudo yum install pgbadger
[user@server ~]$ sudo yum install pgbadger
Loaded plugins: amazon-id, search-disabled-repos
Resolving Dependencies
--> Running transaction check
---> Package pgbadger.noarch 0:11.4-1.rhel7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
==============================================================================================================================================================================================================================================
Package Arch Version Repository Size
==============================================================================================================================================================================================================================================
Installing:
pgbadger noarch 11.4-1.rhel7 pgdg-common 346 k
Transaction Summary
==============================================================================================================================================================================================================================================
Install 1 Package
Total download size: 346 k
Installed size: 1.5 M
Is this ok [y/d/N]: y
Downloading packages:
warning: /var/cache/yum/x86_64/7Server/pgdg-common/packages/pgbadger-11.4-1.rhel7.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 442df0f8: NOKEY========================- ] 0.0 B/s | 224 kB --:--:-- ETA
Public key for pgbadger-11.4-1.rhel7.noarch.rpm is not installed
pgbadger-11.4-1.rhel7.noarch.rpm | 346 kB 00:00:01
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
Userid : "PostgreSQL RPM Building Project <pgsqlrpms-hackers@pgfoundry.org>"
Fingerprint: 68c9 e2b9 1a37 d136 fe74 d176 1f16 d2e1 442d f0f8
Package : pgdg-redhat-repo-42.0-14.noarch (@/pgdg-redhat-repo-latest.noarch)
From : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : pgbadger-11.4-1.rhel7.noarch 1/1
Verifying : pgbadger-11.4-1.rhel7.noarch 1/1
Installed:
pgbadger.noarch 0:11.4-1.rhel7
Complete!
[user@server ~]$
PostgreSQL Configuration:
Since pgbadger
reads the PostgreSQL logs and picks up the information, it is essential to make some changes on Postgresql.Conf
file so that necessary information is available in the PostgreSQL log file for pgbadger
to read.
The first and foremost thing is to enable query logging. Change the log_min_duration_statement
parameter value accordingly on Postgresql.Conf
file. The value ‘0’ signifies that every statement will be logged which may not be ideal for a very busy server. You can change the value accordingly based on your requirement. Suppose, you want to log queries running for more than 2
seconds, then set the parameter value(in milliseconds) as 2000
.
** Do not enable both log_min_duration_statement, log_duration and log_statement altogether, this will result in wrong counter values and excessive log size.
** Do not enable log_statement
as its log format will not be parsed by pgBadger
. If you have log_statement
set to 'all'
nothing will be logged through the log_min_duration_statement
directive and pgbadger
will not be able to parse the log.
** If you don’t want all details about queries, set log_min_duration_statement to -1 to disable it and enable log_duration in your postgresql.conf
file. It will only report the duration and number of queries.
The log_min_duration_statement
should always be preferred (recommended).
log_min_duration_statement = 0
The next important parameter to look for is log_line_prefix
whose value can be specified based on the value of another parameter log_destination
. The log_destination
value tells PostgreSQL to generate logs in the specified format. By default, PostgreSQL generates logs in standard error (stderr) format. In this format, each log message is prefixed with the information specified by the parameter log_line_prefix
.
With stderr
log format, log_line_prefix
must be having at least a time escape sequence %t, %m or %n
and the process-related escape sequence %p or %c
as shown below.
log_line_prefix = '%t [%p]: '
It also lets you add the user, database name, application name and client IP address as well as shown below.
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '
Another thing to remember is that your log messages should be in English with or without locale support. The below settings ensure that:
lc_messages='en_US.UTF-8'
lc_messages='C'
The usual postgresql.conf
configuration for pgBadger
looks like this:
log_min_duration_statement = 0
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
lc_messages='en_US.UTF-8'
lc_messages='C'
The above changes require you to perform a PostgreSQL restart.
systemctl restart postgresql
Generating pgbadger
report:
Now that everything is in place, let’s proceed with generating a pgbadger
report.
The simplest way to analyze a particular log and generate the pgbadger
report is to specify the log file path to pgbadger
and it will generate the report to your present working directory.
$ pgbadger <log file>
For Example:
[user@server ~]$ sudo pgbadger /var/lib/pgsql/12/data/log/postgresql-Fri.log
[========================>] Parsed 1521 bytes of 1521 (100.00%), queries: 0, events: 1
LOG: Ok, generating html report...
[user@server ~]$
You can also analyze all your log files at once by using the wildcard *
while specifying log names. Also, you can specify the output directory and file name as well by specifying the appropriate switch as shown below.
[user@server]$ sudo pgbadger /var/lib/pgsql/12/data/log/postgresql-* -O /var/www/pgbadger -o pgbadger.html
[========================>] Parsed 15914 bytes of 15914 (100.00%), queries: 0, events: 20
LOG: Ok, generating html report...
[user@server]$
The pgbadger
report can be generated incrementally at a certain interval and you can even specify the retention period of the report files as well. This can be accomplished by setting up a cron
job. For example, the below cron
job will generate pgbadger
report daily at 01:00 AM with the retention set as one week.
$ crontab -e
0 1 * * * /usr/bin/pgbadger --R 1 -I -q /var/lib/pgsql/12/data/log/postgresql-* -O /var/www/pgbadger/
You can use the help command as mentioned below to get the complete list of options and their usage in pgbadger
.
$ pgbadger --help
[user@server ~]$ pgbadger --help
Usage: pgbadger [options] logfile [...]
PostgreSQL log analyzer with fully detailed reports and graphs.
Arguments:
logfile can be a single log file, a list of files, or a shell command
returning a list of files. If you want to pass log content from stdin
use - as filename. Note that input from stdin will not work with csvlog.
Options:
-a | --average minutes : number of minutes to build the average graphs of
queries and connections. Default 5 minutes.
-A | --histo-average min: number of minutes to build the histogram graphs
of queries. Default 60 minutes.
-b | --begin datetime : start date/time for the data to be parsed in log
(either a timestamp or a time)
-c | --dbclient host : only report on entries for the given client host.
-C | --nocomment : remove comments like /* ... */ from queries.
-d | --dbname database : only report on entries for the given database.
-D | --dns-resolv : client ip addresses are replaced by their DNS name.
Be warned that this can really slow down pgBadger.
-e | --end datetime : end date/time for the data to be parsed in log
(either a timestamp or a time)
-E | --explode : explode the main report by generating one report
per database. Global information not related to a
database are added to the postgres database report.
-f | --format logtype : possible values: syslog, syslog2, stderr, jsonlog,
cvs, pgbouncer, logplex, rds and redshift. Use this
option when pgBadger is not able to detect the log
format.
-G | --nograph : disable graphs on HTML output. Enabled by default.
-h | --help : show this message and exit.
-H | --html-outdir path: path to directory where HTML report must be written
in incremental mode, binary files stay on directory
defined with -O, --outdir option.
-i | --ident name : programname used as syslog ident. Default: postgres
-I | --incremental : use incremental mode, reports will be generated by
days in a separate directory, --outdir must be set.
-j | --jobs number : number of jobs to run at same time. Run as single
by default or when working with csvlog.
-J | --Jobs number : number of log file to parse in parallel. Process one
file at a time by default or when csvlog is used.
-l | --last-parsed file: allow incremental log parsing by registering the
last datetime and line parsed. Useful if you want
to watch errors since last run or if you want one
report per day with a log rotated each week.
-L | --logfile-list file:file containing a list of log file to parse.
-m | --maxlength size : maximum length of a query, it will be restricted to
the given size. Default truncate size is 100000.
-M | --no-multiline : do not collect multiline statement to avoid garbage
especially on errors that generate a huge report.
-n | --nohighlight : disable SQL code highlighting.
-N | --appname name : only report on entries for given application name
-o | --outfile filename: define the filename for the output. Default depends
on the output format: out.html, out.txt, out.bin,
out.json or out.tsung. This option can be used
multiple time to output several format. To use json
output the Perl module JSON::XS must be installed,
To dump output to stdout use - as filename.
-O | --outdir path : directory where out file must be saved.
-p | --prefix string : the value of your custom log_line_prefix as
defined in your postgresql.conf. Only use it if you
aren't using one of the standard prefixes specified
in the pgBadger documentation, such as if your
prefix includes additional variables like client ip
or application name. See examples below.
-P | --no-prettify : disable SQL queries prettify formatter.
-q | --quiet : don't print anything to stdout, not even a progress
bar.
-Q | --query-numbering : add numbering of queries to the output when using
options --dump-all-queries or --normalized-only.
-r | --remote-host ip : set the host where to execute the cat command on
remote logfile to parse locally the file.
-R | --retention N : number of weeks to keep in incremental mode. Default
to 0, disabled. Used to set the number of weeks to
keep in output directory. Older weeks and days
directory are automatically removed.
-s | --sample number : number of query samples to store. Default: 3.
-S | --select-only : only report SELECT queries.
-t | --top number : number of queries to store/display. Default: 20.
-T | --title string : change title of the HTML page report.
-u | --dbuser username : only report on entries for the given user.
-U | --exclude-user username : exclude entries for the specified user from
report. Can be used multiple time.
-v | --verbose : enable verbose or debug mode. Disabled by default.
-V | --version : show pgBadger version and exit.
-w | --watch-mode : only report errors just like logwatch could do.
-W | --wide-char : encode html output of queries into UTF8 to avoid
Perl message "Wide character in print".
-x | --extension : output format. Values: text, html, bin, json or
tsung. Default: html
-X | --extra-files : in incremental mode allow pgBadger to write CSS and
JS files in the output directory as separate files.
-z | --zcat exec_path : set the full path to the zcat program. Use it if
zcat or bzcat or unzip is not in your path.
-Z | --timezone +/-XX : Set the number of hours from GMT of the timezone.
Use this to adjust date/time in JavaScript graphs.
--pie-limit num : pie data lower than num% will show a sum instead.
--exclude-query regex : any query matching the given regex will be excluded
from the report. For example: "^(VACUUM|COMMIT)"
You can use this option multiple times.
--exclude-file filename: path of the file which contains all the regex to
use to exclude queries from the report. One regex
per line.
--include-query regex : any query that does not match the given regex will
be excluded from the report. You can use this
option multiple times. For example: "(tbl1|tbl2)".
--include-file filename: path of the file which contains all the regex of
the queries to include from the report. One regex
per line.
--disable-error : do not generate error report.
--disable-hourly : do not generate hourly report.
--disable-type : do not generate report of queries by type, database
or user.
--disable-query : do not generate query reports (slowest, most
frequent, queries by users, by database, ...).
--disable-session : do not generate session report.
--disable-connection : do not generate connection report.
--disable-lock : do not generate lock report.
--disable-temporary : do not generate temporary report.
--disable-checkpoint : do not generate checkpoint/restartpoint report.
--disable-autovacuum : do not generate autovacuum report.
--charset : used to set the HTML charset to be used.
Default: utf-8.
--csv-separator : used to set the CSV field separator, default: ,
--exclude-time regex : any timestamp matching the given regex will be
excluded from the report. Example: "2013-04-12 .*"
You can use this option multiple times.
--include-time regex : only timestamps matching the given regex will be
included in the report. Example: "2013-04-12 .*"
You can use this option multiple times.
--exclude-db name : exclude entries for the specified database from
report. Example: "pg_dump". Can be used multiple
time.
--exclude-appname name : exclude entries for the specified application name
from report. Example: "pg_dump". Can be used
multiple time.
--exclude-line regex : pgBadger will start to exclude any log entry that
will match the given regex. Can be used multiple
time.
--exclude-client name : exclude log entries for the specified client ip.
Can be used multiple time.
--anonymize : obscure all literals in queries, useful to hide
confidential data.
--noreport : prevent pgBadger to create reports in incremental
mode.
--log-duration : force pgBadger to associate log entries generated
by both log_duration = on and log_statement = 'all'
--enable-checksum : used to add a md5 sum under each query report.
--journalctl command : command to use to replace PostgreSQL logfile by
a call to journalctl. Basically it might be:
journalctl -u postgresql-9.5
--pid-dir path : set the path where the pid file must be stored.
Default /tmp
--pid-file file : set the name of the pid file to manage concurrent
execution of pgBadger. Default: pgbadger.pid
--rebuild : used to rebuild all html reports in incremental
output directories where there's binary data files.
--pgbouncer-only : only show PgBouncer related menu in the header.
--start-monday : in incremental mode, calendar's weeks start on
sunday. Use this option to start on monday.
--normalized-only : only dump all normalized query to out.txt
--log-timezone +/-XX : Set the number of hours from GMT of the timezone
that must be used to adjust date/time read from
log file before beeing parsed. Using this option
make more difficult log search with a date/time.
--prettify-json : use it if you want json output to be prettified.
--month-report YYYY-MM : create a cumulative HTML report over the specified
month. Requires incremental output directories and
the presence of all necessary binary data files
--noexplain : do not process lines generated by auto_explain.
--command CMD : command to execute to retrieve log entries on
stdin. pgBadger will open a pipe to the command
and parse log entries generated by the command.
--no-week : inform pgbadger to not build weekly reports in
incremental mode. Useful if it takes too much time.
--explain-url URL : use it to override the url of the graphical explain
tool. Default: http://explain.depesz.com/?is_public=0&is_anon=0&plan=
--tempdir DIR : set directory where temporary files will be written
Default: File::Spec->tmpdir() || '/tmp'
--no-process-info : disable changing process title to help identify
pgbadger process, some system do not support it.
--dump-all-queries : dump all queries found in the log file replacing
bind parameters are included in the queries at
their respective placeholders position.
pgBadger is able to parse a remote log file using a passwordless ssh connection.
Use the -r or --remote-host to set the host ip address or hostname. There's also
some additional options to fully control the ssh connection.
--ssh-program ssh path to the ssh program to use. Default: ssh.
--ssh-port port ssh port to use for the connection. Default: 22.
--ssh-user username connection login name. Default to running user.
--ssh-identity file path to the identity file to use.
--ssh-timeout second timeout to ssh connection failure. Default 10 secs.
--ssh-option options list of -o options to use for the ssh connection.
Options always used:
-o ConnectTimeout=$ssh_timeout
-o PreferredAuthentications=hostbased,publickey
Log file to parse can also be specified using an URI, supported protocol are
http[s] and [s]ftp. The curl command will be used to download the file and the
file will be parsed during download. The ssh protocol is also supported and will
use the ssh command like with the remote host use. See examples bellow.
Examples:
pgbadger /var/log/postgresql.log
pgbadger /var/log/postgres.log.2.gz /var/log/postgres.log.1.gz /var/log/postgres.log
pgbadger /var/log/postgresql/postgresql-2012-05-*
pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log
pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" /var/log/postgresql.log
cat /var/log/postgres.log | pgbadger -
# Log prefix with stderr log output
pgbadger --prefix '%t [%p]: user=%u,db=%d,client=%h' /pglog/postgresql-2012-08-21*
pgbadger --prefix '%m %u@%d %p %r %a : ' /pglog/postgresql.log
# Log line prefix with syslog log output
pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' /pglog/postgresql-2012-08-21*
# Use my 8 CPUs to parse my 10GB file faster, much faster
pgbadger -j 8 /pglog/postgresql-10.1-main.log
Use URI notation for remote log file:
pgbadger http://172.12.110.1//var/log/postgresql/postgresql-10.1-main.log
pgbadger ftp://username@172.12.110.14/postgresql-10.1-main.log
pgbadger ssh://username@172.12.110.14:2222//var/log/postgresql/postgresql-10.1-main.log*
You can use together a local PostgreSQL log and a remote pgbouncer log file to parse:
pgbadger /var/log/postgresql/postgresql-10.1-main.log ssh://username@172.12.110.14/pgbouncer.log
Generate Tsung sessions XML file with select queries only:
pgbadger -S -o sessions.tsung --prefix '%t [%p]: user=%u,db=%d ' /pglog/postgresql-10.1.log
Reporting errors every week by cron job:
30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html
Generate report every week using incremental behavior:
0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` -o /var/reports/pg_errors-`date +\%F`.html -l /var/reports/pgbadger_incremental_file.dat
This supposes that your log file and HTML report are also rotated every week.
Or better, use the auto-generated incremental reports:
0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/
will generate a report per day and per week.
In incremental mode, you can also specify the number of week to keep in the
reports:
/usr/bin/pgbadger --retention 2 -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/
If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can
use pgBadger as follow to exclude these period from the report:
pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log
This will help avoid having COPY statements, as generated by pg_dump, on top of
the list of slowest queries. You can also use --exclude-appname "pg_dump" to
solve this problem in a simpler way.
You can also parse journalctl output just as if it was a log file:
pgbadger --journalctl 'journalctl -u postgresql-9.5'
or worst, call it from a remote host:
pgbadger -r 192.168.1.159 --journalctl 'journalctl -u postgresql-9.5'
you don't need to specify any log file at command line, but if you have other
PostgreSQL log file to parse, you can add them as usual.
To rebuild all incremental html reports after, proceed as follow:
rm /path/to/reports/*.js
rm /path/to/reports/*.css
pgbadger -X -I -O /path/to/reports/ --rebuild
it will also update all resource files (JS and CSS). Use -E or --explode
if the reports were built using this option.
pgBadger also support Heroku PostgreSQL logs using logplex format:
heroku logs -p postgres | pgbadger -f logplex -o heroku.html -
this will stream Heroku PostgreSQL log to pgbadger through stdin.
pgBadger can auto detect RDS and cloudwatch PostgreSQL logs using
rds format:
pgbadger -f rds -o rds_out.html rds.log
CloudSQL Postgresql logs it's fairly normal PostgreSQL log but encapsulated in
JSON format. It is auto detected too by pgBagder but in case you need to force
the log format, use `jsonlog`
pgbadger -f jsonlog -o cloudsql_out.html cloudsql.log
This is the same than with the jsonlog extension, the json format is different
but pgBadger can parse both format.
To create a cumulative report over a month use command:
pgbadger --month-report 2919-05 /path/to/incremantal/reports/
this will add a link to the month name into the calendar view in
incremental reports to look at report for month 2019 May.
Use -E or --explode if the reports were built using this option.
[user@server ~]$
Finally, I would say it’s one of the best freely available tools for PostgreSQL log analysis and identifying performance issues. The shareable and feature-rich HTML reports & graphs have made it the go-to tool when it comes to capturing and sharing the database performance report with stakeholders 📊
Credits: 🙇♂️
The main author and official maintainer of the pgBadger
project is Gilles Darold. You can find the official GitHub repo here: https://github.com/darold/pgbadger and the website http://pgbadger.darold.net/ as well 🙏
Subscribe to my newsletter
Read articles from Arindam Ghosh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Arindam Ghosh
Arindam Ghosh
Database reliability engineer 👨💻 with expertise in designing, implementing, and maintaining highly available and scalable database systems 💻 Enjoy automating various database tasks ⚙️ and implementing database DevOps 🚀 🙏