postgresql 日志分析工具pgbadger

标签： postgresql 日志分析 | 发表时间：2014-10-27 18:28 | 作者：xmarker

出处：http://xmarker.blog.163.com

虽然有zabbix等工具监控着pg数据库，但还是有很多信息不能监控到，所以想通过定期分析pg的日志来掌握数据库的运行状态。pgbadger是一款开源的日志分析工具，采用perl编写，项目地址为： https://github.com/dalibo/pgbadger ，下面就使用简要记录下。

1.下载并安装

直接git clone或者下载zip包并上传到pg数据库服务器，然后解压：

[root@pgdb1 soft]# unzip pgbadger.zip

[root@pgdb1 soft]# cd pgbadger-master

安装：

[root@pgdb1 pgbadger-master]# perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for pgBadger
[root@pgdb1 pgbadger-master]# make && make install

由于我要分析的pg格式是csv格式的日志文件，因此需要安装csv_xs模块：

安装Text::CSV_XS模块

yum install perl-CPAN

perl -MCPAN -e shell

install Text::CSV_XS

2.选项说明：

[root@pgdb1 soft]# pgbadger --help

Options:

-a | --average minutes : number of minutes to build the average graphs of

queries and connections. Default 5 minutes.

-A | --histo-avg minutes: number of minutes to build the histogram graphs

of queries. Default 60 minutes.

-b | --begin datetime : start date/time for the data to be parsed in log.

-B | --bar-graph : use bar graph instead of line by default.

-c | --dbclient host : only report on entries for the given client host.

-C | --nocomment : remove comments like /* ... */ from queries.

-d | --dbname database : only report on entries for the given database.

-D | --dns-resolv : client ip adresses are replaced by their DNS name.

Be warned that this can really slow down pgBadger.

-e | --end datetime : end date/time for the data to be parsed in log.

-f | --format logtype : possible values: syslog,stderr,csv. Default: stderr.

-G | --nograph : disable graphs on HTML output. Enabled by default.

-h | --help : show this message and exit.

-i | --ident name : programname used as syslog ident. Default: postgres

-I | --incremental : use incremental mode, reports will be generated by

days in a separate directory, --outdir must be set.

-j | --jobs number : number of jobs to run at same time. Default is 1,

run as single process.

-J | --Jobs number : number of log file to parse in parallel. Default

is 1, run as single process.

-l | --last-parsed file: allow incremental log parsing by registering the

last datetime and line parsed. Useful if you want

to watch errors since last run or if you want one

report per day with a log rotated each week.

-m | --maxlength size : maximum length of a query, it will be restricted to

the given size. Default: no truncate

-M | --no-multiline : do not collect multiline statement to avoid garbage

especially on errors that generate a huge report.

-n | --nohighlight : disable SQL code highlighting.

-N | --appname name : only report on entries for given application name

-o | --outfile filename: define the filename for the output. Default depends

on the output format: out.html, out.txt, out.bin,

out.json or out.tsung.

With module JSON::XS installed, you can output file

in JSON format either.

To dump output to stdout use - as filename.

-O | --outdir path : directory where out file must be saved.

-p | --prefix string : the value of your custom log_line_prefix as

defined in your postgresql.conf. Only use it if you

aren't using one of the standard prefixes specified

in the pgBadger documentation, such as if your prefix

includes additional variables like client ip or

application name. See examples below.

-P | --no-prettify : disable SQL queries prettify formatter.

-q | --quiet : don't print anything to stdout, not even a progress bar.

-r | --remote-host ip : set the host where to execute the cat command on remote

logfile to parse localy the file.

-R | --retention N : number of week to keep in incremental mode. Default 0,

disabled. Used to set the number of weel to keep in

output directory. Older week end day directory are

automatically removed.

-s | --sample number : number of query samples to store/display. Default: 3

-S | --select-only : only report SELECT queries.

-t | --top number : number of queries to store/display. Default: 20

-T | --title string : change title of the HTML page report.

-u | --dbuser username : only report on entries for the given user.

-U | --exclude-user username : exclude entries for the specified user from report.

-v | --verbose : enable verbose or debug mode. Disabled by default.

-V | --version : show pgBadger version and exit.

-w | --watch-mode : only report errors just like logwatch could do.

-x | --extension : output format. Values: text, html, bin, json or

tsung. Default: html

-X | --extra-files : in incremetal mode allow pgbadger to write CSS and JS

files in the output directory as separate files.

-z | --zcat exec_path : set the full path to the zcat program. Use it if

zcat or bzcat or unzip is not in your path.

--pie-limit num : pie data lower than num% will show a sum instead.

--exclude-query regex : any query matching the given regex will be excluded

from the report. For example: "^(VACUUM|COMMIT)"

You can use this option multiple times.

--exclude-file filename: path of the file which contains all the regex to use

to exclude queries from the report. One regex per line.

--include-query regex : any query that does not match the given regex will be

excluded from the report. For example: "(table_1|table_2)"

You can use this option multiple times.

--include-file filename: path of the file which contains all the regex of the

queries to include from the report. One regex per line.

--disable-error : do not generate error report.

--disable-hourly : do not generate hourly report.

--disable-type : do not generate report of queries by type, database...

--disable-query : do not generate query reports (slowest, most

frequent, queries by users, by database, ...).

--disable-session : do not generate session report.

--disable-connection : do not generate connection report.

--disable-lock : do not generate lock report.

--disable-temporary : do not generate temporary report.

--disable-checkpoint : do not generate checkpoint/restartpoint report.

--disable-autovacuum : do not generate autovacuum report.

--charset : used to set the HTML charset to be used. Default: utf-8.

--csv-separator : used to set the CSV field separator, default: ,

--exclude-time regex : any timestamp matching the given regex will be

excluded from the report. Example: "2013-04-12 .*"

You can use this option multiple times.

--exclude-appname name : exclude entries for the specified application name

from report. Example: "pg_dump".

--exclude-line regex : pgbadger will start to exclude any log entry that

will match the given regex. Can be used multiple time.

--anonymize : obscure all literals in queries to hide confidential data.

pgbadger日志分析支持单文件、多文件、压缩文件、增量文件等分析

2.1 分析个单日志文件：

postgres:/var/pg_log@pgdb1>pgbadger -f csv postgresql-2014-10-27_060717.csv

[======================> ] Parsed 8331273 bytes of 8826435 (94.39%), queries: 0, events: 1

LOG: Ok, generating html report...

执行问后会在当前目录生成一个out.html 文件，可以加-o 选项命名输出文件。

把此out.html拉到本地用浏览器打开即可查看到精美的画面，其中信息有增删改查统计、事件、vacuum、临时文件、session、connection、checkpoint等信息。

2.2 分析一个时间段的日志：

postgres:/var/pg_log@pgdb1>pgbadger -b "2014-10-26 10:00:00" -e "2014-10-27 10:00:00" /var/pg_log/* -o /tmp/tmp_pgbadger.html

[======================> ] Parsed 59206 bytes of 62700 (94.43%), queries: 0, events: 0

[=======================> ] Parsed 94 bytes of 96 (97.92%), queries: 0, events: 0

[======================> ] Parsed 29695217 bytes of 31457483 (94.40%), queries: 0, events: 0

[=======================> ] Parsed 94 bytes of 96 (97.92%), queries: 0, events: 0

...

[======================> ] Parsed 29693436 bytes of 31457418 (94.39%), queries: 2250, events: 3

[======================> ] Parsed 29685718 bytes of 31457312 (94.37%), queries: 4884, events: 23

[======================> ] Parsed 8645318 bytes of 9156095 (94.42%), queries: 4884, events: 24

LOG: Ok, generating html report...

Wide character in print at /usr/local/bin/pgbadger line 7007.

Wide character in print at /usr/local/bin/pgbadger line 7996.

可以看到通过/var/pg_log/*的方式指定日志的话，其实pgbadger要扫描所有的文件来分析这个时间段的日志信息。

2.3 定期增量分析日志

由于日志分析是一个日常性的运维工作，因此建议定期增量分析，定期就是使用crontab等工具定期执行日志分析命令，增量是指利用pgbadger的增量分析功能，分析过的日志是不会再分析的。

要进行增量分析，必须要先建立一个目录，让分析过程中产生的结果文件保存到此目录，并生成指示下次分析开始位置的文件。

日常分析：

[root@pgdb1 ~]# mkdir /tmp/pgbadger_log/
[root@pgdb1 ~]# chown -R postgres:postgres /tmp/pgbadger_log/

定时任务：

0 4 * * * /usr/local/bin/pgbadger --retention 4 -I -q /var/pg_log/* -O /tmp/pgbadger_log/

通过以上命令即可把日志输出到 /tmp/pgbadger_log/目录，并保持4周的循环，过期的分析结果将被删掉（注意：生产环境最好别放到/tmp目录下，否则重启机器后结果文件可能会丢失）。

下面手工先创建执行下试试：

postgres:/var/pg_log@pgdb1>/usr/local/bin/pgbadger --retention 4 -I -q /var/pg_log/* -O /tmp/pgbadger_log/

Wide character in print at /usr/local/bin/pgbadger line 7996.

Wide character in print at /usr/local/bin/pgbadger line 7007.

....

速度比较慢，因为有/var/pg_log/下的日志有20来个，接近500M了

执行完后会在/tmp/pgbadger_log/目录下生成如下文件：

[root@pgdb1 pgbadger_log]# ls
2014 index.html LAST_PARSED
[root@pgdb1 pgbadger_log]# tree 2014/
2014/
├── 10
│   ├── 18
│   │   ├── 2014-10-18-875.bin
│   │   ├── 2014-10-18-876.bin
│   │   ├── 2014-10-18-877.bin
│   │   └── index.html

...
│   ├── 26
│   │   ├── 2014-10-26-875.bin
│   │   ├── 2014-10-26-876.bin
│   │   ├── 2014-10-26-877.bin
│   │   └── index.html
│   └── 27
│       ├── 2014-10-27-875.bin
│       ├── 2014-10-27-876.bin
│       └── index.html
├── week-42
│   └── index.html
├── week-43
│   └── index.html
└── week-44
    └── index.html

14 directories, 44 files

-I为增量模式，可以看到，按日期生成了一堆的bin二进制文件，并用index索引每天的结果，如果装了apache可以访问服务器上的这些文件的，由于我的是数据库服务器，没有安装apache，所以要把整个目录拖到本地才能看，打开顶层的index.html就可以选择看哪周、哪天的日志。

再次执行增量分析命令就快很多了：

postgres:/var/pg_log@pgdb1> /usr/local/bin/pgbadger --retention 4 -I -q /var/pg_log/* -O /tmp/pgbadger_log/

Wide character in print at /usr/local/bin/pgbadger line 7007.

Wide character in print at /usr/local/bin/pgbadger line 7167.

2.4 仅分析并输出错误信息

postgres:/var/pg_log@pgdb1> pgbadger -q -w /var/pg_log/postgresql-2014-10-27_060717.csv -o /tmp/pg_errors.html

-w选项只过滤错误相关的信息，这样输出文件很小，便于之查看错误信息。

2.5 并行模式分析

postgres:/tmp/pgbadger_log/2014@pgdb1> /usr/local/bin/pgbadger -j 4 --retention 4 -I -q /var/pg_log/* -O /tmp/pgbadger_log/

-j选项为并行模式，其中还有-J模式，前者为并行N个进程分析一个文件，后者为并行N个进程同时分析N个文件。两者实测速度差不多。

还可以指定其他选项：

-S选项，只分析查询语句

-t选项，列出指定数目的top sql

-u和-U选项，仅列出指定用户或去除指定用户的查询

-x选项，指定输出格式，如果不想用浏览器查看的可以选择txt等格式

--exclude-query选项和--include-query，可以指定正则表达式过滤想要分析或去除的sql

有了这些数据，可以像pt-query-digist工具分析mysql慢查询日志一样分析postgresql的数据库运行状态了。

postgresql 日志分析工具pgbadger

相关 [postgresql 日志分析] 推荐：