彻底挖掘mysqlbinlog数据内容

标签： mysqlbinlog 数据 | 发表时间：2013-11-21 23:05 | 作者：GuaiShuiJue

出处：http://blog.csdn.net

平时工作需要查询mysqlbinlog里面哪个库，哪个表在什么时间更新了什么数据，而通过mysqlbinlog转义过来的文件里面数据太乱，很多注释，而且也没有表结构相关字段，都用@1，@2等类似的方式显示，就想到通过python开发一个这样的功能。

先说下脚本的原理：

先通过mysqlbinlog转义二进制日志binlog文件，保存为一份临时文件（执行完后删除）
脚本连接数据库，查出对应的表结构和字段，然后替换掉对应的@1，@2等字段
过滤掉大部分注释的语句

然后...就没有然后了

__author__ = 'chunyang.wu'
# -*- coding: utf-8 -*-
#!/usr/bin/env python
import MySQLdb
import re
import os
import sys

mysql_config = {
    'host':'192.168.1.197',\
    'user':'root',\
    'passwd':'123456',\
    'port':3306,\
    'db':'mydb'
}

class Deal_binlog:
    def __init_db(self):
        self.mysql_db = MySQLdb.connect(host=self.host,user=self.user,passwd=self.passwd,port=self.port,db=self.db)
        self.mysql_cur=self.mysql_db.cursor()

    def __init__(self):
        self.host = mysql_config['host']
        self.user = mysql_config['user']
        self.passwd = mysql_config['passwd']
        self.port = mysql_config['port']
        self.db = mysql_config['db']
        self.sql_file = sys.argv[2]
        self.input_binlog_file = sys.argv[1]
        self.tmp_binlog_file = 'tmp_binlog_file'
        self.field = []
        self.tb_name = ''
        self.where = []
        self.update = []
        self.delete = []
        self.patt = re.compile("/* .* */")
        self.__init_db()

    def _release_db(self):
        self.mysql_cur.close()
        self.mysql_db.close()

    def _write_data(self,data):
        print data
        data = str(data)+"\n"
        f = open(self.data_file,'a+')
        f.write(data)
        f.close()

    def _get_table_name(self,line):
        try:
            if line.find('Table_map:')!=-1:
                l = line.index('Table_map')
#                print  line[l::].split(' ')
                self.tb_name = line[l::].split(' ')[1].replace('`','')
#                return table
        except Exception,ex:
            print ex

    def _get_table_structure(self,tb):
        desc_sql = 'desc %s' %tb
#        print desc_sql
        self.field = []
        self.mysql_cur.execute(desc_sql)
        res = self.mysql_cur.fetchall()
        for j in res:
            self.field.append(j[0])

    def _do(self):
        '''先把mysql二进制的binlog解析成可识别文件，在从里面提取需要的数据'''
        if os.path.exists(self.sql_file):
            os.remove(self.sql_file)
        os.popen('mysqlbinlog -v -v --base64-output=DECODE-ROWS %s>%s' %(self.input_binlog_file,self.tmp_binlog_file))
        with open(self.tmp_binlog_file,"r") as infile:
            for line in infile:
                if line.rstrip('\n')=='BEGIN':
                    line = line.replace('BEGIN','')
                elif line.find('Table_map:')!=-1:
                    self._get_table_name(line)
                    self._get_table_structure(self.tb_name)
                elif line.find('###   @')!=-1:
#                    print line.replace('###   @','')
                    i = line.replace('###   @','').split('=')[0]
#                    print line,i
                    line = line.replace('###   @'+str(i),self.field[int(i)-1])
                    if(int(i)== len(self.field)):
                        line = self.patt.sub(' ',line)
                    else:
                        line = self.patt.sub(',',line)
                elif line.find('###')!=-1:
                    line = line.replace('###','')
                else:
                    line = ''
                if line.rstrip('\n')!= '':
                    print line.rstrip('\n')
                    f = open(self.sql_file,'a+')
                    f.write(line)
                    f.close()
        os.remove(self.tmp_binlog_file)

def usage():
    print "python deal_binlog.py mysql_binlog_file out_put_sql_file"
    print "  tag: "
    print "  1. change mysql_config dict"
    print "  2. need MySQLdb"
    print "  3. need your mysql server desc table privileges"

def main():
    p = Deal_binlog()
    p._do()
    p._release_db()

if __name__=="__main__":
    if len(sys.argv) ==3:
        main()
    else:
        usage()

脚本的执行方式：python deal_binlog.py mysql_binlog_file out_put_sql_file
操作系统需要安装MySQLdb模块，且需要有权限连接到对应的数据库，脚本的执行可以不在mysql服务器上面（copy一份binlog文件即可）
执行时需要把该脚本和复制的binlog放在同一个目录下，下面贴上一份解析后的结果图：

可以看到哪个时间点，数据库更新了什么数据，我的目的这就达到了（update、delete语句不是标准sql语法，这里如果以后有需要在改了）

---------------------------------------------------------------------------------------------------------------------------------------------------------

URL: http://blog.csdn.net/selectdb/article/details/16861063

作者：GuaiShuiJue 发表于2013-11-21 15:05:02 原文链接

阅读：73 评论：0 查看评论

彻底挖掘mysqlbinlog数据内容

相关 [mysqlbinlog 数据] 推荐：

彻底挖掘mysqlbinlog数据内容

使用MySQLBinlog按时间查询二进制日志时易疏忽的地方

数据仓库

数据抽取

数据库sharding

数据脱敏

数据分析之如何用数据？

excel数据导入mysql数据库

数据批量导入Oracle数据库

数据分析之如何用数据？

相关文章

订阅