drbd+xfs+heartbeat+mysql实现高可用
一.前言
DRBD是一种块设备,可以被用于高可用(HA)之中.它类似于一个网络RAID-1功能.当你将数据写入本地 文件系统时,数据还将会被发送到网络中另一台主机上.以相同的形式记录在一个文件系统中. 本地(主节点)与远程主机(备节点)的数据可以保证实时同步.当本地系统出现故障时,远程主机上还会保留有一份相同的数据,可以继续使用.
Heartbeat 是可以从 Linux-HA 项目 Web 站点公开获得的软件包之一,它提供了所有 HA 系统所需要的基本功能,比如启动和停止资源、监测群集中系统的可用性、在群集中的节点间转移共享 IP 地址的所有者等
系统os为centos5.8 64bit
drbd83
xfs-1.0.2-5.el5_6.1
Heartbeat 2.1.3
Percona-Server-5.5.22-rel25.2
资源分配
192.168.0.45 mysql1
192.168.0.43 mysql2
192.168.0.46 vip
DRBD数据目录 /data
mysql做主复制,后接从机,实现高可用
二.安装xfs支持
XFS的主要特性包括:
数据完全性
采用XFS文件系统,当意想不到的宕机发生后,首先,由于文件系统开启了日志功能,所以你磁盘上的文件不再会意外宕机而遭到破坏了。不论目前文件系统上存储的文件与数据有多少,文件系统都可以根据所记录的日志在很短的时间内迅速恢复磁盘文件内容。
传输特性
XFS文件系统采用优化算法,日志记录对整体文件操作影响非常小。XFS查询与分配存储空间非常快。xfs文件系统能连续提供快速的反应时间。笔者曾经对XFS、JFS、Ext3、ReiserFS文件系统进行过测试,XFS文件文件系统的性能表现相当出众。
可扩展性
XFS 是一个全64-bit的文件系统,它可以支持上百万T字节的存储空间。对特大文件及小尺寸文件的支持都表现出众,支持特大数量的目录。最大可支持的文件大 小为263 = 9 x 1018 = 9 exabytes,最大文件系统尺寸为18 exabytes。
XFS使用高的表结构(B+树),保证了文件系统可以快速搜索与快速空间分配。XFS能够持续提供高速操作,文件系统的性能不受目录中目录及文件数量的限制。
传输带宽
XFS 能以接近裸设备I/O的性能存储数据。在单个文件系统的测试中,其吞吐量最高可达7GB每秒,对单个文件的读写操作,其吞吐量可达4GB每秒。
xfs在默认的系统安装上是不被支持的,需要自己手动安装默认的包。
rpm -qa |grep xfs
xorg-x11-xfs-1.0.2-5.el5_6.1
yum install xfsprogs kmod-xfs
挂载分区
- modprobe xfs //载入xfs文件系统模块,如果不行要重启服务器
- lsmod |grep xfs //查看是否载入了xfs模块
df -h -P -T
- Filesystem Type Size Used Avail Use% Mounted on
- /dev/mapper/VolGroup00-LogVol02 ext3 97G 3.7G 89G 5% /
- /dev/mapper/VolGroup00-LogVol01 ext3 9.7G 151M 9.1G 2% /tmp
- /dev/mapper/VolGroup00-LogVol04 ext3 243G 470M 230G 1% /opt
- /dev/mapper/VolGroup00-LogVol03 ext3 39G 436M 37G 2% /var
- /dev/sda1 ext3 99M 13M 81M 14% /boot
- tmpfs tmpfs 7.9G 0 7.9G 0% /dev/shm
vgdisplay
- --- Volume group ---
- VG Name VolGroup00
- System ID
- Format lvm2
- Metadata Areas 1
- Metadata Sequence No 6
- VG Access read/write
- VG Status resizable
- MAX LV 0
- Cur LV 5
- Open LV 5
- Max PV 0
- Cur PV 1
- Act PV 1
- VG Size 1.09 TB
- PE Size 32.00 MB
- Total PE 35692
- Alloc PE / Size 13056 / 408.00 GB
- Free PE / Size 22636 / 707.38 GB
- VG UUID 7nAuuH-cpyx-u2Jr-bNNJ-fGPe-TUKg-T4KmmU
划出300G数据分区
lvcreate -L 300G -n DRBD VolGroup00
- Logical volume "DRBD" created
- /dev/VolGroup00/DRBD
三.安装配置drbd
1.设置hostname
vi /etc/sysconfig/network
修改HOSTNAME为node1
编辑hosts
vi /etc/hosts
添加:
192.168.0.45 node1
192.168.0.43 node2
使node1 hostnmae临时生效
hostname node1
node2设置类似。
2.安装drbd
yum install drbd83 kmod-drbd83
- Installing:
- drbd83 x86_64 8.3.15-2.el5.centos extras 243 k
- kmod-drbd83 x86_64 8.3.15-3.el5.centos extras
- yum install -y heartbeat heartbeat-ldirectord heartbeat-pils heartbeat-stonith
- Installing:
- heartbeat x86_64 2.1.3-3.el5.centos extras 1.8 M
- heartbeat-ldirectord x86_64 2.1.3-3.el5.centos extras 199 k
- heartbeat-pils x86_64 2.1.3-3.el5.centos extras 220 k
- heartbeat-stonith x86_64 2.1.3-3.el5.centos extras 348 k
3.配置drbd
master上配置配置文件(/etc/drbd.d目录下)global_common.conf,然后scp到slave相同目录下.
vi /etc/drbd.d/global_common.conf
- global {
- usage-count yes;
- # minor-count dialog-refresh disable-ip-verification
- }
- common {
- protocol C;
- handlers {
- # These are EXAMPLE handlers only.
- # They may have severe implications,
- # like hard resetting the node under certain circumstances.
- # Be careful when chosing your poison.
- pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
- pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
- local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
- # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
- # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
- # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
- # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
- # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
- }
- startup {
- # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
- wfc-timeout 15;
- degr-wfc-timeout 15;
- outdated-wfc-timeout 15;
- }
- disk {
- # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
- # no-disk-drain no-md-flushes max-bio-bvecs
- on-io-error detach;
- fencing resource-and-stonith;
- }
- net {
- # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
- # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
- # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
- timeout 60;
- connect-int 15;
- ping-int 15;
- ping-timeout 50;
- max-buffers 8192;
- ko-count 100;
- cram-hmac-alg sha1;
- shared-secret "mypassword123";
- }
- syncer {
- # rate after al-extents use-rle cpu-mask verify-alg csums-alg
- rate 100M;
- al-extents 512;
- csums-alg sha1;
- }
- }
4.配置同步资源
vi /etc/drbd.d/share.res
- resource share {
- meta-disk internal;
- device /dev/drbd0; #device指定的参数最后必须有一个数字,用于global的minor-count,否则会报错。device指定drbd应用层设备。
- disk /dev/VolGroup00/DRBD;
- on node1{ #注意:drbd配置文件中,机器名大小写敏感!
- #on Locrosse{
- address 192.168.0.45:9876;
- }
- on node2 {
- #on Regal {
- address 192.168.0.43:9876;
- }
- }
5.将文件传送到slave上
scp -P 22 /etc/drbd.d/share.res /etc/drbd.d/global_common.conf [email protected]:.
6.在两台机器上分别创建DRBD分区
drbdadm create-md share,share为资源名,对应上面share.res,后面的share皆如此
drbdadm create-md share
no resources defined!
在drbd.conf中加入配件文件
vi /etc/drbd.conf
- include "drbd.d/global_common.conf";
- include "drbd.d/*.res";
drbdadm create-md share
- md_offset 322122543104
- al_offset 322122510336
- bm_offset 322112679936
- Found xfs filesystem
- 314572800 kB data area apparently used
- 314563164 kB left usable by current configuration
- Device size would be truncated, which
- would corrupt data and result in
- 'access beyond end of device' errors.
- You need to either
- * use external meta data (recommended)
- * shrink that filesystem first
- * zero out the device (destroy the filesystem)
- Operation refused.
- Command 'drbdmeta 0 v08 /dev/VolGroup00/DRBD internal create-md' terminated with exit code 40
- drbdadm create-md share: exited with code 40
解决办法
dd if=/dev/zero bs=1M count=1 of=/dev/VolGroup00/DRBD
- 1+0 records in
- 1+0 records out
- 1048576 bytes (1.0 MB) copied, 0.00221 seconds, 474 MB/s
drbdadm create-md share
- Writing meta data...
- initializing activity log
- NOT initialized bitmap
- New drbd meta data block successfully created.
- success
7.添加itpables
- iptables -A INPUT -p tcp -m tcp -s 192.168.0.0/24 --dport 9876 -j ACCEPT
8.在两台机器上启动DRBD
service drbd start 或 /etc/init.d/drbd start
9.查看DRBD启动状态
service drbd status 或 /etc/init.d/drbd status,cat /proc/drbd 或 drbd-overview 查看数据同步状态,第一次同步数据比较慢,只有等数据同步完才能正常使用DRBD。
没有启动drbd或iptables没有打开
service drbd status
- drbd driver loaded OK; device status:
- version: 8.3.15 (api:88/proto:86-97)
- GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
- m:res cs ro ds p mounted fstype
- 0:share WFConnection Secondary/Unknown Inconsistent/DUnknown C
成功连上
/etc/init.d/drbd status
- drbd driver loaded OK; device status:
- version: 8.3.15 (api:88/proto:86-97)
- GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
- m:res cs ro ds p mounted fstype
- 0:share Connected Secondary/Secondary Inconsistent/Inconsistent C
10.设置主节点
把node1设置为 primary节点, 首次设置primary时用命令drbdsetup /dev/drbd0 primary -o 或drbdadm –overwrite-data-of-peer primary share ,而不能直接用命令drbdadm primary share, 再次查看DRBD状态
drbdsetup /dev/drbd0 primary -o
/etc/init.d/drbd status
- drbd driver loaded OK; device status:
- version: 8.3.15 (api:88/proto:86-97)
- GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
- m:res cs ro ds p mounted fstype
- ... sync'ed: 1.9% (301468/307188)M
- 0:share SyncSource Primary/Secondary UpToDate/Inconsistent C
11.格式化为xfs
在node1上为DRBD分区(/dev/drbd0)创建xfs文件系统,并制定标签为drbd 。mkfs.xfs -L drbd /dev/drbd0,若提示未安装xfs,直接yum install xfsprogs安装
mkfs.xfs -L drbd /dev/drbd0
- meta-data=/dev/drbd0 isize=256 agcount=16, agsize=4915049 blks
- = sectsz=512 attr=0
- data = bsize=4096 blocks=78640784, imaxpct=25
- = sunit=0 swidth=0 blks, unwritten=1
- naming =version 2 bsize=4096
- log =internal log bsize=4096 blocks=32768, version=1
- = sectsz=512 sunit=0 blks, lazy-count=0
- realtime =none extsz=4096 blocks=0, rtextents=0
挂载DRBD分区到本地(只能在primary节点上挂载),首先创建本地文件夹,
mkdir /data
之后
mount /dev/drbd0 /data
然后就可以使用drbd分区了。
12.测试,主从切换
首先在master节点/data文件夹中新建test文档touch /data/test,卸载DRBD分区/dev/drbd0, umount /dev/drbd0,把primary节点降为secondary节点drbdadm secondary share
touch /data/test
umount /dev/drbd0
drbdadm secondary share
在slave节点上,把secondary升级为paimary节点 drbdadm primary share,创建/data文件夹 mkdir /data,挂载DRBD分区/dev/drbd0到本地 mount /dev/drbd0 /data,然后查看/data文件夹如果发现test文件则DRBD安装成功。
drbdadm primary share
mkdir /data
mount /dev/drbd0 /data
ll /data
/etc/init.d/drbd status
- drbd driver loaded OK; device status:
- version: 8.3.15 (api:88/proto:86-97)
- GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
- m:res cs ro ds p mounted fstype
- 0:share Connected Primary/Secondary UpToDate/UpToDate C /data xfs
主众切换提要
在master上,先umount
umount /dev/drbd0
把primary节点降为secondary节点
drbdadm secondary share
在slave上
drbdadm primary share
mount /dev/drbd0 /data
这样slave就升为主了
四.安装配置heartbeat
====================
1.手动编译失败版
1.1.安装libnet
http://sourceforge.net/projects/libnet-dev/
wget http://downloads.sourceforge.net/project/libnet-dev/libnet-1.2-rc2.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Flibnet-dev%2F&ts=1368605826&use_mirror=jaist
tar zxvf libnet-1.2-rc2.tar.gz
cd libnet
./configure
make && make install
1.2.解锁用户文件
chattr -i /etc/passwd /etc/shadow /etc/group /etc/gshadow /etc/services
1.3.添加用户
groupadd haclient
useradd -g haclient hacluster
1.4.查看添加结果
tail /etc/passwd
hacluster:x:498:496::/var/lib/heartbeat/cores/hacluster:/bin/bash
tail /etc/shadow
hacluster:!!:15840:0:99999:7:::
tail /etc/gshadow
haclient:!::
tail /etc/group
haclient:x:496:
1.5.安装heartbeat
cd ..
wget http://www.ultramonkey.org/download/heartbeat/2.1.3/heartbeat-2.1.3.tar.gz
tar -zxvf heartbeat-2.1.3.tar.gz
cd heartbeat-2.1.3
./ConfigureMe configure
make && make install
出现这个错误过不去了
/usr/local/lib/libltdl.a: could not read symbols:
2.yum安装heartbeat
yum install libtool-ltdl-devel
yum install heartbeat
总共有三个文件需要配置:
ha.cf 监控配置文件
haresources 资源管理文件
authkeys 心跳线连接加密文件
3.同步两台节点的时间
ntpdate -d cn.pool.ntp.org
4.添加iptables
iptables -A INPUT -p udp -m udp –dport 694 -s 192.168.0.0/24 -j ACCEPT
5.配置ha.cf
ip分配
192.168.0.45 node1
192.168.0.43 node2
192.168.0.46 vip
vi /etc/ha.d/ha.cf
- debugfile /var/log/ha-debug #打开错误日志报告
- keepalive 2 #两秒检测一次心跳线连接
- deadtime 10 #10 秒测试不到主服务器心跳线为有问题出现
- warntime 6 #警告时间(最好在 2 ~ 10 之间)
- initdead 120 #初始化启动时 120 秒无连接视为正常,或指定heartbeat
- #在启动时,需要等待120秒才去启动任何资源。
- udpport 694 #用 udp 的 694 端口连接
- ucast eth0 192.168.0.43 #单播方式连接(主从都写对方的 ip 进行连接)
- node node1 #声明主服(注意是主机名uname -n不是域名)
- node node2 #声明备服(注意是主机名uname -n不是域名)
- auto_failback on #自动切换(主服恢复后可自动切换回来)这个不要开启
- respawn hacluster /usr/lib64/heartbeat/ipfail #监控ipfail进程是否挂掉,如果挂掉就重启它,64位为/lib64,32位为/lib
6.配置authkeys
vi /etc/ha.d/authkeys
写入:
- auth 1
- 1 crc
chmod 600 /etc/ha.d/authkeys
7.配置haresources
mkdir /usr/lib/heartbeat
vi /etc/ha.d/haresources
写入:
- node1 IPaddr::192.168.0.46/24/eth0 drbddisk::share Filesystem::/dev/drbd0::/data::xfs mysqld
node1:master主机名
IPaddr::192.168.0.46/24/eth0:设置虚拟IP(对外使用ip)
drbddisk::r0:管理资源r0
Filesystem::/dev/drbd1::/data::ext3:执行mount与unmout操作
node2配置基本相同,不同的是ha.cf中的192.168.0.43 改为192.168.0.45。
mysql为后续的服务 以空格分开可写多个
8.配置mysql服务
从mysql编译目录复制出mysql.server
cd mysql编译目录
cp support-files/mysql.server /etc/init.d/mysqld
chown root:root /etc/init.d/mysqld
chmod 700 /etc/init.d/mysqld
定义mysql数据目录及二进制日志目录是drbd资源
vi /opt/mysql/my.cnf
- socket = /opt/mysql/mysql.sock
- pid-file = /opt/mysql/var/mysql.pid
- log = /opt/mysql/var/mysql.log
- datadir = /data/mysql/var/
- log-bin=/data/mysql/var/mysql-bin
9.DRBD主从自动切换测试
首先先在node1启动heartbeat,接着在node2启动,这时,node1等node2完全启动后,相继执行设置虚拟IP,启动drbd并设置primary,并挂载/dev/drbd0到/data目录,启动命令为:
service heartbeat start
这时,我们执行ip a命令,发现多了一个IP 192.168.0.46,这个就是虚拟IP,cat /proc/drbd查看drbd状态,显示primary/secondary状态,df -h显示/dev/drbd0已经挂载到/data目录。
然后我们来测试故障自动切换,停止node1的heartbeat服务或者断开网络连接,几秒后到node2查看状态。
接着恢复node1的heartbeat服务或者网络连接,查看其状态。
故障转移测试
service heartbeat standby
执行上面的命令,就会将haresources里面指定的资源从node1正常的转移到node2上,这时通过cat /proc/drbd在两台机器上查看,角色已发生了改变,mysql的服务也已在从服务器上成功开启,通过查看两台机器上的heartbeat日志便可清楚可见资源切换的整个过程,
cat /var/log/ha-log
实验结果
在node1使用 heartbeat stop或 standby后会切到slave ,node1再次开启heartbeat 后会自动再切回
所以关闭heartbeat服务要先关从机,否则会切到从机.
错误1:
/usr/lib/heartbeat/ipfail is not executable
在64位系统中此处为lib64
错误2:
ERROR: Bad permissions on keyfile [/etc/ha.d/authkeys], 600 recommended.
chmod 600 /etc/ha.d/authkeys
=================
ext3和xfs写性能简单测试
sas300g 15k 四盘组raid 0
条带大小Stripe Element Size:256K http://www.wmarow.com/strcalc/
读取策略:自适应 Adaptive
写策略:回写(write back)
打开电池(bbu)
lvm管理
# time dd if=/dev/zero of=/opt/c1gfile bs=1024 count=20000000
20000000+0 records in
20000000+0 records out
20480000000 bytes (20 GB) copied, 95.6601 seconds, 214 MB/s
real 1m35.950s
user 0m9.389s
sys 1m19.322s
# time dd if=/dev/zero of=/data/c1gfile bs=1024 count=20000000
time dd if=/dev/zero of=/data/c1gfile bs=1024 count=20000000
20000000+0 records in
20000000+0 records out
20480000000 bytes (20 GB) copied, 45.6391 seconds, 449 MB/s
real 0m45.652s
user 0m7.663s
sys 0m37.922s
参考:
http://blog.csdn.net/rzhzhz/article/details/7107115
http://www.51osos.com/a/MySQL/jiagouyouhua/jiagouyouhua/2010/1122/Heartbeat.html
http://bbs.linuxtone.org/thread-7413-1-1.html
http://www.centos.bz/2012/03/achieve-drbd-high-availability-with-heartbeat/
http://www.doc88.com/p-714757400456.html
http://icarusli.iteye.com/blog/1751435
http://www.mysqlops.com/2012/03/10/dba%e7%9a%84%e4%ba%b2%e4%bb%ac%e5%ba%94%e8%af%a5%e7%9f%a5%e9%81%93%e7%9a%84raid%e5%8d%a1%e7%9f%a5%e8%af%86.html
http://www.mysqlsystems.com/2010/11/drbd-heartbeat-make-mysql-ha.html
http://www.wenzizone.cn/?p=234