linux高可用集群heartbeat实现http的高可用
- - CSDN博客系统运维推荐文章 linux高可用集群的种类很多,比如常见的heartbeat,corosync,rhcs,keepalived,这些集群软件的出现为我们的业务生产环境提供了高可用的保证,本文将简单介绍一下用heartbeat的v2版本来处理一个简单的http高可用集群的搭建. 在实现http高可用集群之前,首先至少需要2台主机,并且需要做3点基本的准备工作:.
#debugfile /var/log/ha-debug #是否启用debug的日志 logfile /var/log/ha-log #日志文件的存放位置 #logfacility local0 #日志的设施,如果启用了logfile,就不要启动这个选项 keepalive 2 #每隔多少时间进行心跳检测一次 #deadtime 30 #服务器经过多少时间后,还没有检测到其存在,就认为其已经掉线 #warntime 10 #警告时长 #initdead 120 #一个集群起来多久,第二个集群还没启动,则认为集群不成功 #udpport 694 #监听的端口 #baud 19200 #串行线的发送速率 bcast eth0 #以广播的方式发送心跳检测(这里我们使用广播的方式,直接启动bcast eth0即可,这种方式在局域网中机子多的情况下,很耗费资源) #mcast eth0 255.0.0.1 694 1 0 #以多播的方式发送心跳检测 #ucast eth0 192.168.1.2 #以单播的方式发送心跳检测 #auto_failback on #主节点挂了以后,又恢复了,是否从新跳转到主节点上,on表示从新跳转。 #stonith baytech /etc/ha.d/conf/stonith.baytech #定义stonith,怎么隔绝不在线的节点 #node ken3 #集群内的节点名称,每一个节点需要使用一个node,并且值必须与uname -n的值相同 node test1.qiguo.com node test2.qiguo.com #ping 10.10.10.254 #指定ping的地址 ping 192.168.1.1 #网管地址
heartbeat[4825]: 2014/05/11_23:54:35 info: Version 2 support: false heartbeat[4825]: 2014/05/11_23:54:35 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[4825]: 2014/05/11_23:54:35 info: ************************** heartbeat[4825]: 2014/05/11_23:54:35 info: Configuration validated. Starting heartbeat 2.1.4 heartbeat[4826]: 2014/05/11_23:54:35 info: heartbeat: version 2.1.4 heartbeat[4826]: 2014/05/11_23:54:35 info: Heartbeat generation: 1399811242 heartbeat[4826]: 2014/05/11_23:54:35 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 heartbeat[4826]: 2014/05/11_23:54:35 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 heartbeat[4826]: 2014/05/11_23:54:35 info: glib: ping heartbeat started. heartbeat[4826]: 2014/05/11_23:54:35 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[4826]: 2014/05/11_23:54:35 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[4826]: 2014/05/11_23:54:35 info: G_main_add_SignalHandler: Added signal handler for signal 17 heartbeat[4826]: 2014/05/11_23:54:35 info: Local status now set to: 'up' heartbeat[4826]: 2014/05/11_23:54:36 info: Link test1.qiguo.com:eth0 up. heartbeat[4826]: 2014/05/11_23:54:36 info: Link 192.168.1.1:192.168.1.1 up. heartbeat[4826]: 2014/05/11_23:54:36 info: Status update for node 192.168.1.1: status ping heartbeat[4826]: 2014/05/11_23:54:41 info: Link test2.qiguo.com:eth0 up. heartbeat[4826]: 2014/05/11_23:54:41 info: Status update for node test2.qiguo.com: status up harc[4835]: 2014/05/11_23:54:41 info: Running /etc/ha.d/rc.d/status status heartbeat[4826]: 2014/05/11_23:54:42 info: Comm_now_up(): updating status to active heartbeat[4826]: 2014/05/11_23:54:42 info: Local status now set to: 'active' heartbeat[4826]: 2014/05/11_23:54:42 info: Status update for node test2.qiguo.com: status active harc[4853]: 2014/05/11_23:54:42 info: Running /etc/ha.d/rc.d/status status heartbeat[4826]: 2014/05/11_23:54:53 info: remote resource transition completed. heartbeat[4826]: 2014/05/11_23:54:53 info: remote resource transition completed. heartbeat[4826]: 2014/05/11_23:54:53 info: Initial resource acquisition complete (T_RESOURCES(us)) IPaddr[4907]: 2014/05/11_23:54:53 INFO: Resource is stopped heartbeat[4871]: 2014/05/11_23:54:53 info: Local Resource acquisition completed. harc[4957]: 2014/05/11_23:54:53 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[4957]: 2014/05/11_23:54:53 received ip-request-resp IPaddr::192.168.1.210/24/eth0 OK yes ResourceManager[4976]: 2014/05/11_23:54:53 info: Acquiring resource group: test1.qiguo.com IPaddr::192.168.1.210/24/eth0 httpd IPaddr[5002]: 2014/05/11_23:54:53 INFO: Resource is stopped ResourceManager[4976]: 2014/05/11_23:54:53 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.210/24/eth0 start IPaddr[5097]: 2014/05/11_23:54:53 INFO: Using calculated netmask for 192.168.1.210: 255.255.255.0 IPaddr[5097]: 2014/05/11_23:54:53 INFO: eval ifconfig eth0:0 192.168.1.210 netmask 255.255.255.0 broadcast 192.168.1.255 IPaddr[5068]: 2014/05/11_23:54:53 INFO: Success ResourceManager[4976]: 2014/05/11_23:54:53 info: Running /etc/init.d/httpd start观察日志,可以发现高可用的http集群已经启动起来了。现在人为的在test1这台机子上执行shutdown -h now后观察日志的变化。(也可以使用heartbeat自带的hb_standby脚本来切换,默认在/usr/lib/heartbeat目录下)
heartbeat[11796]: 2014/05/11_20:56:46 info: Received shutdown notice from 'test1.qiguo.com'. heartbeat[11796]: 2014/05/11_20:56:46 info: Resources being acquired from test1.qiguo.com. heartbeat[11862]: 2014/05/11_20:56:46 info: acquire local HA resources (standby). heartbeat[11863]: 2014/05/11_20:56:46 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys test2.qiguo.com] to acquire. heartbeat[11862]: 2014/05/11_20:56:46 info: local HA resource acquisition completed (standby). heartbeat[11796]: 2014/05/11_20:56:46 info: Standby resource acquisition done [all]. harc[11888]: 2014/05/11_20:56:46 info: Running /etc/ha.d/rc.d/status status mach_down[11903]: 2014/05/11_20:56:46 info: Taking over resource group IPaddr::192.168.1.210/24/eth0 ResourceManager[11928]: 2014/05/11_20:56:46 info: Acquiring resource group: test1.qiguo.com IPaddr::192.168.1.210/24/eth0 httpd IPaddr[11954]: 2014/05/11_20:56:46 INFO: Resource is stopped ResourceManager[11928]: 2014/05/11_20:56:46 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.210/24/eth0 start IPaddr[12049]: 2014/05/11_20:56:46 INFO: Using calculated netmask for 192.168.1.210: 255.255.255.0 IPaddr[12049]: 2014/05/11_20:56:46 INFO: eval ifconfig eth0:0 192.168.1.210 netmask 255.255.255.0 broadcast 192.168.1.255 IPaddr[12020]: 2014/05/11_20:56:46 INFO: Success ResourceManager[11928]: 2014/05/11_20:56:46 info: Running /etc/init.d/httpd start mach_down[11903]: 2014/05/11_20:56:46 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[11903]: 2014/05/11_20:56:46 info: mach_down takeover complete for node test1.qiguo.com. heartbeat[11796]: 2014/05/11_20:56:46 info: mach_down takeover complete.