Marathon 服务发现及负载均衡 marathon-lb
1- 简介
从官网摘抄了Mesos-DNS的缺陷,也是选择使用marathon-lb做服务发现和负载均衡的原因。
DNS does not identify service ports, unless you use an SRV query; most apps are not able to use SRV records “out of the box.”
DNS does not have fast failover.
DNS records have a TTL (time to live) and Mesos-DNS uses polling to create the DNS records; this can result in stale records.
DNS records do not provide any service health data.
Some applications and libraries do not correctly handle multiple A records; in some cases the query might be cached and not correctly reloaded as required.
Marathon-lb基于HAProxy,给基于TCP和HTTP协议的应用提供代理和负载均衡功能,此外还提供诸如SSL支持,HTTP压缩,健康检查,Lua脚本等。Marathon-lb订阅Marathon的事件总线,实时更新HAProxy的配置,并重载应用。
|| Marathon App 设置
2- App设置
创建app需要使用bridge模式,三端口[containerPort | hostPort | servicePort],通过标签[Labels]上组信息让marathon-lb查找定位。
该容器中应用的端口是80,投射到host上0端口即为随机分配端口,marathon-lb会去收集这些端口并最终在代理节点的servicePort上对外提供服务,这种方式要求网络模式必须为Bridge
Labels标签是让marathon-lb发现该app的关键,这里设置为
HAPROXY_GROUP=external
所以后面使用marathon-lb.py脚本的时候要加上--group external参数
|| marathon-lb 填坑指南
这是尝试过程中各种问题解决方法的记录,简明部署方法见下节marathon-lb部署
1- 安装基本环境 Lua & Haproxy
# cp -r /opt/marathon-lb-master /marathon-lb
# cd /marathon-lb
# sh build-haproxy.sh
装好所需包再运行
# yum install readline-devel
# sh build-haproxy.sh
2- 执行 marathon-lb
我们环境中还是Python2,所以将目录中所有py脚本中的python3 改成python
# vim /marathon-lb/marathon-lb.py
#!/usr/bin/env python
# chmod +x marathon-lb.py
执行marathon-lb脚本,加上marathon入口及app的组信息
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
SSL证书问题,先制作证书
3- 制作pem证书文件
# openssl genrsa -out /etc/ssl/mesosphere.com.key 1024
# openssl req -new -key /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.csr
# openssl x509 -req -days 3650 -in /etc/ssl/mesosphere.com.csr \
> -signkey /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.crt
# cat /etc/ssl/mesosphere.com.crt \
> /etc/ssl/mesosphere.com.key | tee /etc/ssl/mesosphere.com.pem
4- 服务重载问题
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
看一下marathon_lb.py脚本
之前build-haproxy.sh脚本是tar包安装haproxy,没有配置服务生命周期管理脚本,为方便,尝试yum安装看看,现有源的haproxy版本是1.5
5- YUM安装
# yum install haproxy
# systemctl restart haproxy.service
用脚本去侦测marathon上的app,更新 /etc/haproxy/haproxy.cfg
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
查看haproxy.service状态
server-state-file server-state-base lua-load load-server-state-from-file 等关键词不识别
原因在于这是haproxy 1.6才支持的新特性,marathon-lb脚本生成的haproxy.cfg是1.6以上格式,但CentOS现有源只能到1.5,参见如下链接:
http://blog.haproxy.com/2015/10/14/whats-new-in-haproxy-1-6/
找新的可以支持到1.6的第三方源
https://copr.fedorainfracloud.org/coprs/nibbler/haproxy16/
# vim /etc/yum.repo.d/haproxy.repo
[nibbler-haproxy16]
name=Copr repo for haproxy16 owned by nibbler
baseurl=https://copr-be.cloud.fedoraproject.org/results/nibbler/haproxy16/epel-7-$basearch/
skip_if_unavailable=True
gpgcheck=1
gpgkey=https://copr-be.cloud.fedoraproject.org/results/nibbler/haproxy16/pubkey.gpg
enabled=1
enabled_metadata=1
# yum update haproxy
# systemctl restart haproxy.service
# systemctl status haproxy.service -l
只剩下lua-load参数没有识别,tar包编译的时候加了lua support,但rpm包还没有,
# /usr/sbin/haproxy -vv
算了,放弃RPM包方式了,还是用tar包安装来做
|| marathon-lb 部署
0- 域名解析
负责做LB的这个节点,要可以解析marathon集群各节点域名
# vim /etc/hosts
10.63.240.131 ziyan.l-06-rbd-01
10.63.240.132 ziyan.l-07-rbd-02
10.63.240.133 ziyan.l-08-rbd-03
1- Tar包安装haproxy
# yum install readline-devel
# cp -r /opt/marathon-lb-master /marathon-lb
# cd /marathon-lb
# sh build-haproxy.sh
2- 服务管理
# useradd -r haproxy
# cp /usr/local/sbin/haproxy /usr/sbin/
# cp /usr/src/haproxy-1.6.4/examples/ haproxy. init /etc/init.d/haproxy
# chmod 755 /etc/init.d/haproxy
# mkdir /etc/haproxy
3- Haproxy运行测试
写个配置文件简单测试一下
# vim /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend http_front
bind *:80
stats uri /haproxy?stats
default_backend http_back
backend http_back
balance roundrobin
server test1 10.63.240.131:31687 check
# systemctl status haproxy.service -l
这里会有一个 line 26: [: =: unary operator expected 报错,修改启动脚本,给26行那个变量加上引号,避免空变量情况下的语法错误
# vim /etc/init.d/haproxy
[ "${NETWORKING}" = "no" ] && exit 0
此外,还有一个 cannot bind UNIX socket [/run/haproxy/admin.sock] 报错,创建目录
# mkdir /run/haproxy
# systemctl restart haproxy.service 终于可以启动了
4- SSL证书
# openssl genrsa -out /etc/ssl/mesosphere.com.key 1024
# openssl req -new -key /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.csr
# openssl x509 -req -days 3650 -in /etc/ssl/mesosphere.com.csr \
> -signkey /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.crt
# cat /etc/ssl/mesosphere.com.crt \
> /etc/ssl/mesosphere.com.key | tee /etc/ssl/mesosphere.com.pem
5- marathon-lb运行
# vim /marathon-lb/marathon-lb.py
#!/usr/bin/env python
# chmod +x marathon-lb.py
该脚本中重载的命令是 /etc/init.d/haproxy reload 但实际测试中发现偶尔会导致haproxy起不来,还是把reload改成restart比较稳妥
# vim /marathon-lb/marathon-lb.py
...
elif os.path.isfile('/etc/init.d/haproxy'):
logger.debug("we seem to be running on a sysvinit based system")
#reloadCommand = ['/etc/init.d/haproxy', 'reload']
reloadCommand = ['/etc/init.d/haproxy', 'restart']
...
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
查看/etc/haproxy/haproxy.cfg
已经对这三个instances做了负载均衡,再把这个app扩展到8个instances
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
# vim /etc/haproxy/haproxy.sh
已经对这三个instances做了负载均衡,再把这个app扩展到8个instances
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
# vim /etc/haproxy/haproxy.sh
网页访问本机的10000端口,即可查看业务,配置成功
http://10.63.240.128:10000
查看9090端口,HAProxy统计
http://10.63.240.128:9090/haproxy?stats
查看haproxy.conf配置
http://10.63.240.128:9090/_haproxy_getconfig
|| 容器化
1- 添加域名解析
# vim /etc/hosts
192.168.0.223 ziyan.l-06-rbd-01
192.168.0.224 ziyan.l-07-rbd-02
192.168.0.222 ziyan.l-08-rbd-03
2- 下载镜像
# docker pull mesosphere/marathon-lb
3- 使用sse模式
使用SSE模式, marathon-lb连接到marathon的事件endpoint,app状态改变时收到通知
# docker run -e PORTS=9090 --net=host mesosphere/marathon-lb sse \
> --marathon http://10.63.240.131:8080 --group external
现在haproxy的配置文件会根据App中instances变化情况实时更新
4- 把marathon-lb容器也加入marathon集群(需部署好marathonctl)
# vim /opt/marathonctl/apps/lb.json
{
"id": "marathon-lb",
"args":[
"sse",
"--marathon", "http://10.63.240.131:8080",
"--group", "external"
],
"cpus": 1,
"mem": 512,
"instances": 1,1- 简介
从官网摘抄了Mesos-DNS的缺陷,也是选择使用marathon-lb做服务发现和负载均衡的原因。
DNS does not identify service ports, unless you use an SRV query; most apps are not able to use SRV records “out of the box.”
DNS does not have fast failover.
DNS records have a TTL (time to live) and Mesos-DNS uses polling to create the DNS records; this can result in stale records.
DNS records do not provide any service health data.
Some applications and libraries do not correctly handle multiple A records; in some cases the query might be cached and not correctly reloaded as required.
Marathon-lb基于HAProxy,给基于TCP和HTTP协议的应用提供代理和负载均衡功能,此外还提供诸如SSL支持,HTTP压缩,健康检查,Lua脚本等。Marathon-lb订阅Marathon的事件总线,实时更新HAProxy的配置,并重载应用。
|| Marathon App 设置
2- App设置
创建app需要使用bridge模式,三端口[containerPort | hostPort | servicePort],通过标签[Labels]上组信息让marathon-lb查找定位。
该容器中应用的端口是80,投射到host上0端口即为随机分配端口,marathon-lb会去收集这些端口并最终在代理节点的servicePort上对外提供服务,这种方式要求网络模式必须为Bridge
Labels标签是让marathon-lb发现该app的关键,这里设置为
HAPROXY_GROUP=external
所以后面使用marathon-lb.py脚本的时候要加上--group external参数
|| marathon-lb 填坑指南
这是尝试过程中各种问题解决方法的记录,简明部署方法见下节marathon-lb部署
1- 安装基本环境 Lua & Haproxy
# cp -r /opt/marathon-lb-master /marathon-lb
# cd /marathon-lb
# sh build-haproxy.sh
装好所需包再运行
# yum install readline-devel
# sh build-haproxy.sh
2- 执行 marathon-lb
我们环境中还是Python2,所以将目录中所有py脚本中的python3 改成python
# vim /marathon-lb/marathon-lb.py
#!/usr/bin/env python
# chmod +x marathon-lb.py
执行marathon-lb脚本,加上marathon入口及app的组信息
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
SSL证书问题,先制作证书
3- 制作pem证书文件
# openssl genrsa -out /etc/ssl/mesosphere.com.key 1024
# openssl req -new -key /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.csr
# openssl x509 -req -days 3650 -in /etc/ssl/mesosphere.com.csr \
> -signkey /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.crt
# cat /etc/ssl/mesosphere.com.crt \
> /etc/ssl/mesosphere.com.key | tee /etc/ssl/mesosphere.com.pem
4- 服务重载问题
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
看一下marathon_lb.py脚本
之前build-haproxy.sh脚本是tar包安装haproxy,没有配置服务生命周期管理脚本,为方便,尝试yum安装看看,现有源的haproxy版本是1.5
5- YUM安装
# yum install haproxy
# systemctl restart haproxy.service
用脚本去侦测marathon上的app,更新 /etc/haproxy/haproxy.cfg
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
查看haproxy.service状态
server-state-file server-state-base lua-load load-server-state-from-file 等关键词不识别
原因在于这是haproxy 1.6才支持的新特性,marathon-lb脚本生成的haproxy.cfg是1.6以上格式,但CentOS现有源只能到1.5,参见如下链接:
http://blog.haproxy.com/2015/10/14/whats-new-in-haproxy-1-6/
找新的可以支持到1.6的第三方源
https://copr.fedorainfracloud.org/coprs/nibbler/haproxy16/
# vim /etc/yum.repo.d/haproxy.repo
[nibbler-haproxy16]
name=Copr repo for haproxy16 owned by nibbler
baseurl=https://copr-be.cloud.fedoraproject.org/results/nibbler/haproxy16/epel-7-$basearch/
skip_if_unavailable=True
gpgcheck=1
gpgkey=https://copr-be.cloud.fedoraproject.org/results/nibbler/haproxy16/pubkey.gpg
enabled=1
enabled_metadata=1
# yum update haproxy
# systemctl restart haproxy.service
# systemctl status haproxy.service -l
只剩下lua-load参数没有识别,tar包编译的时候加了lua support,但rpm包还没有,
# /usr/sbin/haproxy -vv
算了,放弃RPM包方式了,还是用tar包安装来做
|| marathon-lb 部署
0- 域名解析
负责做LB的这个节点,要可以解析marathon集群各节点域名
# vim /etc/hosts
10.63.240.131 ziyan.l-06-rbd-01
10.63.240.132 ziyan.l-07-rbd-02
10.63.240.133 ziyan.l-08-rbd-03
1- Tar包安装haproxy
# yum install readline-devel
# cp -r /opt/marathon-lb-master /marathon-lb
# cd /marathon-lb
# sh build-haproxy.sh
2- 服务管理
# useradd -r haproxy
# cp /usr/local/sbin/haproxy /usr/sbin/
# cp /usr/src/haproxy-1.6.4/examples/ haproxy. init /etc/init.d/haproxy
# chmod 755 /etc/init.d/haproxy
# mkdir /etc/haproxy
3- Haproxy运行测试
写个配置文件简单测试一下
# vim /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend http_front
bind *:80
stats uri /haproxy?stats
default_backend http_back
backend http_back
balance roundrobin
server test1 10.63.240.131:31687 check
# systemctl status haproxy.service -l
这里会有一个 line 26: [: =: unary operator expected 报错,修改启动脚本,给26行那个变量加上引号,避免空变量情况下的语法错误
# vim /etc/init.d/haproxy
[ "${NETWORKING}" = "no" ] && exit 0
此外,还有一个 cannot bind UNIX socket [/run/haproxy/admin.sock] 报错,创建目录
# mkdir /run/haproxy
# systemctl restart haproxy.service 终于可以启动了
4- SSL证书
# openssl genrsa -out /etc/ssl/mesosphere.com.key 1024
# openssl req -new -key /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.csr
# openssl x509 -req -days 3650 -in /etc/ssl/mesosphere.com.csr \
> -signkey /etc/ssl/mesosphere.com.key -out /etc/ssl/mesosphere.com.crt
# cat /etc/ssl/mesosphere.com.crt \
> /etc/ssl/mesosphere.com.key | tee /etc/ssl/mesosphere.com.pem
5- marathon-lb运行
# vim /marathon-lb/marathon-lb.py
#!/usr/bin/env python
# chmod +x marathon-lb.py
该脚本中重载的命令是 /etc/init.d/haproxy reload 但实际测试中发现偶尔会导致haproxy起不来,还是把reload改成restart比较稳妥
# vim /marathon-lb/marathon-lb.py
...
elif os.path.isfile('/etc/init.d/haproxy'):
logger.debug("we seem to be running on a sysvinit based system")
#reloadCommand = ['/etc/init.d/haproxy', 'reload']
reloadCommand = ['/etc/init.d/haproxy', 'restart']
...
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
查看/etc/haproxy/haproxy.cfg
已经对这三个instances做了负载均衡,再把这个app扩展到8个instances
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
# vim /etc/haproxy/haproxy.sh
已经对这三个instances做了负载均衡,再把这个app扩展到8个instances
# ./marathon_lb.py --marathon http://10.63.240.131:8080 --group external
# vim /etc/haproxy/haproxy.sh
网页访问本机的10000端口,即可查看业务,配置成功
http://10.63.240.128:10000
查看9090端口,HAProxy统计
http://10.63.240.128:9090/haproxy?stats
查看haproxy.conf配置
http://10.63.240.128:9090/_haproxy_getconfig
|| 容器化
1- 添加域名解析
# vim /etc/hosts
192.168.0.223 ziyan.l-06-rbd-01
192.168.0.224 ziyan.l-07-rbd-02
192.168.0.222 ziyan.l-08-rbd-03
2- 下载镜像
# docker pull mesosphere/marathon-lb
3- 使用sse模式
使用SSE模式, marathon-lb连接到marathon的事件endpoint,app状态改变时收到通知
# docker run -e PORTS=9090 --net=host mesosphere/marathon-lb sse \
> --marathon http://10.63.240.131:8080 --group external
现在haproxy的配置文件会根据App中instances变化情况实时更新
4- 把marathon-lb容器也加入marathon集群(需部署好marathonctl)
# vim /opt/marathonctl/apps/lb.json
{
"id": "marathon-lb",
"args":[
"sse",
"--marathon", "http://10.63.240.131:8080",
"--group", "external"
],
"cpus": 1,
"mem": 512,
"instances": 1,
"env": {
"PORTS": "9090"
},
"container": {
"type": "DOCKER",
"docker": {
"image": "mesosphere/marathon-lb",
"network": "HOST"
}
}
}
# usemara app create /opt/marathonctl/apps/lb.json
"env": {
"PORTS": "9090"
},
"container": {
"type": "DOCKER",
"docker": {
"image": "mesosphere/marathon-lb",
"network": "HOST"
}
}
}
# usemara app create /opt/marathonctl/apps/lb.json
http://blog.sina.com.cn/s/blog_6f2d2e310102wisi.html
已有 0 人发表留言,猛击->> 这里<<-参与讨论
ITeye推荐