Heartbeat+Drbd实现

简介:

继续之前的操作,在drbd部署完成之后,将drbd和heartbeat结合起来,实现drbd服务的高可用,并在主节点完成自动挂载,且能够做到故障自动切换。

按照之前的部署,只需要修改heartbeat中的资源,也即修改/etc/init.d/haresources文件的内容。

1、准备工作

注意:在配置drbd高可用之前,需要保证drbd服务是启动的,而且要实现两端都是secondary的状态,如下:

[root@heartbeat01 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----

    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

所以,需要在两个drbd节点上都把drbd设置为开机自动启动。

1
2
/etc/init .d /drbd  start
chkconfig drbd on

在上述工作完成之后,修改haresources文件,内容如下所示:

[root@heartbeat01 ~]# tail -1 /etc/ha.d/haresources 

heartbeat01.contoso.com  IPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4

#这里以heartbeat01为例,heartbeat02的配置和heartbeat01保持一致

2、启动heartbeat

然后,两个节点同时启动heartbeat服务,

/etc/init.d/heartbeat start

3、观察两个节点的服务

1)下面是节点1(heartbeat01)上的状态:

[root@heartbeat01 ~]# ip a |grep 49.100

    inet 172.16.49.100/24 brd 172.16.49.255 scope global secondary eth1

可以看到,节点1(heartbeat01)已经获取了VIP。

[root@heartbeat01 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

    ns:4 nr:0 dw:4 dr:709 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

而且,heartbeat01是drbd中的Primary节点。

[root@heartbeat01 ~]# mount 

/dev/mapper/VolGroup-lv_root on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw)

/dev/sda1 on /boot type ext4 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

/dev/drbd0 on /data type ext4 (rw)

heartbeat01已经自动挂载/dev/drbd0到/data下。

[root@heartbeat01 ~]# ls /data

10.txt  1.txt   29.txt  38.txt  47.txt  56.txt  65.txt  74.txt  83.txt  92.txt

11.txt  20.txt  2.txt   39.txt  48.txt  57.txt  66.txt  75.txt  84.txt  93.txt

12.txt  21.txt  30.txt  3.txt   49.txt  58.txt  67.txt  76.txt  85.txt  94.txt

13.txt  22.txt  31.txt  40.txt  4.txt   59.txt  68.txt  77.txt  86.txt  95.txt

14.txt  23.txt  32.txt  41.txt  50.txt  5.txt   69.txt  78.txt  87.txt  96.txt

15.txt  24.txt  33.txt  42.txt  51.txt  60.txt  6.txt   79.txt  88.txt  97.txt

16.txt  25.txt  34.txt  43.txt  52.txt  61.txt  70.txt  7.txt   89.txt  98.txt

17.txt  26.txt  35.txt  44.txt  53.txt  62.txt  71.txt  80.txt  8.txt   99.txt

18.txt  27.txt  36.txt  45.txt  54.txt  63.txt  72.txt  81.txt  90.txt  9.txt

19.txt  28.txt  37.txt  46.txt  55.txt  64.txt  73.txt  82.txt  91.txt  lost+found

同时,之前drbd同步的文件也都在。

2)下面是节点1(heartbeat01)上的状态:

[root@heartbeat02 ~]# ip a |grep 49.100

节点2上没有VIP。

[root@heartbeat02 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

    ns:0 nr:4 dw:4 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

节点2(heartbeat02)在drbd中是secondary状态。

[root@heartbeat02 ~]# mount -n

/dev/mapper/VolGroup-lv_root on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw)

/dev/sda1 on /boot type ext4 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

同时,heartbeat02也没有挂载/dev/drbd0。

[root@heartbeat02 ~]# ll /data

total 0

当然,/data下面什么都没有。

4、模拟故障切换场景

下面将heartbeat01的heartbeat服务停掉,查看drbd能否自动挂载到heartbeat02上。

[root@heartbeat01 ~]# /etc/init.d/heartbeat stop

Stopping High-Availability services: Done.

1)下面是节点1(heartbeat01)上的状态:

[root@heartbeat01 ~]# ip a|grep 49.100

[root@heartbeat01 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

    ns:16 nr:4 dw:20 dr:1418 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@heartbeat01 ~]# ll /data

total 0

2)下面是节点2(heartbeat02)上的状态:

[root@heartbeat02 ~]# ip a |grep 49.100

    inet 172.16.49.100/24 brd 172.16.49.255 scope global secondary eth1

[root@heartbeat02 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

    ns:4 nr:16 dw:20 dr:705 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@heartbeat02 ~]# ls /data

10.txt  1.txt   29.txt  38.txt  47.txt  56.txt  65.txt  74.txt  83.txt  92.txt

11.txt  20.txt  2.txt   39.txt  48.txt  57.txt  66.txt  75.txt  84.txt  93.txt

12.txt  21.txt  30.txt  3.txt   49.txt  58.txt  67.txt  76.txt  85.txt  94.txt

13.txt  22.txt  31.txt  40.txt  4.txt   59.txt  68.txt  77.txt  86.txt  95.txt

14.txt  23.txt  32.txt  41.txt  50.txt  5.txt   69.txt  78.txt  87.txt  96.txt

15.txt  24.txt  33.txt  42.txt  51.txt  60.txt  6.txt   79.txt  88.txt  97.txt

16.txt  25.txt  34.txt  43.txt  52.txt  61.txt  70.txt  7.txt   89.txt  98.txt

17.txt  26.txt  35.txt  44.txt  53.txt  62.txt  71.txt  80.txt  8.txt   99.txt

18.txt  27.txt  36.txt  45.txt  54.txt  63.txt  72.txt  81.txt  90.txt  9.txt

19.txt  28.txt  37.txt  46.txt  55.txt  64.txt  73.txt  82.txt  91.txt  lost+found

3)检查一下heartbeat02上的日志

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Received shutdown notice from 'heartbeat01.contoso.com'.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Resources being acquired from heartbeat01.contoso.com.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4150]: info: acquire local HA resources (standby).

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4150]: info: local HA resource acquisition completed (standby).

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Standby resource acquisition done [all].

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4151]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys heartbeat02.contoso.com] to acquire.

harc(default)[4176]: 2016/09/26_00:32:04 info: Running /etc/ha.d//rc.d/status status

mach_down(default)[4193]: 2016/09/26_00:32:04 info: Taking over resource group IPaddr::172.16.49.100/24/eth1

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Acquiring resource group: heartbeat01.contoso.com IPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100)[4248]: 2016/09/26_00:32:04 INFO:  Resource is stopped

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/IPaddr 172.16.49.100/24/eth1 start

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: Adding inet address 172.16.49.100/24 with broadcast address 172.16.49.255 to device eth1

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: Bringing device eth1 up

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-172.16.49.100 eth1 172.16.49.100 auto not_used not_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100)[4347]: 2016/09/26_00:32:04 INFO:  Success

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/drbddisk test start

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[4505]: 2016/09/26_00:32:04 INFO:  Resource is stopped

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext4 start

Filesystem(Filesystem_/dev/drbd0)[4595]: 2016/09/26_00:32:04 INFO: Running start for /dev/drbd0 on /data

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[4587]: 2016/09/26_00:32:04 INFO:  Success

mach_down(default)[4193]: 2016/09/26_00:32:04 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

mach_down(default)[4193]: 2016/09/26_00:32:04 info: mach_down takeover complete for node heartbeat01.contoso.com.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: mach_down takeover complete.

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: WARN: node heartbeat01.contoso.com: is dead

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: info: Dead node heartbeat01.contoso.com gave up resources.

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: info: Link heartbeat01.contoso.com:eth1 dead.

Sep 26 00:32:36 heartbeat02.contoso.com ipfail: [4110]: info: Status update: Node heartbeat01.contoso.com now has status dead

Sep 26 00:32:38 heartbeat02.contoso.com ipfail: [4110]: info: NS: We are dead. :<

Sep 26 00:32:38 heartbeat02.contoso.com ipfail: [4110]: info: Link Status update: Link heartbeat01.contoso.com/eth1 now has status dead

Sep 26 00:32:39 heartbeat02.contoso.com ipfail: [4110]: info: We are dead. :<

Sep 26 00:32:39 heartbeat02.contoso.com ipfail: [4110]: info: Asking other side for ping node count.



本文转自 jerry1111111 51CTO博客,原文链接:http://blog.51cto.com/jerry12356/1856566,如需转载请自行联系原作者

相关文章
|
Linux 网络协议
|
存储 监控 关系型数据库
|
网络协议 Oracle 关系型数据库
|
监控 应用服务中间件 数据库
|
监控 应用服务中间件 Python
|
Linux 网络协议 网络安全
|
Web App开发 Linux 开发工具
|
监控 关系型数据库 MySQL