IV 12 MySQL+drbd+heartbeat-阿里云开发者社区

一主多从是最常用的DB架构方案，该架构部署简单、维护方便，通过代理或程序的方式可实现rw splitting，且多个从库通过LVS或haproxy实现LB分担r的压力，排除了r的单点问题，但仅有一个主库这也是单点，若主出问题w将停止，最简单的方案人工介入，做监控，主一旦宕机，管理人员手动选择半同步的那个从改为主，让其它从与新的主同步，人工介入虽可行但高要求的场合并不适用

注：

正常情况下MySQL-M-active负责w，MySQL-M-inactive为不可见状态，MySQL slave负责r，另可做MySQL slave的LB，master和slave同步时利用其自身机制并通过VIP；web server在rw时通过程序自身实现，也可用mysql proxy或amoeba开源软件实现；

注：双主热备模式

1、安装配置heartbeat

准备环境：

VIP：10.96.20.8

master：eth0（10.96.20.113）、eth1（172.16.1.113，不配网关及dns）、主机名（test-master）

backup：eth0（10.96.20.114）、eth1（172.16.1.114，不配网关及dns）、主机名（test-backup）

双网卡、双硬盘、

注：eth0为管理IP；eth1心跳连接及drbd传输通道，若是生产环境中心跳传输和数据传输用一个网卡要做限制，给心跳留有带宽

注：规范vmware中标签，Xshell中标签，公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录，方便分发及管理维护

test-master（分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致，/etc/hosts文件，ssh双机互信，时间同步，iptables，selinux）：

[root@test-master ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5(Santiago)

[root@test-master ~]# uname -rm

2.6.32-431.el6.x86_64 x86_64

[root@test-master ~]# uname -n

test-master

[root@test-master ~]# ifconfig | grep eth0 -A 1

eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC

inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-master ~]# ifconfig | grep eth1 -A 1

eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6

inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-master ~]# route add -host 172.16.1.114 dev eth1 #（添加主机路由，心跳传送通过指定网卡出去，此句可追加到/etc/rc.local中，也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113）

[root@test-master ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12root@test-master

The key's randomart image is:

+--[ RSA 2048]----+

| E o.. |

| .+ + |

|.+.* . |

|oo* o. . |

|+o.. = S |

|+. o . + |

|o o . |

| . |

| |

+-----------------+

[root@test-master ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-backup

The authenticity of host 'test-backup(10.96.20.114)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-backup'(RSA) to the list of known hosts.

root@test-backup's password:

Now try logging into the machine, with"ssh 'root@test-backup'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keysthat you weren't expecting.

[root@test-master ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[root@test-master ~]# service crond restart

Stopping crond: [ OK ]

Starting crond: [ OK ]

[root@test-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-master ~]# rpm -ivh epel-release-6-8.noarch.rpm

warning: epel-release-6-8.noarch.rpm:Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY

Preparing... ########################################### [100%]

1:epel-release ########################################### [100%]

[root@test-master ~]# yum search heartbeat

……

heartbeat-devel.i686 : Heartbeatdevelopment package

heartbeat-devel.x86_64 : Heartbeatdevelopment package

heartbeat-libs.i686 : Heartbeat libraries

heartbeat-libs.x86_64 : Heartbeat libraries

heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux

[root@test-master ~]# yum -y install heartbeat

[root@test-master ~]# chkconfig heartbeat off

[root@test-master ~]# chkconfig --list heartbeat

heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off

test-backup：

[root@test-backup ~]# uname -n

test-backup

[root@test-backup ~]# ifconfig | grep eth0-A 1

eth0 Link encap:Ethernet HWaddr00:0C:29:15:E6:BB

inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-backup ~]# ifconfig | grep eth1-A 1

eth1 Link encap:Ethernet HWaddr00:0C:29:15:E6:C5

inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-backup ~]# route add -host 172.16.1.113 dev eth1

[root@test-backup ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8root@test-backup

The key's randomart image is:

+--[ RSA 2048]----+

| . |

| = . |

| . = * |

| . . . .. + + |

|. + . ..SE . |

| o = . . |

|. . = . |

| o . . . |

|o .o... |

+-----------------+

[root@test-backup ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-master

The authenticity of host 'test-master(10.96.20.113)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-master'(RSA) to the list of known hosts.

root@test-master's password:

Now try logging into the machine, with"ssh 'root@test-master'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keysthat you weren't expecting.

[root@test-backup ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[root@test-backup ~]# service crond restart

Stopping crond: [ OK ]

Starting crond: [ OK ]

[root@test-backup ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm

[root@test-backup ~]# yum -y install heartbeat

[root@test-backup ~]# chkconfig heartbeat off

[root@test-backup ~]# chkconfig --list heartbeat

heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off

test-master：

[root@test-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

[root@test-master ~]# cd /etc/ha.d

[root@test-master ha.d]# ls

authkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs

[root@test-master ha.d]# vim authkeys #（使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数，sha1后跟随机数）

auth 1

1 sha1912d6402295ac8d47109e56b177073b9

[root@test-master ha.d]# chmod 600 authkeys #（此文件权限600，否则启动服务时会报错）

[root@test-master ha.d]# ll !$

ll authkeys

-rw-------. 1 root root 692 Aug 7 21:51 authkeys

[root@test-master ha.d]# vim ha.cf

debugfile /var/log/ha-debug #（调试日志）

logfile /var/log/ha-log

logfacility local1 #（在rsyslog服务中配置通过local1接收日志）

keepalive 2 #（指定心跳间隔时间，即2s发一次广播）

deadtime 30 #（指定备node在30s内没收到主node的心跳信息则立即接管对方的服务资源）

warntime 10 #（指定心跳延迟的时间为10s，当10s内备node没收到主node的心跳信息，就会往日志中写警告，此时不会切换服务）

initdead 120 #（指定在heartbeat首次运行后，需等待120s才启动主node的各资源，此项用于解决等待对方heartbeat服务启动了自己才启，此项值至少要是deadtime的两倍）

udpport 694

#bcast eth0 #（指定心跳使用以太网广播方式在eth0上广播，若要使用两个实际网络传送心跳则要为bcast eth0 eth1）

mcast eth0 225.0.0.11 6941 0 #（设置多播通信的参数，多播地址在LAN内必须是唯一的，因为有可能有多个heartbeat服务，多播地址使用D类IP（224.0.0.0--239.255.255.255），格式为mcast devmcast_group port ttl loop）

auto_failback on #（用于主node恢复后failback）

node test-master #（主node主机名，uname -n结果）

node test-backup #（备node主机名）

crm no #（是否开启CRM功能）

[root@test-master ha.d]# vim haresources

test-master IPaddr::10.96.20.8/24/eth0 #（此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|start，IPaddr即是/etc/ha.d/resource.d/下的脚本）

[root@test-master ha.d]# scp authkeys ha.cf haresources root@test-backup:/etc/ha.d/

authkeys 100% 692 0.7KB/s 00:00

ha.cf 100% 10KB 10.3KB/s 00:00

haresources 100% 5944 5.8KB/s 00:00

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services:INFO: Resource is stopped

Done.

[root@test-master ha.d]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/07_22:39:00 INFO: Resource isstopped

Done.

[root@test-master ha.d]# ps aux | grep heartbeat

root 63089 0.0 3.1 50124 7164 ? SLs 22:38 0:00 heartbeat: mastercontrol process

root 63093 0.0 3.1 50076 7116 ? SL 22:38 0:00 heartbeat: FIFOreader

root 63094 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: write:mcast eth0

root 63095 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: read:mcast eth0

root 63136 0.0 0.3 103264 836 pts/0 S+ 22:39 0:00 grep heartbeat

[root@test-master ha.d]# ssh test-backup 'ps aux | grep heartbeat'

root 3050 0.0 3.1 50124 7164 ? SLs 22:39 0:00 heartbeat: master control process

root 3054 0.0 3.1 50076 7116 ? SL 22:39 0:00 heartbeat: FIFOreader

root 3055 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: write:mcast eth0

root 3056 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: read:mcast eth0

root 3094 0.0 0.5 106104 1368 ? Ss 22:39 0:00 bash -c ps aux | grep heartbeat

root 3108 0.0 0.3 103264 832 ? S 22:39 0:00 grep heartbeat

[root@test-master ha.d]# netstat -tnulp |grep heartbeat

udp 0 0 225.0.0.11:694 0.0.0.0:* 63094/heartbeat:wr

udp 0 0 0.0.0.0:50268 0.0.0.0:* 63094/heartbeat:wr

[root@test-master ha.d]# ssh test-backup 'netstat -tnulp | grep heartbeat'

udp 0 0 0.0.0.0:58019 0.0.0.0:* 3055/heartbeat:wri

udp 0 0 225.0.0.11:694 0.0.0.0:* 3055/heartbeat: wri

[root@test-master ha.d]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'

inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# service heartbeatstop

Stopping High-Availability services: Done.

[root@test-master ha.d]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'

inet 10.96.20.114/24 brd 10.96.20.255scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services:INFO: Resource is stopped

Done.

[root@test-master ha.d]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ha.d]# ssh test-backup 'ip addr | grep 10.96.20'

inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ~]# ssh test-backup 'service heartbeat stop'

Stopping High-Availability services: Done.

2、安装配置drbd

test-master：

[root@test-master ~]# fdisk -l

……

Disk /dev/sdb: 2147 MB, 2147483648 bytes

255 heads, 63 sectors/track, 261 cylinders

Units = cylinders of 16065 * 512 = 8225280bytes

Sector size (logical/physical): 512 bytes /512 bytes

I/O size (minimum/optimal): 512 bytes / 512bytes

Disk identifier: 0x00000000

[root@test-master ~]# parted /dev/sdb #（parted命令可支持大于2T的硬盘，将新硬盘分两个区，一个区用于放数据，另一个区用于drbd的meta data）

GNU Parted 2.1

Using /dev/sdb

Welcome to GNU Parted! Type 'help' to viewa list of commands.

(parted) h

align-check TYPE N check partition N for TYPE(min|opt) alignment

check NUMBER do a simple check on the file system

cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER copy file system to another partition

help [COMMAND] print general help, or helpon COMMAND

mklabel,mktable LABEL-TYPE create a new disklabel (partitiontable)

mkfs NUMBER FS-TYPE make a FS-TYPE file system on partition NUMBER

mkpart PART-TYPE [FS-TYPE] START END make a partition

mkpartfs PART-TYPE FS-TYPE START END make a partition with a file system

move NUMBER START END move partition NUMBER

name NUMBER NAME name partition NUMBER as NAME

print [devices|free|list,all|NUMBER] display the partition table, availabledevices, free space, all found partitions, or a

particular partition

quit exit program

rescue START END rescue a lost partition near START and END

resize NUMBER START END resize partition NUMBER and its file system

rmNUMBER delete partition NUMBER

select DEVICE choose the device to edit

setNUMBER FLAG STATE change the FLAG on partition NUMBER

toggle [NUMBER [FLAG]] toggle the state of FLAG on partition NUMBER

unit UNIT set the default unit to UNIT

version display the version number and copyright information of GNU Parted

(parted) mklabel gpt

(parted) mkpart primary 0 1024

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel?Ignore

(parted) mkpart primary 1025 2147

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore

(parted) p

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 2147MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

Number Start End Size File system Name Flags

1 17.4kB 1024MB 1024MB primary

2 1025MB 2147MB 1122MB primary

[root@test-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

warning:elrepo-release-6-6.el6.elrepo.noarch.rpm: Header V4 DSA/SHA1 Signature, key IDbaadae52: NOKEY

Preparing... ########################################### [100%]

1:elrepo-release ########################################### [100%]

[root@test-master ~]# yum -y install drbd kmod-drbd84

[root@test-master ~]# modprobe drbd

FATAL: Module drbd not found.

[root@test-master ~]# yum -y install kernel* #（更新内核后要重启系统）

[root@test-master ~]# uname -r

2.6.32-642.3.1.el6.x86_64

[root@test-master ~]# depmod

[root@test-master ~]# lsmod | grep drbd

drbd 372759 0

libcrc32c 1246 1 drbd

[root@test-master ~]# ll /usr/src/kernels/

total 12

drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64.debug

[root@test-master ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-master ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

test-backup：

[root@test-backup ~]# parted /dev/sdb

(parted) mklabel gpt

(parted) mkpart primary 0 4096

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore

(parted) mkpart primary 4097 5368

(parted) p

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 5369MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

Number Start End Size File system Name Flags

1 17.4kB 4096MB 4096MB primary

2 4097MB 5368MB 1271MB primary

[root@test-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# ll /etc/yum.repos.d/

total 20

-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo

-rw-r--r--. 1 root root 2150 Feb 9 2014elrepo.repo

-rw-r--r--. 1 root root 957 Nov 4 2012 epel.repo

-rw-r--r--. 1 root root 1056 Nov 4 2012epel-testing.repo

-rw-r--r--. 1 root root 529 Mar 30 23:00 rhel-source.repo.bak

[root@test-backup ~]# yum -y install drbd kmod-drbd84

[root@test-backup ~]# yum -y install kernel*

[root@test-backup ~]# depmod

[root@test-backup ~]# lsmod | grep drbd

drbd 372759 0

libcrc32c 1246 1 drbd

[root@test-backup ~]# chkconfig drbd off

[root@test-backup ~]# chkconfig --list drbd

drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

[root@test-backup ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-backup ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

test-master：

[root@test-master ~]# vim /etc/drbd.d/global_common.conf

[root@test-master ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf

global {

usage-countno;

}

common {

handlers{

}

startup{

}

options{

}

disk{

on-io-error detach;

}

net{

}

syncer{

rate50M;

verify-algcrc32c;

}

[root@test-master ~]# vim/etc/drbd.d/data.res

resource data {

protocol C;

on test-master {

device /dev/drbd0;

disk /dev/sdb1;

address 172.16.1.113:7788;

meta-disk /dev/sdb2[0];

}

on test-backup {

device /dev/drbd0;

disk /dev/sdb1;

address 172.16.1.114:7788;

meta-disk /dev/sdb2[0];

}

[root@test-master ~]# cd /etc/drbd.d

[root@test-master drbd.d]# scp global_common.conf data.res root@test-backup:/etc/drbd.d/

global_common.conf 100% 2144 2.1KB/s 00:00

data.res 100% 251 0.3KB/s 00:00

[root@test-master drbd.d]# drbdadm --help

USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}

GENERAL OPTIONS:

--stacked, -S

--dry-run, -d

--verbose, -v

--config-file=...,-c ...

--config-to-test=..., -t ...

--drbdsetup=..., -s ...

--drbdmeta=..., -m ...

--drbd-proxy-ctl=..., -p ...

--sh-varname=..., -n ...

--peer=..., -P ...

--version, -V

--setup-option=..., -W ...

--help, -h

COMMANDS:

attach disk-options

detach connect

net-options disconnect

up resource-options

down primary

secondary invalidate

invalidate-remote outdate

resize verify

pause-sync resume-sync

adjust adjust-with-progress

wait-connect wait-con-int

role cstate

dstate dump

dump-xml create-md

show-gi get-gi

dump-md wipe-md

apply-al hidden-commands

[root@test-master drbd.d]# drbdadm create-md data

initializing activity log

NOT initializing bitmap

Writing meta data...

New drbd meta data block successfullycreated.

[root@test-master drbd.d]# ssh test-backup 'drbdadm create-md data'

NOT initializing bitmap

initializing activity log

Writing meta data...

New drbd meta data block successfullycreated.

[root@test-master drbd.d]# drbdadm up data

[root@test-master drbd.d]# ssh test-backup 'drbdadm up data'

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

ns:0 nr:0 dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

ns:0 nr:0 dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data #（仅在主上执行）

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----

ns:339968 nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:660016

[=====>..............]sync'ed: 34.3% (660016/999984)K

finish:0:00:15 speed: 42,496 (42,496) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

ns:630784 nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:369200

[===========>........]sync'ed: 63.3% (369200/999984)K

finish:0:00:09 speed: 39,424 (39,424) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

ns:942080 nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:57904

[=================>..]sync'ed: 94.3% (57904/999984)K

finish:0:00:01 speed: 39,196 (39,252) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-1213:27:11

0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----

ns:0 nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[root@test-master drbd.d]# mkdir /drbd

[root@test-master drbd.d]# ssh test-backup 'mkdir /drbd'

[root@test-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0 #（仅在主上执行，meta分区不要格式化）

Writing superblocks and filesystemaccounting information: done

[root@test-master drbd.d]# tune2fs -c -1 /dev/drbd0

tune2fs 1.41.12 (17-May-2010)

Setting maximal mount count to -1

[root@test-master drbd.d]# mount /dev/drbd0 /drbd

[root@test-master drbd.d]# cd /drbd

[root@test-master drbd]# for i in `seq 1 10`; do touch test$i; done

[root@test-master drbd]# ls

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

[root@test-master drbd]# cd

[root@test-master ~]# umount /dev/drbd0

[root@test-master ~]# drbdadm secondary data

[root@test-master ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

test-backup：

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

ns:0 nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[root@test-backup ~]# drbdadm primary data

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----

ns:0 nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

[root@test-backup ~]# mount /dev/drbd0 /drbd

[root@test-backup ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

3、调试heartbeat+drbd

[root@test-master ~]# ssh test-backup 'umount /drbd'

[root@test-master ~]# ssh test-backup 'drbdadm secondary data'

[root@test-master ~]# service drbd stop

Stopping all DRBD resources: .

[root@test-master ~]# ssh test-backup 'service drbd stop'

Stopping all DRBD resources: .

[root@test-master ~]# service heartbeat status

heartbeat is stopped. No process

[root@test-master ~]# ssh test-backup 'service heartbeat status'

heartbeat is stopped. No process

[root@test-master ~]# ll /etc/ha.d/resource.d/{Filesystem,drbddisk}

-rwxr-xr-x. 1 root root 3162 Jan 12 2016 /etc/ha.d/resource.d/drbddisk

-rwxr-xr-x. 1 root root 1903 Dec 2 2013/etc/ha.d/resource.d/Filesystem

[root@test-master ~]# vim /etc/ha.d/haresources #（此行内容相当于脚本加参数的执行方式，例如#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 start|stop，#/etc/ha.d/resource.d/drbddisk data start|stop，#/etc/ha.d/resource.d/Filesystem/dev/drbd0 /drbd ext4 start|stop；heartbeat就是这样按配置的先后顺序控制资源的，如果heartbeat出问题了，可通过查看日志并单独运行这些命令排错）

test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd/0::/drbd::ext4

[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/

haresources 100% 5996 5.9KB/s 00:00

[root@test-master~]# service drbd start #（在主node执行）

Starting DRBD resources: [

create res: data

prepare disk: data

adjust disk: data

adjust net: data

]

..........

***************************************************************

DRBD's startup script waits for the peernode(s) to appear.

- Ifthis node was already a degraded cluster before the

reboot,the timeout is 0 seconds. [degr-wfc-timeout]

- Ifthe peer was available before the reboot, the timeout

is0 seconds. [wfc-timeout]

(These values are for resource 'data'; 0 sec -> wait forever)

Toabort waiting enter 'yes' [ 23]:

[root@test-backup~]# service drbd start #（在备node执行）

Starting DRBD resources: [

create res: data

prepare disk: data

adjust disk: data

adjust net: data

]

[root@test-master ~]# drbdadm role data

Secondary/Secondary

[root@test-master ~]# ssh test-backup 'drbdadm role data'

Secondary/Secondary

[root@test-master ~]# drbdadm -- --overwrite-data-of-peer primary data

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# service heartbeat start

Starting High-Availability services:INFO: Resource is stopped

Done.

[root@test-master ~]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/09_03:08:11 INFO: Resource isstopped

Done.

[root@test-master ~]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 6.3G 11G 38% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ~]# ssh test-backup 'ipaddr | grep 10.96.20'

inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# ssh test-backup 'df-h'

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 3.9G 13G 24% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ssh test-backup 'ls /drbd'

lost+found

test1

test10

test2

test3

test4

test5

test6

test7

test8

test9

[root@test-master ~]# drbdadm role data

Secondary/Primary

[root@test-master ~]# service heartbeat start #（主node恢复后，先确保把drbd理顺，弄正常，再开启heartbeat服务）

Starting High-Availability services:INFO: Resource is stopped

Done.

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 6.3G 11G 38% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 1.3M 896M 1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

4、分别在两主一从上，安装配置MySQL

MySQL-master-active：

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# groupadd -g 3306 mysql

[root@test-master ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@test-master ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@test-master ~]# mkdir /drbd/data #（两主要在drbd的挂载点处创建DB的数据目录，drbd仅同步MySQL的数据，程序文件都放在/usr/local/下）

[root@test-master ~]# chown -R mysql.mysql /drbd/data

[root@test-master ~]# rz #（上传mysql二进制包）

[root@test-master ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@test-master ~]# cd /usr/local

[root@test-master local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@test-master local]# cd mysql

[root@test-master mysql]# chown -R root.mysql ./

[root@test-master mysql]#scripts/mysql_install_db --user=mysql --datadir=/drbd/data #（仅在当前对外提供服务的主node初始化，即drbd的primary端）

Installing MySQL system tables...

160810 19:46:23 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 3908 ...

…….

[root@test-master mysql]# cp support-files/my-large.cnf /etc/my.cnf

[root@test-master mysql]# vim /etc/my.cnf #（添加如下两项）

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[root@test-master mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port =3306

socket =/tmp/mysql.sock

[mysqld]

port =3306

socket =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[root@test-master mysql]# scp /etc/my.cnf root@test-backup:/etc/

my.cnf 100% 4787 4.7KB/s 00:00

[root@test-master mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@test-master mysql]# chkconfig --add mysqld

[root@test-master mysql]# chkconfig mysqldoff

[root@test-master mysql]# chkconfig --list mysqld

mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off

[root@test-master mysql]# service mysqld start

Starting MySQL..... [ OK ]

[root@test-master mysql]#/usr/local/mysql/bin/mysql

……

mysql> GRANT ALL ON *.* TO 'root'@'%'IDENTIFIED BY 'redhat';

Query OK, 0 rows affected (0.28 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%' IDENTIFIED BY 'repluser';

Query OK, 0 rows affected (0.17 sec)

mysql> FLUSH PRIVILEGES;

Query OK, 0 rows affected (0.04 sec)

mysql> select User,Password,Host from mysql.user;

mysql> select User,Host,Password from mysql.user;

+----------+-------------+-------------------------------------------+

| User | Host | Password |

+----------+-------------+-------------------------------------------+

| root | localhost | |

| root | test-master | |

| root | 127.0.0.1 | |

| root | ::1 | |

| | localhost | |

| | test-master | |

| root | % |*84BB5DF4823DA319BBF86C99624479A198E6EEE9 |

| repluser | % |*89A63F9688240669B54B5C2649EEFB795850597E |

+----------+-------------+-------------------------------------------+

8 rows in set (0.23 sec)

mysql> create database webgame;

Query OK, 1 row affected (0.10 sec)

mysql> show databases;

+--------------------+

| Database |

+--------------------+

| information_schema |

| mysql |

| performance_schema |

| test |

| webgame |

+--------------------+

5 rows in set (0.04 sec)

mysql> \q

Bye

[root@test-master mysql]# ip addr | grep 10.96.20

inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master mysql]# df -h | grep drbd0

/dev/drbd0 946M 31M 866M 4% /drbd

[root@test-master ~]# vim /etc/ha.d/haresources

test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd0::/drbd::ext4 mysqld

[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/

MySQL-master-inactive：

[root@test-backup ~]# drbdadm role data

Secondary/Primary

[root@test-backup ~]# groupadd -g 3306 mysql

[root@test-backup ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@test-backup ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@test-backup ~]# rz

[root@test-backup ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@test-backup ~]# cd /usr/local

[root@test-backup local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@test-backup local]# cd mysql

[root@test-backup mysql]# chown -R root.mysql ./

[root@test-backup mysql]# vim /etc/my.cnf #（此文件从master active传来的，确认有如下配置）

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[root@test-backup mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@test-backup mysql]# chkconfig --add mysqld

[root@test-backup mysql]# chkconfig mysqldoff

[root@test-backup mysql]# chkconfig --list mysqld

mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off

mysql-slave：

[root@localhost ~]# mkdir /mydata/data -pv

mkdir: created directory `/mydata'

mkdir: created directory `/mydata/data'

[root@localhost ~]# groupadd -g 3306 mysql

[root@localhost ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@localhost ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@localhost ~]# rz

[root@localhost ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@localhost ~]# cd /usr/local

[root@localhost local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@localhost local]# cd mysql

[root@localhost mysql]# chown -R root.mysql./

[root@localhost mysql]# chown -R mysql.mysql /mydata/data

[root@localhost mysql]# cp support-files/my-large.cnf /etc/my.cnf

cp: overwrite `/etc/my.cnf'? y

[root@localhost mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@localhost mysql]# chkconfig --add mysqld

[root@localhost mysql]# chkconfig --list mysqld

mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off

[root@localhost mysql]# vim /etc/my.cnf

[mysqld]

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11

read_only=1

skip_slave_start=1

[root@localhost mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port =3306

socket =/tmp/mysql.sock

[mysqld]

port =3306

socket =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11

read_only=1

skip_slave_start=1

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[root@localhost mysql]# scripts/mysql_install_db --user=mysql --datadir=/mydata/data

Installing MySQL system tables...

160810 22:18:18 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:18 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46873 ...

Filling help tables...

160810 22:18:19 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:19 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46880 ...

……

[root@localhost mysql]# service mysqld start

Starting MySQL.. [ OK ]

[root@localhost ~]# mysql

mysql> CHANGE MASTER TO MASTER_USER='repluser',MASTER_PASSWORD='repluser',MASTER_HOST='10.96.20.8',MASTER_LOG_FILE='mysql-bin.000003',MASTER_LOG_POS=330;

Query OK, 0 rows affected (0.04 sec)

mysql> start slave;

Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G

……

测试分两步：

先测两主node间是否正常，调整好drbd并开启服务，先不要开启heartbeat，手动开启mysqld服务，在master-active创建新库，再关闭mysqld、将active的drbd置从；将inactive的drbd置为主，开启mysqld在master-inactive上查看；

再测在主切换后，主从同步能否继续，如下，正常

[root@test-backup ~]# tail -f /var/log/ha-log #（模拟active故障，在inactive查看take over过程）

Aug 10 22:40:38 test-backup heartbeat:[7738]: info: Local status now set to: 'up'

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Link test-master:eth0 up.

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Status update for node test-master: status active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Comm_now_up(): updating status to active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Local status now set to: 'active'

harc(default)[7747]: 2016/08/10_22:40:39 info: Running /etc/ha.d//rc.d/statusstatus

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: local resource transition completed.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: Initial resource acquisition complete (T_RESOURCES(us))

Aug 10 22:40:50 test-backup heartbeat:[7766]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: remote resource transition completed.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Received shutdown notice from 'test-master'.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Resources being acquired from test-master.

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: acquire local HA resources (standby).

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: local HA resource acquisition completed (standby).

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Standby resource acquisition done [all].

Aug 10 23:10:16 test-backup heartbeat:[7880]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

harc(default)[7905]: 2016/08/10_23:10:16 info: Running /etc/ha.d//rc.d/statusstatus

mach_down(default)[7922]: 2016/08/10_23:10:16 info: Taking overresource group IPaddr::10.96.20.8/24/eth0

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Acquiring resourcegroup: test-master IPaddr::10.96.20.8/24/eth0 drbddisk::dataFilesystem::/dev/drbd0::/drbd::ext4 mysqld

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[7977]: 2016/08/10_23:10:16 INFO: Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start

IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO: Adding inet address10.96.20.8/24 with broadcast address 10.96.20.255 to device eth0

IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO: Bringing device eth0up

IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.96.20.8 eth0 10.96.20.8 auto not_usednot_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[8076]: 2016/08/10_23:10:16 INFO: Success

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/drbddisk data start

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8231]: 2016/08/10_23:10:17 INFO: Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:17 info: Running/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start

Filesystem(Filesystem_/dev/drbd0)[8314]: 2016/08/10_23:10:17 INFO: Running start for/dev/drbd0 on /drbd

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8306]: 2016/08/10_23:10:17 INFO: Success

ResourceManager(default)[7949]: 2016/08/10_23:10:18 info: Running/etc/init.d/mysqld start

mach_down(default)[7922]: 2016/08/10_23:10:31 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

Aug 10 23:10:32test-backup heartbeat: [7738]: info: mach_down takeover complete.

mach_down(default)[7922]: 2016/08/10_23:10:33 info: mach_down takeovercomplete for node test-master.

[root@test-backup ~]# ip addr

……

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff

inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

inet6 fe80::20c:29ff:fe15:e6bb/64 scope link

valid_lft forever preferred_lft forever

……

[root@test-backup ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 18G 4.7G 12G 29% /

tmpfs 112M 0 112M 0% /dev/shm

/dev/sda1 283M 83M 185M 31% /boot

/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom

/dev/drbd0 946M 31M 866M 4% /drbd

[root@test-backup ~]# service mysqld status

MySQL running(8772) [ OK ]

[root@localhost ~]# mysql （在slave端查看主从同步是否正常）

Welcome to the MySQL monitor. Commands end with ; or \g.

……

mysql> show slave status\G

*************************** 1. row***************************

Slave_IO_State: Waiting formaster to send event

Master_Host: 10.96.20.8

Master_User: repluser

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000005

Read_Master_Log_Pos: 198

Relay_Log_File: relay-log.000004

Relay_Log_Pos: 344

Relay_Master_Log_File: mysql-bin.000005

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

……

mysql> show databases;

+--------------------+

| Database |

+--------------------+

| information_schema |

| mysql |

| performance_schema |

| test |

| webgame1 |

| webgame2 |

| webgame3 |

+--------------------+

7 rows in set (0.00 sec)

MySQL主从同步常用的架构方案：

1、一主一从

注：HA软件keepalived、heartbeat只负责VIP切换即可；

此方案部署简单、容易维护；

master故障后，业务可自动切换到slave；

rw都依赖主库，压力大，有锁、死锁等；

也可让slave有r服务，但要依赖程序代码实现；

2、一主多从

注：HA软件keepalived、heartbeat可只负责VIP的切换；

master故障后，业务可自动切到slave1上，这时slave2可能无法和slave1自动同步，解决办法使用semi-sync机制；

支持rw splitting，master负责w，slave负责r，但要通过程序代码实现；

3、双主

注：HA软件keepalived+LVS，MMM；

双主同步后，可将两个主做LB，任意一个主挂掉业务不受影响；

双主会有严重问题，会增加数据不一致的机率；

双主对性能提升不大，属复杂而并无太多好处的架构方案，不推荐；

4、双主多从：

注：HA软件MMM、keepalived；

若一个主挂掉，业务不受影响；

双写可以做，但会增加数据不一致机率；

同一时间只往一个主上写数据；

5、级联复制

注：HA软件keepalived、heartbeat，可只负责VIP的切换；

master故障切至master2上，master2依然继续向slave{1,2}同步；

slave{1,2}支持rw splitting，但要通过程序代码实现；

从库为级联同步，可能会有延迟，master2若故障，那slave的同步将中断；

6、drbd的双主

注：passive-server作为备用node时是不可见状态

7、

-------------------------------------------------------------------------

注：HA软件heartbeat既负责VIP切换，还负责drbd、mysqld服务的管理；

若master故障自动切至backup，slave{1,2}仍能与backup同步；

slave{1,2}支持rw splitting，但要通过程序代码实现；

此方案也支持semi-sync机制；

backup仅在提升为主时才能访问，正常情况下，master和backup仅有一台对外提供服务；

8、基于SAN存储的HA方案，Oracle、SQLserver常用

-----------------------------------------------------------------------

注：HA软件RedHat Cluster Site；

业务依赖SAN存储；

Backup仅在Master故障后，成功接管才能访问；

slave{1,2}支持rw splitting；

9、

注：部署灵活、资源利用率高；

双master负责w，slave负责r；

业务依赖DNS服务，对长连接的支持不好；

master故障影响从库；

10、

注：可用软件mysql-proxy、amoeba；

前端业务透明rw splitting，后端health check；

开源方案目前不稳定；

需要定制开发DBproxy；

11、分布式数据库集群高可用方案

注：DAL，data access layer；

12、

注：基于Galera高可用方案；

Galera是一套在MySQL InnoDB上实现Multi-Master且sychronousreplication的集群系统；特点：true multi-master;read&write to any node;synchronousreplication;no slave log,integrity issues;no master-slave failover,noVIP;multi-thread slave;automatic node provisioning;

13、MySQL官方cluster高可用方案

注：

MySQL HA架构方案选择依据：

	根据可用性	根据安全性	根据写性能
MySQL replication	98%--99.9+%	No	Fair
master-master with MMM manager	99%	No	Fair
heartbeat/SAN	99.5%--99.9%	Yes	Excellent
Heartbeat/drbd	99.9%	Yes	Good
NDB cluster	99.999%	yes	excellent

注：NDB cluster（very high,specific NDB knowledge,strom MySQL skills and strongsysadmin skills

MySQL目前存在的问题：

单机性能（QPS(rw)，响应时间，数据规模，IOPS是r操作和w操作的瓶颈）；

主从数据一致性（异步复制，semi-sync复制，顺序性+完整性）；

自动化扩容（数据迁移；按一定规模扩容（哈希取模、范围、日期、组合等，水平垂直拆分）；数据容量预估、提前预警（单表容量预估（业务评估）；buffer pool容量、命中率；磁盘容量）；全量+增量自动化扩容（从库提升为新主库；自动或手动；扩容完毕通知代理层对前端透明）；

主库单点（主备策略（备库只做数据同步，不做线上查询）；数据补全（从主库拉取binblog文件进行数据补全）；单点切换（主库宕机，切换新主库，尽量保持数据一致性（业务特性）；通知代理层切换新的主库对应透明）；

分布式数据库：

1、产品定位（尽量保证数据库特性，提升数据规模；线上低延迟的访问；满足具有一定复杂关系的数据操作）；

2、设计原则（实现mysql客户端通信协议；数据逻辑分布对应用透明；自动发现/人工决定/自动处理；支持单机事务）；

3、设计指标（千亿级别存贮数据；响应时间低于10ms；对上层应用完全透明）；

分布式数据库代理层（实现mysql客户端协议；rw splitting；LB，从库加权轮询等；数据查询合并；数据拆分规则；并发控制；sql白名单管理；单机事务支持（amoeba不支持事务）；服务端模型）；

监控（存活监控；主从延时监控；容量监控（表、磁盘）；流量监控（请求）；命中率监控（缓冲池）；关键数据收集上报）；

web监控和报警（界面对运维和DBA友好；可以触发集群管理操作（人工扩容、切换新主库）；监控数据异常报警（邮件、短信、级别不同方式不一样）；

元数据服务（存贮数据拆分规则（配置中心）；选举服务；实现fast paxos协议；数据原子广播通信协议；实现数据通知服务；锁服务；应用定位服务）；

单点切换服务（主库宕机提升备库或从库为新主库（ssh是否通，获取binlog补全数据），尽量保持数据一致性）；选取新主库的策略；新主库确定，通知前端代理层）；

数据迁移服务（根据监控数据和预值指标进行扩容；全量+增量；冗余数据自动清理；自动或人工迁移）

本文转自 chaijowin 51CTO博客，原文链接：http://blog.51cto.com/jowin/1837146，如需转载请自行联系原作者

IV 12 MySQL+drbd+heartbeat

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像