openfiler集群部署过程

node1 node2
hostname: filer1
eth0: 10.10.10.101
eth1: 10.10.100.101
600MB Meta partition
8GB+ Data partition
hostname: filer2
eth0: 10.10.10.102
eth1: 10.10.100.102
600MB Meta partition
8GB+ Data partition

virtualip: 10.10.10.100 ( 不要在任何网卡配置,后面在corosync中配置 )

网卡信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@filer1 ~]# ip addr
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 1c:87:2c:47:db:1f brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc mq qlen 1000
link/ether 00:1b:21:89:43:6c brd ff:ff:ff:ff:ff:ff
inet 10.10.10.101/24 brd 10.10.10.255 scope global eth0
inet6 fe80::21b:21ff:fe89:436c/64 scope link
valid_lft forever preferred_lft forever
4: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc mq qlen 1000
link/ether 00:1b:21:89:43:6d brd ff:ff:ff:ff:ff:ff
inet 10.10.100.101/24 brd 10.10.100.255 scope global eth1
inet6 fe80::21b:21ff:fe89:436d/64 scope link
valid_lft forever preferred_lft forever
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@filer2 ~]# ip addr
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 10:c3:7b:93:2b:21 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc mq qlen 1000
link/ether 00:1b:21:89:42:40 brd ff:ff:ff:ff:ff:ff
inet 10.10.10.102/24 brd 10.10.10.255 scope global eth0
inet6 fe80::21b:21ff:fe89:4240/64 scope link
valid_lft forever preferred_lft forever
4: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc mq qlen 1000
link/ether 00:1b:21:89:42:41 brd ff:ff:ff:ff:ff:ff
inet 10.10.100.102/24 brd 10.10.100.255 scope global eth1
inet6 fe80::21b:21ff:fe89:4241/64 scope link
valid_lft forever preferred_lft forever

修改hosts

1
2
3
4
5
6
[root@filer1 ~]# vi /etc/hosts                                                                                  
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 filer1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
10.10.10.102 filer2
1
2
3
4
5
6
[root@filer2 ~]# vi /etc/hosts                                                                                  
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 filer2 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
10.10.10.101 filer1

SSH互信

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[root@filer1 ~]# ssh-keygen -t dsa

Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
99:47:bf:53:30:4f:ff:7c:b4:5e:1f:52:02:7b:fe:36 root@filer1
The key's randomart image is:
+--[ DSA 1024]----+
| |
| |
| ..o . |
| + .o= . |
| S ...oo.o|
| . ooooo|
| oo o=|
| .+E=|
| .oo|
+-----------------+

[root@filer1 ~]# ssh-copy-id -i /root/.ssh/id_dsa.pub filer2

The authenticity of host 'filer2 (10.10.10.102)' can't be established.
RSA key fingerprint is f8:7a:34:27:6c:5f:bc:cc:53:1f:49:1d:0d:88:4a:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'filer2,10.10.10.102' (RSA) to the list of known hosts.
root@filer2's password:
Now try logging into the machine, with "ssh 'filer2'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[root@filer2 ~]# ssh-keygen -t dsa

Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
27:77:a9:ed:1c:d8:2e:9d:23:1e:b1:27:98:75:62:66 root@filer2
The key's randomart image is:
+--[ DSA 1024]----+
| |
| |
| |
| . |
| S E + |
| @ @ |
| o *.=. |
| oB+. |
| ..o+. |
+-----------------+

[root@filer2 ~]# ssh-copy-id -i /root/.ssh/id_dsa.pub filer1

The authenticity of host 'filer1 (10.10.10.101)' can't be established.
RSA key fingerprint is 2d:f6:80:73:6e:e5:42:03:26:f6:c5:23:57:2f:42:fb.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'filer1,10.10.10.101' (RSA) to the list of known hosts.
root@filer1's password:
Now try logging into the machine, with "ssh 'filer1'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

准备分区

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
[root@filer1 ~]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System

Disk /dev/sdb: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders, total 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x9f02b0bc

Device Boot Start End Blocks Id System
/dev/sdb1 * 63 1028159 514048+ 83 Linux
/dev/sdb2 1028160 17800019 8385930 83 Linux
/dev/sdb3 17800020 21992984 2096482+ 82 Linux swap / Solaris

[root@filer1 ~]# fdisk /dev/sda

Command (m for help): p

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4, default 1):
Using default value 1
First sector (2048-976773167, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-976773167, default 976773167): +600M

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4, default 2):
Using default value 2
First sector (1230848-976773167, default 1230848):
Using default value 1230848
Last sector, +sectors or +size{K,M,G} (1230848-976773167, default 976773167): +2048^H^H^H^H^H^H^Hfffd
Unsupported sufffd'.'
Supported: 10^N: KB (KiloByte), MB (MegaByte), GB (GigaByte)
2^N: K (KibiByte), M (MebiByte), G (GibiByte)
Last sector, +sectors or +size{K,M,G} (1230848-976773167, default 976773167): +8G

Command (m for help): p

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System
/dev/sda1 2048 1230847 614400 83 Linux
/dev/sda2 1230848 18008063 8388608 83 Linux

Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 8e
Changed system type of partition 2 to 8e (Linux LVM)

Command (m for help): p

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System
/dev/sda1 2048 1230847 614400 83 Linux
/dev/sda2 1230848 18008063 8388608 8e Linux LVM

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@filer1 ~]# partprobe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
[root@filer2 ~]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System

Disk /dev/sdb: 160.0 GB, 160040803840 bytes
255 heads, 63 sectors/track, 19457 cylinders, total 312579695 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System
/dev/sdb1 * 63 1028159 514048+ 83 Linux
/dev/sdb2 1028160 17800019 8385930 83 Linux
/dev/sdb3 17800020 21992984 2096482+ 82 Linux swap / Solaris

[root@filer2 ~]# fdisk /dev/sda

The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.

Command (m for help): p

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4, default 1):
Using default value 1
First sector (2048-976773167, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-976773167, default 976773167): +600M

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4, default 2):
Using default value 2
First sector (1230848-976773167, default 1230848):
Using default value 1230848
Last sector, +sectors or +size{K,M,G} (1230848-976773167, default 976773167): +8G

Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 8e
Changed system type of partition 2 to 8e (Linux LVM)

Command (m for help): p

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x1c041c04

Device Boot Start End Blocks Id System
/dev/sda1 2048 1230847 614400 83 Linux
/dev/sda2 1230848 18008063 8388608 8e Linux LVM

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@filer2 ~]# partprobe

配置DRBD

修改配置文件drbd.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[root@filer1 ~]# vi /etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
#include "drbd.d/*.res";
resource meta {
on filer1 {
device /dev/drbd0;
disk /dev/sda1;
address 10.10.100.101:7788;
meta-disk internal;
}
on filer2 {
device /dev/drbd0;
disk /dev/sda1;
address 10.10.100.102:7788;
meta-disk internal;
}
}
resource data1 {
on filer1 {
device /dev/drbd1;
disk /dev/sda2;
address 10.10.100.101:7789;
meta-disk internal;
}
on filer2 {
device /dev/drbd1;
disk /dev/sda2;
address 10.10.100.102:7789;
meta-disk internal;
}
}

复制,使两个节点相同:

1
[root@filer1 etc]# scp /etc/drbd.conf filer2:/etc/drbd.conf
修改global_common.conf 配置文件(根据实际带宽修改rate属性)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
[root@filer1 ~]# vi /etc/drbd.d/global_common.conf 
global {
usage-count yes;
# minor-count dialog-refresh disable-ip-verification
}

common {
protocol C; # 使用C协议,即:所有节点写入成功后再返回。

handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}

startup {
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
}

disk {
# on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
# no-disk-drain no-md-flushes max-bio-bvecs
}

net {
# sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
# max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
# after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
}

syncer {
rate 100M; # 根据数据同步网卡的带宽修改,我的是千兆,所以设置100M。
# rate after al-extents use-rle cpu-mask verify-alg csums-alg
}
}

复制,使两个节点相同。

1
[root@filer1 etc]# scp /etc/drbd.d/global_common.conf filer2:/etc/drbd.d/global_common.conf

创建元数据

(如果提示需要清除分区数据,执行 #dd if=/dev/zero of=/dev/sda1 sda1修改为对应的分区)

1
[root@filer1 etc]# drbdadm create-md meta
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
  --==  Thank you for participating in the global usage survey  ==--
The server's response is:

md_offset 629141504
al_offset 629108736
bm_offset 629088256

Found some data

==> This might destroy existing data! <==

Do you want to proceed?
[need to type 'yes' to confirm] yes

Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
success

最好输入两次创建元数据命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@filer2 ~]# drbdadm create-md meta
md_offset 629141504
al_offset 629108736
bm_offset 629088256

Found some data

==> This might destroy existing data! <==

Do you want to proceed?
[need to type 'yes' to confirm] yes

You want me to create a v08 style flexible-size internal meta data block.
There appears to be a v08 flexible-size internal meta data block
already in place on /dev/sda1 at byte offset 629141504
Do you really want to overwrite the existing v08 meta-data?
[need to type 'yes' to confirm] yes

Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.

创建data1元数据分区也是一样:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[root@filer1 etc]# drbdadm create-md data1
md_offset 8589930496
al_offset 8589897728
bm_offset 8589635584

Found some data

==> This might destroy existing data! <==

Do you want to proceed?
[need to type 'yes' to confirm] yes

Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
success
[root@filer1 etc]# drbdadm create-md data1
md_offset 8589930496
al_offset 8589897728
bm_offset 8589635584

Found some data

==> This might destroy existing data! <==

Do you want to proceed?
[need to type 'yes' to confirm] yes

You want me to create a v08 style flexible-size internal meta data block.
There appears to be a v08 flexible-size internal meta data block
already in place on /dev/sda2 at byte offset 8589930496
Do you really want to overwrite the existing v08 meta-data?
[need to type 'yes' to confirm] yes

Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.

注意:在两个节点都要执行。

(有时如果提示需要清除分区数据,执行 #dd if=/dev/zero of=/dev/sda1 sda1是对应的分区)。

启动drbd服务

1
2
[root@filer1 ~]# service drbd start
[root@filer2 ~]# service drbd start

查看drbd状态的三种方法:(cat /proc/drbd drbd-overview service drbd status

1
2
3
4
5
6
7
[root@filer1 etc]# more /proc/drbd
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:614344
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8388316
1
2
3
[root@filer2 ~]# drbd-overview
0:meta Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
1:data1 Connected Secondary/Secondary Inconsistent/Inconsistent C r-----

此时双节点都是备用状态,并没有同步数据,需要定义一个主节点:

1
2
[root@filer1 ~]# drbdsetup /dev/drbd0 primary -o
[root@filer1 ~]# drbdsetup /dev/drbd1 primary -o
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@filer1 ~]# more /proc/drbd                                   
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:11520 nr:0 dw:0 dr:12192 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:602824
[>....................] sync'ed: 2.0% (602824/614344)K
finish: 0:07:47 speed: 1,280 (1,280) K/sec
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:6272 nr:0 dw:0 dr:6944 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8382044
[>....................] sync'ed: 0.1% (8184/8188)M
finish: 1:27:18 speed: 1,568 (1,568) K/sec

[root@filer1 ~]# drbd-overview
0:meta SyncSource Primary/Secondary UpToDate/Inconsistent C r-----
[>....................] sync'ed: 4.0% (590536/614344)K
1:data1 SyncSource Primary/Secondary UpToDate/Inconsistent C r-----
[>....................] sync'ed: 0.3% (8172/8188)M

[root@filer1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
m:res cs ro ds p mounted fstype
... sync'ed: 0.5% (8156/8188)M
... sync'ed: 6.7% (576200/614344)K
0:meta SyncSource Primary/Secondary UpToDate/Inconsistent C
1:data1 SyncSource Primary/Secondary UpToDate/Inconsistent C

从上面看到三种查看状态方法的区别

节点filer2 查看状态,此时两个节点自动开始同步数据了。

1
2
3
4
5
[root@filer2 ~]# drbd-overview
0:meta SyncTarget Secondary/Primary Inconsistent/UpToDate C r-----
[=========>..........] sync'ed: 52.0% (298312/614344)K
1:data1 SyncTarget Secondary/Primary Inconsistent/UpToDate C r-----
[>....................] sync'ed: 3.8% (7888/8188)M

同步完成

1
2
3
[root@filer1 ~]# drbd-overview
0:meta Connected Secondary/Primary UpToDate/UpToDate C r-----
1:data1 Connected Secondary/Primary UpToDate/UpToDate C r-----

配置元数据分区

文件系统的挂载只能在Primary节点进行,因此,也只有在设置了主节点后才能对drbd设备进行格式化。现在可以在作为主节点filer91上进行了,格式化后把它挂载到/meta目录上,过程如下:

格式化meta分区

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@filer1 ~]# mkfs.ext3 /dev/drbd0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
38400 inodes, 153586 blocks
7679 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=159383552
5 block groups
32768 blocks per group, 32768 fragments per group
7680 inodes per group
Superblock backups stored on blocks:
32768, 98304

Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 39 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

停止openfiler 服务,准备迁移openfiler系统文件

1
2
[root@filer1 ~]# service openfiler stop
Stopping openfiler: [ OK ]

迁移openfiler文件至meta分区(一行一行执行)

1
2
3
4
5
6
7
8
9
10
11
mkdir /meta
mount /dev/drbd0 /meta
mv /opt/openfiler/ /opt/openfiler.local
mkdir -p /meta/opt
cp -a /opt/openfiler.local /meta/opt/openfiler
ln -s /meta/opt/openfiler /opt/openfiler
rm -rf /meta/opt/openfiler/sbin/openfiler
ln -s /usr/sbin/httpd /meta/opt/openfiler/sbin/openfiler
rm -rf /meta/opt/openfiler/etc/rsync.xml
ln -s /opt/openfiler.local/etc/rsync.xml /meta/opt/openfiler/etc/
mkdir -p /meta/etc/httpd/conf.d

迁移Samba/NFS/ISCSI/PROFTPD 配置文件至 Meta 分区(一行一行执行)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@filer1 ~# service nfslock stop
root@filer1 ~# umount -a -t rpc-pipefs
root@filer1 ~# mkdir /meta/etc (有可能会提示已经存在,如存在跳过此步,不用创建)
root@filer1 ~# mv /etc/samba/ /meta/etc/
root@filer1 ~# ln -s /meta/etc/samba/ /etc/samba
root@filer1 ~# mkdir -p /meta/var/spool
root@filer1 ~# mv /var/spool/samba/ /meta/var/spool/
root@filer1 ~# ln -s /meta/var/spool/samba/ /var/spool/samba
root@filer1 ~# mkdir -p /meta/var/lib
root@filer1 ~# mv /var/lib/nfs/ /meta/var/lib/
root@filer1 ~# ln -s /meta/var/lib/nfs/ /var/lib/nfs
root@filer1 ~# mv /etc/exports /meta/etc/
root@filer1 ~# ln -s /meta/etc/exports /etc/exports
root@filer1 ~# mv /etc/ietd.conf /meta/etc/
root@filer1 ~# ln -s /meta/etc/ietd.conf /etc/ietd.conf
root@filer1 ~# mv /etc/initiators.allow /meta/etc/
root@filer1 ~# ln -s /meta/etc/initiators.allow /etc/initiators.allow
root@filer1 ~# mv /etc/initiators.deny /meta/etc/
root@filer1 ~# ln -s /meta/etc/initiators.deny /etc/initiators.deny
root@filer1 ~# mv /etc/proftpd /meta/etc/
root@filer1 ~# ln -s /meta/etc/proftpd/ /etc/proftpd

配置openfiler的HTTPD模块

1
2
root@filer91 ~# rm -rf /opt/openfiler/etc/httpd/modules
root@filer91 ~# ln -s /usr/lib64/httpd/modules /opt/openfiler/etc/httpd/modules

现在尝试启动openfiler服务

1
root@filer91 ~# service openfiler start

配置另一个节点

1
2
3
4
root@filer92 ~# service openfiler stop
root@filer92 ~# mkdir /meta
root@filer92 ~# mv /opt/openfiler/ /opt/openfiler.local
root@filer92 ~# ln -s /meta/opt/openfiler /opt/openfiler
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
root@filer92 ~# service nfslock stop
root@filer92 ~# umount -a -t rpc-pipefs
root@filer92 ~# rm -rf /etc/samba/
root@filer92 ~# ln -s /meta/etc/samba/ /etc/samba
root@filer92 ~# rm -rf /var/spool/samba/
root@filer92 ~# ln -s /meta/var/spool/samba/ /var/spool/samba
root@filer92 ~# rm -rf /var/lib/nfs/
root@filer92 ~# ln -s /meta/var/lib/nfs/ /var/lib/nfs
root@filer92 ~# rm -rf /etc/exports
root@filer92 ~# ln -s /meta/etc/exports /etc/exports
root@filer92 ~# rm -rf /etc/ietd.conf
root@filer92 ~# ln -s /meta/etc/ietd.conf /etc/ietd.conf
root@filer92 ~# rm -rf /etc/initiators.allow
root@filer92 ~# ln -s /meta/etc/initiators.allow /etc/initiators.allow
root@filer92 ~# rm -rf /etc/initiators.deny
root@filer92 ~# ln -s /meta/etc/initiators.deny /etc/initiators.deny
root@filer92 ~# rm -rf /etc/proftpd
root@filer92 ~# ln -s /meta/etc/proftpd/ /etc/proftpd

准备数据存储分区

编辑LVM规则

1
[root@filer1 ~]# vi /etc/lvm/lvm.conf

找到行:filer = [ "a/.*/"] 替换为 filter = [ "a|drbd[0-9]|", "r|.*|" ]

复制,使两节点相同:

1
[root@filer1 ~]# scp /etc/lvm/lvm.conf filer2:/etc/lvm/lvm.conf

创建将要使用的分区(只能在主节点操作)

1
2
3
[root@filer1 ~]# pvcreate /dev/drbd1
[root@filer1 ~]# vgcreate data /dev/drbd1
[root@filer1 ~]# lvcreate -L 300M -n filer data

开始corosync配置

创建corosync的KEY
1
2
3
4
5
[root@filer2 ~]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 128).

这一步需要在真实的控制台键入任意字符

拷贝authkey到另一节点

1
[root@filer2 ~]# scp /etc/corosync/authkey filer1:/etc/corosync/authkey 

在新节点更改authkey文件的权限

1
[root@filer1 /]# chmod 400 /etc/corosync/authkey

创建一个pcmk文件

1
[root@filer1 /]# vi /etc/corosync/service.d/pcmk
1
2
3
4
5
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 0
}

拷贝到另一节点

1
[root@filer1 /]# scp /etc/corosync/service.d/pcmk filer2:/etc/corosync/service.d/pcmk

创建 corosync.conf文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[root@filer1 /]# vi /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 10.10.100.0 # 修改对应的数据同步网段
mcastaddr: 226.94.1.1
mcastport: 5405
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
}

复制到另一节点

1
[root@filer1 /]# scp /etc/corosync/corosync.conf filer2:/etc/corosync/corosync.conf
首次启动corosync服务之前的准备工作
1
2
3
[root@filer1 ~]# chkconfig --level 2345 openfiler off
[root@filer1 ~]# chkconfig --level 2345 nfslock off
[root@filer1 ~]# chkconfig --level 2345 corosync on

在两个节点都进行上面三个操作。

(另外重启过后不用启动DRBD服务,corosync将会接管它们)

检查corosync状态
1
[root@filer1 ~]# ps auxf
1
2
3
4
5
6
7
root      1452  0.0  0.1 534468  4024 ?        Ssl  14:28   0:00 corosync
root 1458 0.0 0.0 66084 2744 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/stonithd
106 1459 0.0 0.1 67744 4944 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/cib
root 1460 0.0 0.0 70840 2224 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/lrmd
106 1461 0.0 0.0 66448 3048 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/attrd
106 1462 0.0 0.0 68940 2660 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/pengine
106 1463 0.0 0.0 76776 3388 ? S 14:28 0:00 \_ /usr/lib64/heartbeat/crmd

查看Corosync同步状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@filer1 ~]# crm_mon --one-shot -V
crm_mon[1722]: 2021/08/16_14:36:24 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
crm_mon[1722]: 2021/08/16_14:36:24 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
crm_mon[1722]: 2021/08/16_14:36:24 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
============
Last updated: Mon Aug 16 14:36:24 2021
Stack: openais
Current DC: filer2 - partition with quorum
Version: 1.1.2-c6b59218ee949eebff30e837ff6f3824ed0ab86b
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ filer1 filer2 ]

开始配置Corosync任务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#进入配置模式
root@filer01~# crm configure
#禁用stonith功能,corosync默认是启用stonith功能的,没有stonith设备,若直接去配置资源的话,verif会报错,无法commit
# property的用法和格式【usage: property [$id=<set_id>] <option>=<value> 】
crm(live)configure# property stonith-enabled=false
#检查设置的属性是否正确
crm(live)configure# verify
#忽略法定票数 只有一个节点时也提供资源服务
crm(live)configure# property no-quorum-policy=ignore
# 定义资源粘性
crm(live)configure# rsc_defaults resource-stickiness=100

#开始定义资源
# 定义虚拟IP资源:【clusterIP 资源类型ocf:heartbeat:IPaddr2 参数ip="10.10.10.90"……】
crm(live)configure# primitive ClusterIP ocf:heartbeat:IPaddr2 params ip=10.10.10.100 cidr_netmask=24 op monitor interval=30s
#定义文件系统资源,切换节点后自动挂载【定义meta分区资源】。
crm(live)configure# primitive MetaFS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/meta" fstype="ext3"
#定义LVM资源,
crm(live)configure# primitive lvmdata ocf:heartbeat:LVM params volgrpname="data"
#定义DRBD资源
crm(live)configure# primitive drbd_meta ocf:linbit:drbd params drbd_resource="meta" op monitor interval="15s"
crm(live)configure# primitive drbd_data ocf:linbit:drbd params drbd_resource="data1" op monitor interval="15s"

# 定义openfiler资源: 资源类型lsb
crm(live)configure# primitive openfiler lsb:openfiler
crm(live)configure# primitive iscsi lsb:iscsi-target
crm(live)configure# primitive samba lsb:smb
crm(live)configure# primitive nfs lsb:nfs
crm(live)configure# primitive nfslock lsb:nfslock
#定义资源组
crm(live)configure# group g_drbd drbd_meta drbd_data
crm(live)configure# group g_services MetaFS lvmdata openfiler ClusterIP iscsi samba nfs nfslock
#定义有主从属性的资源
crm(live)configure# ms ms_g_drbd g_drbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
#定义绑定约束:约束资源组g_services必须运行在ms_g_drbd的主节点上。
crm(live)configure# colocation c_g_services_on_g_drbd inf: g_services ms_g_drbd:Master
#定义顺序约束:只有ms_g_drbd资源准备好时才能开始启动g_services资源组。
crm(live)configure# order o_g_servicesafter_g_drbd inf: ms_g_drbd:promote g_services:start
#提交修改
crm(live)configure# commit

此时集群已经生效

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[root@filer1 ~]# crm_mon


Attempting connection to the cluster...
============
Last updated: Tue Aug 17 07:55:40 2021
Stack: openais
Current DC: filer2 - partition with quorum
Version: 1.1.2-c6b59218ee949eebff30e837ff6f3824ed0ab86b
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ filer1 filer2 ]

Resource Group: g_services
MetaFS (ocf::heartbeat:Filesystem): Started filer2
lvmdata (ocf::heartbeat:LVM): Started filer2
openfiler (lsb:openfiler): Started filer2
ClusterIP (ocf::heartbeat:IPaddr2): Started filer2
iscsi (lsb:iscsi-target): Started filer2
samba (lsb:smb): Started filer2
nfs (lsb:nfs): Started filer2
nfslock (lsb:nfslock): Started filer2
Master/Slave Set: ms_g_drbd
Masters: [ filer2 ]
Slaves: [ filer1 ]

Connection was reset.

其它教程的说法,此时就已经完美完成了。而实际测试过程发现下面几个问题:

  1. 使用reboot重启主节点时,lvmdata资源会卡住。从节点无法接管。

    临时的解决办法:直接将主节点断电可正常切换。

  2. 当切换节点后,新的主节点不能自动挂载lv。

    临时解决办法:

    将主节点的/mnt 目录移动到/meta中.再创建软连接到原来的位置。

    1
    2
    [root@filer1 ~]# mv /mnt /meta/
    [root@filer1 ~]# ln -s /meta/mnt /mnt

    在从节点直接删除/mnt ,同样创建软连接。

    1
    [root@filer2 /]# ln -s /meta/mnt /mnt

    修改corosync配置

    定义新资源(mount_lv1:资源名,/dev/data/ftp:openfiler中创建的lv,/mnt/data/ftp:对应挂载点, ext3:必须和格式化lv时相同),注意:必须使用openfilerWEB控制台创建LV,再设置对应如下的参数。

    1
    crm(live)configure# primitive mount_lv1 ocf:heartbeat:Filesystem params device=/dev/data/ftp directory=/mnt/data/ftp fstype=ext3 

    编辑资源组g_services,加入新定义的资源,注意group里是从左向右依次启动的,所以要注意添加位置

    1
    2
    3
    crm(live)configure# edit g_services

    group g_services MetaFS lvmdata openfiler ClusterIP mount_lv1 iscsi samba nfs nfslock

    最后提交修改

    1
    crm(live)configure# commit
  3. 发现Shares项目中没有之前创建的LV:filer . 修改openfiler的配置文件:volumes.xml

    1
    2
    3
    4
    5
    6
    7
    [root@filer1 ~]# vi /meta/opt/openfiler/etc/volumes.xml

    <?xml version="1.0" ?>
    <volumes>
    <volume id="ftp" name="FTP" mountpoint="/mnt/data/ftp/" vg="data" fstype="ext3" />
    <volume id="filer" name="Filer" mountpoint="/mnt/data/filer" vg="data" fstype="ext3" />
    </volumes>

    在刷新WEB控制台,发现filer显示正常,但是此时的LV:filer是无法管理的,在Volumes中删除此LV,然后可再创建相同或不同名称的LV。

  4. openfiler控制台中无法创建新的VG,只能在资源定义的VG(data)中创建lv。空间扩容受局限

    临时解决办法:先编辑drbd资源,创建新的data2资源,再编辑corosync定义资源,然后加入g_drbd资源组。

    首先设置所有节点开机不启动corosync

    1
    2
    [root@filer91 ~]# chkconfig --level 2345 corosync off
    [root@filer92 ~]# chkconfig --level 2345 corosync off

    重启后所有的集群服务都没有启动,包括drbd服务。准备好符合要求的lvm分区。(此例:/dev/sdc1)

    编辑/etc/drbd.conf 添加新资源,并复制到另一节点。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    resource data2 {	# 资源名称不要和其它资源相同
    on filer91 {
    device /dev/drbd2; # 注意块设备名称
    disk /dev/sdc1;
    address 10.10.100.91:7782; # 端口不要和其它资源相同
    meta-disk internal;
    }
    on filer92 {
    device /dev/drbd2; # 注意块设备名称
    disk /dev/sdc1;
    address 10.10.100.92:7782; # 端口不要和其它资源相同
    meta-disk internal;
    }
    }

    创建新的元数据 :如果提示需要清除数据,执行:dd if=/dev/zero of=/dev/sdc1

    1
    2
    [root@filer91 ~]# drbdadm create-md data2  # 建议运行两次
    [root@filer92 ~]# drbdadm create-md data2 # 另一个节点也要运行

    启动两个节点的DRBD服务

    1
    2
    [root@filer91 ~]# service drbd start
    [root@filer92 ~]# service drbd start

    使用drbd-overview 查看当前哪个是主节点primary 在主节点上设置新的drbd2为主资源。

    1
    #  drbdsetup /dev/drbd2 primary -o  # 这一步执行后,drbd会自动开始同步。

    完成同步后,创建drbd2分区

    1
    2
    3
    4
    5
    6
    7
    root@filer91 ~# pvcreate /dev/drbd2
    #此处理应添加进原来的VG:data ,但我在虚机环境中测试结果是添加了新的PV后,不能对data进行管理了。
    # 所以新建一个VG:data2
    root@filer91 ~# vgcreate data2 /dev/drbd2
    # 为确保可用,先手动添加一个lv,然后在控制台删除。
    root@filer91 ~# lvcreate -L 400M -n filer2 data2
    root@filer91 ~# mkfs.ext3 /dev/data2/filer2

    设置所有节点开机启动corosync,然后重启所有节点。

    1
    2
    [root@filer91 ~]# chkconfig --level 2345 corosync on
    [root@filer92 ~]# chkconfig --level 2345 corosync on

    修改corosync配置(任意节点)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    # 进入配置模式
    [root@filer01~]# crm configure
    # 定义VG资源
    crm(live)configure# primitive lvmdata2 ocf:heartbeat:LVM params volgrpname="data2"
    # 定义drbd资源
    crm(live)configure# primitive drbd_data2 ocf:linbit:drbd params drbd_resource="data2" op monitor interval="15s"
    # 编辑 drbd资源组 加入drbd_drbd2资源
    crm(live)configure# edit g_drbd
    group g_drbd drbd_meta drbd_data drbd_drbd2
    # 编辑 g_services资源组 加入lvmdata2资源,注意位置。
    crm(live)configure# edit g_services
    group g_services MetaFS lvmdata lvmdata2 openfiler ClusterIP mount_lv1 iscsi samba nfs nfslock
    # 提交
    crm(live)configure# commit

    此时检查集群状态,应该已经启动完成了。

    1
    [root@filer01~]# crm_mon

corosync约束

资源约束定义

若想将多个资源运行在同一个节点上,则,做成组,或定义排列约束。

资源约束则用以指定在哪些群集节点上运行资源,以何种顺序装载资源,以及特定资源依赖于哪些其它资源。

pacemaker共给我们提供了三种资源约束方法:

1)Resource Location(资源位置):定义资源可以、不可以或尽可能在哪些节点上运行;
2)Resource Collocation(资源排列):排列约束定义集群资源可以、不可以在某个节点上同时运行;
3)Resource Order(资源顺序):顺序约束定义集群资源在节点上启动的顺序;

定义约束示例(排列约束、顺序约束)

1
crm(live)configure# colocation drbd_with_ms_mfsdrbd inf: drbdfs ms_mfsdrbd:Master	【注:挂载资源追随drbd主资源】
1
crm(live)configure# order drbd_after_ms_mfsdrbd mandatory: ms_mfsdrbd:promote drbdfs:start	【注:节点上存在drbdMaster才能启动drbdfs服务】
1
crm(live)configure# colocation mfsserver_with_drbdfs inf: mfsserver drbdfs	【注:mfs服务追随挂载资源】
1
crm(live)configure# order mfsserver_after_drbdfs mandatory: drbdfs:start mfsserver:start	【注:drbdfs服务启动才能启动mfs服务】
1
crm(live)configure# colocation vip_with_mfsserver inf: vip mfsserver	【注:vip追随mfs服务】
1
crm(live)configure# order vip_before_mfsserver mandatory: vip mfsserver	【注:vip启动才能启动mfs服务】

openfiler集群部署过程
http://anximin.github.io/2021/08/03/NAS_openfiler_cluster_build/
作者
Sylar
发布于
2021年8月3日
许可协议