linux

centos 8 - pacemaker + iscsi + gfs2 shared storage 이중화

sysman 2021. 1. 5. 23:32

 

################################################################

###############        SERVER 2 세팅       #############################

 

 

 

[root@server2 ~]# lsscsi

 

[root@server2 ~]# find /sys -name scan

 

/sys/module/scsi_mod/parameters/scan

[root@server2 ~]# echo "- - -" >> /sys/devices/pci0000:00/0000:00:10.0/host0/scsi_host/host0/scan

[root@server2 ~]# lsscsi

[0:0:1:0]    disk    VMware,  VMware Virtual S 1.0   /dev/sdb

[root@server2 ~]#

[root@server2 ~]#

[root@server2 ~]# fdisk /dev/sdb

 

Welcome to fdisk (util-linux 2.32.1).

Changes will remain in memory only, until you decide to write them.

Be careful before using the write command.

 

Device does not contain a recognized partition table.

Created a new DOS disklabel with disk identifier 0xaf2c7031.

 

Command (m for help): n

Partition type

   p   primary (0 primary, 0 extended, 4 free)

   e   extended (container for logical partitions)

Select (default p): p

Partition number (1-4, default 1):

First sector (2048-2097151, default 2048):

Last sector, +sectors or +size{K,M,G,T,P} (2048-2097151, default 2097151):

 

Created a new partition 1 of type 'Linux' and of size 1023 MiB.

 

Command (m for help): w

The partition table has been altered.

Calling ioctl() to re-read partition table.

Syncing disks.

 

 

[root@server2 ~]# dnf -y install lvm2 targetcli

 

 

[root@server2 ~]#

[root@server2 ~]# lsblk

NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT

sda           8:0    0  100G  0 disk

├─sda1        8:1    0    1G  0 part /boot

└─sda2        8:2    0   99G  0 part

  ├─cl-root 253:0    0 91.9G  0 lvm  /

  ├─cl-swap 253:1    0  2.1G  0 lvm  [SWAP]

  └─cl-home 253:2    0    5G  0 lvm  /home

sdb           8:16   0    1G  0 disk

└─sdb1        8:17   0 1023M  0 part

[root@server2 ~]#

[root@server2 ~]#

[root@server2 ~]#

[root@server2 ~]# targetcli

Warning: Could not load preferences file /root/.targetcli/prefs.bin.

targetcli shell version 2.1.53

Copyright 2011-2013 by Datera, Inc and others.

For help on commands, type 'help'.

 

/> cd /backstores/block

/backstores/block> create block1 /dev/sdb1

Created block storage object block1 using /dev/sdb1.

 

/backstores/block> cd ../..

/> cd iscsi

/iscsi> create iqn.2021-01.com.example.server2:disk1

Created target iqn.2021-01.com.example.server2:disk1.

Created TPG 1.

Global pref auto_add_default_portal=true

Created default portal listening on all IPs (0.0.0.0), port 3260.

 

/iscsi> cd iqn.2021-01.com.example.server2:disk1/tpg1/acls

 

/iscsi/iqn.20...sk1/tpg1/acls> create iqn.2021-01.com.example.server3:server3

Created Node ACL for iqn.2021-01.com.example.server3:server3

 

/iscsi/iqn.20...sk1/tpg1/acls> cd ..

/iscsi/iqn.20...r2:disk1/tpg1> cd luns

/iscsi/iqn.20...sk1/tpg1/luns> create /backstores/block/block1

Created LUN 0.

Created LUN 0->0 mapping in node ACL iqn.2021-01.com.example.server3:server3

 

/iscsi/iqn.20...sk1/tpg1/luns> cd ..

/iscsi/iqn.20...r2:disk1/tpg1> cd portals/

/iscsi/iqn.20.../tpg1/portals> delete 0.0.0.0 3260

Deleted network portal 0.0.0.0:3260

/iscsi/iqn.20.../tpg1/portals> create 192.168.10.220 3260

Using default IP port 3260

Created network portal 192.168.10.220:3260.

/iscsi/iqn.20.../tpg1/portals> create 192.168.10.230 3260

Using default IP port 3260

Created network portal 192.168.10.230:3260.

 

/iscsi/iqn.20.../tpg1/portals>

/iscsi/iqn.20.../tpg1/portals> ls

o- portals ...................................................................................... [Portals: 2]

  o- 192.168.10.220:3260 ................................................................................ [OK]

  o- 192.168.10.230:3260 ................................................................................ [OK]

 

/iscsi/iqn.20.../tpg1/portals> exit

Global pref auto_save_on_exit=true

Configuration saved to /etc/target/saveconfig.json

[root@server2 ~]# targetcli ls

o- / ................................................................................................... [...]

  o- backstores ........................................................................................ [...]

  | o- block ............................................................................ [Storage Objects: 1]

  | | o- block1 ................................................. [/dev/sdb1 (1023.0MiB) write-thru activated]

  | |   o- alua ............................................................................. [ALUA Groups: 1]

  | |     o- default_tg_pt_gp ................................................. [ALUA state: Active/optimized]

  | o- fileio ........................................................................... [Storage Objects: 0]

  | o- pscsi ............................................................................ [Storage Objects: 0]

  | o- ramdisk .......................................................................... [Storage Objects: 0]

  o- iscsi ...................................................................................... [Targets: 1]

  | o- iqn.2021-01.com.example.server2:disk1 ....................................................... [TPGs: 1]

  |   o- tpg1 ......................................................................... [no-gen-acls, no-auth]

  |     o- acls .................................................................................... [ACLs: 1]

  |     | o- iqn.2021-01.com.example.server3:server3 ........................................ [Mapped LUNs: 1]

  |     |   o- mapped_lun0 .......................................................... [lun0 block/block1 (rw)]

  |     o- luns .................................................................................... [LUNs: 1]

  |     | o- lun0 .............................................. [block/block1 (/dev/sdb1) (default_tg_pt_gp)]

  |     o- portals .............................................................................. [Portals: 2]

  |       o- 192.168.10.220:3260 ........................................................................ [OK]

  |       o- 192.168.10.230:3260 ........................................................................ [OK]

  o- loopback ................................................................................... [Targets: 0]

 

 

[root@server2 ~]# firewall-cmd --permanent --add-port=3260/tcp

[root@server2 ~]# firewall-cmd --reload

[root@server2 ~]# firewall-cmd --list-all

 

 

[root@server2 ~]# targetcli

targetcli shell version 2.1.53

Copyright 2011-2013 by Datera, Inc and others.

For help on commands, type 'help'.

 

/iscsi/iqn.20.../tpg1/portals> exit

Global pref auto_save_on_exit=true

Last 10 configs saved in /etc/target/backup/.

Configuration saved to /etc/target/saveconfig.json

[root@server2 ~]# targetcli ls

o- / ................................................................................................... [...]

  o- backstores ........................................................................................ [...]

  | o- block ............................................................................ [Storage Objects: 1]

  | | o- block1 ................................................. [/dev/sdb1 (1023.0MiB) write-thru activated]

  | |   o- alua ............................................................................. [ALUA Groups: 1]

  | |     o- default_tg_pt_gp ................................................. [ALUA state: Active/optimized]

  | o- fileio ........................................................................... [Storage Objects: 0]

  | o- pscsi ............................................................................ [Storage Objects: 0]

  | o- ramdisk .......................................................................... [Storage Objects: 0]

  o- iscsi ...................................................................................... [Targets: 1]

  | o- iqn.2021-01.com.example.server2:disk1 ....................................................... [TPGs: 1]

  |   o- tpg1 ......................................................................... [no-gen-acls, no-auth]

  |     o- acls .................................................................................... [ACLs: 1]

  |     | o- iqn.2021-01.com.example.server3:server3 ........................................ [Mapped LUNs: 1]

  |     |   o- mapped_lun0 .......................................................... [lun0 block/block1 (rw)]

  |     o- luns .................................................................................... [LUNs: 1]

  |     | o- lun0 .............................................. [block/block1 (/dev/sdb1) (default_tg_pt_gp)]

  |     o- portals .............................................................................. [Portals: 2]

  |       o- 192.168.10.220:3260 ........................................................................ [OK]

  |       o- 192.168.10.230:3260 ........................................................................ [OK]

  o- loopback ................................................................................... [Targets: 0]

[root@server2 ~]#

[root@server2 ~]# targetcli

/> saveconfig

Last 10 configs saved in /etc/target/backup/.

Configuration saved to /etc/target/saveconfig.json

/> exit

Global pref auto_save_on_exit=true

Configuration saved to /etc/target/saveconfig.json

[root@server2 ~]# reboot

 

 

 

 

 

###############################################################

###################     SERVER 3 세팅     ###########################

 

centos 8 에서는 dlm 이 없어 dlm 리빌드함

[root@server3 ~]# yum install yum-utils

[root@server3 ~]# yumdownloader --source dlm

[root@server3 ~]# yum -y install rpm-build libxml2-devel systemd-devel glibc-kernheaders gcc make

[root@server3 ~]# dnf config-manager --set-enabled ha

[root@server3 ~]# dnf -y install pacemaker fence-agents-all gfs2-utils pcp-zeroconf corosynclib-devel pacemaker-libs-devel

[root@server3 ~]# rpmbuild --rebuild dlm-4.0.9-3.el8.src.rpm

[root@server3 ~]# rpm -ivh /root/rpmbuild/RPMS/x86_64/dlm-4.0.9-3.el8.x86_64.rpm --nodeps

[root@server3 ~]# dnf repolist

 

 

계정 암호 등록

[root@server3 ~]# passwd hacluster

Changing password for user hacluster.

New password:

BAD PASSWORD: The password contains the user name in some form

Retype new password:

passwd: all authentication tokens updated successfully.

 

pcs 설치

[root@server3 ~]# dnf -y install pcs

[root@server3 ~]# systemctl start pcsd

[root@server3 ~]# systemctl enable pcsd

 

방화벽 오픈

[root@server3 ~]# firewall-cmd --permanent --zone=public --add-port=2224/tcp

[root@server3 ~]# firewall-cmd --permanent --zone=public --add-port=5405/udp

[root@server3 ~]# firewall-cmd --permanent --zone=public --add-service=high-availability

[root@server3 ~]# firewall-cmd --reload

 

hosts 작성

[root@server3 ~]# vi /etc/hosts

192.168.10.210 server2.example.com

192.168.10.220 server3.example.com

192.168.10.230 server4.example.com

 

호스트 인증

[root@server3 ~]# pcs host auth server3.example.com server4.example.com

Username: hacluster

Password:

server4.example.com: Authorized

server3.example.com: Authorized

 

클러스터 셋업

[root@server3 ~]# pcs cluster setup cluster server3.example.com server4.example.com

 

싱크

[root@server3 ~]# pcs cluster sync

server3.example.com: Succeeded

server4.example.com: Succeeded

 

cluster 스타트

[root@server3 ~]# pcs cluster start --all

server3.example.com: Starting Cluster...

server4.example.com: Starting Cluster...

 

corosync 매핑 확인

[root@server3 ~]# corosync-cfgtool -s

Printing link status.

Local node ID 1

LINK ID 0

        addr    = 192.168.10.220

        status:

                nodeid  1:      link enabled:1  link connected:1

                nodeid  2:      link enabled:1  link connected:1

 

조인 확인

[root@server3 ~]# corosync-cmapctl | grep -i members

runtime.members.1.config_version (u64) = 0

runtime.members.1.ip (str) = r(0) ip(192.168.10.220)

runtime.members.1.join_count (u32) = 1

runtime.members.1.status (str) = joined

runtime.members.2.config_version (u64) = 0

runtime.members.2.ip (str) = r(0) ip(192.168.10.230)

runtime.members.2.join_count (u32) = 1

runtime.members.2.status (str) = joined

 

[root@server3 ~]# pcs status corosync

 

 

[root@server3 ~]# crm_verify -L -V

(unpack_resources)      error: Resource start-up disabled since no STONITH resources have been defined

(unpack_resources)      error: Either configure some or disable STONITH with the stonith-enabled option

(unpack_resources)      error: NOTE: Clusters with shared data need STONITH to ensure data integrity

 

[root@server3 ~]# pcs property set stonith-enabled=false

[root@server3 ~]# crm_verify -L -V

[root@server3 ~]#

 

 

 

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 19:29:49 2021

  * Last change:  Tue Jan  5 19:29:25 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 0 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * No resources

 

Daemon Status:

  corosync: active/disabled

  pacemaker: active/disabled

  pcsd: active/enabled

 

[root@server3 ~]# systemctl enable pacemaker

Created symlink /etc/systemd/system/multi-user.target.wants/pacemaker.service → /usr/lib/systemd/system/pacemaker.service.

[root@server3 ~]# systemctl enable corosync

Created symlink /etc/systemd/system/multi-user.target.wants/corosync.service → /usr/lib/systemd/system/corosync.service.

[root@server3 ~]#

 

[root@server3 ~]# dnf -y install lvm2-lockd gfs2-utils

 

[root@server3 ~]# pcs property set no-quorum-policy=freeze

[root@server3 ~]# pcs property show

 

[root@server3 ~]# pcs resource create dlm --group locking ocf:pacemaker:controld op monitor interval=30s on-fail=ignore

[root@server3 ~]# pcs resource clone locking interleave=true

[root@server3 ~]# pcs resource create lvmlockd --group locking ocf:heartbeat:lvmlockd op monitor interval=30s on-fail=ignore

 

[root@server3 ~]# pcs status --full   //아래 처럼 확인된다면

...

Failed Resource Actions:

  * dlm_start_0 on server3.example.com 'not configured' (6):

....

 

server3, server4 모두  dlm enable, start

[root@server3 ~]# systemctl status dlm

[root@server3 ~]# systemctl start dlm

[root@server3 ~]# systemctl enable dlm

 

그래도 failed resource actions가 있다면 rebooting .(리부팅으로 해결함)

 

server3, server4 모두

[root@server3 ~]# vi /etc/lvm/lvm.conf

locking_type = 1

use_lvmlockd = 1

[root@server3 ~]# systemctl start lvmlockd

[root@server3 ~]# systemctl status lvmlockd

 

 

 

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 20:38:47 2021

  * Last change:  Tue Jan  5 20:38:43 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 4 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

[root@server3 ~]# pcs property set stonith-enabled=false

 

 

 

-------------------------------------------------------

iscsi 세팅

 

[root@server3 ~]# dnf -y install iscsi-initiator-utils

[root@server3 ~]# vi /etc/iscsi/initiatorname.iscsi

InitiatorName=iqn.2021-01.com.example.server3:server3

InitiatorName=iqn.2021-01.com.example.server4:server4

 

[root@server3 ~]# systemctl restart iscsid

[root@server3 ~]# systemctl enable iscsid

 

[root@server3 ~]# iscsiadm -m discovery -t st -p 192.168.10.210

[root@server3 ~]# iscsiadm -m node -T iqn.2021-01.com.example.server2:disk1 -p 192.168.10.210:3260 -l

[root@server3 ~]# lsblk

 

아래처럼 error가 난다면...

[root@server3 ~]# pvcreate /dev/sdb1

  WARNING: lvmlockd process is not running.

  Global lock failed: check that lvmlockd is running.

 

[root@server3 ~]# fdisk /dev/sdb   //fdisk 로 파티션 삭제하고 다시 파티셔닝함

d->n->p->엔터 엔터 엔터 -> w

 

[root@server3 ~]# lvmlockd

[root@server3 ~]# pvcreate /dev/sdb

WARNING: dos signature detected on /dev/sdb at offset 510. Wipe it? [y/n]: y

  Wiping dos signature on /dev/sdb.

  Physical volume "/dev/sdb" successfully created.

 

[root@server3 ~]# vgcreate --shared svg /dev/sdb

  Volume group "svg" successfully created

  VG svg starting dlm lockspace

  Starting locking.  Waiting until locks are ready...

 

 

server4에서만 적용

[root@server4 ~]# vgchange --lock-start svg

  VG svg starting dlm lockspace

  Starting locking.  Waiting until locks are ready...

 

 

 

 

[root@server3 ~]# lvcreate --activate sy -l 100%FREE -n lv0 svg

  Logical volume "lv0" created.

 

[root@server3 ~]# mkfs.gfs2 -j2 -p lock_dlm -t cluster:gfs2-server2 /dev/svg/lv0

/dev/svg/lv0 is a symbolic link to /dev/dm-3

This will destroy any data on /dev/dm-3

Are you sure you want to proceed? [y/n] y

Discarding device contents (may take a while on large devices): Done

Adding journals: Done

Building resource groups: Done

Creating quota file: Done

Writing superblock and syncing: Done

Device:                    /dev/svg/lv0

Block size:                4096

Device size:               0.99 GB (260096 blocks)

Filesystem size:           0.99 GB (260092 blocks)

Journals:                  2

Journal size:              8MB

Resource groups:           6

Locking protocol:          "lock_dlm"

Lock table:                "cluster:gfs2-server2"

UUID:                      361b1f8b-8bc7-42b6-9a47-26d2fee0cfe0

 

[root@server3 ~]# lsblk -f

NAME        FSTYPE      LABEL                UUID                                   MOUNTPOINT

sda

├─sda1      ext4                             9a7d9b4c-7026-46ac-863d-07330fd8ceac   /boot

└─sda2      LVM2_member                      9rt6BW-jskp-jqK9-OasB-m3dE-zcHd-hbazk1

  ├─cl-root xfs                              87a1ed4a-93a0-4642-b5d9-39900b923a11   /

  ├─cl-swap swap                             9f8a7bc3-8cc1-4988-84dd-1fb8deb8422b   [SWAP]

  └─cl-home xfs                              35b8c783-732f-4e72-8af1-06c4f0c64f48   /home

sdb         LVM2_member                      GhX2jx-MdKd-7Yvq-DFwZ-Di4t-ErXh-qN6ucj

└─svg-lv0   gfs2        cluster:gfs2-server2 361b1f8b-8bc7-42b6-9a47-26d2fee0cfe0

 

 

 

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 21:42:13 2021

  * Last change:  Tue Jan  5 20:38:43 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 4 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

 

Failed Resource Actions:

  * lvmlockd_monitor_30000 on server4.example.com 'not running' (7): call=29, status='complete', exitreason='', last-rc-change='2021-01-05 21:13:49 +09:00', queued=0ms, exec=0ms

  * lvmlockd_monitor_30000 on server3.example.com 'not running' (7): call=14, status='complete', exitreason='', last-rc-change='2021-01-05 21:14:19 +09:00', queued=0ms, exec=0ms

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server3 ~]#

 

 

[root@server3 ~]# pcs resource create sharedlv1 --group shared_vg1 ocf:heartbeat:LVM-activate lvname=lv0 vgname=svg activation_mode=shared vg_access_mode=lvmlockd

[root@server3 ~]# pcs resource clone shared_vg1 interleave=true

[root@server3 ~]# pcs constraint order start locking-clone then shared_vg1-clone

Adding locking-clone shared_vg1-clone (kind: Mandatory) (Options: first-action=start then-action=start)

 

같은 호스트에 적용

[root@server3 ~]# pcs constraint colocation add shared_vg1-clone with locking-clone

 

폴더 생성

[root@server3 ~]# mkdir /gfs2

 

fstab 처럼 적용

[root@server3 ~]# pcs resource create sharedfs1 --group shared_vg1 ocf:heartbeat:Filesystem device="/dev/svg/lv0" directory="/gfs2" fstype="gfs2" options=noatime op monitor interval=10s on-fail=ignore

 

확인

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 21:50:05 2021

  * Last change:  Tue Jan  5 21:49:55 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com server4.example.com ]

 

Failed Resource Actions:

  * lvmlockd_monitor_30000 on server4.example.com 'not running' (7): call=29, status='complete', exitreason='', last-rc-change='2021-01-05 21:48:25 +09:00', queued=0ms, exec=0ms

  * lvmlockd_monitor_30000 on server3.example.com 'not running' (7): call=14, status='complete', exitreason='', last-rc-change='2021-01-05 21:47:54 +09:00', queued=0ms, exec=0ms

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

[root@server3 ~]# df -Th

Filesystem          Type      Size  Used Avail Use% Mounted on

devtmpfs            devtmpfs  443M     0  443M   0% /dev

tmpfs               tmpfs     471M   48M  424M  10% /dev/shm

tmpfs               tmpfs     471M  7.0M  464M   2% /run

tmpfs               tmpfs     471M     0  471M   0% /sys/fs/cgroup

/dev/mapper/cl-root xfs        92G  6.0G   86G   7% /

/dev/mapper/cl-home xfs       5.0G   69M  5.0G   2% /home

/dev/sda1           ext4      976M  236M  674M  26% /boot

tmpfs               tmpfs      95M     0   95M   0% /run/user/0

/dev/mapper/svg-lv0 gfs2     1016M   19M  998M   2% /gfs2

 

 

아래 증상

[root@server3 ~]# pcs status

.....

Failed Resource Actions:

  * lvmlockd_monitor_30000 on server4.example.com 'not running' (7): call=29, status='complete', exitreason='', last-rc-change='2021-01-05 21:48:25 +09:00', queued=0ms, exec=0ms

  * lvmlockd_monitor_30000 on server3.example.com 'not running' (7): call=14, status='complete', exitreason='', last-rc-change='2021-01-05 21:47:54 +09:00', queued=0ms, exec=0ms

 

.....

 

해결책 //lvm.conf 문제로 lvmlockd start에서 fail나기 때문 (server3, server4 둘 다 확인)

[root@server3 ~]# vi /etc/lvm/lvm.conf   //컨피그 enable

locking_type = 1

use_lvmlockd = 1

[root@server3 ~]# systemctl start lvmlockd   //데몬 시작

 

[root@server3 ~]# pcs resource cleanup    //로그 삭제

Cleaned up all resources on all nodes

Waiting for 2 replies from the controller.. OK

 

확인

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 21:55:46 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

 

##############################################################

################     SERVER 4 세팅     #############################

 

[root@server4 ~]# yum install yum-utils

[root@server4 ~]# yumdownloader --source dlm

[root@server4 ~]# yum -y install rpm-build libxml2-devel systemd-devel glibc-kernheaders gcc make

[root@server4 ~]# dnf config-manager --set-enabled ha

[root@server4 ~]#  dnf -y install pacemaker fence-agents-all gfs2-utils pcp-zeroconf corosynclib-devel pacemaker-libs-devel

[root@server4 ~]# rpmbuild --rebuild dlm-4.0.9-3.el8.src.rpm

[root@server4 ~]# rpm -ivh /root/rpmbuild/RPMS/x86_64/dlm-4.0.9-3.el8.x86_64.rpm --nodeps

 

[root@server4 ~]# dnf repolist

repo id                                                 repo name

appstream                                               CentOS Linux 8 - AppStream

baseos                                                  CentOS Linux 8 - BaseOS

epel                                                    Extra Packages for Enterprise Linux 8 - x86_64

epel-modular                                            Extra Packages for Enterprise Linux Modular 8 - x86_64

extras                                                  CentOS Linux 8 - Extras

ha                                                      CentOS Linux 8 - HighAvailability

powertools                                              CentOS Linux 8 - PowerTools

 

[root@server4 ~]# passwd hacluster

Changing password for user hacluster.

New password:

BAD PASSWORD: The password contains the user name in some form

Retype new password:

passwd: all authentication tokens updated successfully.

 

[root@server4 ~]# dnf -y install pcs

 

[root@server4 ~]# systemctl start pcsd

[root@server4 ~]# systemctl enable pcsd

 

[root@server4 ~]# firewall-cmd --permanent --zone=public --add-port=2224/tcp

[root@server4 ~]# firewall-cmd --permanent --zone=public --add-port=5405/udp

[root@server4 ~]# firewall-cmd --permanent --zone=public --add-service=high-availability

[root@server4 ~]# firewall-cmd --reload

 

 

 

[root@server4 ~]# vi /etc/hosts

192.168.10.210 server2.example.com

192.168.10.220 server3.example.com

192.168.10.230 server4.example.com

 

 

 

[root@server4 ~]# pcs status corosync

 

Membership information

----------------------

    Nodeid      Votes Name

         1          1 server3.example.com

         2          1 server4.example.com (local)

 

[root@server4 ~]# systemctl enable pacemaker

[root@server4 ~]# systemctl enable corosync

 

 

[root@server4 ~]# dnf -y isntall lvm2-lockd gfs2-utils

 

[root@server4 ~]# systemctl enable dlm

[root@server4 ~]# systemctl start dlm

 

[root@server4 ~]# vi /etc/lvm/lvm.conf

locking_type = 1

use_lvmlockd = 1

 

[root@server4 ~]# systemctl enable lvmlockd

[root@server4 ~]# systemctl start lvmlockd

 

 

---------------------------------------

iscsi 세팅

 

[root@server4 ~]# dnf -y install iscsi-initiator-utils

[root@server4 ~]# vi /etc/iscsi/initiatorname.iscsi

InitiatorName=iqn.2021-01.com.example.server3:server3

InitiatorName=iqn.2021-01.com.example.server4:server4

 

[root@server4 ~]# systemctl restart iscsid

[root@server4 ~]# systemctl enable iscsid

 

[root@server4 ~]# iscsiadm -m discovery -t st -p 192.168.10.210

[root@server4 ~]# iscsiadm -m node -T iqn.2021-01.com.example.server2:disk1 -p 192.168.10.210:3260 -l

 

 

server4에서만 적용할 것 vgcreate --lock-start (server3번에 참고)

 

[root@server4 ~]# mkdir /gfs2

 

[root@server4 ~]# systemctl start lvmlockd

[root@server4 ~]# systemctl status lvmlockd

lvmlockd.service - LVM lock daemon

   Loaded: loaded (/usr/lib/systemd/system/lvmlockd.service; enabled; vendor preset: disabled)

   Active: active (running) since Tue 2021-01-05 21:53:52 KST; 16s ago

     Docs: man:lvmlockd(8)

 Main PID: 48248 (lvmlockd)

    Tasks: 3 (limit: 5665)

   Memory: 2.5M

   CGroup: /system.slice/lvmlockd.service

           └─48248 /usr/sbin/lvmlockd --foreground

 

Jan 05 21:53:52 server4.example.com systemd[1]: Starting LVM lock daemon...

Jan 05 21:53:52 server4.example.com lvmlockd[48248]: [D] creating /run/lvm/lvmlockd.socket

Jan 05 21:53:52 server4.example.com lvmlockd[48248]: Socket /run/lvm/lvmlockd.socket already in use

Jan 05 21:53:52 server4.example.com lvmlockd[48248]: 1609851232 lvmlockd started

Jan 05 21:53:52 server4.example.com systemd[1]: Started LVM lock daemon.

[root@server4 ~]#

 

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 21:56:06 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server4 ~]#

 

 

 

[root@server4 ~]# df -Th

Filesystem          Type      Size  Used Avail Use% Mounted on

devtmpfs            devtmpfs  443M     0  443M   0% /dev

tmpfs               tmpfs     471M   63M  409M  14% /dev/shm

tmpfs               tmpfs     471M  7.0M  464M   2% /run

tmpfs               tmpfs     471M     0  471M   0% /sys/fs/cgroup

/dev/mapper/cl-root xfs        92G  6.1G   86G   7% /

/dev/mapper/cl-home xfs       5.0G   69M  5.0G   2% /home

/dev/sda1           ext4      976M  236M  674M  26% /boot

tmpfs               tmpfs      95M     0   95M   0% /run/user/0

/dev/mapper/svg-lv0 gfs2     1016M   19M  998M   2% /gfs2

 

 

 

###############################################

###############    TEST     ######################

##############################################

테스트 하기

 

server 4 reboot

[root@server4 ~]# touch /gfs2/file{1..5}

[root@server4 ~]# ls -l /gfs2/

total 40

-rw-r--r--. 1 root root 0 Jan  5 22:25 file1

-rw-r--r--. 1 root root 0 Jan  5 22:25 file2

-rw-r--r--. 1 root root 0 Jan  5 22:25 file3

-rw-r--r--. 1 root root 0 Jan  5 22:25 file4

-rw-r--r--. 1 root root 0 Jan  5 22:25 file5

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:25:52 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

여기서 리붓

[root@server4 ~]# reboot

 

 

 

 

서버3 확인 (여기서는 계속 디렉토리 무중단 사용 가능 확인) 

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:26:24 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Resource Group: locking:0:

      * dlm     (ocf::pacemaker:controld):       Stopping server4.example.com

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Stopped

    * Started: [ server3.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server3 ~]# df -Th

Filesystem          Type      Size  Used Avail Use% Mounted on

devtmpfs            devtmpfs  443M     0  443M   0% /dev

tmpfs               tmpfs     471M   48M  424M  10% /dev/shm

tmpfs               tmpfs     471M  7.0M  464M   2% /run

tmpfs               tmpfs     471M     0  471M   0% /sys/fs/cgroup

/dev/mapper/cl-root xfs        92G  6.0G   86G   7% /

/dev/mapper/cl-home xfs       5.0G   69M  5.0G   2% /home

/dev/sda1           ext4      976M  236M  674M  26% /boot

tmpfs               tmpfs      95M     0   95M   0% /run/user/0

/dev/mapper/svg-lv0 gfs2     1016M   19M  998M   2% /gfs2

[root@server3 ~]# ls -l /gfs2

total 40

-rw-r--r--. 1 root root 0 Jan  5 22:25 file1

-rw-r--r--. 1 root root 0 Jan  5 22:25 file2

-rw-r--r--. 1 root root 0 Jan  5 22:25 file3

-rw-r--r--. 1 root root 0 Jan  5 22:25 file4

-rw-r--r--. 1 root root 0 Jan  5 22:25 file5

[root@server3 ~]#

[root@server3 ~]#

[root@server3 ~]#

[root@server3 ~]# touch /gfs2/file{5..10}

[root@server3 ~]#

[root@server3 ~]#

[root@server3 ~]# ls -l /gfs2/

total 80

-rw-r--r--. 1 root root 0 Jan  5 22:25 file1

-rw-r--r--. 1 root root 0 Jan  5 22:27 file10

-rw-r--r--. 1 root root 0 Jan  5 22:25 file2

-rw-r--r--. 1 root root 0 Jan  5 22:25 file3

-rw-r--r--. 1 root root 0 Jan  5 22:25 file4

-rw-r--r--. 1 root root 0 Jan  5 22:27 file5

-rw-r--r--. 1 root root 0 Jan  5 22:27 file6

-rw-r--r--. 1 root root 0 Jan  5 22:27 file7

-rw-r--r--. 1 root root 0 Jan  5 22:27 file8

-rw-r--r--. 1 root root 0 Jan  5 22:27 file9

[root@server3 ~]#

 

 

 

########################################################

################    복구하기         #########################

#######################################################

 

복구하기

 

 

server3에서 server4 장애 확인 (server4 rebooting 되었으니깐)

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:29:15 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured (1 BLOCKED from further action due to failure)

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Resource Group: locking:0:

      * dlm     (ocf::pacemaker:controld):       FAILED server4.example.com (blocked)

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Stopped

    * Started: [ server3.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

 

Failed Resource Actions:

  * dlm_stop_0 on server4.example.com 'error' (1): call=68, status='Timed Out', exitreason='', last-rc-change='2021-01-05 22:26:22 +09:00', queued=0ms, exec=100003ms

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

[root@server3 ~]# pcs status --full

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server4.example.com (2) (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:30:34 2021

  * Last change:  Tue Jan  5 21:55:34 2021 by hacluster via crmd on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured (1 BLOCKED from further action due to failure)

 

Node List:

  * Online: [ server3.example.com (1) server4.example.com (2) ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Resource Group: locking:0:

      * dlm     (ocf::pacemaker:controld):       FAILED server4.example.com (blocked)

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Stopped

    * Resource Group: locking:1:

      * dlm     (ocf::pacemaker:controld):       Started server3.example.com

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Started server3.example.com

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Resource Group: shared_vg1:0:

      * sharedlv1       (ocf::heartbeat:LVM-activate):   Started server3.example.com

      * sharedfs1       (ocf::heartbeat:Filesystem):     Started server3.example.com

    * Resource Group: shared_vg1:1:

      * sharedlv1       (ocf::heartbeat:LVM-activate):   Stopped

      * sharedfs1       (ocf::heartbeat:Filesystem):     Stopped

 

Migration Summary:

  * Node: server4.example.com (2):

    * dlm: migration-threshold=1000000 fail-count=1000000 last-failure='Tue Jan  5 22:28:02 2021'

 

Failed Resource Actions:

  * dlm_stop_0 on server4.example.com 'error' (1): call=68, status='Timed Out', exitreason='', last-rc-change='2021-01-05 22:26:22 +09:00', queued=0ms, exec=100003ms

 

Tickets:

 

PCSD Status:

  server3.example.com: Online

  server4.example.com: Offline

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

 

 

 

 

 

 

 

server4를 메인터넌스 모드로 변경 (안하면 server4가 안올라옴 계속 loading(빙글빙글) 표시만 나옴 접속안됨, login도 안나옴)

maintenance 모드로 변경 하자마자 server4 rebooting 되면서 ssh접속이됨, console에서 login 나옴

[root@server3 ~]# pcs node maintenance server4.example.com

[root@server3 ~]#

[root@server3 ~]#

[root@server3 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server3.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:32:08 2021

  * Last change:  Tue Jan  5 22:32:01 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Node server4.example.com: OFFLINE (maintenance)

  * Online: [ server3.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

 

Failed Fencing Actions:

  * reboot of server4.example.com failed: delegate=, client=stonith-api.60589, origin=server3.example.com, last-failed='2021-01-05 22:32:07 +09:00'

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server3 ~]#

 

 

 

#################################

로그인 되면 server 4 복구시킴##########

################################

 

 

[root@server4 ~]# pcs status

Error: error running crm_mon, is pacemaker running?

  Could not connect to the CIB: Transport endpoint is not connected

  crm_mon: Error: cluster is not available on this node

[root@server4 ~]# systemctl status pcsd

pcsd.service - PCS GUI and remote configuration interface

   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)

   Active: inactive (dead)

     Docs: man:pcsd(8)

           man:pcs(8)

 

[root@server4 ~]# systemctl start pcsd

[root@server4 ~]# systemctl status pcsd

pcsd.service - PCS GUI and remote configuration interface

   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)

   Active: active (running) since Tue 2021-01-05 22:35:04 KST; 5s ago

     Docs: man:pcsd(8)

           man:pcs(8)

 Main PID: 3928 (pcsd)

    Tasks: 1 (limit: 5665)

   Memory: 38.3M

   CGroup: /system.slice/pcsd.service

           └─3928 /usr/libexec/platform-python -Es /usr/sbin/pcsd

 

[root@server4 ~]# pcs cluster sync

server3.example.com: Succeeded

server4.example.com: Succeeded

 

[root@server4 ~]# pcs cluster start server4.example.com

server4.example.com: Starting Cluster...

 

 

[root@server4 ~]# corosync-cfgtool -s

Printing link status.

Local node ID 2

LINK ID 0

        addr    = 192.168.10.230

        status:

                nodeid  1:      link enabled:1  link connected:1

                nodeid  2:      link enabled:1  link connected:1

 

[root@server4 ~]# systemctl status pacemaker

pacemaker.service - Pacemaker High Availability Cluster Manager

   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)

   Active: active (running) since Tue 2021-01-05 22:36:35 KST; 27s ago

     Docs: man:pacemakerd

           https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html

 Main PID: 4571 (pacemakerd)

    Tasks: 7

   Memory: 40.9M

   CGroup: /system.slice/pacemaker.service

           ├─4571 /usr/sbin/pacemakerd -f

           ├─4572 /usr/libexec/pacemaker/pacemaker-based

           ├─4573 /usr/libexec/pacemaker/pacemaker-fenced

           ├─4574 /usr/libexec/pacemaker/pacemaker-execd

           ├─4575 /usr/libexec/pacemaker/pacemaker-attrd

           ├─4576 /usr/libexec/pacemaker/pacemaker-schedulerd

           └─4577 /usr/libexec/pacemaker/pacemaker-controld

Jan 05 22:36:36 server4.example.com pacemaker-controld[4577]:  notice: Node server3.example.com state is now member

Jan 05 22:36:36 server4.example.com pacemaker-controld[4577]:  notice: Node server4.example.com state is now member

Jan 05 22:36:36 server4.example.com pacemaker-controld[4577]:  notice: Pacemaker controller successfully started and accepting connections

Jan 05 22:36:36 server4.example.com pacemaker-controld[4577]:  notice: State transition S_STARTING -> S_PENDING

Jan 05 22:36:37 server4.example.com pacemaker-controld[4577]:  notice: Fencer successfully connected

Jan 05 22:36:37 server4.example.com pacemaker-controld[4577]:  notice: State transition S_PENDING -> S_NOT_DC

Jan 05 22:36:39 server4.example.com pacemaker-controld[4577]:  notice: Result of probe operation for dlm on server4.example.com: not running

Jan 05 22:36:39 server4.example.com pacemaker-controld[4577]:  notice: Result of probe operation for lvmlockd on server4.example.com: ok

Jan 05 22:36:39 server4.example.com pacemaker-controld[4577]:  notice: Result of probe operation for sharedlv1 on server4.example.com: not running

Jan 05 22:36:39 server4.example.com pacemaker-controld[4577]:  notice: Result of probe operation for sharedfs1 on server4.example.com: not running

 

 

 

 

 

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server3.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:37:12 2021

  * Last change:  Tue Jan  5 22:32:01 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Node server4.example.com: maintenance

  * Online: [ server3.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Resource Group: locking:0:

      * dlm     (ocf::pacemaker:controld):       Stopped

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Started server4.example.com (unmanaged)

    * Started: [ server3.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

 

Failed Fencing Actions:

  * reboot of server4.example.com failed: delegate=, client=stonith-api.63365, origin=server3.example.com, last-failed='2021-01-05 22:37:12 +09:00'

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

 

 

 

 

 

[root@server4 ~]# systemctl start dlm

[root@server4 ~]# systemctl status dlm

dlm.service - dlm control daemon

   Loaded: loaded (/usr/lib/systemd/system/dlm.service; enabled; vendor preset: disabled)

   Active: active (running) since Tue 2021-01-05 22:38:14 KST; 5s ago

  Process: 5340 ExecStartPre=/sbin/modprobe dlm (code=exited, status=0/SUCCESS)

 Main PID: 5341 (dlm_controld)

    Tasks: 3 (limit: 5665)

   Memory: 4.9M

   CGroup: /system.slice/dlm.service

           ├─5341 /usr/sbin/dlm_controld --foreground

           └─5343 /usr/sbin/dlm_controld --foreground

 

Jan 05 22:38:14 server4.example.com systemd[1]: Starting dlm control daemon...

Jan 05 22:38:14 server4.example.com dlm_controld[5341]: 334 dlm_controld 4.0.9 started

Jan 05 22:38:14 server4.example.com systemd[1]: Started dlm control daemon.

 

 

확인

[root@server4 ~]# systemctl status dlm

[root@server4 ~]# systemctl status pacemaker

[root@server4 ~]# systemctl status corosync

[root@server4 ~]# systemctl status lvmlockd

 

 

 

 

 

history clean 하면  fail로그는 삭제됨

[root@server4 ~]# pcs stonith history cleanup

cleaning up fencing-history for node *

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server3.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:41:14 2021

  * Last change:  Tue Jan  5 22:32:01 2021 by root via cibadmin on server3.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Node server4.example.com: maintenance

  * Online: [ server3.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Resource Group: locking:0:

      * dlm     (ocf::pacemaker:controld):       Stopped

      * lvmlockd        (ocf::heartbeat:lvmlockd):       Started server4.example.com (unmanaged)

    * Started: [ server3.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com ]

    * Stopped: [ server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

 

 

[root@server4 ~]# pcs resource cleanup

Cleaned up all resources on all nodes

 

 

 

 

 

 

 

server4 메인터넌스 모드 해제하자 마자 다시 shared 볼륨이 다시 연결됨

[root@server4 ~]# pcs node unmaintenance server4.example.com

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server3.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:43:23 2021

  * Last change:  Tue Jan  5 22:43:19 2021 by root via cibadmin on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Resource Group: shared_vg1:0:

      * sharedlv1       (ocf::heartbeat:LVM-activate):   Starting server4.example.com

      * sharedfs1       (ocf::heartbeat:Filesystem):     Stopped

    * Started: [ server3.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server4 ~]#

[root@server4 ~]# pcs status

Cluster name: cluster

Cluster Summary:

  * Stack: corosync

  * Current DC: server3.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum

  * Last updated: Tue Jan  5 22:43:40 2021

  * Last change:  Tue Jan  5 22:43:19 2021 by root via cibadmin on server4.example.com

  * 2 nodes configured

  * 8 resource instances configured

 

Node List:

  * Online: [ server3.example.com server4.example.com ]

 

Full List of Resources:

  * Clone Set: locking-clone [locking]:

    * Started: [ server3.example.com server4.example.com ]

  * Clone Set: shared_vg1-clone [shared_vg1]:

    * Started: [ server3.example.com server4.example.com ]

 

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

[root@server4 ~]#

[root@server4 ~]# df -Th

Filesystem          Type      Size  Used Avail Use% Mounted on

devtmpfs            devtmpfs  443M     0  443M   0% /dev

tmpfs               tmpfs     471M   51M  421M  11% /dev/shm

tmpfs               tmpfs     471M  7.0M  464M   2% /run

tmpfs               tmpfs     471M     0  471M   0% /sys/fs/cgroup

/dev/mapper/cl-root xfs        92G  6.0G   86G   7% /

/dev/mapper/cl-home xfs       5.0G   69M  5.0G   2% /home

/dev/sda1           ext4      976M  236M  674M  26% /boot

tmpfs               tmpfs      95M     0   95M   0% /run/user/0

/dev/mapper/svg-lv0 gfs2     1016M   19M  998M   2% /gfs2

[root@server4 ~]# ls -l /gfs2

total 80

-rw-r--r--. 1 root root 0 Jan  5 22:25 file1

-rw-r--r--. 1 root root 0 Jan  5 22:27 file10

-rw-r--r--. 1 root root 0 Jan  5 22:25 file2

-rw-r--r--. 1 root root 0 Jan  5 22:25 file3

-rw-r--r--. 1 root root 0 Jan  5 22:25 file4

-rw-r--r--. 1 root root 0 Jan  5 22:27 file5

-rw-r--r--. 1 root root 0 Jan  5 22:27 file6

-rw-r--r--. 1 root root 0 Jan  5 22:27 file7

-rw-r--r--. 1 root root 0 Jan  5 22:27 file8

-rw-r--r--. 1 root root 0 Jan  5 22:27 file9

[root@server4 ~]#

 

 

 참조

www.slideshare.net/ienvyou/rhel7centos7-pacemakerhav10

 

[오픈소스컨설팅]RHEL7/CentOS7 Pacemaker기반-HA시스템구성-v1.0

리눅스 pacemaker 기반의 High Availaiblity 구성방법에 대해 설명합니다. pacemaker를 사용하는 다른 리눅스 기반도 구성이 가능합니다. Pacemaker 기반 Linux High Availability 입문용으로는 적합하지 않을 수 있

www.slideshare.net

www.headdesk.me/Redhat_Cluster_on_EL7/8#Build_dlm_package_on_CentOS_8

 

Redhat Cluster on EL7/8 - www.headdesk.me

Red Hat Cluster on RHEL7 / CentOS 7 / CentOS 8 This is sometimes known as Pacemaker cluster. What you need to setup it up: At least 2 servers 2x NIC, bonded for application use 2x NIC, bonded for corosync use (this was known as heartbeat in previous versio

www.headdesk.me

manpages.ubuntu.com/manpages/bionic/man8/lvmlockd.8.html

 

Ubuntu Manpage: lvmlockd — LVM locking daemon

Powered by the Ubuntu Manpage Repository, file bugs in Launchpad © 2019 Canonical Ltd. Ubuntu and Canonical are registered trademarks of Canonical Ltd.

manpages.ubuntu.com

access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_high_availability_clusters/assembly_configuring-gfs2-in-a-cluster-configuring-and-managing-high-availability-clusters#proc_configuring-gfs2-in-a-cluster.adoc-configuring-gfs2-cluster

 

Chapter 7. GFS2 file systems in a cluster Red Hat Enterprise Linux 8 | Red Hat Customer Portal

The Red Hat Customer Portal delivers the knowledge, expertise, and guidance available through your Red Hat subscription.

access.redhat.com

www.golinuxcloud.com/setup-high-availability-cluster-centos-8/

 

10 easy steps to setup High Availability Cluster CentOS 8 | GoLinuxCloud

Step-by-Step Guide to setup Linux High Availability Cluster in CentOS 8 using system_id_source uname. Configure pacemaker, corosync and linux HA cluster.

www.golinuxcloud.com

 

 

 

 

'linux' 카테고리의 다른 글

centos teaming - lacp  (0) 2021.01.06
centos 8 - gluster  (0) 2021.01.06
centos 8 - drbd 설치  (0) 2021.01.04
centos 8 - pacemaker  (0) 2021.01.03
centos - keepalived (링크 이중화 vrrp)  (0) 2021.01.03