磁盘是系统中最易损的消耗品,所以更换硬盘,也是每一个存储常见的操作。
针对Exadata如何添加、删除磁盘呢?这里我做了一个实验,详细的记录了测试的全过程,供大家参考。
首先对Exadata磁盘部分,新增加了lun,celldisk,griddisk的概念,这里简单介绍一下,如下图:
一个物理硬盘添加到Cell之后,会自动创建LUN,LUN对应的是创建一个celldisk,一个celldisk对应一个或者多个griddisk,而griddisk就可以提供给ASM实例使用了。这里我主要记录一下我在将一个正在运行的Exadata机器,它的一个磁盘正常删除并添加的完整过程。
(这里强调一下“CellCLI>”都是在Cell上执行的命令,“SQL>”都是在compute node(既数据库节点)上执行的命令)
1.首先要确定我需要删除的硬盘的LUN的位置,celldisk名字,通过下面命令可以查询
CellCLI> LIST CELLDISK WHERE name LIKE 'CD_03.*' detail
name: CD_03_dmorlcel08 <<<<<<<<< celldisk名字
comment:
creationTime: 2012-04-20T04:37:54-04:00
deviceName: /dev/sdd
devicePartition: /dev/sdd
diskType: HardDisk
errorCount: 0
freeSpace: 0
id: 772cc8b1-ca81-468e-8b65-16e2724ef4da
interleaving: none
lun: 0_3 <<<<<<<<< 在0_3 这个LUN
raidLevel: 0
size: 557.859375G
status: normal
2.查询griddisk的信息,这个可以找到asm中的disk name,以及diskgroup的name。
通过这个命令还可以看到这个celldisk对应三个griddisk。
CellCLI> LIST GRIDDISK WHERE cellDisk LIKE 'CD_03.*' detail
name: DATA_CD_03_dmorlcel08
asmDiskGroupName: DATA
asmDiskName: DATA_CD_03_DMORLCEL08 <<<<<<<<<<<<<<<<<<<<<<<<<ASM disk name
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:39:48-04:00
diskType: HardDisk
errorCount: 0
id: 524f59b3-32af-4926-bc40-bf094d05a71d
offset: 32M
size: 423G
status: active
name: DBFS_DG_CD_03_dmorlcel08
asmDiskGroupName: DBFS_DG
asmDiskName: DBFS_DG_CD_03_DMORLCEL08 <<<<<<<<<<<<<<<<<<<<<<<<<ASM disk name
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:57:22-04:00
diskType: HardDisk
errorCount: 0
id: 001ef1df-5d41-4fab-b1ad-5a8983a63b8c
offset: 528.734375G
size: 29.125G
status: active
name: RECO_CD_03_dmorlcel08
asmDiskGroupName: RECO <<<<<<<<<<<<<<<<<<<<<<<<<ASM diskgroup name
asmDiskName: RECO_CD_03_DMORLCEL08 <<<<<<<<<<<<<<<<<<<<<<<<<ASM disk name
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:56:28-04:00
diskType: HardDisk
errorCount: 0
id: 544c54c0-2969-4ad3-bea0-f3e2b4f1bf31
offset: 423.046875G
size: 105.6875G
status: active
3.通过ASM实例查询磁盘信息
$more /etc/oratab
#Backup file is /u01/app/oracle/product/11.2.0.3/db_home1/srvm/admin/oratab.bak.dmorldb05 line added by Agent
+ASM1:/u01/app/11.2.0.3/grid:N # line added by Agent
$ export ORACLE_SID=+ASM1
$ export ORACLE_HOME=/u01/app/11.2.0.3/grid
$ sqlplus / as sysasm
SQL> set line 400
SQL> select GROUP_NUMBER,DISK_NUMBER,LABEL,NAME,MOUNT_STATUS,state,FAILGROUP,MODE_STATUS,PATH from v$asm_disk where name='RECO_CD_03_DMORLCEL08'; <<<<<<<disk name need to correct
GROUP_NUMBER DISK_NUMBER LABEL NAME PATH
------------ ----------- ------------------------------- ------------------------------
4 10 RECO_CD_03_DMORLCEL08 RECO_CD_03_DMORLCEL08 o/192.168.10.16/RECO_CD_03_dmorlcel08
查询对应的磁盘组
SQL> select GROUP_NUMBER,NAME from v$asm_diskgroup; <<<<<determine the diskgroup number
GROUP_NUMBER NAME
------------ ------------------------------
1 DATA
2 DBFS_DG
4 RECO
SQL> select NAME,ALLOCATION_UNIT_SIZE,STATE,TOTAL_MB,OFFLINE_DISKS,TOTAL_MB,FREE_MB,REQUIRED_MIRROR_FREE_MB from v$asm_diskgroup;
4.下面开始我们的删除操作,如果删除celldisk,首先应该从最上层,ASM层面开始删除磁盘
SQL> alter diskgroup RECO drop disk RECO_CD_03_DMORLCEL08 REBALANCE power 8;
SQL> alter diskgroup DBFS_DG drop disk DBFS_DG_CD_03_DMORLCEL08 REBALANCE power 8;
SQL> alter diskgroup DATA drop disk DATA_CD_03_DMORLCEL08 REBALANCE power 8;
5.删除的过程中,通过这个命令查收reblance进度,确保ASM删除磁盘的动作完成,再进行下一步操作。
SQL> set line 200
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ ----- ---- ---------- ---------- ---------- ---------- ----------
1 REBAL RUN 8 8 58593 207611 7586 19
SQL> /
GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ ----- ---- ---------- ---------- ---------- ---------- ----------
1 REBAL RUN 8 8 58856 207656 7643 19
SQL> /
no rows selected <<<<<<<<<<<<表示reblance已经完成
6.ASM中删除掉DBFS_DG_CD_03_DMORLCEL08磁盘后,通过griddisk命令查询,发现对应的asmDiskName,asmDiskGroupName,已经没有值了。
CellCLI> list GRIDDISK DBFS_DG_CD_03_DMORLCEL08 detail
name: DBFS_DG_CD_03_dmorlcel08
asmDiskGroupName:
asmDiskName:
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:57:22-04:00
diskType: HardDisk
errorCount: 0
id: 001ef1df-5d41-4fab-b1ad-5a8983a63b8c
offset: 528.734375G
size: 29.125G
status: active
7.删除griddisk,并查询,发现已经查询不到griddisk DBFS_DG_CD_03_DMORLCEL08的信息
CellCLI> DROP GRIDDISK DBFS_DG_CD_03_DMORLCEL08;
GridDisk DBFS_DG_CD_03_dmorlcel08 successfully dropped
CellCLI> list GRIDDISK DBFS_DG_CD_03_DMORLCEL08 detail
CELL-02007: Grid disk does not exist: DBFS_DG_CD_03_DMORLCEL08
CellCLI> LIST gridDISK WHERE cellDisk LIKE '.*CD_03.*' detail
name: DATA_CD_03_dmorlcel08
asmDiskGroupName: DATA
asmDiskName: DATA_CD_03_DMORLCEL08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:39:48-04:00
diskType: HardDisk
errorCount: 0
id: 524f59b3-32af-4926-bc40-bf094d05a71d
offset: 32M
size: 423G
status: active
name: RECO_CD_03_dmorlcel08
asmDiskGroupName:
asmDiskName:
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:56:28-04:00
diskType: HardDisk
errorCount: 0
id: 544c54c0-2969-4ad3-bea0-f3e2b4f1bf31
offset: 423.046875G
size: 105.6875G
status: active
8.继续删除RECO_CD_03_dmorlcel08并查询
CellCLI> DROP GRIDDISK RECO_CD_03_dmorlcel08;
GridDisk RECO_CD_03_dmorlcel08 successfully dropped
CellCLI> LIST gridDISK WHERE cellDisk LIKE '.*CD_03.*' detail
name: DATA_CD_03_dmorlcel08
asmDiskGroupName: DATA
asmDiskName: DATA_CD_03_DMORLCEL08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:39:48-04:00
diskType: HardDisk
errorCount: 0
id: 524f59b3-32af-4926-bc40-bf094d05a71d
offset: 32M
size: 423G
status: active
9.继续删除DATA_CD_03_DMORLCEL08,并查询
CellCLI> DROP GRIDDISK DATA_CD_03_DMORLCEL08
GridDisk DATA_CD_03_dmorlcel08 successfully dropped
CellCLI> LIST gridDISK WHERE cellDisk LIKE '.*CD_03.*' detail
CellCLI> LIST CELLDISK WHERE name LIKE 'CD_03.*' detail
name: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:37:54-04:00
deviceName: /dev/sdd
devicePartition: /dev/sdd
diskType: HardDisk
errorCount: 0
freeSpace: 557.8125G
freeSpaceMap: offset=32M,size=19.96875G
offset=20.015625G,size=537.84375G
id: 772cc8b1-ca81-468e-8b65-16e2724ef4da
interleaving: none
lun: 0_3
raidLevel: 0
size: 557.859375G
status: normal
CellCLI> LIST CELLDISK WHERE name LIKE 'CD_03.*' detail
name: CD_03_dmorlcel08
comment:
creationTime: 2012-04-20T04:37:54-04:00
deviceName: /dev/sdd
devicePartition: /dev/sdd
diskType: HardDisk
errorCount: 0
freeSpace: 557.8125G
freeSpaceMap: offset=32M,size=19.96875G
offset=20.015625G,size=537.84375G
id: 772cc8b1-ca81-468e-8b65-16e2724ef4da
interleaving: none
lun: 0_3
raidLevel: 0
size: 557.859375G
status: normal
CellCLI> drop celldisk CD_03_dmorlcel08
CellDisk CD_03_dmorlcel08 successfully dropped
10.删除celldisk,并查询LUN信息
CellCLI> LIST CELLDISK WHERE name LIKE 'CD_03.*' detail
CellCLI> LIST LUN 0_3 DETAIL
name: 0_3
cellDisk:
deviceName: /dev/sdd
diskType: HardDisk
id: 0_3
isSystemLun: FALSE
lunAutoCreate: FALSE
lunSize: 557.861328125G
lunUID: 0_3
physicalDrives: 28:3
raidLevel: 0
lunWriteCacheMode: "WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU"
status: normal
比较一个正常的LUN信息
CellCLI> LIST LUN 0_4 DETAIL
name: 0_4
cellDisk: CD_04_dmorlcel08 <<<<<<<
deviceName: /dev/sde
diskType: HardDisk
id: 0_4
isSystemLun: FALSE
lunAutoCreate: FALSE
lunSize: 557.861328125G
lunUID: 0_4
physicalDrives: 28:4
raidLevel: 0
lunWriteCacheMode: "WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU"
status: normal
11.在删除完成后,我们要开始添加的过程,实际troubleshooting中,会遇到Exadata在磁盘更换完成后,并没有正常的创建celldisk和griddisk,我们需要手动创建celldisk和griddisk。
手动创建之间,我们需要查询physicaldisk状态,确认physicalSerial和你更换的磁盘是对应的
CellCLI> list physicaldisk where luns=0_3 detail
name: 28:3
deviceId: 24
diskType: HardDisk
enclosureDeviceId: 28
errMediaCount: 0
errOtherCount: 0
foreignState: false
luns: 0_3
makeModel: "SEAGATE ST360057SSUN600G"
physicalFirmware: 0A25
physicalInsertTime: 2012-01-20T15:57:20-05:00
physicalInterface: sas
physicalSerial: E12EQY
physicalSize: 558.9109999993816G
slotNumber: 3
status: normal
12.创建celldisk,根据前面的信息,确认需要创建的celldisk与LUN能准确的匹配上,不要写错
CellCLI> create celldisk CD_03_dmorlcel08 lun=0_4(如果没有正确匹配,会遇到如下错误)
CELL-04527: Cannot complete the creation of cell disk CD_03_dmorlcel08. Received error: CELL-04522: The LUN 0_4 has a valid celldisk.
Cell disks are not created: CD_03_dmorlcel08
成功创建
CellCLI> create celldisk CD_03_dmorlcel08 lun=0_3
CellDisk CD_03_dmorlcel08 successfully created
14.首先查询一个正常的磁盘,确认后续需要设置的信息,如name,size,offset
CellCLI> list griddisk where celldisk=CD_04_dmorlcel08 attributes name,size,offset
DATA_CD_04_dmorlcel08 423G 32M
DBFS_DG_CD_04_dmorlcel08 29.125G 528.734375G
RECO_CD_04_dmorlcel08 105.6875G 423.046875G
15.创建griddisk磁盘
CellCLI> create griddisk DATA_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=423G
CellCLI> create griddisk DBFS_DG_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=29.125G
CellCLI> create griddisk RECO_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=105.6875G
下面是执行的记录,以及查询结果
CellCLI> create celldisk CD_03_dmorlcel08 lun=0_3
CellDisk CD_03_dmorlcel08 successfully created
CellCLI> create griddisk DATA_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=423G
GridDisk DATA_CD_03_dmorlcel08 successfully created
CellCLI> create griddisk DBFS_DG_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=29.125G
GridDisk DBFS_DG_CD_03_dmorlcel08 successfully created
CellCLI> create griddisk RECO_CD_03_dmorlcel08 celldisk=CD_03_dmorlcel08,size=105.6875G
GridDisk RECO_CD_03_dmorlcel08 successfully created
可以查询到celldisk信息
CellCLI> LIST CELLDISK WHERE name LIKE 'CD_03.*' detail
name: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:13:17-05:00
deviceName: /dev/sdd
devicePartition: /dev/sdd
diskType: HardDisk
errorCount: 0
freeSpace: 0
id: 2f3e651e-6fb6-47e7-a227-692645519ce4
interleaving: none
lun: 0_3
raidLevel: 0
size: 557.859375G
status: normal
可以查询到griddisk信息
CellCLI> LIST gridDISK WHERE cellDisk LIKE '.*CD_03.*' detail
name: DATA_CD_03_dmorlcel08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:17-05:00
diskType: HardDisk
errorCount: 0
id: 6f9cb08e-fd8f-4063-b902-439387237f5d
offset: 32M
size: 423G
status: active
name: DBFS_DG_CD_03_dmorlcel08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:26-05:00
diskType: HardDisk
errorCount: 0
id: 055710ce-216a-4b10-a215-b38763ea65df
offset: 423.046875G
size: 29.125G
status: active
name: RECO_CD_03_dmorlcel08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:33-05:00
diskType: HardDisk
errorCount: 0
id: fdf9a255-0a0d-4152-a557-de80b112ddff
offset: 452.171875G
size: 105.6875G
status: active
CellCLI> list griddisk where celldisk=CD_03_dmorlcel08 attributes name,size,offset
DATA_CD_03_dmorlcel08 423G 32M
DBFS_DG_CD_03_dmorlcel08 29.125G 423.046875G
RECO_CD_03_dmorlcel08 105.6875G 452.171875G
16.如果磁盘没有删除的情况下,需要force再次删除一下,但是我们的测试中,前面已经正常删除磁盘
SQL> alter diskgroup data drop disk DATA_CD_07_dm03cel13 force;
select GROUP_NUMBER,DISK_NUMBER,LABEL,NAME,MOUNT_STATUS,state,FAILGROUP,MODE_STATUS,PATH from v$asm_disk where name like '%CD_03_DMORLCEL08';
17.添加磁盘
sql> ALTER DISKGROUP DATA ADD DISK 'o/192.168.10.16/DATA_CD_03_dmorlcel08' rebalance power 10;
sql> ALTER DISKGROUP DBFS_DG ADD DISK 'o/192.168.10.16/DBFS_DG_CD_03_dmorlcel08' rebalance power 10;
sql> ALTER DISKGROUP RECO ADD DISK 'o/192.168.10.16/RECO_CD_03_dmorlcel08' rebalance power 10;
添加完成两个磁盘后,查询ASM的结果
GROUP_NUMBER DISK_NUMBER LABEL NAME MOUNT_S STATE FAILGROUP MODE_ST PATH
------------ ----------- ------------------------------- ------------------------------ ------- -------- ------------------------------ ------- ------------------
1 3 DATA_CD_03_DMORLCEL08 DATA_CD_03_DMORLCEL08 CACHED NORMAL DMORLCEL08 ONLINE o/192.168.10.16/DATA_CD_03_dmorlcel08
2 6 DBFS_DG_CD_03_DMORLCEL08 DBFS_DG_CD_03_DMORLCEL08 CACHED NORMAL DMORLCEL08 ONLINE o/192.168.10.16/DBFS_DG_CD_03_dmorlcel08
18.添加磁盘的过程,通过这个命令来检查完成状况
SQL> select * from v$asm_operation;
19.最后,全部完成的状态
SQL> select GROUP_NUMBER,DISK_NUMBER,LABEL,NAME,MOUNT_STATUS,state,FAILGROUP,MODE_STATUS,PATH from v$asm_disk where name like '%CD_03_DMORLCEL08';
GROUP_NUMBER DISK_NUMBER LABEL NAME MOUNT_S STATE FAILGROUP MODE_ST PATH
------------ ----------- ------------------------------- ------------------------------ ------- -------- ------------------------------ -------
1 3 DATA_CD_03_DMORLCEL08 DATA_CD_03_DMORLCEL08 CACHED NORMAL DMORLCEL08 ONLINE o/192.168.10.16/DATA_CD_03_dmorlcel08
4 6 RECO_CD_03_DMORLCEL08 RECO_CD_03_DMORLCEL08 CACHED NORMAL DMORLCEL08 ONLINE o/192.168.10.16/RECO_CD_03_dmorlcel08
2 6 DBFS_DG_CD_03_DMORLCEL08 DBFS_DG_CD_03_DMORLCEL08 CACHED NORMAL DMORLCEL08 ONLINE o/192.168.10.16/DBFS_DG_CD_03_dmorlcel08
查询griddisk信息,可以看到asmDiskGroupName,asmDiskName都已经有了。
CellCLI> list griddisk where celldisk=CD_03_dmorlcel08 detail
name: DATA_CD_03_dmorlcel08
asmDiskGroupName: DATA
asmDiskName: DATA_CD_03_DMORLCEL08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:17-05:00
diskType: HardDisk
errorCount: 0
id: 6f9cb08e-fd8f-4063-b902-439387237f5d
offset: 32M
size: 423G
status: active
name: DBFS_DG_CD_03_dmorlcel08
asmDiskGroupName: DBFS_DG
asmDiskName: DBFS_DG_CD_03_DMORLCEL08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:26-05:00
diskType: HardDisk
errorCount: 0
id: 055710ce-216a-4b10-a215-b38763ea65df
offset: 423.046875G
size: 29.125G
status: active
name: RECO_CD_03_dmorlcel08
asmDiskGroupName: RECO
asmDiskName: RECO_CD_03_DMORLCEL08
availableTo:
cellDisk: CD_03_dmorlcel08
comment:
creationTime: 2013-02-07T02:14:33-05:00
diskType: HardDisk
errorCount: 0
id: fdf9a255-0a0d-4152-a557-de80b112ddff
offset: 452.171875G
size: 105.6875G
status: active
至此,全部的实验已经完成,可以参考这个实验,来完成磁盘的添加。