Exadata Storage Configuration

March 25th, 2012 | Written by John Clarke

 

The purpose of this post is to outline how storage works on Exadata.  We’ll look at the host-based storage on the compute nodes and storage server nodes, and then look at the cell storage characteristics, mapping these to ASM storage and finally, database storage. 
 
Environment Description
 
The demonstrations in this document will be done using Centroid’s X2-2 Quarter rack.  
 
Compute Node Storage
 
Let’s start with a “df –k” listing:
 
[root@cm01dbm01 ~]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VGExaDb-LVDbSys1
                      30963708  22221844   7169000  76% /
/dev/sda1               126427     48728     71275  41% /boot
/dev/mapper/VGExaDb-LVDbOra1
                     103212320  57668260  40301180  59% /u01
tmpfs                 84132864     76492  84056372   1% /dev/shm
172.16.1.200:/exadump
                     14465060256 3669637248 10795423008  26% /dump
[root@cm01dbm01 ~]#
 
Other than an NFS mount I’ve got on this machine, we can see a 30GB root file-system, a small boot file-system, and a 100Gb /u01 mount point.  Now let’s look at an fdisk output:
 
[root@cm01dbm01 ~]# fdisk -l
 
Disk /dev/sda: 598.8 GB, 598879502336 bytes
255 heads, 63 sectors/track, 72809 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          16      128488+  83  Linux
/dev/sda2              17       72809   584709772+  8e  Linux LVM
[root@cm01dbm01 ~]#
 
We can see a 600Gb drive partitioned into /dev/sda1 and /dev/sda2 partitions.  We know that /dev/sda1 is mounted to /boot from the df listing, so we also know that the / and /u01 file-systems are built on logical volumes. Before continuing, the SunFire servers that the compute nodes run on use an LSI MegaRaid controller, so we can use MegaCli64 to show the physical hardware:
 
[root@cm01dbm01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -aALL
                                     
System
        OS Name (IP Address)       : Not Recognized
        OS Version                 : Not Recognized
        Driver Version             : Not Recognized
        CLI Version                : 8.00.23
 
Hardware
        Controller
                 ProductName       : LSI MegaRAID SAS 9261-8i(Bus 0, Dev 0)
                 SAS Address       : 500605b002f054d0
                 FW Package Version: 12.12.0-0048
                 Status            : Optimal
        BBU
                 BBU Type          : Unknown
                 Status            : Healthy
        Enclosure
                 Product Id        : SGPIO           
                 Type              : SGPIO
                 Status            : OK
 
        PD 
                Connector          : Port 0 - 3<Internal>: Slot 3 
                Vendor Id          : SEAGATE 
                Product Id         : ST930003SSUN300G
                State              : Global HotSpare
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Spun down
 
                Connector          : Port 0 - 3<Internal>: Slot 2 
                Vendor Id          : SEAGATE 
                Product Id         : ST930003SSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal>: Slot 1 
                Vendor Id          : SEAGATE 
                Product Id         : ST930003SSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal>: Slot 0 
                Vendor Id          : SEAGATE 
                Product Id         : ST930003SSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active
 
Storage
 
       Virtual Drives
                Virtual drive      : Target Id 0 ,VD name DBSYS
                Size               : 557.75 GB
                State              : Optimal
                RAID Level         : 5 
 
 
Exit Code: 0x00
[root@cm01dbm01 ~]#
 
Based on this, we have 4 300Gb drives; one hot spare and 3 active in slots 0, 1, and 2.  The virtual drive created with the internal RAID controller matches up in size with the fdisk listing.  If we do a pvdisplay, we see this:
 
[root@cm01dbm01 ~]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               VGExaDb
  PV Size               557.62 GB / not usable 1.64 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              142751
  Free PE               103327
  Allocated PE          39424
  PV UUID               xKSxo7-k8Hb-HM52-iGoD-tMKC-Vhxl-OQuNFG
 
Note that the PV size equals the virtual drive size from the MegaCli64 output.   There’s a single VG created on /dev/sda2 called VGExaDB:
 
[root@cm01dbm01 ~]# vgdisplay
  --- Volume group ---
  VG Name               VGExaDb
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               557.62 GB
  PE Size               4.00 MB
  Total PE              142751
  Alloc PE / Size       39424 / 154.00 GB
  Free  PE / Size       103327 / 403.62 GB
  VG UUID               eOfArN-08zd-1oD4-C4iu-RJbh-2Pxb-yhWmSW
 
As you can see, there is about 400Gb of free space on the volume group.  An lvdisplay shows the swap partition, LVDbSys1, and LVDbOra1 (mapped to “/” and “/u01”, respectively):
 
[root@cm01dbm01 ~]# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSys1
  VG Name                VGExaDb
  LV UUID                wsj1Dc-MXvd-6haj-vCb0-I8dY-dlt9-18kCwu
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                30.00 GB
  Current LE             7680
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSwap1
  VG Name                VGExaDb
  LV UUID                iH64Ie-LJSq-hchp-h1sg-OPww-pTx5-jQpj6T
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                24.00 GB
  Current LE             6144
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbOra1
  VG Name                VGExaDb
  LV UUID                CnRtDt-h6T3-iMFO-EZl6-0OHP-D6de-xZms6O
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                100.00 GB
  Current LE             25600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
 
 
These logical volumes are mapped to /dev/mapper devices like so:
 
[root@cm01dbm01 ~]#  ls -ltar /dev/VGExaDb/LVDb*
lrwxrwxrwx 1 root root 28 Feb 20 21:59 /dev/VGExaDb/LVDbSys1 -> /dev/mapper/VGExaDb-LVDbSys1
lrwxrwxrwx 1 root root 29 Feb 20 21:59 /dev/VGExaDb/LVDbSwap1 -> /dev/mapper/VGExaDb-LVDbSwap1
lrwxrwxrwx 1 root root 28 Feb 20 21:59 /dev/VGExaDb/LVDbOra1 -> /dev/mapper/VGExaDb-LVDbOra1
[root@cm01dbm01 ~]#
 
So in short:
 
- On the compute nodes, file-systems are built on logical volumes
- Logical volumes are build on a volume group based on the LSI MegaRAID controlled devices
 
Cell Server Storage
 
Each Exadata storage server has 12 SAS disks.  Each disk in each storage server is the same type – either High Performance (600GB 15K RPM) or High Capacity (2TB or 3TB).  The first two disk drives in each storage cell contained mirrored copies of the Exadata storage server “system area”.  This system area contains the storage server software, storage cell operating system storage, metrics and alert repository, and so forth.  The storage servers used the LSI MegaRAID controller, just like the compute nodes, and if you run lsscsi you’ll see both physical disks and PCI flash disks:
 
[root@cm01cel01 ~]# lsscsi -v
[0:2:0:0]    disk    LSI      MR9261-8i        2.12  /dev/sda
  dir: /sys/bus/scsi/devices/0:2:0:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:0/0:2:0:0]
[0:2:1:0]    disk    LSI      MR9261-8i        2.12  /dev/sdb
  dir: /sys/bus/scsi/devices/0:2:1:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:1/0:2:1:0]
[0:2:2:0]    disk    LSI      MR9261-8i        2.12  /dev/sdc
  dir: /sys/bus/scsi/devices/0:2:2:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:2/0:2:2:0]
[0:2:3:0]    disk    LSI      MR9261-8i        2.12  /dev/sdd
  dir: /sys/bus/scsi/devices/0:2:3:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:3/0:2:3:0]
[0:2:4:0]    disk    LSI      MR9261-8i        2.12  /dev/sde
  dir: /sys/bus/scsi/devices/0:2:4:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:4/0:2:4:0]
[0:2:5:0]    disk    LSI      MR9261-8i        2.12  /dev/sdf
  dir: /sys/bus/scsi/devices/0:2:5:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:5/0:2:5:0]
[0:2:6:0]    disk    LSI      MR9261-8i        2.12  /dev/sdg
  dir: /sys/bus/scsi/devices/0:2:6:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:6/0:2:6:0]
[0:2:7:0]    disk    LSI      MR9261-8i        2.12  /dev/sdh
  dir: /sys/bus/scsi/devices/0:2:7:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:7/0:2:7:0]
[0:2:8:0]    disk    LSI      MR9261-8i        2.12  /dev/sdi
  dir: /sys/bus/scsi/devices/0:2:8:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:8/0:2:8:0]
[0:2:9:0]    disk    LSI      MR9261-8i        2.12  /dev/sdj
  dir: /sys/bus/scsi/devices/0:2:9:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:9/0:2:9:0]
[0:2:10:0]   disk    LSI      MR9261-8i        2.12  /dev/sdk
  dir: /sys/bus/scsi/devices/0:2:10:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:10/0:2:10:0]
[0:2:11:0]   disk    LSI      MR9261-8i        2.12  /dev/sdl
  dir: /sys/bus/scsi/devices/0:2:11:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:11/0:2:11:0]
[1:0:0:0]    disk    Unigen   PSA4000          1100  /dev/sdm
  dir: /sys/bus/scsi/devices/1:0:0:0  [/sys/devices/pci0000:00/0000:00:1a.7/usb1/1-1/1-1:1.0/host1/target1:0:0/1:0:0:0]
[8:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdn
  dir: /sys/bus/scsi/devices/8:0:0:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:02.0/0000:1b:00.0/host8/port-8:0/end_device-8:0/target8:0:0/8:0:0:0]
[8:0:1:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdo
  dir: /sys/bus/scsi/devices/8:0:1:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:02.0/0000:1b:00.0/host8/port-8:1/end_device-8:1/target8:0:1/8:0:1:0]
[8:0:2:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdp
  dir: /sys/bus/scsi/devices/8:0:2:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:02.0/0000:1b:00.0/host8/port-8:2/end_device-8:2/target8:0:2/8:0:2:0]
[8:0:3:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdq
  dir: /sys/bus/scsi/devices/8:0:3:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:02.0/0000:1b:00.0/host8/port-8:3/end_device-8:3/target8:0:3/8:0:3:0]
[9:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdr
  dir: /sys/bus/scsi/devices/9:0:0:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:04.0/0000:21:00.0/host9/port-9:1/end_device-9:1/target9:0:0/9:0:0:0]
[9:0:1:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sds
  dir: /sys/bus/scsi/devices/9:0:1:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:04.0/0000:21:00.0/host9/port-9:0/end_device-9:0/target9:0:1/9:0:1:0]
[9:0:2:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdt
  dir: /sys/bus/scsi/devices/9:0:2:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:04.0/0000:21:00.0/host9/port-9:2/end_device-9:2/target9:0:2/9:0:2:0]
[9:0:3:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdu
  dir: /sys/bus/scsi/devices/9:0:3:0  [/sys/devices/pci0000:00/0000:00:07.0/0000:19:00.0/0000:1a:04.0/0000:21:00.0/host9/port-9:3/end_device-9:3/target9:0:3/9:0:3:0]
[10:0:0:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdv
  dir: /sys/bus/scsi/devices/10:0:0:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:02.0/0000:29:00.0/host10/port-10:1/end_device-10:1/target10:0:0/10:0:0:0]
[10:0:1:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdw
  dir: /sys/bus/scsi/devices/10:0:1:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:02.0/0000:29:00.0/host10/port-10:0/end_device-10:0/target10:0:1/10:0:1:0]
[10:0:2:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdx
  dir: /sys/bus/scsi/devices/10:0:2:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:02.0/0000:29:00.0/host10/port-10:2/end_device-10:2/target10:0:2/10:0:2:0]
[10:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdy
  dir: /sys/bus/scsi/devices/10:0:3:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:02.0/0000:29:00.0/host10/port-10:3/end_device-10:3/target10:0:3/10:0:3:0]
[11:0:0:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdz
  dir: /sys/bus/scsi/devices/11:0:0:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:04.0/0000:2f:00.0/host11/port-11:1/end_device-11:1/target11:0:0/11:0:0:0]
[11:0:1:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdaa
  dir: /sys/bus/scsi/devices/11:0:1:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:04.0/0000:2f:00.0/host11/port-11:0/end_device-11:0/target11:0:1/11:0:1:0]
[11:0:2:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdab
  dir: /sys/bus/scsi/devices/11:0:2:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:04.0/0000:2f:00.0/host11/port-11:2/end_device-11:2/target11:0:2/11:0:2:0]
[11:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdac
  dir: /sys/bus/scsi/devices/11:0:3:0  [/sys/devices/pci0000:00/0000:00:09.0/0000:27:00.0/0000:28:04.0/0000:2f:00.0/host11/port-11:3/end_device-11:3/target11:0:3/11:0:3:0]
[root@cm01cel01 ~]#
 
In the above listing, we can tell:
 
- The “MARVELL” devices are ATA attached PCI flash devices – we’ll cover these shortly
- The “MR9261-8i” LSI devices represent our 12 physical SAS disks.  Since they’re controlled via the LSI MegaRAID controller, we can use MegaCli64 to show more information:
 
[root@cm01cel01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -aALL
                                     
System
        OS Name (IP Address)       : Not Recognized
        OS Version                 : Not Recognized
        Driver Version             : Not Recognized
        CLI Version                : 8.00.23
 
Hardware
        Controller
                 ProductName       : LSI MegaRAID SAS 9261-8i(Bus 0, Dev 0)
                 SAS Address       : 500605b002f4aac0
                 FW Package Version: 12.12.0-0048
                 Status            : Optimal
        BBU
                 BBU Type          : Unknown
                 Status            : Healthy
        Enclosure
                 Product Id        : HYDE12          
                 Type              : SES
                 Status            : OK
 
                 Product Id        : SGPIO           
                 Type              : SGPIO
                 Status            : OK
 
        PD 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 11 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 10 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 9 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 8 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 7 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 6 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 4 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 3 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 2 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 1 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 0 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 5 
                Vendor Id          : SEAGATE 
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
 
Storage
 
       Virtual Drives
                Virtual drive      : Target Id 0 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 1 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 2 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 3 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 4 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 6 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 7 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 8 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 9 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 10 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 11 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
                Virtual drive      : Target Id 5 ,VD name 
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0 
 
 
Exit Code: 0x00
[root@cm01cel01 ~]#
 
You’ll notice above that we’ve got twelve (12) SEAGATE/ ST360057SSUN600G,557.861 GB high performance disks in this storage server.  Using cellcli, we can confirm this and note the corresponding sizes:
 
CellCLI> list physicaldisk attributes name,diskType,physicalSize
20:0     HardDisk 558.9109999993816G
20:1     HardDisk 558.9109999993816G
20:2     HardDisk 558.9109999993816G
20:3     HardDisk 558.9109999993816G
20:4     HardDisk 558.9109999993816G
20:5     HardDisk 558.9109999993816G
20:6     HardDisk 558.9109999993816G
20:7     HardDisk 558.9109999993816G
20:8     HardDisk 558.9109999993816G
20:9     HardDisk 558.9109999993816G
20:10     HardDisk 558.9109999993816G
20:11     HardDisk 558.9109999993816G
FLASH_1_0 FlashDisk 22.8880615234375G
FLASH_1_1 FlashDisk 22.8880615234375G
FLASH_1_2 FlashDisk 22.8880615234375G
FLASH_1_3 FlashDisk 22.8880615234375G
FLASH_2_0 FlashDisk 22.8880615234375G
FLASH_2_1 FlashDisk 22.8880615234375G
FLASH_2_2 FlashDisk 22.8880615234375G
FLASH_2_3 FlashDisk 22.8880615234375G
FLASH_4_0 FlashDisk 22.8880615234375G
FLASH_4_1 FlashDisk 22.8880615234375G
FLASH_4_2 FlashDisk 22.8880615234375G
FLASH_4_3 FlashDisk 22.8880615234375G
FLASH_5_0 FlashDisk 22.8880615234375G
FLASH_5_1 FlashDisk 22.8880615234375G
FLASH_5_2 FlashDisk 22.8880615234375G
FLASH_5_3 FlashDisk 22.8880615234375G
 
 
Cell Server OS Storage
 
We know from documentation that the operating system on the Exadata storage servers resides on the first two SAS disks on the cell.  Let’s do a “df –h” from the host:
 
[root@cm01cel01 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md6              9.9G  3.6G  5.9G  38% /
tmpfs                  12G     0   12G   0% /dev/shm
/dev/md8              2.0G  647M  1.3G  34% /opt/oracle
/dev/md4              116M   60M   50M  55% /boot
/dev/md11             2.3G  130M  2.1G   6% /var/log/oracle
[root@cm01cel01 ~]# 
 
Based on the /dev/md* device names, we know we’ve got software RAID in play for these devices, and that this RAID was created using mdadm.  Let’s query our mdadm configuration on /dev/md6, /dev/md8, /dev/md4, and /dev/md11:
 
[root@cm01cel01 ~]# mdadm -Q -D /dev/md6
/dev/md6:
        Version : 0.90
  Creation Time : Mon Feb 21 13:06:27 2011
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6
    Persistence : Superblock is persistent
 
    Update Time : Sun Mar 25 20:50:28 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
 
           UUID : 2ea655b5:89c5cafc:b8bacc8c:27078485
         Events : 0.49
 
    Number   Major   Minor   RaidDevice State
       0       8        6        0      active sync   /dev/sda6
       1       8       22        1      active sync   /dev/sdb6
[root@cm01cel01 ~]# mdadm -Q -D /dev/md8
/dev/md8:
        Version : 0.90
  Creation Time : Mon Feb 21 13:06:29 2011
     Raid Level : raid1
     Array Size : 2096384 (2047.59 MiB 2146.70 MB)
  Used Dev Size : 2096384 (2047.59 MiB 2146.70 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 8
    Persistence : Superblock is persistent
 
    Update Time : Sun Mar 25 20:50:16 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
 
           UUID : 4c4b589f:a2e42e48:8847db6b:832284bd
         Events : 0.78
 
    Number   Major   Minor   RaidDevice State
       0       8        8        0      active sync   /dev/sda8
       1       8       24        1      active sync   /dev/sdb8
[root@cm01cel01 ~]# mdadm -Q -D /dev/md5
/dev/md5:
        Version : 0.90
  Creation Time : Mon Feb 21 13:06:20 2011
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 5
    Persistence : Superblock is persistent
 
    Update Time : Sun Mar 25 04:27:05 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
 
           UUID : bf701820:0c124b92:9c9bfc74:7d418b3f
         Events : 0.36
 
    Number   Major   Minor   RaidDevice State
       0       8        5        0      active sync   /dev/sda5
       1       8       21        1      active sync   /dev/sdb5
[root@cm01cel01 ~]# mdadm -Q -D /dev/md11
/dev/md11:
        Version : 0.90
  Creation Time : Mon Feb 21 13:06:29 2011
     Raid Level : raid1
     Array Size : 2433728 (2.32 GiB 2.49 GB)
  Used Dev Size : 2433728 (2.32 GiB 2.49 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 11
    Persistence : Superblock is persistent
 
    Update Time : Sun Mar 25 20:50:32 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
 
           UUID : 9d76d724:5a2e31a1:fa34e9e7:a875f020
         Events : 0.82
 
    Number   Major   Minor   RaidDevice State
       0       8       11        0      active sync   /dev/sda11
       1       8       27        1      active sync   /dev/sdb11
[root@cm01cel01 ~]#
 
From the above output, we can see that the /dev/sda and /dev/sdb physical devices are software mirrored via mdadm.  If we do a “fdisk –l”, we see the following:
 
[root@cm01cel01 ~]# fdisk -l
 
Disk /dev/sda: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          15      120456   fd  Linux raid autodetect
/dev/sda2              16          16        8032+  83  Linux
/dev/sda3              17       69039   554427247+  83  Linux
/dev/sda4           69040       72824    30403012+   f  W95 Ext'd (LBA)
/dev/sda5           69040       70344    10482381   fd  Linux raid autodetect
/dev/sda6           70345       71649    10482381   fd  Linux raid autodetect
/dev/sda7           71650       71910     2096451   fd  Linux raid autodetect
/dev/sda8           71911       72171     2096451   fd  Linux raid autodetect
/dev/sda9           72172       72432     2096451   fd  Linux raid autodetect
/dev/sda10          72433       72521      714861   fd  Linux raid autodetect
/dev/sda11          72522       72824     2433816   fd  Linux raid autodetect
 
Disk /dev/sdb: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          15      120456   fd  Linux raid autodetect
/dev/sdb2              16          16        8032+  83  Linux
/dev/sdb3              17       69039   554427247+  83  Linux
/dev/sdb4           69040       72824    30403012+   f  W95 Ext'd (LBA)
/dev/sdb5           69040       70344    10482381   fd  Linux raid autodetect
/dev/sdb6           70345       71649    10482381   fd  Linux raid autodetect
/dev/sdb7           71650       71910     2096451   fd  Linux raid autodetect
/dev/sdb8           71911       72171     2096451   fd  Linux raid autodetect
/dev/sdb9           72172       72432     2096451   fd  Linux raid autodetect
/dev/sdb10          72433       72521      714861   fd  Linux raid autodetect
/dev/sdb11          72522       72824     2433816   fd  Linux raid autodetect
 
Disk /dev/sdc: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdc doesn't contain a valid partition table
 
Disk /dev/sdd: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdd doesn't contain a valid partition table
 
Disk /dev/sde: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sde doesn't contain a valid partition table
 
Disk /dev/sdf: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdf doesn't contain a valid partition table
 
Disk /dev/sdg: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdg doesn't contain a valid partition table
 
Disk /dev/sdh: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdh doesn't contain a valid partition table
 
Disk /dev/sdi: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdi doesn't contain a valid partition table
 
Disk /dev/sdj: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdj doesn't contain a valid partition table
 
Disk /dev/sdk: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdk doesn't contain a valid partition table
 
Disk /dev/sdl: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdl doesn't contain a valid partition table
 
Disk /dev/sdm: 4009 MB, 4009754624 bytes
126 heads, 22 sectors/track, 2825 cylinders
Units = cylinders of 2772 * 512 = 1419264 bytes
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdm1               1        2824     3914053   83  Linux
 
Disk /dev/md1: 731 MB, 731906048 bytes
2 heads, 4 sectors/track, 178688 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md1 doesn't contain a valid partition table
 
Disk /dev/md11: 2492 MB, 2492137472 bytes
2 heads, 4 sectors/track, 608432 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md11 doesn't contain a valid partition table
 
Disk /dev/md2: 2146 MB, 2146697216 bytes
2 heads, 4 sectors/track, 524096 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md2 doesn't contain a valid partition table
 
Disk /dev/md8: 2146 MB, 2146697216 bytes
2 heads, 4 sectors/track, 524096 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md8 doesn't contain a valid partition table
 
Disk /dev/md7: 2146 MB, 2146697216 bytes
2 heads, 4 sectors/track, 524096 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md7 doesn't contain a valid partition table
 
Disk /dev/md6: 10.7 GB, 10733879296 bytes
2 heads, 4 sectors/track, 2620576 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md6 doesn't contain a valid partition table
 
Disk /dev/md5: 10.7 GB, 10733879296 bytes
2 heads, 4 sectors/track, 2620576 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md5 doesn't contain a valid partition table
 
Disk /dev/md4: 123 MB, 123273216 bytes
2 heads, 4 sectors/track, 30096 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
 
Disk /dev/md4 doesn't contain a valid partition table
 
Disk /dev/sdn: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdn doesn't contain a valid partition table
 
Disk /dev/sdo: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdo doesn't contain a valid partition table
 
Disk /dev/sdp: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdp doesn't contain a valid partition table
 
Disk /dev/sdq: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdq doesn't contain a valid partition table
 
Disk /dev/sdr: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdr doesn't contain a valid partition table
 
Disk /dev/sds: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sds doesn't contain a valid partition table
 
Disk /dev/sdt: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdt doesn't contain a valid partition table
 
Disk /dev/sdu: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdu doesn't contain a valid partition table
 
Disk /dev/sdv: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdv doesn't contain a valid partition table
 
Disk /dev/sdw: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdw doesn't contain a valid partition table
 
Disk /dev/sdx: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdx doesn't contain a valid partition table
 
Disk /dev/sdy: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdy doesn't contain a valid partition table
 
Disk /dev/sdz: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdz doesn't contain a valid partition table
 
Disk /dev/sdaa: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdaa doesn't contain a valid partition table
 
Disk /dev/sdab: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdab doesn't contain a valid partition table
 
Disk /dev/sdac: 24.5 GB, 24575868928 bytes
255 heads, 63 sectors/track, 2987 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
 
Disk /dev/sdac doesn't contain a valid partition table
[root@cm01cel01 ~]#
 
This is telling us the following:
 
- /dev/sda[6,8,4,11] and /dev/sdb[6,8,4,11] are partitioned to contain OS storage, mirrored via software RAID via mdadm
- /dev/sdc, /dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh, /dev/sdi, /dev/sdj, /dev/sdk, and /dev/sdl don't contain valid partition tables because they’re wholly reserved for database storage
- /dev/sda3 and /dev/sdb3 don’t have usable partitions on them and are used for database storage on the first two disks
 
LUNs
 
The hierarchy of non-flash database storage in Exadata storage servers can be represented below:
 
977
 
 
 
 
 
 
 
 
 
 
 
 
 
978
 
 
 
 
 
 
 
 
 
 
 
 
A LUN is created on each physical disk.  For the first two drives, the LUN maps to the “non-system area” storage component on the physical disk.  For the remainder of the drives, it maps to the entire physical disk.   Exadata cell disks are created on LUNs.  Let’s look at a cellcli output:
 
CellCLI> list lun attributes name, deviceName, isSystemLun, physicalDrives, lunSize where disktype=harddisk
0_0 /dev/sda TRUE 20:0 557.861328125G
0_1 /dev/sdb TRUE 20:1 557.861328125G
0_2 /dev/sdc FALSE 20:2 557.861328125G
0_3 /dev/sdd FALSE 20:3 557.861328125G
0_4 /dev/sde FALSE 20:4 557.861328125G
0_5 /dev/sdf FALSE 20:5 557.861328125G
0_6 /dev/sdg FALSE 20:6 557.861328125G
0_7 /dev/sdh FALSE 20:7 557.861328125G
0_8 /dev/sdi FALSE 20:8 557.861328125G
0_9 /dev/sdj FALSE 20:9 557.861328125G
0_10 /dev/sdk FALSE 20:10 557.861328125G
0_11 /dev/sdl FALSE 20:11 557.861328125G
 
CellCLI>
 
From the above, we can see that the first two LUNs contain system areas (drives 20:0 and 20:1) and the remaining 10 do not.
 
Cell Disks
 
Cell disks are created on LUNs, and are the storage entities/abstractions on which grid disks are created:
 
979
 
 
 
 
 
 
 
 
 
 
 
 
From cellcli, our celldisks look like this:
 
CellCLI> list celldisk attributes name,deviceName,devicePartition,lun,size where disktype=harddisk
CD_00_cm01cel01 /dev/sda /dev/sda3 0_0 528.734375G
CD_01_cm01cel01 /dev/sdb /dev/sdb3 0_1 528.734375G
CD_02_cm01cel01 /dev/sdc /dev/sdc 0_2 557.859375G
CD_03_cm01cel01 /dev/sdd /dev/sdd 0_3 557.859375G
CD_04_cm01cel01 /dev/sde /dev/sde 0_4 557.859375G
CD_05_cm01cel01 /dev/sdf /dev/sdf 0_5 557.859375G
CD_06_cm01cel01 /dev/sdg /dev/sdg 0_6 557.859375G
CD_07_cm01cel01 /dev/sdh /dev/sdh 0_7 557.859375G
CD_08_cm01cel01 /dev/sdi /dev/sdi 0_8 557.859375G
CD_09_cm01cel01 /dev/sdj /dev/sdj 0_9 557.859375G
CD_10_cm01cel01 /dev/sdk /dev/sdk 0_10 557.859375G
CD_11_cm01cel01 /dev/sdl /dev/sdl 0_11 557.859375G
 
CellCLI>
 
A couple of things to note about the above:
 
- The cell disk size on the first two drives is about 30GB smaller than the remaining 10 drives – this is because the system area resides on the first two drives
- The device partition on the first two cell disks is /dev/sda3 and /dev/sdb3 – this is what we expected from the fdisk output in the Cell Server OS Storage section
 
Grid Disks
 
980
 
 
 
 
 
 
 
 
 
 
 
 
 
Grid disks are created on cell disks, and represent the storage available for ASM disks.  In other words, when you create ASM disk groups, the devices you use are grid disks, and grid disks are the disks available to ASM.  From cellcli, we can see the following grid disks:
 
CellCLI> list griddisk
DATA_CD_00_cm01cel01   active
DATA_CD_01_cm01cel01   active
DATA_CD_02_cm01cel01   active
DATA_CD_03_cm01cel01   active
DATA_CD_04_cm01cel01   active
DATA_CD_05_cm01cel01   active
DATA_CD_06_cm01cel01   active
DATA_CD_07_cm01cel01   active
DATA_CD_08_cm01cel01   active
DATA_CD_09_cm01cel01   active
DATA_CD_10_cm01cel01   active
DATA_CD_11_cm01cel01   active
DBFS_DG_CD_02_cm01cel01 active
DBFS_DG_CD_03_cm01cel01 active
DBFS_DG_CD_04_cm01cel01 active
DBFS_DG_CD_05_cm01cel01 active
DBFS_DG_CD_06_cm01cel01 active
DBFS_DG_CD_07_cm01cel01 active
DBFS_DG_CD_08_cm01cel01 active
DBFS_DG_CD_09_cm01cel01 active
DBFS_DG_CD_10_cm01cel01 active
DBFS_DG_CD_11_cm01cel01 active
RECO_CD_00_cm01cel01   active
RECO_CD_01_cm01cel01   active
RECO_CD_02_cm01cel01   active
RECO_CD_03_cm01cel01   active
RECO_CD_04_cm01cel01   active
RECO_CD_05_cm01cel01   active
RECO_CD_06_cm01cel01   active
RECO_CD_07_cm01cel01   active
RECO_CD_08_cm01cel01   active
RECO_CD_09_cm01cel01   active
RECO_CD_10_cm01cel01   active
RECO_CD_11_cm01cel01   active
 
CellCLI>
 
In this configuration, we have:
 
- 3 different types of grid disks, prefixed by DATA, RECO, and DBFS_DG
- The naming convention is “<PREFIX>_<CELLDISK>_<CELL_SERVER>”, but these can be whatever you'd like. 
- We have one grid disk of each type on each cell disk.  This isn’t a requirement, but is probably what you want – when creating ASM disk groups you’d typically wildcard the disk string and ideally, you’d want this storage spread across every physical disks across each cell.
 
Let’s take a look at a couple of DATA_CD% disks, DATA_CD_00_cm01cel01 and DATA_CD_10_cm01cel01:  
 
CellCLI> list griddisk attributes name,asmDiskGroupName,celldisk,offset,size where name=DATA_CD_10_cm01cel01
DATA_CD_10_cm01cel01 DATA_CM01 CD_10_cm01cel01 32M 423G
 
CellCLI> list griddisk attributes name,asmDiskGroupName,celldisk,offset,size where name=DATA_CD_00_cm01cel01
DATA_CD_00_cm01cel01 DATA_CM01 CD_00_cm01cel01 32M 423G
 
CellCLI>
 
This shows:
 
- A uniform Grid Disk size of 423G, as specified during Grid Disk creation.  If you create grid disks one-at-a-time without wildcarding the creation, you have flexibility to have different grid disk sizes on each cell disk or across cell disks, but this is probably a bad idea as it upsets the balance of extents to physical disks.
- A byte offset of 32M.  This essentially means that extents start 32MB from the outer section of the physical drive.  
 
Let’s look at all of our grid disks:
 
CellCLI> list griddisk attributes name,asmDiskGroupName,celldisk,offset,size                     
DATA_CD_00_cm01cel01   DATA_CM01 CD_00_cm01cel01 32M         423G
DATA_CD_01_cm01cel01   DATA_CM01 CD_01_cm01cel01 32M         423G
DATA_CD_02_cm01cel01   DATA_CM01 CD_02_cm01cel01 32M         423G
DATA_CD_03_cm01cel01   DATA_CM01 CD_03_cm01cel01 32M         423G
DATA_CD_04_cm01cel01   DATA_CM01 CD_04_cm01cel01 32M         423G
DATA_CD_05_cm01cel01   DATA_CM01 CD_05_cm01cel01 32M         423G
DATA_CD_06_cm01cel01   DATA_CM01 CD_06_cm01cel01 32M         423G
DATA_CD_07_cm01cel01   DATA_CM01 CD_07_cm01cel01 32M         423G
DATA_CD_08_cm01cel01   DATA_CM01 CD_08_cm01cel01 32M         423G
DATA_CD_09_cm01cel01   DATA_CM01 CD_09_cm01cel01 32M         423G
DATA_CD_10_cm01cel01   DATA_CM01 CD_10_cm01cel01 32M         423G
DATA_CD_11_cm01cel01   DATA_CM01 CD_11_cm01cel01 32M         423G
DBFS_DG_CD_02_cm01cel01 DBFS_DG   CD_02_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_03_cm01cel01 DBFS_DG   CD_03_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_04_cm01cel01 DBFS_DG   CD_04_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_05_cm01cel01 DBFS_DG   CD_05_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_06_cm01cel01 DBFS_DG   CD_06_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_07_cm01cel01 DBFS_DG   CD_07_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_08_cm01cel01 DBFS_DG   CD_08_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_09_cm01cel01 DBFS_DG   CD_09_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_10_cm01cel01 DBFS_DG   CD_10_cm01cel01 264.046875G 29.125G
DBFS_DG_CD_11_cm01cel01 DBFS_DG   CD_11_cm01cel01 264.046875G 29.125G
RECO_CD_00_cm01cel01   RECO_CM01 CD_00_cm01cel01 211.546875G 105G
RECO_CD_01_cm01cel01   RECO_CM01 CD_01_cm01cel01 211.546875G 105G
RECO_CD_02_cm01cel01   RECO_CM01 CD_02_cm01cel01 211.546875G 105G
RECO_CD_03_cm01cel01   RECO_CM01 CD_03_cm01cel01 211.546875G 105G
RECO_CD_04_cm01cel01   RECO_CM01 CD_04_cm01cel01 211.546875G 105G
RECO_CD_05_cm01cel01   RECO_CM01 CD_05_cm01cel01 211.546875G 105G
RECO_CD_06_cm01cel01   RECO_CM01 CD_06_cm01cel01 211.546875G 105G
RECO_CD_07_cm01cel01   RECO_CM01 CD_07_cm01cel01 211.546875G 105G
RECO_CD_08_cm01cel01   RECO_CM01 CD_08_cm01cel01 211.546875G 105G
RECO_CD_09_cm01cel01   RECO_CM01 CD_09_cm01cel01 211.546875G 105G
RECO_CD_10_cm01cel01   RECO_CM01 CD_10_cm01cel01 211.546875G 105G
RECO_CD_11_cm01cel01   RECO_CM01 CD_11_cm01cel01 211.546875G 105G
 
CellCLI>
 
We can see from the above that although the size of the DATA% grid disks is 423G, the byte offset for the RECO disks is at 211G and the DBFS_DG diskgroup at 264G.  This clues us in that our grid disks are built on cell disks defined with interleaving.  We can confirm this by checking our cell disk configuration:
 
CellCLI> list celldisk attributes name, interleaving
CD_00_cm01cel01 normal_redundancy
CD_01_cm01cel01 normal_redundancy
CD_02_cm01cel01 normal_redundancy
CD_03_cm01cel01 normal_redundancy
CD_04_cm01cel01 normal_redundancy
CD_05_cm01cel01 normal_redundancy
CD_06_cm01cel01 normal_redundancy
CD_07_cm01cel01 normal_redundancy
CD_08_cm01cel01 normal_redundancy
CD_09_cm01cel01 normal_redundancy
CD_10_cm01cel01 normal_redundancy
CD_11_cm01cel01 normal_redundancy
 
Flash Disks
 
Before moving on to ASM storage, let’s talk about the PCI Flash Cards in each storage cell.  There are four (4) 96GB PCI flash cards in each server for a total of 384GB of PCI flash per cell.   We can see that each flash card is split into four 22.88 areas:
 
CellCLI> list physicaldisk attributes name,physicalsize,slotnumber where disktype=FlashDisk
FLASH_1_0 22.8880615234375G "PCI Slot: 1; FDOM: 0"
FLASH_1_1 22.8880615234375G "PCI Slot: 1; FDOM: 1"
FLASH_1_2 22.8880615234375G "PCI Slot: 1; FDOM: 2"
FLASH_1_3 22.8880615234375G "PCI Slot: 1; FDOM: 3"
FLASH_2_0 22.8880615234375G "PCI Slot: 2; FDOM: 0"
FLASH_2_1 22.8880615234375G "PCI Slot: 2; FDOM: 1"
FLASH_2_2 22.8880615234375G "PCI Slot: 2; FDOM: 2"
FLASH_2_3 22.8880615234375G "PCI Slot: 2; FDOM: 3"
FLASH_4_0 22.8880615234375G "PCI Slot: 4; FDOM: 0"
FLASH_4_1 22.8880615234375G "PCI Slot: 4; FDOM: 1"
FLASH_4_2 22.8880615234375G "PCI Slot: 4; FDOM: 2"
FLASH_4_3 22.8880615234375G "PCI Slot: 4; FDOM: 3"
FLASH_5_0 22.8880615234375G "PCI Slot: 5; FDOM: 0"
FLASH_5_1 22.8880615234375G "PCI Slot: 5; FDOM: 1"
FLASH_5_2 22.8880615234375G "PCI Slot: 5; FDOM: 2"
FLASH_5_3 22.8880615234375G "PCI Slot: 5; FDOM: 3"
 
CellCLI>
 
We can also determine whether this is configured for Smart Flash Cache:
 
CellCLI> list flashcache detail
name:               cm01cel01_FLASHCACHE
cellDisk:           FD_07_cm01cel01,FD_12_cm01cel01,FD_09_cm01cel01,FD_04_cm01cel01,FD_02_cm01cel01,FD_01_cm01cel01,FD_13_cm01cel01,FD_14_cm01cel01,FD_08_cm01cel01,FD_00_cm01cel01,FD_06_cm01cel01,FD_03_cm01cel01,FD_10_cm01cel01,FD_15_cm01cel01,FD_05_cm01cel01,FD_11_cm01cel01
creationTime:       2012-02-20T23:09:15-05:00
degradedCelldisks:
effectiveCacheSize: 364.75G
id:                 08e69f5d-48ca-4c5e-b614-25989a33b269
size:               364.75G
status:             normal
 
CellCLI>
 
In the above output we can see that each Flash Disk is allocated to Flash Cache, for a total size of 364.75G
 
ASM Storage
981
 
 
 
 
 
 
 
 
 
 
 
As mentioned previously, ASM disk groups are built on storage cell grid disks.  In the example above, we’ve created a diskgroup called DATA_CM01 of type normal redundancy using an InfiniBand-aware disk string wild-card, ‘o/*/DATA*”.  Here’s what the wildcard means:
 
- “o” means to look for devices over the InfiniBand network.
- The first wildcard, “o/*”, means to build a disk group on devices across all storage server InfiniBand IP addresses.  From the compute node, Oracle determines these by examining cellip.ora.  See below:
 
[grid@cm01dbm01 ~]$ locate cellip.ora
/etc/oracle/cell/network-config/cellip.ora
/opt/oracle.SupportTools/onecommand/tmp/cellip.ora
[grid@cm01dbm01 ~]$ cat /etc/oracle/cell/network-config/cellip.ora
cell="192.168.10.3"
cell="192.168.10.4"
cell="192.168.10.5"
[grid@cm01dbm01 ~]$
 
- The “DATA*” indicates to build the disk group on each grid disk whose name starts with “DATA”
 
Let’s look more closely at what this ASM disk group looks like:
 
SQL> select a.name,b.path,b.state,b.failgroup
  2  from v$asm_diskgroup a, v$asm_disk b
  3  where a.group_number=b.group_number
  4  and a.name like '%DATA%'
  5  order by 4,1
  6  /
DATA_CM01       o/192.168.10.3/DATA_CD_01_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_04_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_10_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_02_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_06_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_05_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_07_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_08_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_00_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_11_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_03_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.3/DATA_CD_09_cm01cel01  NORMAL   CM01CEL01
DATA_CM01       o/192.168.10.4/DATA_CD_08_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_07_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_02_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_06_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_09_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_05_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_11_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_10_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_04_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_03_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_00_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.4/DATA_CD_01_cm01cel02  NORMAL   CM01CEL02
DATA_CM01       o/192.168.10.5/DATA_CD_02_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_01_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_06_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_10_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_05_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_09_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_08_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_11_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_04_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_07_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_00_cm01cel03  NORMAL   CM01CEL03
DATA_CM01       o/192.168.10.5/DATA_CD_03_cm01cel03  NORMAL   CM01CEL03
 
36 rows selected.
 
As we can see, we’ve got 36 disks in this DATA_CM01 disk group, one for each grid disks on each of 3 storage servers (recall we’re on a quarter rack, which has 3 storage cells).  The DBFG_DG and RECO_CM01 ASM disk groups would look very similar.
 
When we created this ASM disk group, we specified normal redundancy.  With Exadata, external redundancy is not an option – you either need to use normal or high redundancy.  With normal redundancy, each extent is mirrored to a different cell and with high redundancy; it’s mirrored via ASM to two additional cells.  Specifically, extents are mirrored to partner disks in different failure groups.  Let’s take a look at these relationships, focusing on DATA_CM01:
 
SQL> select group_number,name from v$asm_diskgroup;
 
GROUP_NUMBER NAME
------------ ------------------------------
  1 DATA_CM01
  2 DBFS_DG
  3 RECO_CM01
 
SQL>
 
  1  SELECT count(disk_number)
  2  FROM v$asm_disk
  3* WHERE group_number = 1
SQL> /
 
COUNT(DISK_NUMBER)
------------------
36
 
Now we’ll see how many partners the disks have:
 
  1  SELECT disk "Disk", count(number_kfdpartner) "Number of partners"
  2  FROM x$kfdpartner
  3  WHERE grp=1
  4  GROUP BY disk
  5* ORDER BY 1
SQL> /
 
      Disk Number of partners
---------- ------------------
0    8
1    8
2    8
3    8
4    8
5    8
6    8
7    8
8    8
9    8
<< output truncated >>
 
We’ve got 8 partners for each disk.  Now let’s see where they actually reside:
 
SQL> SELECT d.group_number "Group#", d.disk_number "Disk#", p.number_kfdpartner "Partner disk#"
  2  FROM x$kfdpartner p, v$asm_disk d
  3  WHERE p.disk=d.disk_number and p.grp=d.group_number
  4  ORDER BY 1, 2, 3;
 
    Group#      Disk# Partner disk#
---------- ---------- -------------
         1          0            13
         1          0            16
         1          0            17
         1          0            23
         1          0            24
         1          0            29
         1          0            30
         1          0            34
         1          1             5
         1          1            12
         1          1            18
         1          1            20
         1          1            22
         1          1            23
         1          1            28
         1          1            34
         1          2            12
         1          2            17
         1          2            18
         1          2            20
<< output truncated >>
 
As we can see, the partner disks span multiple cells.
 
Grid Infrastructure Storage
 
On the compute nodes we’re running Oracle 11gR2 Grid Infrastructure and Oracle RAC.  You don’t have to use RAC with Exadata, but most companies do.  With Grid Infrastructure, each compute node accesses the cluster registry (OCR) and mirrored voting disks.  Where do these physically reside on  the Exadata X2-2?.  
 
Let’s take a look:
 
[grid@cm01dbm01 ~]$ cd $ORACLE_HOME/bin
[grid@cm01dbm01 bin]$ ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3420
Available space (kbytes) :     258700
ID                       : 1833511320
Device/File Name         :   +DBFS_DG
                                    Device/File integrity check succeeded
 
                                    Device/File not configured
 
                                    Device/File not configured
 
                                    Device/File not configured
 
                                    Device/File not configured
 
Cluster registry integrity check succeeded
 
Logical corruption check bypassed due to non-privileged user
 
[grid@cm01dbm01 bin]$
 
[grid@cm01dbm01 bin]$ asmcmd
ASMCMD> ls
DATA_CM01/
DBFS_DG/
RECO_CM01/
ASMCMD> cd DBFS_DG
ASMCMD> ls
cm01-cluster/
ASMCMD> cd cm01-cluster
ASMCMD> ls
OCRFILE/
ASMCMD> cd OCRFILE
ASMCMD> ls
REGISTRY.255.753579427
ASMCMD> ls -l
Type     Redund  Striped  Time             Sys  Name
OCRFILE  MIRROR  COARSE   MAR 25 22:00:00  Y    REGISTRY.255.753579427
ASMCMD>
 
[grid@cm01dbm01 bin]$ ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   948f35d3d9c44f94bfe7bb831758104a (o/192.168.10.4/DBFS_DG_CD_06_cm01cel02) [DBFS_DG]
 2. ONLINE   61fb620328a24f87bf8c4a0ac0275cd1 (o/192.168.10.5/DBFS_DG_CD_05_cm01cel03) [DBFS_DG]
 3. ONLINE   60ab0b9e7dfe4f0abfb16b4344f5ede6 (o/192.168.10.3/DBFS_DG_CD_05_cm01cel01) [DBFS_DG]
Located 3 voting disk(s).
[grid@cm01dbm01 bin]$
 
From the above, it looks like:
 
- The OCR is stored in the DBFS_DG ASM disk group
- 3 copies of the vote disk are also stored in DBFS_DG, with a mirror on cell disks on each storage server
 
Database Storage
 
This is the easy part – Oracle uses ASM for database file storage on Exadata.  You are allowed to store files in NFS file-systems, but it’s generally discouraged because Exadata software features won’t be available for IO against these types of files.  Let’s take a look at a sample database’s files:
 
  1  select name from v$datafile
  2  union
  3  select name from v$tempfile
  4  union
  5* select member from v$logfile
SQL> set echo on
SQL> /
+DATA_CM01/dwprd/datafile/dw_data.559.777990713
+DATA_CM01/dwprd/datafile/dw_indx.563.777990715
+DATA_CM01/dwprd/datafile/dwdim_data.558.777990713
+DATA_CM01/dwprd/datafile/dwdim_indx.560.777990715
+DATA_CM01/dwprd/datafile/dwdiss_data.534.777990711
+DATA_CM01/dwprd/datafile/dwdiss_indx.561.777990715
+DATA_CM01/dwprd/datafile/dwfact_data.557.777990713
+DATA_CM01/dwprd/datafile/dwfact_indx.562.777990715
+DATA_CM01/dwprd/datafile/dwlibrary_data.556.777990713
+DATA_CM01/dwprd/datafile/dwportal_data.541.777990713
+DATA_CM01/dwprd/datafile/dwstage_data.564.777990715
+DATA_CM01/dwprd/datafile/dwstore_data.540.777990713
+DATA_CM01/dwprd/datafile/dwstore_indx.565.777990715
+DATA_CM01/dwprd/datafile/dwsum_data.531.777990709
+DATA_CM01/dwprd/datafile/inf.530.777990709
+DATA_CM01/dwprd/datafile/infolog_data.533.777990711
+DATA_CM01/dwprd/datafile/sysaux.507.774050315
+DATA_CM01/dwprd/datafile/system.505.774050303
+DATA_CM01/dwprd/datafile/undotbs1.506.774050327
+DATA_CM01/dwprd/datafile/undotbs2.448.774050349
+DATA_CM01/dwprd/datafile/usagedim_data.539.777990713
+DATA_CM01/dwprd/datafile/usagedim_indx.566.777990717
+DATA_CM01/dwprd/datafile/usagefact_data.532.777990709
+DATA_CM01/dwprd/datafile/usagefact_indx.567.777990717
+DATA_CM01/dwprd/datafile/usagereport_data.538.777990711
+DATA_CM01/dwprd/datafile/usagereport_indx.568.777990717
+DATA_CM01/dwprd/datafile/usagestage_data.537.777990711
+DATA_CM01/dwprd/datafile/usagestage_indx.569.777990717
+DATA_CM01/dwprd/datafile/usagestore_data.536.777990711
+DATA_CM01/dwprd/datafile/usagestore_indx.570.777990717
+DATA_CM01/dwprd/datafile/usagesum_data.535.777990711
+DATA_CM01/dwprd/datafile/usagesum_indx.571.777990717
+DATA_CM01/dwprd/datafile/users.499.774050361
+DATA_CM01/dwprd/redo01.log
+DATA_CM01/dwprd/redo02.log
+DATA_CM01/dwprd/redo03.log
+DATA_CM01/dwprd/redo04.log
+DATA_CM01/dwprd/redo05.log
+DATA_CM01/dwprd/redo06.log
+DATA_CM01/dwprd/redo07.log
+DATA_CM01/dwprd/redo08.log
+RECO_CM01/dwprd/tempfile/temp.270.778028067
+RECO_CM01/dwprd/tempfile/temp.272.778027033
+RECO_CM01/dwprd/tempfile/temp.438.778027027
+RECO_CM01/dwprd/tempfile/temp.463.774052319