Metrics are recorded observations of run-time properties and internal instrumentation values in the storage cell and its components, such as cells, cell disks, grid disks, etc.
CellCLI> list metricdefinition where objectType='CELLDISK' <detail>
CellCLI> list metricdefinition where objectType='GRIDDISK' <detail>
The “detail” clause at the end of the listing will show additional details about the metrics available. There is a wide range of metric definitions for each object type, so let’s start by focusing on Grid Disk metrics and see what’s available to monitor:
CellCLI> list metricdefinition where objectType='GRIDDISK' detail;
name: GD_IO_BY_R_LG
description: "Number of megabytes read in large blocks from a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: MB
name: GD_IO_BY_R_LG_SEC
description: "Number of megabytes read in large blocks per second from a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: MB/sec
name: GD_IO_BY_R_SM
description: "Number of megabytes read in small blocks from a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: MB
name: GD_IO_BY_R_SM_SEC
description: "Number of megabytes read in small blocks per second from a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: MB/sec
name: GD_IO_BY_W_LG
description: "Number of megabytes written in large blocks to a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: MB
name: GD_IO_BY_W_LG_SEC
description: "Number of megabytes written in large blocks per second to a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: MB/sec
name: GD_IO_BY_W_SM
description: "Number of megabytes written in small blocks to a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: MB
name: GD_IO_BY_W_SM_SEC
description: "Number of megabytes written in small blocks per second to a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: MB/sec
name: GD_IO_ERRS
description: "Number of IO errors on a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: Number
name: GD_IO_ERRS_MIN
description: "Number of IO errors on a grid disk per minute"
metricType: Rate
objectType: GRIDDISK
unit: /min
name: GD_IO_RQ_R_LG
description: "Number of requests to read large blocks from a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: "IO requests"
name: GD_IO_RQ_R_LG_SEC
description: "Number of requests to read large blocks per second from a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: IO/sec
name: GD_IO_RQ_R_SM
description: "Number of requests to read small blocks from a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: "IO requests"
name: GD_IO_RQ_R_SM_SEC
description: "Number of requests to read small blocks per second from a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: IO/sec
name: GD_IO_RQ_W_LG
description: "Number of requests to write large blocks to a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: "IO requests"
name: GD_IO_RQ_W_LG_SEC
description: "Number of requests to write large blocks per second to a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: IO/sec
name: GD_IO_RQ_W_SM
description: "Number of requests to write small blocks to a grid disk"
metricType: Cumulative
objectType: GRIDDISK
unit: "IO requests"
name: GD_IO_RQ_W_SM_SEC
description: "Number of requests to write small blocks per second to a grid disk"
metricType: Rate
objectType: GRIDDISK
unit: IO/sec
CellCLI>
As we can see above in the metricType listing, metrics can either be cumulative or instantaneous – your monitoring needs should dictate what types of metrics you may or will need to display. Rather than go through every case of monitoring, below is a table containing some common current monitoring scenarios that you may wish to report on. In the below table, I am mostly doing instantaneous metrics, but to get cumulative values in most cases you can drop the “_SEC” from the end of the list command.
Monitoring Requirement | objectType | CellCLI Command |
Cell CPU Utilization | Cell |
list metriccurrent where name='CL_CPUT'; |
Cell Memory Utilization | Cell |
list metriccurrent where name='CL_MEMUT'; |
Cell Temperature | Cell |
list metriccurrent where name='CL_TEMP'; |
Total IO packets received/second | Cell |
list metriccurrent where name='N_NIC_RCV_SEC'; |
Total IO packets transmitted second | Cell |
list metriccurrent where name='N_NIC_TRANS_SEC'; |
MB Read/Written in large blocks/Sec | Cell Disk |
list metriccurrent where name='CD_IO_BY_R_LG_SEC'; list metriccurrent where name='CD_IO_BY_W_LG_SEC'; |
MB Read/Write in small blocks/Sec | Cell Disk |
list metriccurrent where name='CD_IO_BY_R_SM_SEC'; list metriccurrent where name='CD_IO_BY_W_SM_SEC'; |
Avg IO Load of cell disk | Cell Disk |
list metriccurrent where name= CD_IO_LOAD; |
Number of large read/write requests/second to cell disk | Cell Disk |
list metriccurrent where name='CD_IO_RQ_R_LG_SEC'; list metriccurrent where name='CD_IO_RQ_W_LG_SEC'; |
Number of small read/write requests/second to cell disk | Cell Disk |
list metriccurrent where name='CD_IO_RQ_R_SM_SEC'; list metriccurrent where name='CD_IO_RQ_W_SM_SEC'; |
Avg latency of large read/write to cell disk | Cell Disk |
list metriccurrent where name='CD_IO_TM_R_LG_RQ'; list metriccurrent where name='CD_IO_TM_W_LG_RQ'; |
Avg latency of small read/write to cell disk | Cell Disk |
list metriccurrent where name='CD_IO_TM_R_SM_RQ'; list metriccurrent where name='CD_IO_TM_W_SM_RQ'; |
MB read/written in large blocks/Sec | Grid Disk |
list metriccurrent where name='GD_IO_BY_R_LG_SEC'; list metriccurrent where name='GD_IO_BY_W_LG_SEC'; |
MB read/write in small blocks/sec | Grid Disk |
list metriccurrent where name='GD_IO_BY_R_SM_SEC'; list metriccurrent where name='GD_IO_BY_W_SM_SEC'; |
Number of large read/write requests/second to grid disk | Grid Disk |
list metriccurrent where name='GD_IO_RQ_R_LG_SEC'; list metriccurrent where name='GD_IO_RQ_W_LG_SEC'; |
Number of small read/write requests/second to grid disk | Grid Disk |
list metriccurrent where name='GD_IO_RQ_R_SM_SEC'; list metriccurrent where name='GD_IO_RQ_W_SM_SEC'; |
Number of MB/sec pushed out of FlashCache due to being 80% full | FlashCache |
list metriccurrent where name='FC_BYKEEP_OVERWR_SEC'; |
Number of MB used for ‘keep’ objects in FlashCache | FlashCache |
list metriccurrent where name=' FC_BYKEEP_USED; |
Number of MB used for in FlashCache | FlashCache |
list metriccurrent where name=' FC_BY_USED; |
Number of MB/sec read/written from FlashCache | FlashCache |
list metriccurrent where name='FC_IO_BY_R_SEC'; list metriccurrent where name='FC_IO_BY_W_SEC'; |
Number of reads/sec satisfied FlashCache | FlashCache |
list metriccurrent where name=' FC_IO_RQ_R_SEC'; |
Number of IO requests/second that resulted in FlashCache being populated | FlashCache |
list metriccurrent where name='FC_IO_BY_W_SEC' |
Mb/sec received from host | Host Interconnect |
list metriccurrent where name='N_MB_RECEIVED_SEC'; |
Mb/sec sent to host | Host Interconnect |
list metriccurrent where name='N_MB_SENT_SEC'; |
Examples
Below, output is truncated in each example to save space.
Cell server CPU utilization:
[[email protected] cellmon]$ dcli -g ../cell_group cellcli -e \
list metriccurrent where name='CL_CPUT';
cm01cel01: CL_CPUT cm01cel01 0.2 %
cm01cel02: CL_CPUT cm01cel02 0.2 %
cm01cel03: CL_CPUT cm01cel03 0.7 %
[[email protected] cellmon]$
Average IO load of cell disks:
[[email protected] cellmon]$ dcli -g ../cell_group cellcli -e \
list metriccurrent where name= 'CD_IO_LOAD'
cm01cel01: CD_IO_LOAD CD_00_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_01_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_02_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_03_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_04_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_05_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_06_cm01cel01 0
cm01cel01: CD_IO_LOAD CD_07_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_08_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_09_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_10_cm01cel01 1
cm01cel01: CD_IO_LOAD CD_11_cm01cel01 1
Average latency of large IO to cell disks:
[[email protected] cellmon]$ dcli -g ../cell_group cellcli -e \
list metriccurrent where name='CD_IO_TM_R_LG_RQ'
cm01cel01: CD_IO_TM_R_LG_RQ CD_00_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_01_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_02_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_03_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_04_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_05_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_06_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_07_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_08_cm01cel01 0.0 us/request
cm01cel01: CD_IO_TM_R_LG_RQ CD_09_cm01cel01 0.0 us/request
Number of large read requests/second to Grid Disks:
[[email protected] cellmon]$ dcli -g ../cell_group cellcli -e \
list metriccurrent where name='GD_IO_RQ_R_LG_SEC'
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_00_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_01_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_02_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_03_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_04_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_05_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_06_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_07_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_08_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_09_cm01cel01 0.0 IO/sec
cm01cel01: GD_IO_RQ_R_LG_SEC DATA_CD_10_cm01cel01 0.0 IO/sec
Cumulative number of large read requests to Grid Disks:
[[email protected] cellmon]$ dcli -g ../cell_group cellcli -e \
list metriccurrent where name='GD_IO_RQ_R_LG'
cm01cel01: GD_IO_RQ_R_LG DATA_CD_00_cm01cel01 58,973 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_01_cm01cel01 58,293 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_02_cm01cel01 58,120 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_03_cm01cel01 58,243 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_04_cm01cel01 58,844 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_05_cm01cel01 58,973 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_06_cm01cel01 58,491 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_07_cm01cel01 58,326 IO requests
cm01cel01: GD_IO_RQ_R_LG DATA_CD_08_cm01cel01 58,405 IO requests
Summary
Monitoring Exadata with metrics provides insight into the component performance and availability.
1050 Wilshire Drive,
Suite 170,
Troy, MI 48084
Phone: (248) 465-9533
Toll free: 1-877-868-1753
Email: [email protected]
© Centroid, Inc. All rights reserved. Contact Privacy Policy Terms of Use CCPA Policy
Centroid is a cloud services and technology company that provides Oracle enterprise workload consulting and managed services across Oracle, Azure, Amazon, Google, and private cloud. From applications to technology to infrastructure, Centroid’s depth of Oracle expertise and breadth of cloud capabilities helps clients modernize, transform, and grow their business to the next level.
© Centroid, Inc. All rights reserved. Contact Privacy Policy Terms of Use CCPA Policy