Exadata Smart Scan Processing

The purpose of this post is to demonstrate Smart Scan processing on Centroid’s Exadata X2-2 Quarter Rack. Smart Scan processing is one of the components of Exadata Cell Offload Processing whose goal is to offload processing to storage cells instead of performing block IO requests on the database tier. The goal of Smart Scan processing is to do all the “heavy lifting”, IO-wise, in the storage cell and only send back data to the compute nodes that is actually needed and requested. This reduces interconnect traffic between the storage tier (storage cells) and database tier, reduces the clutter in the database buffer cache, improves query performance, and improves concurrency for multiple full-scan types of queries happening at once. With Smart Scan processing, row filtering, column filtering, some join processing, and other functions are performed in the Exadata cells. Results are returned to the foreground session’s PGA instead of the shared buffer cache.
With Smart Scan processing:
  1. User issues a SQL statement.
  2. Optimizer determines whether SQL statement is offloadable based on predicate(s).
  3. An iDB command is constructed and sent to Exadata cells.
  4. SQL is processed in Exadata cells.
  5. Filtered result set is returned from disks (or FastCache or Storage Index region) to cell server.
  6. A consolidated result set is built from all Exadata cells.
  7. Rows are returned to compute nodes.

 

Controlling Smart Scan Behavior

By default, Smart Scan processing on an Exadata database machine is enabled with the initialization parameter cell_offload_processing set to TRUE. You can turn the feature off by setting cell_offload_processing=FALSE at either the system or session level, or by using an optimizer hint /*+ opt_param(‘cell offload processing’ ‘false’) */. See the table below:

Command/Operation Impact
SQL> alter system set cell_offload_processing=TRUE; Turns on cell offload processing and enables Smart Scan system-wide
SQL> alter system set cell_offload_processing=FALSE; Turns off cell offload processing and enables Smart Scan for current session
SQL> select /*+ opt_param (‘cell offload processing’ ‘true’) */ <cols> from <table> where <pred>; Turns on cell offload processing and enables Smart Scan for current SQL statement
SQL> select /*+ opt_param (‘cell offload processing’ ‘false’) */ <cols> from <table> where <pred>; Turns off cell offload processing and enables Smart Scan for current SQL statement

Performance Impact of Smart Scan Processing

In this section we’ll do a couple of examples and compare them. First, we’ll build a copy of DBA_OBJECTS called MYOBJ and populate it with about 153 million rows.

SYSTEM @ dbm2> create table myobj

2 tablespace users

3 as select * from dba_objects;

 

Table created.

 

Elapsed: 00:00:00.30

SYSTEM @ dbm2> select count(*) from myobj;

18692

 

1 row selected.

 

Elapsed: 00:00:00.04

SYSTEM @ dbm2> declare

2 begin

3 for i in 1..13 loop

4 insert into myobj

5 select * from myobj;

6 end loop;

7 end;

8 /

 

PL/SQL procedure successfully completed.

 

Elapsed: 00:05:14.67

SYSTEM @ dbm2> commit;

SYSTEM @ dbm2> select count(*) from myobj;

 

153124864

 

1 row selected.

 

Elapsed: 00:02:27.78

SYSTEM @ dbm2>

To see how large this table is:

SYSTEM @ dbm2> col sbytes format 999,999,999.99

SYSTEM @ dbm2> col sblocks format 999,999,999,999

SYSTEM @ dbm2> select sum(bytes)/1024/1024/1024 sbytes,

2 sum(blocks) sblocks

3 from user_extents

4 where segment_name=’MYOBJ’;

15.49 2,030,080

 

1 row selected.

 

Elapsed: 00:00:00.07

SYSTEM @ dbm2>

 

So we’ve got a 150+ million row table, 15+Gb in size. Let’s do some tests. The first test will be without Smart Scan processing and the second test will be with Smart Scan processing.

 

 

 

In this first example we retrieved our answer from the 153 million row table in a little over 11 seconds. With the cell_offload_processing hint set to false, there were zero eligible bytes to return via Smart Scan so the query retrieved all 16Gb+ bytes from the cells to the database tier nodes. Note that the “cell physical IO interconnect bytes” statistic will show bytes returned from the Exadata cells regardless of whether they’re smart scanned or not.

With Smart Scan processing enabled, we retrieved the same result in less than 4 seconds. With cell_offload_processing set to TRUE in the SQL hint, Oracle calculated that the number of bytes eligible for predicate offload, as indicated by “cell physical IO bytes eligible for predicate offload”, matched the number of bytes returned from the cell in the non-offload-able previous query. We also can see how many bytes were actually returned from the cells as well as those which were returned via Smart Scan processing, as well as a percentage of how many blocks were saved from being shipped across the interconnect and into the database/compute node.

 

Statistic Description Test Result Commentary
“cell physical IO bytes eligible for predicate offloading” Shows how many bytes were eligible for Smart Scan processing. This statistic will be incremented and populated if the query is cell offload-able. Matched “cell physical IO interconnect bytes” from non-offload-able test
“cell physical IO interconnect bytes” Shows actual bytes returned from cell This will always show the number of bytes returned from cell – note that it’s 88% less than non-offload-able queries.
“cell physical IO interconnect bytes returned by smart scan” Shows how many bytes returned over interconnect as a result of Smart Scan processing In this test, the number if slightly lower than “cell physical IO interconnect bytes” because a few non-Smart Scan bytes were returned to satisfy the fetching of rows and so forth. This delta would be higher the more rows and columns are returned.

In the next test, we’ll try things without the parallel query hint:

 

 

This test showed longer run-times but proportionally the same types of results of the previous test.

Next, we’ll do a “select count(1)” and see if there is any impact:

 

 

 

As we can see, we get the same general type of output. Now let’s put in a filter predicate and see if we can learn anything. We’ll also throw in some storage index checking scripting as well.

 

 

 

We see a pretty dramatic savings in IO interconnect bytes, but still no improvement from a storage index. Let’s re-do the query and do an equijoin with object_name=’BATCH_GROUP’ and see if there’s any difference.

 

 

The output above shows no storage index savings, based on similar response times as well as lack of statistics for “cell physical IO saved by storage index”. The reason for this is more than likely because MYOBJ is not well-ordered by the predicate column, OBJECT_NAME. We’ll dive into storage index test cases in a future post.

Summary

Exadata Smart Scan processing is a means to improve query response time and reduce the amount of data passed from the storage cell servers to the database tier nodes. Examples in this post show the realized savings.