Exadata Smart Scan and Index Access

With Exadata Smart Scan processing, Exadata is capable of offloading a large portion of a query’s workload to the storage cells. Many claims exists out there that recommend “migrate to Exadata and drop all your indexes”, but let’s see how Smart Scan processing works with various types of index access through a series of tests. For these tests, we’ll use the SOE.ORDERS table, a large partitioned table in the SwingBench Orders schema. In addition, I’ll add a column to the table called “STATUS2” with a “Y” for most of the rows and “N” for the remaining small number of rows.

 

 

 

 

Here’s what we can determine from this:

  • The first query does an index range scan on a primary key unique index and is not eligible for smart scan processing.
  • The second query says it does an “INDEX STORAGE FAST FULL SCAN” on the ORDER_PK index, but it’s evident that this was not Smart Scan offloaded due to only the prescience of “cell physical IO interconnect bytes” and no offload statistics from V$MYSTAT.

On the next test, we’ll scan column with a reverse index built on it:

 

 

As we can see, no cell offload took place for this either. Let’s full-scan the table and see if it makes a difference:

 

 

Clearly, changing to a full-scan made quite a difference. In addition to running in 2.72 seconds, it saved 99%of IO, only returning 7200Mb from the storage cells. This test makes quite an argument for disabling or dropping indexes (or hinting) to take advantage of cell offload processes. If we re-visit the index test on the ORDER_ID columns demonstrated previously, we see the same type of result. This tells us that the “+0” clause in the predicate doesn’t prevent using the index, it just forces a full scan and makes it not eligible for cell offload:

 

 

Now let’s try the index on ORDER_STATUS:

 

 

This shows that the query was not offloaded, but it did finish in .09 seconds using the index, which is pretty fast. What happens if we force a full scan on the query via a hint? The theory of course, based on previous testing, is that the time should approach the 2+ second time range:

 

 

This confirms the theory and points out an interesting point – Smart Scan processing in some cases will definitely NOT outpace normal selective index access. Let’s do the same test with ORDER_STATUS is null, which retrieves a large number of rows:

 

 

With this we see that cell offload was in play, Smart Scan worked, it retrieved the rows in slightly under 2 seconds, and it also full-scanned the table as expected based on the predicate and null values in the column. Note that the MB returned from Smart Scan processing was higher than in previous tests in this section – this is a function of the query being run and the query time was only slightly higher.

 

Our last test will be on the bitmap index in column STATUS2:

 

 

The output is as predicted, but let’s try a bitmap index range scan. Prior to doing this, I’ll update 10 million rows in the table and set STATUS2=’Y’:

 

 

Above you can see a different execution plan – “BITMAP IDNEX STORAGE FAST FULL SCAN”. Also, the “cell physical IO interconnect bytes” is about double what it was in the previous scan, indicating more data was passed back compared to the “BITMAP INDEX SINGLE VALUE” test previously. And while no “cell physical IO interconnect bytes returned by smart scan” rows were returned, we can probably assume that the query did in fact perform cell offloading. We would think that “cell index scan” statistics would be non-zero, but for some reason they are not. This is something we’ll need to follow-up on.

Summary

Cell offload does not take place when indexes exist on column in the predicate. Depending on the query and it’s load nature, it may or may not be advisable to index or rely on cell offload and full-scanning – mileage will vary!