Centroid displayed an excellent combination of strategic forethought, leveraging technology and...
Birken Olsen, CEO
The BCE Group

Archive for the ‘DBA Scripts’ Category

Large SGA on 32-bit Redhat Linux

without comments

Many DBAs are familiar with the SGA size limitations on 32-bit platforms.  This post shows how to allocate a 3Gb buffer cache on 32-bit RHAS 3.

  • Step 1: Mount /dev/shm to as type ramfs.  Edit /etc/fstab and add an entry like this:

none                    /dev/shm                ramfs defaults,size=4G 0 0

  • Step 2: do a “mount -a” to mount /dev/shm.  I use ramfs instead of tmpfs because it doesn’t use swap; tmpfs does.  With ramfs memory allocation will also grow dynamically, whereas when using tmpfs it will not.
  • Step 3: As root, assuming your database is owned by the Linux account oracle, whose primary group is “dba”, do this:

# chown oracle:dba /dev/shm

  • Step 4: Add the following to /etc/security/limits.conf to increase maximum memory lock parameters.  Ensure “oracle” user has his environment sourced to establish these settings.

oracle            soft    memlock         3145728
oracle            hard    memlock         3145728

  • Step 5: Login as oracle and do “ulimit -a”; validate that memlock is set to the above values
  • Step 6: Edit /etc/sysctl.conf and add/change the following.  When complete, do “sysctl -w” to activate changes into the Linux kernel.  The “vm.hugetbl_pool” setting below is set based on the output of a script provided here (http://download-uk.oracle.com/docs/cd/B28359_01/server.111/b32009/appi_vlm.htm) - run this AFTER Oracle is started at the completion of this document to get a realistic value for vm.hugetbl_pool and adjust accordingly

kernel.sem = 1000 32000 100 150
kernel.shmmax = 4294967295
kernel.shmall = 4194304
net.ipv4.ip_local_port_range = 1024 65000
vm.pagecache = 10 20 30
kernel.shmmni=4096
vm.hugetlb_pool=4096

  • Step 7: Unset db_cache_size, db_xk_cache_size, sga_target, sga_max_size, memory_target init.ora parameters and manually set shared_pool_size to appropriate value.  You can use “show sga” to determine this
  • Step 8: Set use_indirect_data_buffers=true
  • Step 9: Set db_block_buffers such that the product of db_block_buffers and db_block_size = 3G
  • Step 10: In oracle’s .profile/.bash_profile, set DISABLE_MAP_LOCK=1.  This is required to avoid unnecessarily long connect times for databases that are connected to frequently

export DISABLE_MAP_LOCK=1

  • Step 11: Stop oracle, source environment, ensure O/S limits are correct (ulimit -a), ensure /dev/shm is owned by oracle (ls -al /dev/shm), ensure DISABLE_MAP_LOCK=1, and then start Oracle
  • Step 12: Add the following to /etc/rc2.d/S99local:

mount /dev/shm

chown oracle:dba /dev/shm

Test.  Enjoy the benefits of a large cache.  Test across reboots to ensure /dev/shm is mounted correctly.

Written by John Clarke

June 10th, 2010 at 3:03 pm

Using DBMS_SQLDIAG

without comments

As a DBA, have you ever had an issue you suspected may be a bug and have been asked to generate “test data” for Oracle development?  We ran into this at a client recently and found a cool 11g utility (at least I *think* it’s new :)) to get Oracle Support the data they need to reproduce the issue in-house.

The situation had to do with optimizer_features_enable being set to 11.1.0.7 in a database recently upgraded form 9.2.0.5 in an Oracle eBusiness Suite environment (11.5.10.2) that utilized Oracle Reports heavily.  Our problem was specific to a custom Report that had optimizer hints embedded in the main query.  With optimizer_features_enable set to 11.1.0.7, the report retrieved no rows.  With it set to 9.2.0.x or 10.2.0.x, it returned the proper number of rows.  Based on this data discrepancy, we felt it safe to set optimizer_features_enable < 11.1.0.7 across the board, as we didn’t know the scope of the issue.

During our work on an SR with Oracle, they asked for us to send the query, all the versions and optimizer settings, as well as an export of all the tables involved in the query with exported optimizer statistics.  Our problem was several-fold; first, the query was complex and had many embedded views, so we didn’t really want to spend time deconstructing it to get a comprehensive list of tables to export.  Second, the underlying tables were very large (hundreds of millions of rows for a few of them) and we didn’t have disk space or quite frankly, time, to export all the tables in their entirety.

Enter DBMS_SQLDIAG …

Using DBMS_SQLDIAG and 11g Data Pump features, we were able to quickly generate a complete test case to export a subset of the rows from all the impacted tables, very quickly and with minimal disk space requirements.  Here’s what we did:

  1. Grabbed the offending query from a TKPROF’d trace file
  2. Used this syntax to generate a test case:

declare
tc_out clob;
begin
dbms_sqldiag.export_sql_testcase(directory=>’<directory>’,
sql_text=>’<SQL Text>’,
testcase => tc_out,
exportdata=>TRUE,
samplingpercent=>1);
end;
/

In the above example, note the following:

  • <directory> is a valid directory - check DBA_DIRECTORIES
  • <SQL Text> is the SQL statement from the TKPROF output
  • exportdata=>TRUE tells DBMS_SQLDIAG to export the data from the base tables
  • samplingpercent=>1 tells Data Pump to use a 1% sampling size.  This was important to limit the number of rows

After executing, a number of XML, log, and Data Pump export dumps are generated to <directory> and available to upload to the SR!

One additional step we took was to export table statistics for all the tables involved in the query.  For this, we looked in the log file for all tables export and used DBMS_STATS.EXPORT_TABLE_STATS to export segment statistics.

The obvious benefit here is that it enabled us to continue working on the SR, but some other possible applications of DBMS_SQLDIAG could be for internal testing purposes, testing functionality/performance across versions of Oracle without a complicated upgrade, regression testing, and so forth.

Written by John Clarke

March 3rd, 2010 at 12:14 am

Connecting an Oracle Database with a DB2 Database

without comments

On a recent project we had a business and technology reason to connect Oracle EBS running 11g to a DB2 database.  After a few weeks of researching for the best practice and approach we decided on the following implementation steps.

  1. Download a DB2 driver, such as from DataTek, and download to your application server in any storage directory.
  2. Create a directory called “YM” under your custom application top.  example: $CUSTOM_TOP/java/YM.  (Assumes you have already created a $CUSTOM_TOP and it has a directory called “java”.
  3. Copy all files from the storage directory to $CUSTOM_TOP/java/YM
  4. Add 3 entries to s_adovar_classpath and s_adovar_afclasspath to point to the files in $CUSTOM_TOP/java/YM.
  5. Make sure you run auto-config on your updated environment.
  6. Bounce the application and database tier.
  7. Perform any select, insert, update and or delete from the Oracle EBS application via java program to the DB2 database.  The java program will reference the DB2 connection from a jdbc connect string such as: “jdbc:oracle:db2://servername.com:port;databasename=xxxxxx; User=xxxx; Password=xxxx”

Written by Jim Brull

January 30th, 2010 at 5:42 pm

What you can do with ASH: Top Resource Consuming SQL

without comments

 The following SQL shows you the top resource-consuming pieces of SQL in your instance:

select ash.SQL_ID ,
     sum(decode(ash.session_state,'ON CPU',1,0))     "CPU",
     sum(decode(ash.session_state,'WAITING',1,0))    -
     sum(decode(ash.session_state,'WAITING', decode(en.wait_class, 'User I/O',1,0),0))    "WAIT" ,
     sum(decode(ash.session_state,'WAITING', decode(en.wait_class, 'User I/O',1,0),0))    "IO" ,
     sum(decode(ash.session_state,'ON CPU',1,1))     "TOTAL"
from v$active_session_history ash,
       v$event_name en
where SQL_ID is not NULL
  and en.event#=ash.event#
group by sql_id
order by sum(decode(session_state,'ON CPU',1,1))   desc 

Written by John Clarke

April 10th, 2009 at 4:24 pm

“Oracle is slow, can you see if anything is going on?”

without comments

As DBAs, we’re faced with this question all the time.  In order to quickly supply an accurate answer to this question, an experienced Oracle DBA needs to have a few tools in his belt - and I’m not talking about any special software or monitoring solutions, I’m talking simply SQL*Plus scripts and access to a database account with access to the V$ views.

Here’s what I do when someone asks me this question:

Step #1:

Take a look at V$SESSION_WAIT.  This will show you details about sessions currently and actively waiting on named Oracle wait events.  More often than not, if things are “slow”,  a session or sessions is waiting on an instrumented Oracle wait event.  The query I use to do this (works on 9i-11g) is below:
select     A.sid,
decode(A.event,'null event','CPU Exec',A.event) WaitEvent,
decode(A.event,'slave wait','N/A',
'PX Deq: Execution Msg','N/A',
'PX Deq Credit: send blk','N/A',
'latch free','N/A',
'enqueue',
chr(bitand(A.p1,-16777216)/16777215)||chr(bitand(A.p1,16711680)/65535),
'file open','-1',to_char(A.p1)) p1,
decode(A.event,'enqueue',decode(mod(A.p1,16),'6','ROW-LOCK','4','ITL','3',
'FK?','OTHER'),
'file open',
-1,
A.p2) p2,
decode(A.event,'latch free','N/A','enqueue',null,'PX qref latch','-1',
'buffer busy waits',to_char(A.p3), A.p3) p3,
decode(A.state,'WAITING','WTG',
'WAITED UNKNOWN TIME','UNK',
'WAITED SHORT TIME','WST',
'WAITED KNOWN TIME','WKT') wait_type,
decode(A.state,'WAITING',A.seconds_in_wait,
'WAITED UNKNOWN TIME',-999,
'WAITED SHORT TIME',A.wait_time,
'WAITED KNOWN TIME',A.WAIT_TIME) wt,
round((last_call_et/60),2) lc,
substr(nvl(b.module,b.program),1,15) pgm
from    v$session_wait A,
v$session B
where  A.event not in ('Queue Monitor Slave Wait','wait for unread message on broadcast channel','Queue Monitor Wait','jobq slave wait','queue messages','SQL*Net message to client','Null event','rdbms ipc message','i/o slave wait','io done')
and A.event <> 'pipe get'
and A.event not like '%akeup%'
and A.event not like 'Streams AQ%'
and A.state in ('WAITING','WAITED KNOWN TIME')
and A.sid=B.sid
and B.status='ACTIVE'
order by 1
/

Sample output is below (you’ll have to set column headings and other SQL*plus formatting options, but you get the point):

Sid Wait Event                     P1         P2         P3         Typ     Time       last call What
------ ------------------------------ ---------- ---------- ---------- --- -------- ---------------
4518 gc buffer busy                 24         38019      65537      WTG        0             .00
4519 gc buffer busy                 24         38019      65537      WTG        0             .00 XXVG_INV_PICKLI
4680 gc buffer busy                 24         38019      65537      WTG        0             .00
4830 gc buffer busy                 24         38019      65537      WTG        0             .00 FNDRSSUB
4886 smon timer                     300        0          0          WTG       29        18447.47 oracle@usplsvpe
4887 control file parallel write    2          4          2          WTG        0        18447.47 oracle@usplsvpe
4893 gcs remote message             24         0          0          WTG        0        18447.47 oracle@usplsvpe
4895 gcs remote message             24         0          0          WTG        0        18447.47 oracle@usplsvpe
4896 ges remote message             64         0          0          WTG      152        18447.47 oracle@usplsvpe
4899 DIAG idle wait                 1          1          200        WTG  1106848        18447.47 oracle@usplsvpe
4900 pmon timer                     300        0          0          WTG      680        18447.47 oracle@usplsvpe

In this output, you’ll see a handful of sessions waiting on “gc buffer busy” wait events.  At this point, it’s time for the Oracle DBA to study up on what the wait events mean; in this case, sessions are waiting on RAC-related global buffer busy waits, which means that blocks are being used and are pinned in another instance’s cache.   I won’t go into a description on what all the wait events mean here - you can look them up at any of the following URLs:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents.htm#REFRN101
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#BGGIBDJI
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/instance_tune.htm#i22670
http://www.adp-gmbh.ch/ora/tuning/event.html
http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=34405.1
http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=62172.1

At this point,  you know who’s waiting on what and you can use the output to look for anomalies for the current environment.  A couple of things to note:

  • There are a handful of common wait events in any “busy” Oracle environment; specifically, “db file sequential read”, “db file scattered read”, latch-related, enqueue-related (locks), etc.  You should be familiar with what types of waits are “normal” for a given system.
  • You should become familiar with the relative quantity of each type of wait for each system at various times during the day.  For example, at client A, with a new implementation, low transaction volume, not many users, you may never see more than a handful of I/O-related waits at any given time.  At this client, if you see several dozen sessions waiting on the same type or class of wait event, it’s probably a cause for concern.  At a different client, it may be typical to see 20 or 30 I/O-related waits at any given time.  Bottom line is this - you need to have familiarity with the system you’re monitoring.
  • Any DBA with his salt should become study Oracle’s wait interface and become familiar with what each of the major wait events means
  • You can use Centroid’s “CCEO Infra Wait Interface.ppt” document as a quick reference on the wait interface

Step #2:

Grab the SQL for the sessions that show up repeatedly and frequently in the output from the above query.  Note the SID (Session Identifier) and use it as input to the following script:

select
t.sql_fulltext ,
t.buffer_gets, t.disk_reads,t.executions
from v$session s,
v$sql t
where s.sql_address =t.address and s.sql_hash_value =t.hash_value
and s.sid = &&1
/

Format the out of this if you plan on running an execution plan on it.

If you want additional detail about the session(s) from V$SESSION_WAIT, you can query V$SESSION.

Step #3:

If the SQL statements extracted from the previous step are waiting on I/O-related or contention-related waits,  you should grab an execution plan/explain plan by taking the formatted SQL and plugging into the below script:

set lines 120
explain plan for
<< insert SQL here >>
select * from table(dbms_xplan.display(null, null,'all'));

Step #4:

If the slowness is related to, for example, locks (enqueue waits), find out who the lock holder(s) is by querying V$LOCK or DBA_WAITERS and make a judgement call as to whether to kill the session(s) holding the lock, communicate with the end-user, etc.

Step #5:

Fix it.  This could be a quick-fix (resolving a lock), or more likely will take some time to assess.  If your cause of slowness is I/O-related waits, for example, you need to determine whether the SQL is optimized, whether indexes will help, whether concurrency patterns are abnormal (i.e., are there 50 simultaneous executions of a batch job that should only be running serially?), etc.  SQL optimization is a science in itself that requires knowledge of the underlying data structures and data volumes, as well as an understanding of Oracle’s optimizer.

Step #6:

What if V$SESSION_WAIT doesn’t tell you anything meaningful?  This is when you should consult ASH (Active Session History) views to give you time breakdown details:
select
decode(nvl(to_char(s.sid),-1),-1,'DISCONNECTED','CONNECTED')
"STATUS",
topsession.sid             "SID",
topsession.program                  "PROGRAM",
max(topsession.CPU)              "CPU",
max(topsession.WAIT)       "WAITING",
max(topsession.IO)                  "IO",
max(topsession.TOTAL)           "TOTAL"
from (
select * from (
select
ash.session_id sid,
ash.session_serial# serial#,
ash.user_id user_id,
ash.program,
sum(decode(ash.session_state,'ON CPU',1,0))     "CPU",
sum(decode(ash.session_state,'WAITING',1,0))    -
sum(decode(ash.session_state,'WAITING',
decode(wait_class,'User I/O',1, 0 ), 0))    "WAIT" ,
sum(decode(ash.session_state,'WAITING',
decode(wait_class,'User I/O',1, 0 ), 0))    "IO" ,
sum(decode(session_state,'ON CPU',1,1))     "TOTAL"
from v$active_session_history ash
group by session_id,user_id,session_serial#,program
order by sum(decode(session_state,'ON CPU',1,1)) desc
) where rownum < 10
)    topsession,
v$session s,
all_users u
where
u.user_id =topsession.user_id and
/* outer join to v$session because the session might be disconnected */
topsession.sid         = s.sid         (+) and
topsession.serial# = s.serial#   (+)
group by  topsession.sid, topsession.serial#,
topsession.user_id, topsession.program, s.username,
s.sid,s.paddr,u.username
order by max(topsession.TOTAL) desc
/

The output may look like this:

STATUS         Sid PROGRAM                           CPU    WAITING    IO  TOTAL
------------ ----- ------------------------- ----------- ---------- ----- ------
DISCONNECTED  4518                                 11220          9    83  11312
CONNECTED     4584                                  9620         25    50   9695
DISCONNECTED  4683 das@usplsvpba002.verigy.n        5598        258   735   6591
et (TNS V1-V3)

CONNECTED     4888 oracle@usplsvped002.verig         483        956     0   1439
y.net (LGWR)

CONNECTED     4897 oracle@usplsvped002.verig           7       1119     0   1126
y.net (LMON)

DISCONNECTED  4614                                   158         75   552    785
DISCONNECTED  4698 sqlservr.exe                       52         20   695    767
DISCONNECTED  4491 sqlservr.exe                      102         35   496    633
DISCONNECTED  4698 sqlservr.exe                       14          9   578    601

You can use the methods in Steps 2-3 above to get details about the sessions above.

Step #7:

If nothing stands out at this point, consult system logs and Oracle alert logs, as well as O/S performance tools (sar, top, glance, etc)

Written by John Clarke

April 10th, 2009 at 4:23 pm