Archive for the ‘DBA Scripts’ Category
Large SGA on 32-bit Redhat Linux
Many DBAs are familiar with the SGA size limitations on 32-bit platforms. This post shows how to allocate a 3Gb buffer cache on 32-bit RHAS 3.
- Step 1: Mount /dev/shm to as type ramfs. Edit /etc/fstab and add an entry like this:
none /dev/shm ramfs defaults,size=4G 0 0
- Step 2: do a “mount -a” to mount /dev/shm. I use ramfs instead of tmpfs because it doesn’t use swap; tmpfs does. With ramfs memory allocation will also grow dynamically, whereas when using tmpfs it will not.
- Step 3: As root, assuming your database is owned by the Linux account oracle, whose primary group is “dba”, do this:
# chown oracle:dba /dev/shm
- Step 4: Add the following to /etc/security/limits.conf to increase maximum memory lock parameters. Ensure “oracle” user has his environment sourced to establish these settings.
oracle soft memlock 3145728
oracle hard memlock 3145728
- Step 5: Login as oracle and do “ulimit -a”; validate that memlock is set to the above values
- Step 6: Edit /etc/sysctl.conf and add/change the following. When complete, do “sysctl -w” to activate changes into the Linux kernel. The “vm.hugetbl_pool” setting below is set based on the output of a script provided here (http://download-uk.oracle.com/docs/cd/B28359_01/server.111/b32009/appi_vlm.htm) - run this AFTER Oracle is started at the completion of this document to get a realistic value for vm.hugetbl_pool and adjust accordingly
kernel.sem = 1000 32000 100 150
kernel.shmmax = 4294967295
kernel.shmall = 4194304
net.ipv4.ip_local_port_range = 1024 65000
vm.pagecache = 10 20 30
kernel.shmmni=4096
vm.hugetlb_pool=4096
- Step 7: Unset db_cache_size, db_xk_cache_size, sga_target, sga_max_size, memory_target init.ora parameters and manually set shared_pool_size to appropriate value. You can use “show sga” to determine this
- Step 8: Set use_indirect_data_buffers=true
- Step 9: Set db_block_buffers such that the product of db_block_buffers and db_block_size = 3G
- Step 10: In oracle’s .profile/.bash_profile, set DISABLE_MAP_LOCK=1. This is required to avoid unnecessarily long connect times for databases that are connected to frequently
export DISABLE_MAP_LOCK=1
- Step 11: Stop oracle, source environment, ensure O/S limits are correct (ulimit -a), ensure /dev/shm is owned by oracle (ls -al /dev/shm), ensure DISABLE_MAP_LOCK=1, and then start Oracle
- Step 12: Add the following to /etc/rc2.d/S99local:
mount /dev/shm
chown oracle:dba /dev/shm
Test. Enjoy the benefits of a large cache. Test across reboots to ensure /dev/shm is mounted correctly.
Using DBMS_SQLDIAG
As a DBA, have you ever had an issue you suspected may be a bug and have been asked to generate “test data” for Oracle development? We ran into this at a client recently and found a cool 11g utility (at least I *think* it’s new :)) to get Oracle Support the data they need to reproduce the issue in-house.
The situation had to do with optimizer_features_enable being set to 11.1.0.7 in a database recently upgraded form 9.2.0.5 in an Oracle eBusiness Suite environment (11.5.10.2) that utilized Oracle Reports heavily. Our problem was specific to a custom Report that had optimizer hints embedded in the main query. With optimizer_features_enable set to 11.1.0.7, the report retrieved no rows. With it set to 9.2.0.x or 10.2.0.x, it returned the proper number of rows. Based on this data discrepancy, we felt it safe to set optimizer_features_enable < 11.1.0.7 across the board, as we didn’t know the scope of the issue.
During our work on an SR with Oracle, they asked for us to send the query, all the versions and optimizer settings, as well as an export of all the tables involved in the query with exported optimizer statistics. Our problem was several-fold; first, the query was complex and had many embedded views, so we didn’t really want to spend time deconstructing it to get a comprehensive list of tables to export. Second, the underlying tables were very large (hundreds of millions of rows for a few of them) and we didn’t have disk space or quite frankly, time, to export all the tables in their entirety.
Enter DBMS_SQLDIAG …
Using DBMS_SQLDIAG and 11g Data Pump features, we were able to quickly generate a complete test case to export a subset of the rows from all the impacted tables, very quickly and with minimal disk space requirements. Here’s what we did:
- Grabbed the offending query from a TKPROF’d trace file
- Used this syntax to generate a test case:
declare
tc_out clob;
begin
dbms_sqldiag.export_sql_testcase(directory=>’<directory>’,
sql_text=>’<SQL Text>’,
testcase => tc_out,
exportdata=>TRUE,
samplingpercent=>1);
end;
/
In the above example, note the following:
- <directory> is a valid directory - check DBA_DIRECTORIES
- <SQL Text> is the SQL statement from the TKPROF output
- exportdata=>TRUE tells DBMS_SQLDIAG to export the data from the base tables
- samplingpercent=>1 tells Data Pump to use a 1% sampling size. This was important to limit the number of rows
After executing, a number of XML, log, and Data Pump export dumps are generated to <directory> and available to upload to the SR!
One additional step we took was to export table statistics for all the tables involved in the query. For this, we looked in the log file for all tables export and used DBMS_STATS.EXPORT_TABLE_STATS to export segment statistics.
The obvious benefit here is that it enabled us to continue working on the SR, but some other possible applications of DBMS_SQLDIAG could be for internal testing purposes, testing functionality/performance across versions of Oracle without a complicated upgrade, regression testing, and so forth.
Connecting an Oracle Database with a DB2 Database
On a recent project we had a business and technology reason to connect Oracle EBS running 11g to a DB2 database. After a few weeks of researching for the best practice and approach we decided on the following implementation steps.
- Download a DB2 driver, such as from DataTek, and download to your application server in any storage directory.
- Create a directory called “YM” under your custom application top. example: $CUSTOM_TOP/java/YM. (Assumes you have already created a $CUSTOM_TOP and it has a directory called “java”.
- Copy all files from the storage directory to $CUSTOM_TOP/java/YM
- Add 3 entries to s_adovar_classpath and s_adovar_afclasspath to point to the files in $CUSTOM_TOP/java/YM.
- Make sure you run auto-config on your updated environment.
- Bounce the application and database tier.
- Perform any select, insert, update and or delete from the Oracle EBS application via java program to the DB2 database. The java program will reference the DB2 connection from a jdbc connect string such as: “jdbc:oracle:db2://servername.com:port;databasename=xxxxxx; User=xxxx; Password=xxxx”
What you can do with ASH: Top Resource Consuming SQL
The following SQL shows you the top resource-consuming pieces of SQL in your instance:
select ash.SQL_ID ,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING', decode(en.wait_class, 'User I/O',1,0),0)) "WAIT" ,
sum(decode(ash.session_state,'WAITING', decode(en.wait_class, 'User I/O',1,0),0)) "IO" ,
sum(decode(ash.session_state,'ON CPU',1,1)) "TOTAL"
from v$active_session_history ash,
v$event_name en
where SQL_ID is not NULL
and en.event#=ash.event#
group by sql_id
order by sum(decode(session_state,'ON CPU',1,1)) desc
“Oracle is slow, can you see if anything is going on?”
As DBAs, we’re faced with this question all the time. In order to quickly supply an accurate answer to this question, an experienced Oracle DBA needs to have a few tools in his belt - and I’m not talking about any special software or monitoring solutions, I’m talking simply SQL*Plus scripts and access to a database account with access to the V$ views.
Here’s what I do when someone asks me this question:
Step #1:
Take a look at V$SESSION_WAIT. This will show you details about sessions currently and actively waiting on named Oracle wait events. More often than not, if things are “slow”, a session or sessions is waiting on an instrumented Oracle wait event. The query I use to do this (works on 9i-11g) is below:
select A.sid,
decode(A.event,'null event','CPU Exec',A.event) WaitEvent,
decode(A.event,'slave wait','N/A',
'PX Deq: Execution Msg','N/A',
'PX Deq Credit: send blk','N/A',
'latch free','N/A',
'enqueue',
chr(bitand(A.p1,-16777216)/16777215)||chr(bitand(A.p1,16711680)/65535),
'file open','-1',to_char(A.p1)) p1,
decode(A.event,'enqueue',decode(mod(A.p1,16),'6','ROW-LOCK','4','ITL','3',
'FK?','OTHER'),
'file open',
-1,
A.p2) p2,
decode(A.event,'latch free','N/A','enqueue',null,'PX qref latch','-1',
'buffer busy waits',to_char(A.p3), A.p3) p3,
decode(A.state,'WAITING','WTG',
'WAITED UNKNOWN TIME','UNK',
'WAITED SHORT TIME','WST',
'WAITED KNOWN TIME','WKT') wait_type,
decode(A.state,'WAITING',A.seconds_in_wait,
'WAITED UNKNOWN TIME',-999,
'WAITED SHORT TIME',A.wait_time,
'WAITED KNOWN TIME',A.WAIT_TIME) wt,
round((last_call_et/60),2) lc,
substr(nvl(b.module,b.program),1,15) pgm
from v$session_wait A,
v$session B
where A.event not in ('Queue Monitor Slave Wait','wait for unread message on broadcast channel','Queue Monitor Wait','jobq slave wait','queue messages','SQL*Net message to client','Null event','rdbms ipc message','i/o slave wait','io done')
and A.event <> 'pipe get'
and A.event not like '%akeup%'
and A.event not like 'Streams AQ%'
and A.state in ('WAITING','WAITED KNOWN TIME')
and A.sid=B.sid
and B.status='ACTIVE'
order by 1
/
Sample output is below (you’ll have to set column headings and other SQL*plus formatting options, but you get the point):
Sid Wait Event P1 P2 P3 Typ Time last call What
------ ------------------------------ ---------- ---------- ---------- --- -------- ---------------
4518 gc buffer busy 24 38019 65537 WTG 0 .00
4519 gc buffer busy 24 38019 65537 WTG 0 .00 XXVG_INV_PICKLI
4680 gc buffer busy 24 38019 65537 WTG 0 .00
4830 gc buffer busy 24 38019 65537 WTG 0 .00 FNDRSSUB
4886 smon timer 300 0 0 WTG 29 18447.47 oracle@usplsvpe
4887 control file parallel write 2 4 2 WTG 0 18447.47 oracle@usplsvpe
4893 gcs remote message 24 0 0 WTG 0 18447.47 oracle@usplsvpe
4895 gcs remote message 24 0 0 WTG 0 18447.47 oracle@usplsvpe
4896 ges remote message 64 0 0 WTG 152 18447.47 oracle@usplsvpe
4899 DIAG idle wait 1 1 200 WTG 1106848 18447.47 oracle@usplsvpe
4900 pmon timer 300 0 0 WTG 680 18447.47 oracle@usplsvpe
In this output, you’ll see a handful of sessions waiting on “gc buffer busy” wait events. At this point, it’s time for the Oracle DBA to study up on what the wait events mean; in this case, sessions are waiting on RAC-related global buffer busy waits, which means that blocks are being used and are pinned in another instance’s cache. I won’t go into a description on what all the wait events mean here - you can look them up at any of the following URLs:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents.htm#REFRN101
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#BGGIBDJI
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/instance_tune.htm#i22670
http://www.adp-gmbh.ch/ora/tuning/event.html
http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=34405.1
http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=62172.1
At this point, you know who’s waiting on what and you can use the output to look for anomalies for the current environment. A couple of things to note:
- There are a handful of common wait events in any “busy” Oracle environment; specifically, “db file sequential read”, “db file scattered read”, latch-related, enqueue-related (locks), etc. You should be familiar with what types of waits are “normal” for a given system.
- You should become familiar with the relative quantity of each type of wait for each system at various times during the day. For example, at client A, with a new implementation, low transaction volume, not many users, you may never see more than a handful of I/O-related waits at any given time. At this client, if you see several dozen sessions waiting on the same type or class of wait event, it’s probably a cause for concern. At a different client, it may be typical to see 20 or 30 I/O-related waits at any given time. Bottom line is this - you need to have familiarity with the system you’re monitoring.
- Any DBA with his salt should become study Oracle’s wait interface and become familiar with what each of the major wait events means
- You can use Centroid’s “CCEO Infra Wait Interface.ppt” document as a quick reference on the wait interface
Step #2:
Grab the SQL for the sessions that show up repeatedly and frequently in the output from the above query. Note the SID (Session Identifier) and use it as input to the following script:
select
t.sql_fulltext ,
t.buffer_gets, t.disk_reads,t.executions
from v$session s,
v$sql t
where s.sql_address =t.address and s.sql_hash_value =t.hash_value
and s.sid = &&1
/
Format the out of this if you plan on running an execution plan on it.
If you want additional detail about the session(s) from V$SESSION_WAIT, you can query V$SESSION.
Step #3:
If the SQL statements extracted from the previous step are waiting on I/O-related or contention-related waits, you should grab an execution plan/explain plan by taking the formatted SQL and plugging into the below script:
set lines 120
explain plan for
<< insert SQL here >>
select * from table(dbms_xplan.display(null, null,'all'));
Step #4:
If the slowness is related to, for example, locks (enqueue waits), find out who the lock holder(s) is by querying V$LOCK or DBA_WAITERS and make a judgement call as to whether to kill the session(s) holding the lock, communicate with the end-user, etc.
Step #5:
Fix it. This could be a quick-fix (resolving a lock), or more likely will take some time to assess. If your cause of slowness is I/O-related waits, for example, you need to determine whether the SQL is optimized, whether indexes will help, whether concurrency patterns are abnormal (i.e., are there 50 simultaneous executions of a batch job that should only be running serially?), etc. SQL optimization is a science in itself that requires knowledge of the underlying data structures and data volumes, as well as an understanding of Oracle’s optimizer.
Step #6:
What if V$SESSION_WAIT doesn’t tell you anything meaningful? This is when you should consult ASH (Active Session History) views to give you time breakdown details:
select
decode(nvl(to_char(s.sid),-1),-1,'DISCONNECTED','CONNECTED')
"STATUS",
topsession.sid "SID",
topsession.program "PROGRAM",
max(topsession.CPU) "CPU",
max(topsession.WAIT) "WAITING",
max(topsession.IO) "IO",
max(topsession.TOTAL) "TOTAL"
from (
select * from (
select
ash.session_id sid,
ash.session_serial# serial#,
ash.user_id user_id,
ash.program,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING',
decode(wait_class,'User I/O',1, 0 ), 0)) "WAIT" ,
sum(decode(ash.session_state,'WAITING',
decode(wait_class,'User I/O',1, 0 ), 0)) "IO" ,
sum(decode(session_state,'ON CPU',1,1)) "TOTAL"
from v$active_session_history ash
group by session_id,user_id,session_serial#,program
order by sum(decode(session_state,'ON CPU',1,1)) desc
) where rownum < 10
) topsession,
v$session s,
all_users u
where
u.user_id =topsession.user_id and
/* outer join to v$session because the session might be disconnected */
topsession.sid = s.sid (+) and
topsession.serial# = s.serial# (+)
group by topsession.sid, topsession.serial#,
topsession.user_id, topsession.program, s.username,
s.sid,s.paddr,u.username
order by max(topsession.TOTAL) desc
/
The output may look like this:
STATUS Sid PROGRAM CPU WAITING IO TOTAL
------------ ----- ------------------------- ----------- ---------- ----- ------
DISCONNECTED 4518 11220 9 83 11312
CONNECTED 4584 9620 25 50 9695
DISCONNECTED 4683 das@usplsvpba002.verigy.n 5598 258 735 6591
et (TNS V1-V3)
CONNECTED 4888 oracle@usplsvped002.verig 483 956 0 1439
y.net (LGWR)
CONNECTED 4897 oracle@usplsvped002.verig 7 1119 0 1126
y.net (LMON)
DISCONNECTED 4614 158 75 552 785
DISCONNECTED 4698 sqlservr.exe 52 20 695 767
DISCONNECTED 4491 sqlservr.exe 102 35 496 633
DISCONNECTED 4698 sqlservr.exe 14 9 578 601
You can use the methods in Steps 2-3 above to get details about the sessions above.
Step #7:
If nothing stands out at this point, consult system logs and Oracle alert logs, as well as O/S performance tools (sar, top, glance, etc)











