Row cache wait in Oracle RAC
Hi,
Our application is running on Oracle RAC. During certain time of the day, the applcation responds very slowly. At these times, it is observed that the row cache waits are very high. We have even tried altering the sys.AUDSES$ sequence and changing it cache size to 10000 from default 20, but this did not help.
Can anyone suggest a solution for this problem? And why this problem occurs?
Hi,
it looks like your problem is related to the fact that you do not cache sequences (this is a well know RAC tuning topic).
Oracle introduced sequences (wrong name, definitely) to generate unique numbers, not to actually support a time sequence of events, or to preserve an order or to have ascending sequences of numbers with no gaps.
Ordering a sequence of events is a serialization process that should not be implemented by a sequence.
Now, if you do not cash sequences, in RAC the lock (enqueue) on the sequence (that is required when you ask for the next set of values) is a global resource on which inter instance contention occurs.
Furthermore, in case the application has a high volume of inserts, having nocache sequences leads to inter instance index block contention.
Oracle says that the default cache value of 20 for sequences is inappropriate in most case of RAC implementations and it is frequent to have caches of 1000 values or more. You need to test what is your ideal value.
Now it is up to you to decide between:
- keep things as they are and have a non scalable RAC installation
- find a way to cache sequences without harming the application assumptions.
Hope it helps,
Regards,
Corrado
Similar Messages
-
Enable Cache Fusion in Oracle RAC
Hi gurus,
I cannot find on google how to enable and test Cache Fusion feautre in Oracle RAC. Could you help me please?
Best.I don't know. I cannot find parameter which enable or disable CFAs Aman has already stated the feature is already present
http://download.oracle.com/docs/cd/B28359_01/server.111/b28318/consist.htm#CNCPT1317
I even cannot find information how to test CF to ensure that it really works?!http://download.oracle.com/docs/cd/B28359_01/rac.111/b28254/monitor.htm#RACAD981
@Aman
Cache Fusion is the technology which makes the 10g RAC, 10g RAC. And 11g Database as well :)
Edited by: Amy De Caj on Jul 19, 2009 4:26 AM
Edited by: Amy De Caj on Jul 19, 2009 4:28 AM -
Performance issues; waited too long for a row cache enqueue lock!
hi Experts,
OS: Oracle Solaris on SPARC (64-bit)
DB version:
SQL> select * from V$VERSION;
BANNER
Oracle Database 11g Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Solaris: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL>We have seen 100% CPU usage and high database load, so I checked the instance and have seen there were many blocking sessions and more than 71 sessions running the same select ;
elect tablespace_name as tbsname from (select tablespace_name,sum(bytes)/1024/1024 free_mb,0 total_mb,0 max_mb from dba_free_space group by tablespace_name union select tablespace_name, 0 current_mb,sum(bytes)/1024/1024 total_mb, sum(decode(maxbytes, 0, bytes, maxbytes))/1024/1024 max_mb from dba_data_files group by tablespace_name) group by tablespace_name having round((sum(total_mb)-sum(free_mb))/sum(max_mb)*100) > 95 Blocking sessions are running queries like this;
SELECT * from MYTABLE WHERE MYCOL=:1 FOR UPDATE;This select queries are coming from a cron job running every 10 minutes to check the tablespaces; so I first killed (kill -9 pid) those select statements so the load and CPU decreased to 13% of CPU usage. Blocking sessions still there and I didn't killed them waiting for app guys confirmation... after few hours and the CPU usage never went down the 13%; I have seen many errors;
WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=...System State dumped to trace file .....trcAfter that , we decided to restart the DB to release the locks!
I would like to understand why during loads we were no able to run those select statements, statspack schedule snapshot reports were not able to finish, also automatic
database statistics... why 5 for update statements locked the whole DB?user12035575 wrote:
SELECT FOR UPDATE will only lock the table row until the transaction is completed.
"WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK" happens when it needs to acquire a lock on data dictionary. Did you check the trace file associated with the statement?The trace file is too long, which information I need to focus more? -
Error: WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=26
Hi every one,
Today, i met a problem: Application cannot connect to database because database hang ( I also cannot connect to database with sqlplus) . Check alert log, only one error:
WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=26This error not only appear first time, but also happen every one month. I must reset server for database release all memory but I think It isn't a true solution!
Could you give a recommend for this.
Regards.The Row Cache is actually the Data Dictionary Cache. It is where definitions from the data dictionary (tablespaces, objects, users etc) are loaded into memory.
There would be an associated trace file written with the occurrence of this warning.
See Oracle Support Note 278316.1 for more information
Hemant K Chitale -
Latch: row cache objects
Hello everyone,
Note: Apologize for the bad formatting, tried but it seems I forgot how to use it
BANNER
Oracle Database 11g Release 11.2.0.2.0 - 64bit Production
I've seen high "*latch: row cache objects*" in SP/ASH report for ~14 hours back, when the users were unable to connect to the database. There were,
WARNING: inbound connection timed out (ORA-3136)
Time: 30-APR-2012 02:24:36
Tracing not turned on.
Tns error struct:
errors all over the alert log for the duration of 6 minutes of the problem.
I've put few records in bold due to which I concluded that the problem was with "dc_users" thing.
Can anybody tell me how/where I should proceed forward ?
SP report:Instance Efficiency Indicators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.84 Optimal W/A Exec %: 100.00
Library Hit %: 97.43 Soft Parse %: 87.86
Execute to Parse %: 22.54 Latch Hit %: 99.95
Parse CPU to Parse Elapsd %: 0.30 % Non-Parse CPU: 87.83
Shared Pool Statistics Begin End
Memory Usage %: 45.09 46.98
% SQL with executions>1: 11.49 13.15
% Memory for SQL w/exec>1: 72.96 21.33
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
latch: row cache objects 6,655 634,260 95306 97.0
log file sync 289,923 6,469 22 1.0
CPU time 5,039 .8
db file sequential read 310,084 2,840 9 .4
log file parallel write 451,706 1,144 3 .2
ASH Report
Analysis Begin Time: 30-Apr-12 02:24:00
Analysis End Time: 30-Apr-12 02:30:00
Elapsed Time: 6.0 (mins)
Begin Data Source: DBA_HIST_ACTIVE_SESS_HISTORY
in AWR snapshot 12185
End Data Source: DBA_HIST_ACTIVE_SESS_HISTORY
in AWR snapshot 12185
Sample Count: 1,385
Average Active Sessions: 38.47
Avg. Active Session per CPU: 1.60
Report Target: None specified
Top User Events DB/Inst: NIKU/niku (Apr 30 02:24 to 02:30)
Avg Active
Event Event Class % Event Sessions
latch: row cache objects Concurrency 75.45 29.03
CPU + Wait for CPU CPU 9.75 3.75
log file sync Commit 3.83 1.47
db file sequential read User I/O 3.61 1.39
Top Event P1/P2/P3 Values DB/Inst: NIKU/niku (Apr 30 02:24 to 02:30)
Event % Event P1 Value, P2 Value, P3 Value % Activity
Parameter 1 Parameter 2 Parameter 3
latch: row cache objects 75.60 "42287858200","279","0" 75.60
address number tries
1* select addr, latch#, child#, name, misses, gets from v$latch_children where name like '%row%cache%objec%' order by gets , misses
niku> /
ADDR LATCH# CHILD# NAME MISSES GETS
0000000A16FF21C8 279 26 row cache objects 0 0
0000000A16FF14C8 279 2 row cache objects 0 0
00000009D88D7ED8 279 3 row cache objects 0 0
0000000A16FF1B48 279 14 row cache objects 0 0
00000009D88D8558 279 15 row cache objects 0 0
0000000A16FF1CE8 279 17 row cache objects 0 0
0000000A26265A28 279 19 row cache objects 0 0
0000000A16FF1E88 279 20 row cache objects 0 0
00000009D88D8898 279 21 row cache objects 0 0
0000000A26265BC8 279 22 row cache objects 0 0
0000000A16FF2028 279 23 row cache objects 0 0
00000009D88D8A38 279 24 row cache objects 0 0
0000000A26265D68 279 25 row cache objects 0 0
00000009D88D8BD8 279 27 row cache objects 0 0
0000000A26265F08 279 28 row cache objects 0 0
00000009D88D8D78 279 30 row cache objects 0 0
0000000A262660A8 279 31 row cache objects 0 0
0000000A16FF2508 279 32 row cache objects 0 0
0000000A16FF26A8 279 35 row cache objects 0 0
00000009D88D90B8 279 36 row cache objects 0 0
0000000A262663E8 279 37 row cache objects 0 0
0000000A262668C8 279 46 row cache objects 0 0
0000000A26266A68 279 49 row cache objects 0 0
0000000A16FF2368 279 29 row cache objects 0 11
0000000A16FF2848 279 38 row cache objects 0 116
0000000A16FF29E8 279 41 row cache objects 0 200
00000009D88D93F8 279 42 row cache objects 0 318
00000009D88D9258 279 39 row cache objects 0 1010
0000000A16FF2EC8 279 50 row cache objects 0 1406
00000009D88D9598 279 45 row cache objects 0 1472
0000000A26266588 279 40 row cache objects 0 1705
0000000A26266728 279 43 row cache objects 0 7383
0000000A16FF2B88 279 44 row cache objects 0 32346
00000009D88D98D8 279 51 row cache objects 19 63948
0000000A26265888 279 16 row cache objects 0 88045
0000000A26266248 279 34 row cache objects 0 141176
00000009D88D9738 279 48 row cache objects 0 326672
0000000A16FF19A8 279 11 row cache objects 867 1770385
00000009D88D8078 279 6 row cache objects 9 1979542
0000000A16FF2D28 279 47 row cache objects 2 3435018
00000009D88D86F8 279 18 row cache objects 2557 14956121
0000000A26265068 279 1 row cache objects 224 24335868
0000000A262653A8 279 7 row cache objects 29760 133991553
00000009D88D8F18 279 33 row cache objects 60612 677263122
00000009D88D83B8 279 12 row cache objects 23981 739014460
0000000A26265208 279 4 row cache objects 19973399 852043775
0000000A26265548 279 10 row cache objects 280137 856097342
00000009D88D8218 279 9 row cache objects 715879777 1219000976
0000000A262656E8 279 13 row cache objects 3856073 2397402780
0000000A16FF1668 279 5 row cache objects 12763217 2920278217
*0000000A16FF1808 279 8 row cache objects 67329804 4145389092*
51 rows selected.
niku> list
1 select addr, latch#, child#, name, misses, gets from v$latch_children where name like '%row%cache%objec%' order by gets , misses
niku> select distinct s.kqrstcln latch#,r.cache#,r.parameter name,r.type,r.subordinate#
from v$rowcache r,x$kqrst s
where r.cache#=s.kqrstcid
order by 1,4,5; 2 3 4
LATCH# CACHE# NAME TYPE SUBORDINATE#
1 3 dc_rollback_segments PARENT
2 1 dc_free_extents PARENT
3 4 dc_used_extents PARENT
4 2 dc_segments PARENT
5 0 dc_tablespaces PARENT
6 5 dc_tablespace_quotas PARENT
7 6 dc_files PARENT
*8 10 dc_users PARENT*
*8 7 dc_users SUBORDINATE 0*
*8 7 dc_users SUBORDINATE 1*
*8 7 dc_users SUBORDINATE 2*
9 8 dc_objects PARENT
9 8 dc_object_grants SUBORDINATE 0
10 17 dc_global_oids PARENT
11 12 dc_constraints PARENT
12 13 dc_sequences PARENT
13 16 dc_histogram_defs PARENT
13 16 dc_histogram_data SUBORDINATE 0
13 16 dc_histogram_data SUBORDINATE 1
14 54 dc_sql_prs_errors PARENT
15 32 kqlsubheap_object PARENT
16 19 dc_table_scns PARENT
16 19 dc_partition_scns SUBORDINATE 0
17 18 dc_outlines PARENT
18 14 dc_profiles PARENT
19 47 realm cache PARENT
19 47 realm auth SUBORDINATE 0
20 48 Command rule cache PARENT
21 49 Realm Object cache PARENT
21 49 Realm Subordinate Cache SUBORDINATE 0
22 46 Rule Set Cache PARENT
23 34 extensible security user and rol PARENT
24 35 extensible security principal pa PARENT
25 37 extensible security UID to princ PARENT
26 36 extensible security principal na PARENT
27 33 extensible security principal ne PARENT
28 38 XS security class privilege PARENT
29 39 extensible security midtier cach PARENT
30 43 AV row cache 1 PARENT
31 44 AV row cache 2 PARENT
32 45 AV row cache 3 PARENT
33 15 global database name PARENT
34 20 rule_info PARENT
35 21 rule_or_piece PARENT
35 21 rule_fast_operators SUBORDINATE 0
36 23 dc_qmc_ldap_cache_entries PARENT
37 52 qmc_app_cache_entries PARENT
38 53 qmc_app_cache_entries PARENT
39 27 qmtmrcin_cache_entries PARENT
40 28 qmtmrctn_cache_entries PARENT
41 29 qmtmrcip_cache_entries PARENT
42 30 qmtmrctp_cache_entries PARENT
43 31 qmtmrciq_cache_entries PARENT
44 26 qmtmrctq_cache_entries PARENT
45 9 qmrc_cache_entries PARENT
46 50 qmemod_cache_entries PARENT
47 24 outstanding_alerts PARENT
48 22 dc_awr_control PARENT
49 25 SMO rowcache PARENT
50 40 sch_lj_objs PARENT
51 41 sch_lj_oids PARENT
61 rows selected.
niku> select parameter, gets from v$rowcache order by gets desc;
PARAMETER GETS
dc_users 2802019571
dc_tablespaces 2405092307
dc_objects 1815427326jjk wrote:
I've already been thru the link that you've mentioned and unfortunately couldn't make much use of it.I didn't think it was really likely to be relevant, but there was always a long shot that it might have given you a clue.
Considering the "dc_users" had maximum gets, I thought (rather as per internet) that it might be the point of contention. However I did observe high misses on child# 9 which is "dc_objects". It's often the case that the misses is more important than the gets when you see lots of gets and misses on a few latches/caches - the bit that might have been most instructure was the dictionary cache bit from the AWR showing gets, misses, scans, scanmisses etc. It might have told us a little about what was going in and out of the dictionary cache and let us guess why.
In alert log:
Sun Apr 29 02:20:00 2012
29-APR-2012 02:20:00 -- xxxxxxx package - REGRANT_READONLY Begin re-grant read only roles
Sun Apr 29 02:24:34 2012
29-APR-2012 02:24:34 -- xxxxxxx package - REGRANT_READONLY End re-grant read only roles
Sun Apr 29 02:30:00 2012
29-APR-2012 02:30:00 -- xxxxxxx package - REGRANT_READWRITE Begin re-grant read write roles
Sun Apr 29 02:32:02 2012
29-APR-2012 02:32:02 -- xxxxxxx package - REGRANT_READWRITE End re-grant read write roles
Is this code that "regrants" roles to users who already have them ? That's what it sounds like, and that sounds like something that would impact on various parts of the dictionary cache, especially dc_users, and possibly dc_obejcts.
CPU per Elap per Old
Executions Rows Processed Rows per Exec Exec (s) Exec (s) Hash Value
161,198 1,244 0.0 0.00 0.00 978935325
select /*+ rule */ c.name, u.name from con$ c, cdef$ cd, user$ u
where c.con# = cd.con# and cd.enabled = :1 and c.owner# = u.us
er#
159,955 159,952 1.0 0.00 0.00 2458412332
select o.name, u.name from obj$ o, user$ u where o.obj# = :1 an
d o.owner# = u.user#
159,932 6 0.0 0.00 0.00 2636710067
insert into objauth$(option$,grantor#,obj#,privilege#,grantee#,c
ol#,sequence#) values(decode(:1,0,null,:1),:2,:3,:4,:5,decode(:6
,0,null,:6),object_grant.nextval)
147,168 147,168 1.0 0.00 0.00 3468666020
select text from view$ where rowid=:1
124,635 124,635 1.0 0.00 0.00 564166580
select count(*) from ( select u.
name from registry$ r, us
er$ u where r.status in (1,3,5)
and r.namespace = 'SERVER'The first one looks like a response to a constraint being breached.
The third one looks like something that might happen when you grant a privilege on an object to a user - and maybe the first one happens if the user has already got it and the insert raises a "duplicate key" error. The fourth one commonly happens when you have to re-optimize a query containing a view - and when you execute DDL (such as changing privileges on an object) you invalidate SQL and have to re-optimize it eventually. I can't remember where I've seen the second one appearing.
If you have a process that tries to do a lot of grants on objects to users and roles in a very short time, it's quite likely to create havoc in the dictionary cache - check what that package was up to and why it runs.
What is the missing information ?When I looked at some of your posting, the output didn't match the query, some of the later columns had gone missing - this might have been my browser rather than your input though.
Regards
Jonathan Lewis -
"latch: row cache objects" and high "VERSION_COUNT"
Hello,
we are being faced with a situation where the database spends most of it's time waiting for latches in the shared pool (as seen in the AWR report).
All statements issued by the application are using bind variables, but what we can see in V$SQL is that even though the statements are using bind variables some of them have a relatively high version_count (> 300) and many invaliadations (100 - 200) even though the tables involved are very small (some not more than 3 or 4 rows).
Here is some (hopefully enough) information about the environment
Version: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production (on RedHat EL 5)
Parameters:
cursor_bind_capture_destination memory+disk
cursor_sharing EXACT
cursor_space_for_time FALSE
filesystemio_options none
hi_shared_memory_address 0
memory_max_target 12288M
memory_target 12288M
object_cache_optimal_size 102400
open_cursors 300
optimizer_capture_sql_plan_baselines FALSE
optimizer_dynamic_sampling 2
optimizer_features_enable 11.2.0.2
optimizer_index_caching 0
optimizer_index_cost_adj 100
optimizer_mode ALL_ROWS
optimizer_secure_view_merging TRUE
optimizer_use_invisible_indexes FALSE
optimizer_use_pending_statistics FALSE
optimizer_use_sql_plan_baselines TRUE
plsql_optimize_level 2
session_cached_cursors 50
shared_memory_address 0The shared pool size (according to AWR) is 4,832M
The buffer cache is 3,008M
Now, my question: is a version_count of > 300 a problem (we have about 10-15 of those with a total of ~7000 statements in v$sqlarea). Those are also the statements listed in the AWR report at the top in the section "SQL ordered by Version Count" and "SQL ordered by Sharable Memory"
Is it possible that those statements are causing the the latch contention in the shared pool?
I went through https://blogs.oracle.com/optimizer/entry/why_are_there_more_cursors_in_11g_for_my_query_containing_bind_variables_1
The tables involved are fairly small and all the execution plans for each cursor are identical.
I can understand some of the invalidations that happen, because we have 7 schemas that have identical tables, but from my understanding that shouldn't cause such a high invalidation number. Or am I mistaken?
I'm not that experienced with Oracle tuning at that level, so I would appreciate any pointer on how I can find out where exactly the latch problem occurs
After flushing the shared pool, the problem seems to go away for a while. But apparently that is only fighting symptoms, not fixing the root cause of the problem.
Some of the statements in question:
SELECT * FROM QRTZ_SIMPLE_TRIGGERS WHERE TRIGGER_NAME = :1 AND TRIGGER_GROUP = :2
UPDATE QRTZ_TRIGGERS SET TRIGGER_STATE = :1 WHERE TRIGGER_NAME = :2 AND TRIGGER_GROUP = :3 AND TRIGGER_STATE = :4
UPDATE QRTZ_TRIGGERS SET TRIGGER_STATE = :1 WHERE JOB_NAME = :2 AND JOB_GROUP = :3 AND TRIGGER_STATE = :4
SELECT TRIGGER_STATE FROM QRTZ_TRIGGERS WHERE TRIGGER_NAME = :1 AND TRIGGER_GROUP = :2
UPDATE QRTZ_SIMPLE_TRIGGERS SET REPEAT_COUNT = :1, REPEAT_INTERVAL = :2, TIMES_TRIGGERED = :3 WHERE TRIGGER_NAME = :4 AND TRIGGER_GROUP = :5
DELETE FROM QRTZ_TRIGGER_LISTENERS WHERE TRIGGER_NAME = :1 AND TRIGGER_GROUP = :2So all of them are using bind variables.
I have seen that the columns used in the where clause all have histograms available. Would removing them reduce the number of invalidations?
Unfortunately I did not save the information from v$sql_shared_cursor before the shared pool was flushed, but most of the invalidations occurred in the ROLL_INVALID_MISMATCH column if that is of any help. There are some invalidations reported for AUTH_CHECK_MISMATCH and TRANSLATION_MISMATCH but to my understanding they caused by executing the statement for different schemas if I'm not mistaken.
Looking at v$latch_missed, most of the waits for parent = 'row cache objects' are for "kqrpre: find obj" and "kqreqd: reget">
In the AWR report, what does the Dictionary Cache Stats section say?
>
Here they are:
Dictionary Cache Stats
Cache Get Requests Pct Miss Scan Reqs Mod Reqs Final Usage
dc_awr_control 65 0.00 0 2 1
dc_constraints 729 33.33 0 729 1
dc_global_oids 60 23.33 0 0 31
dc_histogram_data 7,397 10.53 0 0 2,514
dc_histogram_defs 21,797 9.83 0 0 5,239
dc_object_grants 4 25.00 0 0 12
dc_objects 27,683 2.29 0 223 2,581
dc_profiles 1,842 0.00 0 0 1
dc_rollback_segments 1,634 0.00 0 0 39
dc_segments 7,335 6.94 0 360 1,679
dc_sequences 139 5.76 0 139 19
dc_table_scns 53 100.00 0 0 0
dc_tablespace_quotas 1,956 0.10 0 0 4
dc_tablespaces 17,488 0.00 0 0 11
dc_users 58,013 0.03 0 0 164
global database name 4,261 0.00 0 0 1
outstanding_alerts 54 0.00 0 0 9
sch_lj_oids 4 0.00 0 0 2
Library Cache Activity
Namespace Get Requests Pct Miss Pin Requests Pct Miss Reloads Invalidations
ACCOUNT_STATUS 3,664 0.03 0 0 0
BODY 560 2.14 2,343 0.60 0 0
CLUSTER 52 0.00 52 0.00 0 0
DBLINK 3,668 0.00 0 0 0
EDITION 1,857 0.00 3,697 0.00 0 0
INDEX 99 19.19 99 19.19 0 0
OBJECT ID 68 100.00 0 0 0
SCHEMA 2,646 0.00 0 0 0
SQL AREA 32,996 2.26 1,142,497 0.21 189 226
SQL AREA BUILD 848 62.15 0 0 0
SQL AREA STATS 860 82.09 860 82.09 0 0
TABLE/PROCEDURE 17,713 2.62 26,112 4.88 61 0
TRIGGER 1,704 2.00 6,737 0.52 1 0 -
ROW CACHE ENQUEUE LOCK/ibrary cache load lock leads to database hung
(lowercase, curly brackets, no spaces)
We faced database hung on 3 node 11i erp 9i rac database.
We saw the library cache load lock timed out events reported in alert log.
Then few ora-600 and later ROW CACHE ENQUEUE LOCK timed out event. Eventually database was hung and we had to bounce the services .
we created support sr 7845542.992 for RCA.
The support says to increase shared pool size to avoid shared pool fragmentation and avoid reload ,additionaly to upgrade to 10g database.
I am not covinced adding additional pool size would solve this or upgrade to 10 .furthermore even 10g has such issues reported.
I saw couple of bugs mentioned such issue can happen due deadlock of session holding latches .
kindly let me know your view on issue
If required i can attach statspack for more information. (lowercase, curly brackets, no spaces)Many Thanks, i was keen to have your update .
There are 8 cpus on each node . Reloads very high during time period ,but normally there are not high reloads.
Statspack details for 3 nodes
STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
PROD 21184234 PROD1 1 9.2.0.8.0 YES npi-or-db-p-
11.npi.corp
Snap Id Snap Time Sessions Curs/Sess Comment
Begin Snap: 149817 30-Oct-09 13:00:09 574 #########
End Snap: 149837 30-Oct-09 14:00:17 602 #########
Elapsed: 60.13 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 8,192M Std Block Size: 8K
Shared Pool Size: 1,024M Log Buffer: 10,240K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 122,414.93 11,449.13
Logical reads: 69,550.76 6,504.89
Block changes: 928.41 86.83
Physical reads: 196.24 18.35
Physical writes: 28.65 2.68
User calls: 343.97 32.17
Parses: 558.61 52.25
Hard parses: 43.48 4.07
Sorts: 467.24 43.70
Logons: 0.63 0.06
Executes: 2,046.99 191.45
Transactions: 10.69
% Blocks changed per Read: 1.33 Recursive Call %: 97.59
Rollback per transaction %: 5.07 Rows per Sort: 15.85
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.72 In-memory Sort %: 100.00
Library Hit %: 96.79 Soft Parse %: 92.22
Execute to Parse %: 72.71 Latch Hit %: 99.77
Parse CPU to Parse Elapsd %: 60.10 % Non-Parse CPU: 78.07
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
db file sequential read 249,234 0 1,537 6 6.5
db file scattered read 61,776 0 769 12 1.6
row cache lock 780,098 10 566 1 20.2
library cache lock 697,849 157 432 1 18.1
latch free 127,926 4,715 387 3 3.3
global cache cr request 370,770 3,091 309 1 9.6
PL/SQL lock timer 59 58 112 1903 0.0
wait for scn from all nodes 303,572 18 103 0 7.9
library cache pin 26,231 2 100 4 0.7
global cache null to x 17,717 716 92 5 0.5
buffer busy waits 5,388 18 74 14 0.1
db file parallel read 5,245 0 69 13 0.1
log file sync 20,407 29 66 3 0.5
enqueue 52,200 70 60 1 1.4
buffer busy global CR 4,845 33 55 11 0.1
CGS wait for IPC msg 412,512 407,106 50 0 10.7
ksxr poll remote instances 1,279,565 483,046 48 0 33.2
log file parallel write 160,040 0 42 0 4.1
library cache load lock 1,491 2 29 20 0.0
global cache open x 19,507 344 28 1 0.5
buffer busy global cache 957 0 22 23 0.0
global cache s to x 16,516 180 20 1 0.4
db file parallel write 11,120 0 12 1 0.3
log file sequential read 618 0 11 18 0.0
DFS lock handle 23,768 0 10 0 0.6
control file sequential read 8,563 0 4 0 0.2
KJC: Wait for msg sends to c 1,549 57 4 3 0.0
lock escalate retry 76 76 4 52 0.0
SQL*Net break/reset to clien 12,546 0 3 0 0.3
SQL*Net more data to client 85,773 0 3 0 2.2
control file parallel write 1,265 0 2 1 0.0
global cache null to s 648 23 1 2 0.0
global cache busy 200 0 1 5 0.0
global cache open s 1,493 28 1 1 0.0
log file switch completion 12 0 1 61 0.0
PX Deq Credit: send blkd 161 70 1 4 0.0
kksfbc child completion 119 118 1 5 0.0
PX Deq: reap credit 5,948 5,456 0 0 0.2
PX Deq: Execute Reply 83 29 0 3 0.0
process startup 8 0 0 25 0.0
LGWR wait for redo copy 992 12 0 0 0.0
IPC send completion sync 450 450 0 0 0.0
PX Deq: Parse Reply 100 28 0 1 0.0
undo segment extension 10,380 10,372 0 0 0.3
PX Deq: Join ACK 146 65 0 1 0.0
buffer deadlock 222 221 0 0 0.0
async disk IO 1,179 0 0 0 0.0
wait list latch free 2 0 0 16 0.0
PX Deq: Msg Fragment 112 28 0 0 0.0
Library Cache Activity for DB: PROD Instance: PROD1 Snaps: 149817 -149837
->"Pct Misses" should be very low
Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
BODY 116,007 1.1 133,347 19.9 24,338 0
CLUSTER 4,224 0.6 5,131 1.0 0 0
INDEX 15,048 24.1 13,798 26.4 2 0
JAVA DATA 82 0.0 692 39.6 136 0
JAVA RESOURCE 66 39.4 206 25.2 12 0
PIPE 1,140 0.5 1,160 0.5 0 0
SQL AREA 1,197,908 12.6 13,517,660 1.5 111,833 73
TABLE/PROCEDURE 3,847,439 0.8 4,230,265 7.9 142,200 0
TRIGGER 8,444 2.4 8,657 18.5 1,274 0
GES Lock GES Pin GES Pin GES Inval GES Invali-
Namespace Requests Requests Releases Requests dations
BODY 1 1,234 1,258 985 0
CLUSTER 3,222 25 25 25 0
INDEX 13,792 3,641 3,631 3,629 0
JAVA DATA 0 0 0 0 0
JAVA RESOURCE 0 26 25 0 0
PIPE 0 0 0 0 0
SQL AREA 0 0 0 0 0
TABLE/PROCEDURE 857,137 13,130 13,264 10,762 0
TRIGGER 0 200 202 200 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
PROD 21184234 PROD2 2 9.2.0.8.0 YES npi-or-db-p-
12.npi.corp
Snap Id Snap Time Sessions Curs/Sess Comment
Begin Snap: 149847 30-Oct-09 14:00:05 493 #########
End Snap: 149857 30-Oct-09 15:00:02 432 #########
Elapsed: 59.95 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 8,192M Std Block Size: 8K
Shared Pool Size: 1,024M Log Buffer: 10,240K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 71,853.44 32,058.65
Logical reads: 273,904.84 122,207.36
Block changes: 889.13 396.70
Physical reads: 40.40 18.03
Physical writes: 20.97 9.35
User calls: 153.74 68.60
Parses: 66.19 29.53
Hard parses: 2.66 1.19
Sorts: 25.70 11.47
Logons: 0.16 0.07
Executes: 726.41 324.10
Transactions: 2.24
% Blocks changed per Read: 0.32 Recursive Call %: 92.41
Rollback per transaction %: 4.84 Rows per Sort: 193.55
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.99
Buffer Hit %: 99.99 In-memory Sort %: 100.00
Library Hit %: 99.35 Soft Parse %: 95.97
Execute to Parse %: 90.89 Latch Hit %: 99.99
Parse CPU to Parse Elapsd %: 36.55 % Non-Parse CPU: 98.28
Wait Events for DB: PROD Instance: PROD2 Snaps: 149847 -149857
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
enqueue 65,823 33,667 90,459 1374 8.2
row cache lock 38,996 560 1,795 46 4.8
PX Deq Credit: send blkd 522 499 1,223 2344 0.1
PX Deq: Parse Reply 466 416 987 2117 0.1
db file sequential read 50,130 0 421 8 6.2
library cache lock 78,842 172 210 3 9.8
db file scattered read 6,904 0 152 22 0.9
global cache cr request 84,801 575 113 1 10.5
latch free 8,096 736 65 8 1.0
log file sync 5,676 27 41 7 0.7
wait for scn from all nodes 18,891 10 24 1 2.3
CGS wait for IPC msg 394,678 392,142 21 0 49.0
library cache pin 1,339 0 17 13 0.2
global cache null to x 2,145 48 16 8 0.3
global cache s to x 3,242 32 16 5 0.4
buffer busy waits 366 10 15 40 0.0
ksxr poll remote instances 70,990 31,295 14 0 8.8
db file parallel read 359 0 11 31 0.0
global cache open x 2,708 55 10 4 0.3
async disk IO 3,474 0 8 2 0.4
global cache open s 3,470 10 6 2 0.4
log file parallel write 13,076 0 5 0 1.6
global cache busy 58 40 5 90 0.0
PL/SQL lock timer 1 1 5 4877 0.0
DFS lock handle 3,362 0 5 1 0.4
log file sequential read 412 0 4 10 0.1
db file parallel write 2,774 0 3 1 0.3
library cache load lock 59 0 3 58 0.0
buffer busy global CR 722 0 3 4 0.1
control file sequential read 6,398 0 3 0 0.8
SQL*Net break/reset to clien 16,078 0 2 0 2.0
name-service call wait 26 0 2 67 0.0
control file parallel write 1,248 0 2 1 0.2
process startup 24 0 1 49 0.0
KJC: Wait for msg sends to c 3,491 4 1 0 0.4
SQL*Net more data to client 23,724 0 1 0 2.9
buffer busy global cache 23 0 0 19 0.0
global cache null to s 114 0 0 4 0.0
PX Deq: reap credit 5,646 5,509 0 0 0.7
log file switch completion 4 0 0 58 0.0
lock escalate retry 54 54 0 1 0.0
IPC send completion sync 119 118 0 0 0.0
direct path read 2,820 0 0 0 0.3
direct path read (lob) 3,632 0 0 0 0.5
PX Deq: Join ACK 88 37 0 0 0.0
direct path write 2,470 0 0 0 0.3
kksfbc child completion 6 6 0 6 0.0
buffer deadlock 3 3 0 11 0.0
global cache quiesce wait 4 4 0 8 0.0
Library Cache Activity for DB: PROD Instance: PROD2 Snaps: 149847 -149857
->"Pct Misses" should be very low
Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
BODY 27,353 0.5 28,091 6.5 1,643 0
CLUSTER 203 1.0 269 1.5 0 0
INDEX 526 9.9 271 19.9 0 0
JAVA DATA 18 0.0 120 6.7 4 0
JAVA RESOURCE 20 45.0 56 26.8 3 0
JAVA SOURCE 1 100.0 1 100.0 0 0
PIPE 999 0.4 1,043 0.4 0 0
SQL AREA 131,793 7.6 3,406,577 0.4 7,012 0
TABLE/PROCEDURE 926,987 0.2 1,907,993 1.0 8,845 0
TRIGGER 1,519 0.1 1,532 4.9 69 0
GES Lock GES Pin GES Pin GES Inval GES Invali-
Namespace Requests Requests Releases Requests dations
BODY 1 129 277 117 0
CLUSTER 168 2 2 2 0
INDEX 271 52 56 52 0
JAVA DATA 0 0 0 0 0
JAVA RESOURCE 0 9 6 0 0
JAVA SOURCE 0 1 1 1 0
PIPE 0 0 0 0 0
SQL AREA 0 0 0 0 0
TABLE/PROCEDURE 89,523 764 868 460 0
TRIGGER 0 2 14 2 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
DB Name DB Id Instance Inst Num Release Cluster Host
PROD 21184234 PROD3 3 9.2.0.8.0 YES npi-or-db-p-
13.npi.corp
Snap Id Snap Time Sessions Curs/Sess Comment
Begin Snap: 149808 30-Oct-09 14:00:00 31 #########
End Snap: 149809 30-Oct-09 15:00:02 34 11,831.4
Elapsed: 60.03 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 8,192M Std Block Size: 8K
Shared Pool Size: 1,024M Log Buffer: 10,240K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 1,518.14 36,700.35
Logical reads: 1,333.43 32,235.02
Block changes: 5.09 123.01
Physical reads: 54.31 1,312.88
Physical writes: 3.91 94.44
User calls: 1.46 35.40
Parses: 2.24 54.21
Hard parses: 0.04 0.93
Sorts: 0.84 20.28
Logons: 0.06 1.45
Executes: 3.11 75.23
Transactions: 0.04
% Blocks changed per Read: 0.38 Recursive Call %: 94.31
Rollback per transaction %: 45.64 Rows per Sort: 215.97
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 96.21 In-memory Sort %: 100.00
Library Hit %: 99.07 Soft Parse %: 98.29
Execute to Parse %: 27.94 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 69.88 % Non-Parse CPU: 97.92
Wait Events for DB: PROD Instance: PROD3 Snaps: 149808 -149809
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
enqueue 19,510 7,472 15,509 795 130.9
PX Deq: Parse Reply 1,152 1,071 2,577 2237 7.7
row cache lock 2,202 518 1,579 717 14.8
db file scattered read 31,556 0 354 11 211.8
db file sequential read 17,272 0 67 4 115.9
db file parallel read 1,722 0 34 20 11.6
global cache cr request 53,754 91 32 1 360.8
wait for scn from all nodes 1,897 13 10 5 12.7
CGS wait for IPC msg 403,358 401,478 10 0 2,707.1
DFS lock handle 4,753 0 8 2 31.9
direct path read 1,248 0 6 5 8.4
PX Deq: Execute Reply 110 38 6 51 0.7
global cache open s 160 10 5 31 1.1
control file sequential read 6,442 0 3 0 43.2
name-service call wait 26 0 2 78 0.2
latch free 129 109 2 13 0.9
KJC: Wait for msg sends to c 153 24 1 9 1.0
control file parallel write 1,245 0 1 1 8.4
buffer busy waits 199 0 1 6 1.3
process startup 20 0 1 44 0.1
global cache null to x 74 2 1 9 0.5
global cache null to s 19 0 1 29 0.1
global cache open x 268 1 1 2 1.8
library cache lock 1,150 0 0 0 7.7
PX Deq: Join ACK 129 48 0 3 0.9
log file parallel write 1,157 0 0 0 7.8
async disk IO 219 0 0 1 1.5
direct path write 1,024 0 0 0 6.9
ksxr poll remote instances 6,740 4,595 0 0 45.2
PX Deq: reap credit 6,580 6,511 0 0 44.2
buffer busy global CR 73 0 0 2 0.5
log file sequential read 11 0 0 10 0.1
log file sync 100 0 0 1 0.7
global cache s to x 282 2 0 0 1.9
db file parallel write 95 0 0 1 0.6
library cache pin 142 0 0 0 1.0
SQL*Net break/reset to clien 28 0 0 1 0.2
IPC send completion sync 81 81 0 0 0.5
PX Deq: Signal ACK 32 14 0 1 0.2
PX Deq Credit: send blkd 3 1 0 7 0.0
SQL*Net more data to client 841 0 0 0 5.6
PX Deq: Msg Fragment 37 17 0 0 0.2
log file single write 4 0 0 1 0.0
db file single write 1 0 0 1 0.0
SQL*Net message from client 4,213 0 13,673 3246 28.3
gcs remote message 214,784 75,745 7,016 33 1,441.5
wakeup time manager 233 233 6,812 29237 1.6
PX Idle Wait 2,338 2,294 5,686 2432 15.7
PX Deq: Execution Msg 2,151 1,979 4,796 2229 14.4
Library Cache Activity for DB: PROD Instance: PROD3 Snaps: 149808 -149809
->"Pct Misses" should be very low
Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
BODY 1,290 0.0 1,290 0.0 0 0
CLUSTER 18 0.0 8 0.0 0 0
SQL AREA 4,893 2.0 36,371 0.5 2 0
TABLE/PROCEDURE 1,555 3.9 3,834 4.9 71 0
TRIGGER 286 0.0 286 0.0 0 0
GES Lock GES Pin GES Pin GES Inval GES Invali-
Namespace Requests Requests Releases Requests dations
BODY 1 0 0 0 0
CLUSTER 4 0 0 0 0
SQL AREA 0 0 0 0 0
TABLE/PROCEDURE 863 224 42 42 0
TRIGGER 0 0 0 0 0
------------------------------------------------------------- -
Oracle RAC scalability(does it linearly scalable upto 20-30 nodes)
We are looking datastorage solution as Oracle -RAC for following performance requirement
Application is generating Resources which we want to store in database and provide searching on this resources.
Resource have 2 part one is data and another one is meta
Data contains textual/binray data like txt/html/doc/excel/pdf/image file etc
meta contains 30-40 different property telling something about data.
Average resource size is 10K
Insertion speed required for such resource (Data + Meta ) 2Gbps(30K Resource/Second )
We want indexing also on data and meta.
We used single oracle database and created resource table which has 40 column for meta property and one column of blob type for data.
Performance achieved is 100Mbps insertion speed (on normal machine)
Now to go to 2Gbps we are thinking to use Oracle RAC to scale it up to 2Gbps Insertion speed.May be 20 Node is required to scale it upto 2Gbps.
Now my question is does Oracle RAC provide close to liener scalability upto 20-30 nodes or not.
Key requirement is to achieve insertion speed upto 2Gbps
High availability of oracle rac can be added advantage for us but key concern here is scalability not fault tolerance.> Now we are not using oracle partitioning because it
is slow when we define domain index (even index is
local and it is not sync real time ,index maintaines
is off) but it maintains pending queue and which
slows down insertion process.
Hmm.. I'm using partitioning extensively for mass parallel inserts and it is a lot cleaner than individual tables (requiring dynamic SQL), and I have not seen any performance issues.
Can you elaborate on what issues you have seen?
> Our main and key concern is to achieve insertion
speed of 2Gbps initialy and should be scalable upto
10Gbbps.
May I ask what data you are collecting? This volume sounds a bit extreme - is something like collecting/sniffing UDP/TCP packets?
> 1.If we have say I/O and network bandwidth available
then does oracle RAC will be capable to consume this
available I/O and network bandwidth by adding more
nodes.
Yes. Remember that each cluster node has its own set of local platform resources - including a pipe to the shared storage (such as a I/O fibre channel).
What will cause an impact? Anything that will impact a single insert process will also impact that process across a cluster. E.g. two processes attempting to insert a row with the same PK - only one can succeed and thus one will be blocked by another. Bitmap indexing as a lock on a bitmap "slot" locks the index data for a number of rows - any of which can currently be updated by other processes. Etc.
How is this resolved in a cluster? As the processes are not local to the same platform, IPC cannot be used. Thus it means the Interconnect has to be used. This will be slower than IPC.
> 2.We also heard from HP that Interconnect of node
will be botteleneck for us but my question is if we
look at above scenario where only insertion and
searching is there and no updation is there in system
then will RAC Interconnect become botteleneck just
for hearbeat messanging.
The Interconnect need not be a bottleneck. Besides, the Interconnect is a fundamental cog in the share-everything cluster machine. It is not a "Bad Thing". So I question what HP is saying - for me to accept such advice, it needs to be backed up with hard technical facts.
If Interconnect is such an issue according to HP, just what do they recommend you use to scale your system with? Let me guess - some very expensive and very complex HP product? A superdome perhaps?
> 3.Indexing time is much more even we create index for
one hour data at one shot, can we dedicate few nodes
in RAC just for indexing.
That is what I'm doing - running up to a 100+ PQ process to do the index builds and rebuilds.
> 4.When i use Create table as select command to
transfer same amount of data from one table to
another table then it takes only 30 seconds and when
i use direct path uploading then it takes 3 minute,
Make sure that you're performance comparisons are valid. What do you imply with "direct path uploading"?
Remember that CTAS is disk-to-disk I/O via the SGA buffer cache. There is no "client side" involved. Nothing external. Not even a PGA buffer area. No pushing data from a client process via IPC to an Oracle server process.
If your benchmark includes pushing data from a client, even from a PL/SQL process, that will be slower than a CTAS - always.
When dealing with such large volumes, the "traditional RDBMS" approach need to be carefully considered. Every single constraint, every single index, every single trigger, results in a tiny overhead that becomes a very huge overhead given the data volumes.
Data management also plays a crucial role. Unless you can manage the data, you cannot effectively insert such huge volumes, process those volumes and query those volumes.
I see RAC and partitioning and PL/SQL server side processing as crucial ingredients to make this work. -
Hi All
I am installing Oracle RAC 10g 10.2.0.1 on HP-UX B.11.31 U ia64 but can not complete
hosts file
#Public IPs
10.144.1.111 spgdb01
10.144.1.112 spgdb02
#Private IPs
10.144.2.2 spgdb01p
10.144.2.3 spgdb02p
#Virtual IPs
10.144.1.113 spgdb01v
10.144.1.114 spgdb02v
I do installation with runInstaller without error. It copy and link is ok. When I run root.sh then It cannot complete as following
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/product/10.2.0' is not owned by root
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 0: spgdb01 spgdb01p spgdb01
node 1: spgdb02 spgdb02p spgdb02
Creating OCR keys for user 'root', privgrp 'sys'..
Operation successful.
Now formatting voting device: /ora/crs/votedisk01
waitpid(-1, 0x7fffdf50, WUNTRACED) .................................................................................................... [sleeping]
Now formatting voting device: /oracle/oradata1/crs/votedisk02
Now formatting voting device: /oracle/oradata2/crs/votedisk03
Format of 3 voting devices complete.
Startup will be queued to init within 30 seconds.
====================
I have waited for 10 mins but still not complete
Additionally, log from runInstaller, I got
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2011-04-28_12-13-31AM. Please wait ...-bash-4.2$ Oracle Universal Installer, Version 10.2.0.1.0 Production
Copyright (C) 1999, 2005, Oracle. All rights reserved.
Private Interconnect : null
Private Interconnect : null
Private Interconnect : null
Private Interconnect : null
So, please help me fix this issue
Thank youI had this problem and resolved it by transporting the file to the installation server with the correct ftp datatype (binary).
On page 54 of the install guide (..Server\Oracle_Business_Intelligence\doc\doc\bi.1013\b31765.pdf) that comes with the installation files, there is an instruction to make sure that any ftp activity is done in binary.
This may not have occured with the license.xml file if you use a tool which offers the "feature" of automatic datatype recognition.
Hope this helps. -
Oracle rac templates 11g R2 buildcluster.sh error
Hi All,
am facing below error, while creating oracle rac templates. kindly let us know how to resolve below error.
===error=========================
Oracle RAC 11gR2 OneCommand (v1.2) for Oracle VM - (c) 2010-2011 Oracle Corporation
Cksum: [1170221909 255000 racovm.sh] at Sun Jan 5 04:15:14 EST 2014
Kernel: 2.6.18-194.0.0.0.3.el5xen (i686) [1 processor(s)] 1700 MB
2014-01-05 04:15:14:[printparams:Time :racnode1] Completed successfully in 4 seconds (0h:00m:04s)
2014-01-05 04:15:14:[setsshora:Start:racnode1] SSH Setup for the Oracle user(s)...
INFO (node:racnode1): Running as oracle: /u01/racovm/ssh/setssh-Linux.sh -s -x -c NO -h nodelist -p *** (setup on 2 node(s): racnode1 racnode2)
ERROR: Failed to create temporary file /tmp/setssh-cretmpQY3958 on localhost, can not proceed
Exiting...
ERROR (node:racnode1): Failed to configure passwordless SSH for the oracle user
2014-01-05 04:15:17:[setsshora:Time :racnode1] Completed with errors in 3 seconds (0h:00m:03s), status: 1
2014-01-05 04:15:17:[buildcluster:Time :racnode1] Completed with errors in 58 seconds (0h:00m:58s), status: 1
thanks,
Mike.Try this. It worked for me.
Please keep in mind that you will need wait till each step finishes successfully before move to next one
For Step1 and 2, you can skip node(s) on which you didn't execute root.sh yet.
Step 1: As root, run "$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force" on all nodes, except the last one.
Step 2: As root, run "$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode" on last node. This command will zero out OCR and VD disk also.
Step 3: As root, run $GRID_HOME/root.sh on all node one by one -
ASM instances on 2 node Oracle RAC 10g r2 on Red Hat 4 u1
Hi all
I'm experiencing a problem in configuring diskgroups under +ASM instances on a two node Oracle RAC.
I followed the official guide and also official documents from metalink site, but i'm stuck with the visibility of asm disks.
I created fake disks on nfs with Netapp certified storage binding them to block device with the usual trick "losetup /dev/loopX /nfs/disk1 " ,
run "oracleasm createdisk DISKX /dev/loopX" on one node and
"oracleasm scandisks" on the other one.
With "oracleasm listdisks" i can see the disks at OS level in both nodes , but , when i try to create and mount diskgroup in the ASM instances , on the instance on which i create the diskgroup all is well, but the other one doesn't see the disks at all, and diskgroup mount fails with :
ERROR: no PST quorum in group 1: required 2, found 0
Tue Sep 20 16:22:32 2005
NOTE: cache dismounting group 1/0x6F88595E (DG1)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DG1 was not mounted
any help would be appreciated
thanks a lot.
AntonelloI'm having this same problem. Did you ever find a solution?
-
Error in ONS logs while implmenting FCF on oracle RAC from java program
I have java prog on client machine that uses properties from a property file.While making the connection to the ONS port on the oracle RAC server to implement FCF the program is throwing error as below:
java.sql.SQLException: Io exception: The Network Adapter could not establish the connection
and when i checked the ons logs for that node the logs are as follows:
Connection 5,199.xxx.xxxxxx,8200 header RCV failed (Connect
ion reset by peer) coFlags=1002a
These logs are generated only when java program tries to connect else the daemon started without any errors.
But sometime it connets and gives the desired output.
Please advice and do let me know in case you need more information.
Java program on the client machine is as follows..
* Oracle Support Services
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Enumeration;
import java.util.Properties;
import java.util.ResourceBundle;
import oracle.jdbc.pool.OracleConnectionCacheManager;
import oracle.jdbc.pool.OracleDataSource;
public class FCFConnectionCacheExample
private OracleDataSource ods = null;
private OracleConnectionCacheManager occm = null;
private Properties cacheProperties = null;
public FCFConnectionCacheExample() throws SQLException
// create a cache manager
occm = OracleConnectionCacheManager.getConnectionCacheManagerInstance();
Properties props = loadProperties("fcfcache");
cacheProperties = new java.util.Properties();
cacheProperties.setProperty("InitialLimit", (String)props.get("InitialLimit"));
cacheProperties.setProperty("MinLimit", (String)props.get("MinLimit"));
cacheProperties.setProperty("MaxLimit", (String)props.get("MaxLimit"));
ods = new OracleDataSource();
ods.setUser((String)props.get("username"));
ods.setPassword((String)props.get("password"));
ods.setConnectionCachingEnabled(true);
ods.setFastConnectionFailoverEnabled(true);
ods.setConnectionCacheName("MyCache");
ods.setONSConfiguration((String)props.get("onsconfig"));
ods.setURL((String)props.get("url"));
occm.createCache("MyCache", ods, cacheProperties);
private Properties loadProperties (String file)
Properties prop = new Properties();
ResourceBundle bundle = ResourceBundle.getBundle(file);
Enumeration enumlist = bundle.getKeys();
String key = null;
while (enumlist.hasMoreElements())
key = (String) enumlist.nextElement();
prop.put(key, bundle.getObject(key));
return prop;
public void run() throws Exception
Connection conn = null;
Statement stmt = null;
ResultSet rset = null;
String sQuery =
"select sys_context('userenv', 'instance_name'), " +
"sys_context('userenv', 'server_host'), " +
"sys_context('userenv', 'service_name') " +
"from dual";
try
conn = null;
conn = ods.getConnection();
stmt = conn.createStatement();
rset = stmt.executeQuery(sQuery);
rset.next();
System.out.println("-----------");
System.out.println("Instance -> " + rset.getString(1));
System.out.println("Host -> " + rset.getString(2));
System.out.println("Service -> " + rset.getString(3));
System.out.println("NumberOfAvailableConnections: " +
occm.getNumberOfAvailableConnections("MyCache"));
System.out.println("NumberOfActiveConnections: " +
occm.getNumberOfActiveConnections("MyCache"));
System.out.println("-----------");
catch (SQLException sqle)
while (sqle != null)
System.out.println("SQL State: " + sqle.getSQLState());
System.out.println("Vendor Specific code: " +
sqle.getErrorCode());
Throwable te = sqle.getCause();
while (te != null) {
System.out.print("Throwable: " + te);
te = te.getCause();
sqle.printStackTrace();
sqle = sqle.getNextException();
finally
try
rset.close();
stmt.close();
conn.close();
catch (SQLException sqle2)
System.out.println("Error during close");
public static void main(String[] args)
System.out.println(">> PROGRAM using JDBC thin driver no oracle client required");
System.out.println(">> ojdbc14.jar and ons.jar must be in the CLASSPATH");
System.out.println(">> Press CNTRL C to exit running program\n");
try
FCFConnectionCacheExample test = new FCFConnectionCacheExample();
while (true)
test.run();
Thread.currentThread().sleep(10000);
catch (InterruptedException e)
System.out.println("PROGRAM Ended by user");
catch (Exception ex)
System.out.println("Error Occurred in MAIN");
ex.printStackTrace();
Some of the info i have deleted intensionally as this is confidential
Property file is as follows
# properties required for test
username=test
password=test
InitialLimit=10
MinLimit=10
MaxLimit=20
onsconfig=nodes=RAC-node1:port,RAC-node2:port
url=jdbc:oracle:thin:@(DESCRIPTION= \
(LOAD_BALANCE=yes) \
(ADDRESS=(PROTOCOL=TCP)(HOST=RAC-node1)(PORT=1521)) \
(ADDRESS=(PROTOCOL=TCP)(HOST=RAC-node1)(PORT=1521)) \
(CONNECT_DATA=(service_name=RAC_SERVICE)))Hi;
Please check below note:
Link Errors While Installing CRS & RAC Database software [ID 438747.1]
Codeword File $TIMEBOMB_CWD,/opt/aCC/newconfig/aCC.cwd Missing Or Empty [ID 552893.1]
Regard
Helios -
Recommendations - Oracle RAC 10g on Solaris 10 Containers Logical/Local..
Dear Oracle Experts et all
I have a couple of questions for Oracle 10g RAC implementation on Solaris and seek your advice. we are attempting to implement oracle 10g RAC on Solaris OS and SPARC Platform.
1 We are wondering if Oracle 10g RAC could be implemented on Solaris Local/Logical Containers? I was assuming that Oracle will always link it self with OS binaries and Libraries while S/W installation and hence will need an OS image/Root Disk over which it could go. However, in containers, I assume we have a single solaris installation and configuration which will thus be shared to the containers which will be further configured in it. In such situations how does Oracle instalation proceed? Do I need to look at a scenario where, the global Container/Zone will have Oracle install and this image be shared across to zones/containers accordingly? If it is so, what all filesystems from OS will need to be shared across to these zones/containers?
Additionally, even if this approach is supported, is it a recommended approach? I am unsure about the stability and functionality of Oracle in such cases and am not able to completly conceptualize. However, I assume there could be certain items which needs to be approprietly taken care off. It will help if you could share observations from your experiences.
2 The idea of RAC we are looking at is to have multiple Oracle Installations on top of native clustering solution say veritas clusters/Sun Clusters. Do we still need to have Oracle Cluster solution Clusterware (ORACRS) on top of this to achieve Oracle Clustering? Will I be able to install Oracle as a standalone installation on top of native clustering solution say veritas clusters/Sun Clusters?
Our requirement is to have the above mentioned multiple Oracle installations spread across two (2) seperate H/W platforms,say Node A and Node B, and configure our Cluster Solution to behave as active-passive across Node A and Node B. In other words, I will configure Clustering Solution like VRTS/SunCluster in Active-Passive, then have 3 Oracle installations on Node A, another 3 on Node B. I will configure one database each for each of these Oracle S/W installation (with an idea not to have Clusterware between clustering solution VRTS/SunCluster and Oracle installation, if it works). Now I will run 3 databases thus on each of these nodes. If any downtime happens on any one of the nodes, say Node A, I will fail all oracle databases and S/W accordingly to the alternate available node, Node B in this case, using native clustering solution and I will want the database to behave as it was behaving earlier, on Node A. I am not sure though if I will be able to bring the database up on Node B when resources in OS perspective are failed over.
we want to use Oracle 10g RAC Release 2 EE on Solaris 10 OS latest/one before the latest release.
Please share your thoughts.
Regards!
SaratSarat Chandra C wrote:
Dear Oracle Experts et all
I have a couple of questions for Oracle 10g RAC implementation on Solaris and seek your advice. we are attempting to implement oracle 10g RAC on Solaris OS and SPARC Platform.
1 We are wondering if Oracle 10g RAC could be implemented on Solaris Local/Logical Containers? My understanding is that RAC in a Zone (Container) is not supported by Oracle, and will not work anyway. Regardless of installation, RAC needs to do cluster level stuff about the cluster configuration, changing network addresses dynamically, and sending guaranteed messages over the cluster interconnect. None of this stuff can be done in a Local Zone in Solaris, because Local Zones have fewer permissions that the Global Zone. This is part of the design of Solaris Zones, and nothing to do with how Oracle RAC itself works on them.
This is all down to the security model of Zones, and Local Zones lack the ability to do certain things, to stop them reconfiguring themselves and impacting other Zones. Hence RAC cannot do dynamic cluster reconfiguration in a Local Zone, such as changing virtual network addresses when a node fails.
My understanding is that RAC just cannot work in a Local Zone. This was certainly true 5 years ago (mid 2005), and was a result of the inherent design and implementation of Zones in Solaris. Things may have changed, so check the Solaris documentation, and check if Oracle RAC is supported in Local Zones. However, as I said, this limitation was inherent in the design of Zones, so I do not see how Sun could possibly have changed it so that RAC would work in a Local Zone.
To me, your only option is the Global Zone. Which pretty much destroys the argument for having Zones on a Solaris system, unless you can host other non-Oracle application on the other Zones.
2 The idea of RAC we are looking at is to have multiple Oracle Installations on top of native clustering solution say veritas clusters/Sun Clusters. Do we still need to have Oracle Cluster solution Clusterware (ORACRS) on top of this to achieve Oracle Clustering? Will I be able to install Oracle as a standalone installation on top of native clustering solution say veritas clusters/Sun Clusters?I am not sure the term 'native' is correct. All 'Cluster' software is low level, and has components that run within the operating system. Whether this is Sun Cluster, Veritas Cluster Server, or Oracle Clusterware. They are all as 'native' to Solaris as each other. They all perform the same function for Oracle RAC around Cluster management - which nodes are members of the cluster, heartbeats between nodes, reliable fast message delivery, etc.
You only need one piece of Cluster software. So pick one and use it. If you use the Sun or Veritas cluster products, then you do not need the Oracle Clusterware software. But I would use it, because it is free (included with RAC), is from Oracle themselves and so guaranteed to work, is fully supported, and is one less third party product to deal with. Having an all Oracle software stack makes things simpler and more reliable, as far as I am concerned. You can be sure that Oracle will have fully tested RAC on their own Clusterware, and be able to replicate any issues in their own support environments.
Officially the Sun and Veritas products will work and are supported. But when you get a problem with your Cluster environment, who are you going to call? You really want to avoid "finger pointing" when you have a problem, with each vendor blaming the cause of the problem on another vendor. Using an all Oracle stack is simpler, and ensures Oracle will "own" all your support problems.
Also future upgrades between versions will be simpler, as Oracle will release all their software together, and have tested it together. When using third party Cluster software, you have to wait for all vendors to release new versions of their own software, and then wait again while it is tested against all the different third party software that runs on it. I have heard of customers stuck on old versions of certain cluster products, who cannot upgrade because there are no compatible combinations in the support matrices between the cluster product and Oracle database versions.
I will configure Clustering Solution like VRTS/SunCluster in Active-Passive, then have 3 Oracle installations on Node A, another 3 on Node B. As I said before, these 3 Oracle installations will actually all be on the same Global Zone, because RAC will not go into Local Zones.
John -
Oracle RAC Nodes getting reboot in case of preferred controller failed
When we are disconnecting both Fiber cable from preferred Controller A or plugging out Controller A card from Disk Array(IBM DS 4300), After 90 seconds both the servers are rebooting.
In this time complete RAC network is going out of service for approx 5 minutes.After reboot both servers are coming with both instances without any manual intervention
Its a critical issue for us because we are loosing High Availability, Let us know how we can resolve this critical issue.
Detail of Network:
1. Software- Oracle 10g Release2
2. OS- Redhat Linux 3 (Kernel Version-2.4.21-27.ELsmp)
3. Shared Storage- IBM DS 4300.
4. Multipathing Driver - RDAC (rdac-LINUX-09.00 A5.13)
4. Nodes- IBM 346
5. Databse on ASM
6. ASM,OCR & Voting Disk Preferred controller is A.
7. Hangcheck timer value is 210 seconds.
8. Both Server available with 2 HBA port . I HBA port is connected with Controller A and Seconfd HBA port is connected with Controller B of SAN Disk Array.
As per my understanding,
Voting disk resides in Disk Array and Controller A is preferred owner of Voting Disk LUN.. When i am disconnecting both fiber cable from preferred controller A , then Both Nodes Clusterware software trying to contact with Voting Disk, When they are unable to contact with Voting disk in specfic time period, they are going for reboot.
I tested Controller failure testing with Oracle RAC software as well without Oracle. Without Oracle its working fine and reason behind, in that time Disk Array is waiting for approx 300 seconds for changing preferred controlller from A to B.
But With Oracle, Clusterware Software reboot both nodes before Controller can shift from A to B.
So if i conclude,the tech who has good understanding of Oracle Clusterware on Linux OS & IBM RDAC multipath driver can help me.
when we install Oracle RAC on Linux, it is required to configure hangcheck timer.
Oracle recomends 180 second.
It means if one of node is hanging, then second node will wait for 180 seconds, if within 180 seconds ,it is not able to resolve this situation then it will reboot hung node.
I think Hangcheck timer configuration reuired only with Linux OS.
Configuration File
cat >> /etc/rc.d/rc.local << EOF
modprobe hangcheck-timer hangcheck_tick=15 hangcheck_margin=60Sorry
Hangcheck timer is
Configuration File
cat >> /etc/rc.d/rc.local << EOF
modprobe hangcheck-timer hangcheck_tick=30 hangcheck_margin=180 -
RCA for Oracle RAC Performance Issue
Hi DBAs,
I have setup a 2 node Oracle RAC 10.2.0.3 on Linux 4.5 (64 bit) with 16 GB memory and 4 dual core CPUs each. The database is serving a web application but unfortunately the system is at its knees. The performance is terrible. The storage is a EMC SAN but ASM is not implemented with a fear to further degrade the performance or not to complicate the system further.
I am seeking the expert advises from some GURUs from this forums to formulate the action plan to do the root cause analysis to the system and database. Please advise me what tools I can use to gather the information about the Root Cause. AWR Report is not very helpful. The system stats with top, vmstat, iostat only show the high resource usage but difficult to find the reason. OEM has configured and very frequently report all kind of high wait events.
How I can use effectively find Network bottle necks (netstat command which need to be really helpful to understand).
How I can see the system I/O (iostats) which can provide me some useful information. I don't understand what sould be the baseline or optimal values to compare the I/O activities.
I am seeking help and advised to diagnose the issue. I also want to represent this issue as a case study.
Thanks
-Samar-First of all, RAC is mainly suited for OLTP applications.
Secondly, if your application is unscalable (it doesn't use bind variables and no SQL statements have been tuned and/or it has been ported from Sukkelserver 200<whatever>) running it against RAC will make things worse.
Thirdly: RAC uses a chatty Interconnect. If you didn't configure the Interconnect properly,and/or are using slow Network cards (1 Gb is mandatory), and/or you are not using a 9k MTU on your 1 Gb NIC, this again will make things worse.
You can't install RAC 'out of the box'. It won't perform! PERIOD.
Fourthly: you might suffer from your 'application' connecting and disconnecting for every individual SQL statement and/or commit every individual INSERT or UPDATE.
You need to address this.
Using ADDM and/or AWR is compulsory for analysing the problem, and/or having read Cary Millsaps book on Optimizing Oracle performance is compulsory.
You won't come anywhere without AWR and OS statistics will not provide any clue.
Because, paraphrasing William Jefferson Clinton, former president of the US of A:
It's the application, stupid.
99 out of 100 cases. Trust me. All developers I know currently are 100 percent clueless.
That said, if you can't be bothered to post the top 5 AWR events, and you aren't up to using AWR reports, maybe you should hire a consultant who can.
Regards,
Sybrand Bakker
Senior Oracle DBA
Maybe you are looking for
-
How do you add a Jalbum ( web album) to my exisiting website(GoLive CS2).
JAlbum is a gallery software that makes web albums from your digital images. I am a photographer and I can't seem to find out how to add albums to my webpages. I created my webpage using GoLive CS 2 . HELP PLEASE. Any other suggestions on how to make
-
Upgrade from Oracle 8i in Win98 to Oracle 11g in Window Vista
Hi, I have an existing system that runs in Window98 using Oracle 8i as database. We are planning to convert the system from using Oracle 8i to 11g, and this new converted system will be running in Window Vista. What I have in mind is doing export and
-
Informatica Start Workflow failed in Workflow Manager
Hi, I have just completed the installation of oracle BI Apps version 7.9.4 (Chapter 4 & 5 of installtion document). Now I am trying to run workflow using informatica workflow manager and i am getting below error in workflow log. It says "Cannot find
-
BB10 sync email, contacts, calendar
While I am waiting for Verizon to get their act together, I am re-evaluating my email, calendar, contact syncing. This seems to be a major issue on the forums. I presently use gmail and want to use a different system as I am becoming more dissatisfi
-
Hello! My card had been debited the money, please return them and stop the subscription because soon I don't need it. The money was withdrawn and I did not enjoy the paid subscription.