CPU time from a multi processor

Hi
I need to get the CPU time from a multi processor machine,
The top command will not do the job for me, and I will need to use the command in automation for testing the CPU time for 2 hours or more, I thinking about redirect the CPU % to a file, and in the end I will run the average for the numbers in that file.
If I will be able to see the CPU time for the two processors it will be grate, but more important is to collect the global CPU status.
I don�t have a command line that a can use, I can use some help.
Thanks Shay

mpstat in fact works on my Opteron 270 dual-processor dual-core machine running Soalris 10. For instance 'mpstat 3 5' displays 5 reports each 3 seconds apart, showing status of each CPU:
% mpstat 3 5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 1 0 1 383 245 73 0 5 0 0 87 1 0 0 99
1 1 0 1 33 4 51 0 4 0 0 57 0 0 0 99
2 0 0 1 38 0 72 0 2 0 0 42 0 0 0 100
3 1 0 1 53 26 49 0 1 1 0 47 0 0 0 100
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 393 252 76 0 8 0 0 94 1 0 0 99
1 0 0 0 48 4 82 0 7 1 0 71 0 0 0 100
2 3 0 0 39 0 76 0 4 1 0 51 0 0 0 100
3 0 0 0 44 25 35 0 3 2 0 49 0 0 0 100
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 382 250 64 0 5 0 0 111 0 0 0 100
1 0 0 0 29 4 43 0 4 0 0 56 1 0 0 99
2 0 0 1 48 1 93 0 3 0 0 39 0 0 0 100
3 0 0 0 69 29 78 0 1 1 0 65 0 0 0 100
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 1 386 250 72 0 4 0 0 111 0 0 0 99
1 0 0 0 28 3 42 0 3 0 0 55 1 0 0 99
2 0 0 0 42 0 81 0 1 0 0 43 0 0 0 100
3 0 0 0 67 29 74 0 0 1 0 63 0 0 0 100
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 404 252 98 0 5 0 0 100 1 0 0 99
1 0 0 0 45 9 68 0 5 1 0 53 0 0 0 100
2 0 0 0 34 0 64 0 4 0 0 50 0 0 0 100
3 0 0 0 58 27 60 0 2 2 0 73 0 0 0 100
(The first report summarizes all activity since boot.)

Similar Messages

V$osstat and V$SYS_TIME_MODEL - how to get CPU time from instance

Hi there !
I have a function osstat, which take stats from the os using v$osstat (credits for the procedure to a person, I regret to say, that I cant remember his name). But since we have 9 databases on the same server (and we dont have access to the server os itself (outsourcing stinks), we often would like to know more about cpu, waits etc. And one of the procedures we use is the osstat.
I have tried to combine it with V$SYS_TIME_MODEL in oder to se how much of the OS CPU time comes from the instance I am on at the moment, but I'm not able to figure out how to do it exaclty.
This is my code:
DROP TYPE OSSTAT_RECORD;
CREATE OR REPLACE TYPE osstat_record IS OBJECT (
date_time_from TIMESTAMP,
date_time_to TIMESTAMP,
idle_time NUMBER,
user_time NUMBER,
sys_time NUMBER,
iowait_time NUMBER,
nice_time NUMBER,
instance_cpu_time NUMBER
DROP TYPE OSSTAT_TABLE;
CREATE OR REPLACE TYPE osstat_table AS TABLE OF osstat_record;
CREATE OR REPLACE FUNCTION osstat(p_interval IN NUMBER default 5, p_count IN NUMBER default 2, p_dec in number default 0)
   RETURN osstat_table
   PIPELINED
IS
l_t1 osstat_record;
l_t2 osstat_record;
l_out osstat_record;
l_num_cpus NUMBER;
l_total NUMBER;
l_instance NUMBER;
BEGIN
l_t1 := osstat_record(NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
l_t2 := osstat_record(NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
SELECT value
INTO l_num_cpus
FROM v$osstat
WHERE stat_name = 'NUM_CPUS';
FOR i IN 1..p_count+1
LOOP
    SELECT systimestamp, sum(decode(stat_name,'IDLE_TIME', value, NULL)) as idle_time,
           sum(decode(stat_name,'USER_TIME', value, NULL)) as user_time,
           sum(decode(stat_name,'SYS_TIME', value, NULL)) as sys_time,
           sum(decode(stat_name,'IOWAIT_TIME', value, NULL)) as iowait_time,
           sum(decode(stat_name,'NICE_TIME', value, NULL)) as nice_time
    INTO l_t2.date_time_to, l_t2.idle_time, l_t2.user_time, l_t2.sys_time, l_t2.iowait_time, l_t2.nice_time
    FROM v$osstat
    WHERE stat_name in ('IDLE_TIME','USER_TIME','SYS_TIME','IOWAIT_TIME','NICE_TIME');
    select value/100000
    into l_t2.instance_cpu_time
    from V$SYS_TIME_MODEL
    where stat_name = 'DB time';
    l_out := osstat_record(l_t1.date_time_from, systimestamp,
                           (l_t2.idle_time-l_t1.idle_time)/l_num_cpus/p_interval,
                           (l_t2.user_time-l_t1.user_time)/l_num_cpus/p_interval,
                           (l_t2.sys_time-l_t1.sys_time)/l_num_cpus/p_interval,
                           (l_t2.iowait_time-l_t1.iowait_time)/l_num_cpus/p_interval,
                           (l_t2.nice_time-l_t1.nice_time)/l_num_cpus/p_interval,
                           ((l_t2.instance_cpu_time-l_t1.instance_cpu_time)/100)); --- >> Should I divide by no of cpus here as well??? Or ???
    l_total := l_out.idle_time+l_out.user_time+l_out.sys_time+l_out.iowait_time+nvl(l_out.nice_time,0);
    if l_out.user_time > 0 then
       l_instance := (l_out.instance_cpu_time*100)/l_total;   ->> instance in percent of the total cputime
    else
       l_instance := 0;
    end if;
    if i > 1 then
    PIPE ROW(osstat_record(l_t1.date_time_to, systimestamp,
                           trunc((l_out.idle_time/l_total*100),p_dec),
                           trunc((l_out.user_time/l_total*100),p_dec),
                           trunc((l_out.sys_time/l_total*100),p_dec),
                           trunc((l_out.iowait_time/l_total*100),p_dec),
                           trunc((l_out.nice_time/l_total*100),p_dec),
                           trunc(l_instance,p_dec)));
    end if;
    l_t1 := l_t2;
    sys.dbms_lock.sleep(p_interval);
END LOOP;
RETURN;
END;
/I get ie a USER CPU Time of 15% fo a given interval of 5 mins - and a cputime for the instance of 50 - and others are 5% and 1%.
My brain has stopped working now .... I'm stuck
Mette

mettemusens wrote:
Hi there !Duplicate thread:
Re: v$osstat and V$SYS_TIME_MODEL question
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

Apple mobile device service uses too much cpu time

Apple mobile device service uses too much cpu time, from 88% to 99% of the CPU time.

The best way to fix this is to completely remove all aspects of iTunes then reinstall. Step by step directions are on this article. Good luck
http://support.apple.com/kb/HT1925?viewlocale=en_US

How to estimate how much CPU time is parse related DB 9i wide .

Hi,
I'm trying to estimate how much CPU time is parse related and next compare that value after i.e. cursor_cache setting. So far I've found :
parse time cpu
parse time elapsed
in v$sysstat .
Are there good indicators for my needs ?
Basically when I get CPU time from statspack and parsetime_cpu (somehow) i can derive percentage usage.
btw how is 'CPU time' in statspack calculated ?
I'm running AIX DB 9.2.0.8 EE , can take historical data from statspack .
Regards.
Greg

Hi,
I'm trying to estimate how much CPU time is parse related and next compare that value after i.e. cursor_cache setting. So far I've found :
parse time cpu
parse time elapsed
in v$sysstat .
Are there good indicators for my needs ?
Basically when I get CPU time from statspack and parsetime_cpu (somehow) i can derive percentage usage.
btw how is 'CPU time' in statspack calculated ?
I'm running AIX DB 9.2.0.8 EE , can take historical data from statspack .
Regards.
Greg

Why is FF sucking up so much CPU time & Why can I not preform a search on FAQ's?

Why is FF sucking up so much CPU time &
Why can I not preform a search on FAQ's?
FF is sucking up way too much CPU time from 30 to 98% for great/long periods of time on a Sony Vaio laptop platform running P4 CPU w/XP prof w/SP3 & 2Gb RAM.
Why can I not preform a search on FAQ's so that might not have repeat a question that might have been asked previously?
Searching 24xx pages is not my idea of how to spend precious time to find an answer that I am looking for.

Safe mode or turning off the plugins does not work with this bug. If they are still installed firefox will open plugincontainer, the plugins need to be uninstalled or disabled at OS level.
"''clearing Cookies will start an instance of plugin-container for almost all of the plugins installed on the system, regardless of enable/disable state for each one.''"

Upgrading from single processor to multi processor

I am looking for information on upgrading a 7.3 Oracle from a
single processor to a multi processor. I will be putting in
another processor in my server and would like to know how to get
Oracle to use this second processor. I know that I need to
upgrade Windows NT to multiprocessor first but what do I need to
do after that with Oracle.

I'm not certain why that would be the wrong boards. I guess HP has the wrong boards posted on their product's support site at the link below.
http://h20000.www2.hp.com/bizsupport/TechSupport/Home.jsp?lang=en&cc=us&prodTypeId=12454&prodSeriesI...
When I clicked on Support Forum, that is where it took me.
↙-----------How do I give Kudos?| How do I mark a post as Solved? ----------------↓

Uncorrelated GC STW pauses in ParNew + sys cpu time spikes from jvm calls

Hi, JVM experts.
GC is not an easy subject to grab all its subtleties, so I'm looking for some advice with my situation where to dig further.
The overall picture:
1) Linux smp (2.6.5) x86_64 host with 8G memory, 2x2Ghz xeon (HT) CPUs
Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_02-b05, mixed mode)
2) Heap: -Xms4500M -Xmx4500M -XX:MaxNewSize=128m -XX:NewSize=128m -XX:MaxPermSize=128m -XX:PermSize=128m
3) GC: -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0 -XX:SurvivorRatio=128 -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintClassHistogram
4) Application is running under consistent traffic for two weeks.
5) I monitor the system with "sar"
6) Once or twice a week I observe the kernel-space cpu usage spike to ~90% for a duration <10 seconds.
7) I detect the cpu spike and dump a snapshot of the system situation. It's the JVM which consumes the syscpu time. JVM is sent SIGQUIT at the same instant.
There are two points, I'm rather vague about.
1) All cpu load situations have the following similar pattern in GC log file:
195098.019: [GC 195098.019: [ParNew
Desired survivor size 491520 bytes, new threshold 0 (max 0)
: 129152K->0K(130112K), 0.0754030 secs] 1549718K->1448253K(4607040K), 0.0755550 secs]
Total time for which application threads were stopped: 0.0758180 seconds
Application time: 0.5316910 seconds
Total time for which application threads were stopped: 0.0003550 seconds
Application time: 0.0695990 seconds
Total time for which application threads were stopped: 0.0001620 seconds
Application time: 0.1657730 seconds
Total time for which application threads were stopped: 0.0001780 seconds
Application time: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000440 seconds
Application time: 0.2738210 seconds
Total time for which application threads were stopped: 0.0003530 seconds
Application time: 0.3108570 seconds
195099.448: [GC 195099.448: [ParNew
Desired survivor size 491520 bytes, new threshold 0 (max 0)
: 129151K->0K(130112K), 0.0712390 secs] 1577405K->1476947K(4607040K), 0.0713980 secs]
Total time for which application threads were stopped: 0.0716750 seconds
Application time: 0.0364560 seconds
Total time for which application threads were stopped: 8.2666520 seconds
Application time: 0.0000700 seconds
Total time for which application threads were stopped: 0.0055730 seconds
Application time: 0.0069140 seconds
Total time for which application threads were stopped: 0.0017350 seconds
Application time: 0.0011930 seconds
Total time for which application threads were stopped: 0.0064760 seconds
Application time: 0.0000720 seconds
Total time for which application threads were stopped: 0.0001120 seconds
Application time: 0.0001010 seconds
Total time for which application threads were stopped: 0.0000650 seconds
Application time: 0.0001570 seconds
195107.840: [Full GC 195107.840: [CMS: 1476947K->1092516K(4476928K), 7.6193100 secs] 1488921K->1092516K(4607040K), [CMS Perm : 46862K->46641K(131072K)], 7.6194800 secs]
num #instances #bytes class name
1: 1962568 306032400 [Ljava.util.HashMap$Entry;
2: 1962446 125596544 java.util.HashMap
The "delay" of the SIGQUIT/Histogram dump can be ~1-2 seconds from the CPU overload detection. I'm curious about the "suspeciously spurious" SWT preceding the cpu spike. What could be the possible reasons?
2) Per thread analysis of the JVM threads hinted me to one thread which sometimes monopolized ~60% of the cpu time in kernel space. This is the native posix thread which "calls back into" JVM on certain events. Could it somehow cause the aforementioned STW?
My attempts to get a gdb stack dump at cpu load event failed (gdb can't resolve the stack in some of the jvm threads and enters infinite loop)
Any advice is greatly appreciated
regards,
- andrey

Just to close my "ticket" with some findings on the way hopefully to be useful for somebody else.
Since the problem manifested itself only on Linux box (have tried two different enterprise production kernels 2.6), I couldn't use dtrace, so my choice was limited to systap and /proc/profile.
The pattern when cpu usage was spiking was the following:
15 get_stack 0.1042
31 unhandled_signal 0.6458
34 stub_rt_sigreturn 0.2297
49 copy_siginfo_to_user 0.1021
70 find_vma 0.6250
94 retint_signal 0.7705
100 do_sigaltstack 0.2604
138 is_prefetch 0.3920
211 __up_read 1.1989
252 system_call 1.9535
292 __down_read 1.9730
552 save_i387 2.8750
1501 do_page_fault 0.9381
2266 do_signal 1.3488
2427 get_signal_to_deliver 1.9700
2452 force_sig_info 11.7885
3715 sys_rt_sigreturn 5.1597
14260 total 0.0057
It has been always the extensive number of signals generated. systap is lacking dtace java probes (as well as any u-level probes for that matter), so I could tap only in the kernel on syscalls:
force_sig_info= 10209 – 10230
0xc043681b : force_sig_info+0x1/0x86
0xc04230d3 : force_sig_info_fault+0x24/0x28
0xc0404670 : sys_rt_sigreturn+0x0/0xff
0xc061e3d8 : kprobe_exceptions_notify+0x164/0x386
0xc061f043 : notifier_call_chain+0x2a/0x47
0xc061f07e : atomic_notifier_call_chain+0x17/0x1a
0xc061f011 : do_page_fault+0x5e7/0x5ef
0xc0400000 : startup_32+0x0/0xb4
sys_rt_sigreturn= 10209 – 10235
0xc0404671 : sys_rt_sigreturn+0x1/0xff
0xc040518a : syscall_call+0x7/0xb
0xc0400000 : startup_32+0x0/0xb4
sys_rt_sigreturn() is tricky (it is designed to return to the kernel from the u-space signal handler), but the force_sig_info() is the result of the do_page_fault(). The swap was disabled and I assumed that page faults were "minor"-s. Looking at the comments in the OpenJDK JVM sources I could see that there are cases for non-mapped regions singal handling (like dynamic stack growth), so I suspected it's something very JVM specific. I could not correlate it precisely to GC events.
The end of the story is that I didn't find out exactly what was the real trigger of this event, but JVM1.6.05 does not produce such CPU spilkes.
- a.

Too much exclusive CPU time counted at swapcontext function

Hi,
I'm using Sun Studio Express March 2009 Build, especially Performance Analyzer, and I have observed some hardly understandable CPU times measured at swapcontext function of libc library.
Here is my machine spec.
Two-way Intel E5320 processors with 16GB memory
SUSE Linux Enterprise Server 10 SP1 (x86_64)
My program consisted of 8 threads (pthread), and around a hundred of user contexts (coroutines or fibers) run on every single thread. I'm using makecontext/swapcontext for creating/scheduling user contexts on threads.
I'm using both Sun Studio Performance Analyzer and Intel VTune Performance Analyzer.
My problem is that performance analyzer reported about 20% of total CPU time as the exclusive CPU time of swapcontext function, while I couldn't find as many HW event samples related to swapcontext by using VTune.
To narrow down the problem scope, I made simple test program, and I reproduced the problem. I attached the test program generator written by bash script at the end of this message.
I generated the test program with following command.
% bash code_gen.sh 8 128 100000 1000
Then, you can get the test program, which consists of 8 threads with 128 user contexts at each threads, and 100000 times of context switch at each user context.
In my system, Sun Studio Performance Analyzer reported 328 seconds of exclusive CPU time at swapcontext out of 463 seconds total CPU time (CPU Time of <Total>). Briefly, swapcontext consumed about 70% of total CPU time.
However, according to VTune sampling, both libc-2.4.so and vmlinux-2.6.16.46-0.12-smp consume only 8% of total clockticks.
It's too large gap between Sun Studio and VTune.
Have you seen this kind of problem? Do you know why this mismatch happens?
Or how can I estimate actual swapcontext cost?
Thank you for reading my post, and I'm looking forward to some hints about my problem.
Colin
---- code_gen.sh ----
#!/bin/bash
usage()
     echo "code_gen.sh <num_threads> <num_task> <num_loop> <func_body_size>"
test()
     local num_threads=$1
     local num_tasks=$2
     local func_loop=$3
     local func_body_size=$4
     local file_name="mytest_${num_threads}_${num_tasks}_${func_loop}_${func_body_size}"
     main_func_gen $num_tasks $func_loop $func_body_size > $file_name.c
     gcc -O2 $file_name.c -o $file_name -lpthread
sub_func_gen()
     local func_id=$1
     local func_loop=$2
     local func_body_size=$3
     local num_tasks=$4
     cat <<!
static void f$1(int threadId)
     volatile int c = 0;
     int i = 0;
     for(i = 0; i < $func_loop; ++i)
     for i in `seq 1 $func_body_size`; do
          echo "          c+=1;";
     done
     cat <<!
          swapcontext(&ctx[threadId][$func_id], &ctx[threadId][($func_id+1)%$num_tasks]);
main_func_gen()
     num_tasks=$1
     func_loop=$2
     func_body_size=$3
     cat <<!
#include <stdio.h>
#include <ucontext.h>
#include <stdlib.h>
#include <sys/time.h>
#include <time.h>
#include <pthread.h>
static ucontext_t **ctx;
static ucontext_t *mctx;
static pthread_t *pThreads;
     for i in `seq 0 $((num_tasks-1))`; do
          sub_func_gen $i $func_loop $func_body_size $num_tasks
     done
     cat <<!
int
threadMain (int* pThreadId)
char** st = NULL;
int i = 0;
int* ret = NULL;
     int threadId=*pThreadId;
printf("$num_tasks tasks on %d thread\n", threadId);
st = (char**)malloc(sizeof(char*)*$num_tasks);
ctx[threadId] =(ucontext_t*)malloc(sizeof(ucontext_t)*$num_tasks);
ret = (int*)malloc(sizeof(int)*$num_tasks);
     for i in `seq 0 $((num_tasks-1))`; do
          cat <<!
st[$i] = (char*)malloc(sizeof(char)*8192);
getcontext(&ctx[threadId][$i]);
ctx[threadId][$i].uc_stack.ss_sp = st[$i];
ctx[threadId][$i].uc_stack.ss_size = 8192;
ctx[threadId][$i].uc_link = &mctx[threadId];
makecontext(&ctx[threadId][$i], f$i, 1, threadId);
     done
     cat <<!
//printf("start\n");
swapcontext(&mctx[threadId], &ctx[threadId][0]);
return 0;
int
main(int argc, char* argv[])
     int num_threads = $num_threads;
     int rc;
     pthread_attr_t attr;
     void *status;
struct timeval begin, end;
     int *threadId;
     int i;
     printf("%d threads\n", num_threads);
     pthread_attr_init(&attr);
     pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
     pThreads = (pthread_t*)malloc(sizeof(pthread_t)*num_threads);
     mctx = (ucontext_t*)malloc(sizeof(ucontext_t)*num_threads);
     ctx = (ucontext_t**)malloc(sizeof(ucontext_t*)*num_threads);
     threadId = (int*)malloc(sizeof(int)*num_threads);
     // begin time measurement
gettimeofday(&begin, NULL);
     for(i=0; i < num_threads; ++i)
          threadId[i] = i;
          rc = pthread_create(&pThreads, &attr, threadMain, (void*)&threadId[i]);
          if(rc)
               printf("ERROR; return code from pthread_create is %d\n", rc);
               exit(-1);
     pthread_attr_destroy(&attr);
     for(i = 0; i < num_threads; ++i)
          rc = pthread_join(pThreads[i], &status);
          if(rc)
               printf("ERROR; return code from pthread_join is %d\n", rc);
               exit(-1);
     // end time measurement
gettimeofday(&end, NULL);
printf("finished. Elapsed time=%dms\n", ((end.tv_sec - begin.tv_sec)*1000000+(end.tv_usec - begin.tv_usec))/1000);
     pthread_exit(NULL);
if [[ $# -ne 4 ]]; then
     usage
     exit 0
fi
test $1 $2 $3 $4

Hi Nik,
Oh! I didn't know that. Here I put my code again. I'm sorry for your confusion.
#!/bin/bash
usage()
     echo "code_gen.sh <num_threads> <num_task> <num_loop> <func_body_size>"
test()
     local num_threads=$1
     local num_tasks=$2
     local func_loop=$3
     local func_body_size=$4
     local file_name="mytest_${num_threads}_${num_tasks}_${func_loop}_${func_body_size}"
     main_func_gen $num_tasks $func_loop $func_body_size > $file_name.c
     gcc -O2 $file_name.c -o $file_name -lpthread
sub_func_gen()
     local func_id=$1
     local func_loop=$2
     local func_body_size=$3
     local num_tasks=$4
     cat <<!
static void f$1(int threadId)
     volatile int c = 0;
     int i = 0;
     for(i = 0; i < $func_loop; ++i)
     for i in `seq 1 $func_body_size`; do
          echo "          c+=1;";
     done
     cat <<!
          swapcontext(&ctx[threadId][$func_id], &ctx[threadId][($func_id+1)%$num_tasks]);
main_func_gen()
     num_tasks=$1
     func_loop=$2
     func_body_size=$3
     cat <<!
#include <stdio.h>
#include <ucontext.h>
#include <stdlib.h>
#include <sys/time.h>
#include <time.h>
#include <pthread.h>
static ucontext_t **ctx;
static ucontext_t *mctx;
static pthread_t *pThreads;
     for i in `seq 0 $((num_tasks-1))`; do
          sub_func_gen $i $func_loop $func_body_size $num_tasks
     done
     cat <<!
int
threadMain (int* pThreadId)
    char** st = NULL;
    int i = 0;
    int* ret = NULL;
     int threadId=*pThreadId;
    printf("$num_tasks tasks on %d thread\n", threadId);
    st = (char**)malloc(sizeof(char*)*$num_tasks);
    ctx[threadId] =(ucontext_t*)malloc(sizeof(ucontext_t)*$num_tasks);
    ret = (int*)malloc(sizeof(int)*$num_tasks);
     for i in `seq 0 $((num_tasks-1))`; do
          cat <<!
        st[$i] = (char*)malloc(sizeof(char)*8192);
        getcontext(&ctx[threadId][$i]);
        ctx[threadId][$i].uc_stack.ss_sp = st[$i];
        ctx[threadId][$i].uc_stack.ss_size = 8192;
        ctx[threadId][$i].uc_link = &mctx[threadId];
        makecontext(&ctx[threadId][$i], f$i, 1, threadId);
     done
     cat <<!
    //printf("start\n");
    swapcontext(&mctx[threadId], &ctx[threadId][0]);
    return 0;
int
main(int argc, char* argv[])
     int num_threads = $num_threads;
     int rc;
     pthread_attr_t attr;
     void *status;
    struct timeval begin, end;
     int *threadId;
     int i;
     printf("%d threads\n", num_threads);
     pthread_attr_init(&attr);
     pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
     pThreads = (pthread_t*)malloc(sizeof(pthread_t)*num_threads);
     mctx = (ucontext_t*)malloc(sizeof(ucontext_t)*num_threads);
     ctx = (ucontext_t**)malloc(sizeof(ucontext_t*)*num_threads);
     threadId = (int*)malloc(sizeof(int)*num_threads);
     // begin time measurement
    gettimeofday(&begin, NULL);
     for(i=0; i < num_threads; ++i)
          threadId[i] = i;
          rc = pthread_create(&pThreads, &attr, threadMain, (void*)&threadId[i]);
          if(rc)
               printf("ERROR; return code from pthread_create is %d\n", rc);
               exit(-1);
     pthread_attr_destroy(&attr);
     for(i = 0; i < num_threads; ++i)
          rc = pthread_join(pThreads[i], &status);
          if(rc)
               printf("ERROR; return code from pthread_join is %d\n", rc);
               exit(-1);
     // end time measurement
gettimeofday(&end, NULL);
printf("finished. Elapsed time=%dms\n", ((end.tv_sec - begin.tv_sec)*1000000+(end.tv_usec - begin.tv_usec))/1000);
     pthread_exit(NULL);
if [[ $# -ne 4 ]]; then
     usage
     exit 0
fi
test $1 $2 $3 $4best regards,
Colin

IPC on multi-processor environment

I have written a C program which uses IPC resource "message queue" and "shared memory" on a single processor SUN Sparc workstation (with
Solaris 2.5.1 OS).
I would like to know whether my program can function correctly if it runs on a multi-processor SUN platform (e.g. Sun HPC model 450 with more than 1 CPU installed).
Does anyone have any experience on this?
Best Regards,
Annie

Hi.
The IPC features will work find in an MP environment. It is designed for processes which are normally protected from each other to be able to exchange data. You just have make sure that access to your message queue or especially shared memory is synchronised between multiple processes by using semaphores (see semget(2)).
You could also look at using just one process but multithreading it.
The multithreaded programming guide covers the issues.
Have you expereienced a particular problem?
Regards,
Ralph
SUN DTS

Multi-processor Multi-Threaded deadlock

Hi all-
I've posted this over at jGuru.com and haven't come up with an effective solution but I wanted to try here before I reported this as a bug. This deals with kicking off many threads at once (such as might happen when requests are coming in over a socket).
I'm looking for a way to find out when all of my threads have finished processing. I have several different implementations that work on a single processor machine:
inst is an array of 1000+ threads. The type of inst is a class that extends threads (and thus implements the runable interface).
for (i = 0;i<Count;i++)
inst[ i ].start()
for (i = 0;i<Count;i++)
inst[ i ].join();
however this never terminates on a multiprocessor machine. I've also tried an isAlive loop instead of join that waits until all threads have terminated until it goes on. Again, this works on the single processor but not the multi-processor machine. Additionally someone suggested a solution with a third "counter" class that is synchronized and decremented in each thread as processing finishes, then a notifyAll is called. The main thread is waiting after it starts all the threads, wakes up to check if the count is zero, and if it's not it goes back to sleep. Again this will work on the single processor but not the multi-processor.
The thread itself ends up doing a JNI call which in turn calls another DLL which I just finished making thread safe (this has been tested via C programs). The procedures are mathematically intensive but calculation time is around half a second on a P3 800Mhz.
Any help on this is greatly appreciated. My next step will likely be tearing down the application to exclude the calculating part just to test the JVM behavior. Is there a spec with the JVM behavior on a multi processor machine, I haven't seen one and it's hard to know what to expect from join() (joining across to processors seems like wierd concept), getCurrentThread() (since 2+ threads are running at the same time), etc.

My next step will likely be tearing down the application to
exclude the calculating part just to test the JVM behavior.Sounds like a really good idea.
Is there a spec with the JVM behavior on a multi processor machine, The behaviour of threads is defined in the specs.
You might want to check the bug database also. There are bug fixes for various versions of java.

Multi Processor rendering & editing optimization

Hello,
Got some great info here from my last post! So thanx
So my question is...is it more ideal to use Xeon or multi processor board setups than single CPU chips? My main aim is for rendering or workflow turnaround since in the next couple of months I might land a nice spot in the top five post companies in my sector...god willing.
Therefore I plan to build a completely new system either focused on Multi CPU or buying multi crossfire Open CL rendering GPU cards. I can't find any documentation or recommendations for improved performances in rendering speeds or realtime fx with a Multi CPU board or the NOW added multi Stream or CUDA GPU (SLI or CrossFire) rendering setups.
if anybody could explain any myths or facts as to how Premiere Pro takes advantage of CPUs or GPUs would be a big help. I'm currently using a Quadro 2000 and an i7-990x with 24gb of memory.
thnx

Is your quaddro connected to a 10 bit monitor?
If not, you would be better served by a faster GTX card
Best Video Card http://forums.adobe.com/thread/1238382
Also, view the results of the CS5 Benchmark http://ppbm5.com/ to see what is fast

Sockd process uses a lot of CPU time

Hi,
I'm running the Sun Java System Web Proxy Server version 4.02 on a SunFire V210 dual processor box running Solaris 10 with the default socks5.conf for testing.
Just browsing a few web pages in Firefox or IE using the socks proxy boosts CPU usage from the sockd process to a staggering 50% and stay there for several minutes.
Comparing with the old NEC reference Socks5 daemon the Sun version is really performing badly.
The system is pretty standard though I have tuned /etc/system and the tcp stack using recommendations in the proxy administration manual. All Solaris 10 patches are installed.
prstat output:
   PID USERNAME SIZE   RSS STATE PRI NICE      TIME CPU PROCESS/NLWP
24519 proxy    6664K 4384K cpu1     0   10   0:11:18 50% sockd/43Any ideas what's wrong or do I have to stick with the old NEC reference daemon?
Regards
Kasper L�vschall

Using:
truss -dlfo truss.out ./sockd-wdog
Where things begin to happen:
5259/1:          11.2858     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/1:          12.2958     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/1:          13.3058     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/1:          14.3158     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/2:          15.1858     pollsys(0xFEAEFDC8, 1, 0xFEAEFD58, 0x00000000)     = 0
5259/1:          15.3258     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/2:          pollsys(0xFEAEFDC8, 1, 0xFEAEFD58, 0x00000000) (sleeping...)
5259/1:          lwp_park(0xFFBFF290, 0)          (sleeping...)
5259/1:          16.3359     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/1:          lwp_park(0xFFBFF290, 0)          (sleeping...)
5259/1:          17.3459     lwp_park(0xFFBFF290, 0)                    Err#62 ETIME
5259/2:          18.2382     pollsys(0xFEAEFDC8, 1, 0xFEAEFD58, 0x00000000)     = 1
5259/2:          18.2385     accept(3, 0xFEAEFEC8, 0xFEAEFE64, SOV_DEFAULT)     = 5
5259/2:          18.2386     lwp_unpark(43)                         = 0
5259/43:     18.2386     lwp_park(0x00000000, 0)                    = 0
5259/2:          18.2388     accept(3, 0xFEAEFEC8, 0xFEAEFE64, SOV_DEFAULT)     Err#11 EAGAIN
5259/43:     18.2389     getsockname(5, 0xFE5AFEA8, 0xFE5AFDA4, SOV_DEFAULT) = 0
5259/43:     18.2391     getpeername(5, 0xFE5AFE38, 0xFE5AFDA4, SOV_DEFAULT) = 0
5259/43:     18.2391     read(5, 0x00063B7E, 1)                    Err#11 EAGAIN
5259/43:     18.2407     pollsys(0xFE5AFD08, 1, 0xFE5AFCA0, 0x00000000)     = 1
5259/43:     18.2408     read(5, "04", 1)                    = 1
5259/43:     18.2409     read(5, "01\0 P BF9 ] c", 7)               = 7
5259/43:     18.2411     read(5, " k l\0", 255)                    = 3
5259/43:     18.2412     write(4, " [ 0 1 / M a r / 2 0 0 6".., 88)     = 88
5259/43:     18.2413     so_socket(PF_INET, SOCK_STREAM, IPPROTO_IP, "", SOV_DEFAULT) = 6
5259/43:     18.2414     fcntl(6, F_GETFL)                    = 2
5259/43:     18.2415     fcntl(6, F_SETFL, FWRITE|FNONBLOCK)          = 0
5259/43:     18.2416     bind(6, 0xFE5AF980, 16, SOV_SOCKBSD)          = 0
5259/43:     18.2418     connect(6, 0xFE5AF910, 16, SOV_DEFAULT)          Err#150 EINPROGRESS
5259/43:     18.2970     pollsys(0xFE5AF798, 1, 0xFE5AF730, 0x00000000)     = 1
5259/43:     18.2971     getsockopt(6, SOL_SOCKET, SO_ERROR, 0xFE5AF6D0, 0xFE5AF6D4, SOV_DEFAULT) = 0
5259/43:     18.2972     getsockname(6, 0xFE5AF980, 0xFE5AF8A4, SOV_DEFAULT) = 0
5259/43:     18.2973     write(5, "\0 Z918782E1 511", 8)               = 8
5259/43:     18.2974     lwp_unpark(3)                         = 0
5259/3:          18.2974     lwp_park(0x00000000, 0)                    = 0
5259/3:          18.2977     brk(0x00064850)                         = 0
5259/3:          18.2977     brk(0x00078850)                         = 0
5259/3:          18.2981     pollsys(0xFEACF458, 50, 0xFEACF3E8, 0x00000000)     = 0
5259/3:          18.2982     pollsys(0xFEACF458, 50, 0xFEACF3E8, 0x00000000)     = 0
5259/3:          18.2983     pollsys(0xFEACF458, 50, 0xFEACF3E8, 0x00000000)     = 0Then loades of pollsys(0xFEACF458, 50, 0xFEACF3E8, 0x00000000)     = 0 until I kill the daemon - seems like they are killing the server?
Thanks,
Kasper

Multi-processor support!

I must say that the only feature that will make me upgrade ever again from CS4 (or even to upgrade my MacPro) would be full multi-processor support.
Unfortunately, Adobe has been introducing new features that turn out to be imperfect and NEVER get updated or fixed any more.
Hoping for CS5 to come through!

You're right. What I meant was FULL multi-processor support.
Like saving and opening files, especially when saving large, multi-layered 16 bit files. Just tested it again - a file that is 2.91GB when opened in photoshop (which any file I work on typically is) takes 42 seconds to save as an 8 bit file and (after deleting layers to make it the same 2.91GB when open in photoshop in 16 bit format) takes 5min and 20 seconds to save in 16 bit while the activity monitor shows only one processor being utilized.
I would like to work more in 16 bit but it's not practical now as it takes too long for just the saving part.
At least it should be non-modal so that you could work on another image while others are being saved and/or opened.
There are also a few filters that do not use multiple processors - like the lens blur filter, which works amazingly well but makes me take a coffee break every time I use (especially when used with a large blur amount).

Multi Processor environments

Just a quick question for ye, no detail really required at this time.
Is it possible to run a multi threaded Java application in a multi processor environment, where the processors would share the work between them.
I suppose basically what I am asking is does the JVM have the capabilities to run multiple threads across multiple processors.
If the answer to this question is yes, how difficult is it to set up or is there any special requirements, or extensions required.
Cheers

In most current JVMs, the facilities of the OS to spread the execution across the system's CPU's is provided to the Java programmer quite transparently. Avoid 'green threads' and you are good as gold.
On Solaris 7 and earlier, it gets tricky to convince the two-level kernel threading mechanism to really use all CPU's (or even more than one). The Sun JVM's for Solaris provide 'unsupported' but often suggested command line options to use 'bound threads' to solve this problem and Java 1.3.0 and 1.3.1 do a much better job. I've run WLS on 10-way 4500 systems and was able to consume about 70% of the 10 CPU's with productive work - principal limitation was our ability to apply load to the system.
In Solaris 8, there is an alternate thread library that a process can choose to use (just put it in the front of the LD_LIBRARY_PATH). This causes that process to use a one-level (or flat) Solaris thread model which can improve performance (and will definitely increase processor utilization) on larger SMP systems.
Chuck

Java multi-thread Applet and Multi-processor

Hello,
I have a JAVA applet which work fine on a mono-processeur machine and not fine at all on multi-processors.
It use multi thread (for reading on socket) and some times, request are lost.
Is that a matter of programming or a JVM bug ?
This happens on Linux and Windows NT bi-processors.
We hare using JAVA 1.3.0_02
Thanks for your help
[email protected]

I have already have a look to javax.swing.SwingUtilities.invokeLater() but (I think) it don't work for my need.
A graphic applet which create 2 threads : one for reading and one for writing on a socket.
Request sent to the server are memorized in a vector, and once answer is received, the request is remove from vector and treated. Access to the list is protected by synchronize.
Everything works fine one several plateforme (linux, windows 98/2000/NT, Solaris). The only problem is on multi-processor : request are lost !

CPU time from a multi processor

Similar Messages

Maybe you are looking for