Blackfin Performance Profiling

Hello,
I'm acquiring 32 bits signals in which only 18 bits are used. As the acquisition rate is around 4.5 MHz and last a few seconds, I must compress my data. Therefore I wrote a simple VI that takes an array of 4 32 bits data (where only 4*18 bits are usefull) and convert it into an array of 9 U8 of data. The conversion takes 0.4 us on my computer (Quad Xeon) but more than 60 us on the Blackin which make no sense. I'm using the BF548 evaluation kit running at 600 MHz where 60 us equals almost 36000 operations...
I'm using the TickCount (ms) over multiple calls the conversion VI to know get my execution time.
Is there any known issue with the TickCount (ms) function? Is the evaluation kit expected to run slower for any reason? I've attached the LabVIEW project to this message.
Regards,
Patrick Lessnick
Attachments:
bitstuffing.zip ‏33 KB

Patrick,
From my discussions with R&D we are not experiencing a bug. Its a unique challenge to streamline performance with LabVIEW for Blackfin since there are so many additional caveats we don't usually have to worry about in LabVIEW for Windows.
If performance is the chief concern, here are my recommendations.
Use a shift register.
Don't use a subVI. The code will be larger on the diagram, but it will be faster and smaller on the chip.
Don't use build array in the low level part. Instead, Initialize an array outside of the loop and index into it.
Turn on optimization. For example, turning on Disable parallel execution should have a significant effect.
If all else fails and our generated code just isn't fast enough (which I doubt), there is still the Inline C Node that he can use for doing something like this.
Given these recommendations please let me know if you have any additional questions or concerns.
-Mark
Mark
LabVIEW R&D

Similar Messages

Using Flex Performance Profiler for Profiling Flex with Java Applications

Hi , I am planning to use Flex Profiler to profile my Application .
I have developed a sample Application using Flex MXML ,some ActionScript classes for Events and Cairngorm , Java , Oracle as database and Blazeds .
Can i use Flex Performance Profiler to Profile my Application . ?? I am asking this question as i ad read the below line from Adobe site and My Application includes java Methods
"You can use the profiler to profile ActionScript 3.0 applications " Can anybody please tell me What does this mean ?? and can i use the Flex Performnace Profiler . Please suggets me .

Thanks Karl for the prompt response .
I am making a call to a Java Method from my Action Script function , or getting data from Java Method into the ActionScript function .
So my question is , will the Flex Profiler will be applicable in this case as it internally calls Java Methods .

Performance Profile broken in Flex 4?

I'm trying to profile my application, but I can't get the Performance Profile view to show anything. When profiling began I did tell it to profile performance, are there any other steps to make it work, or is it broken?

Hi,
If you have selected only performance profiling you need to capture performance profile data by clicking on the capture performance profile data button.
If you have reset the performance data before capturing the performance profile data then you may not see performance of any of the methods.
Thanks,
Kishan

FB4 Performance profiling - what's up with [tincan]?

Hello folks,
I just profiled my application in FB4 and when viewing the performance profile that was recorded there is an entry with name "[tincan]", which claims around 70 to 90% of all time.
Reading blogs etc I found that tincan is the time spend on rendering a video?! This doesn't make sense, of course, and my conclusion at this point is that profiling in FB4 is useless.
Any help? Thanks!

Hey folks,
any thoughts on this one?

Free performance-profile tool ?

Hi Folks,
Might anyone know of a good and free performance profile tool ?
I am looking for something like Candle, Intrascope or Mercury 'Deep Diagnostics'.
I've just seen a list on http://www.javaperformancetuning.com/resources.shtml
anyone used these ?
thanks!
JM

"Marmelstein" <[email protected]> wrote in message news:40bd9e54$1@mktnews1...
I should add that I'm more interested in application-level stuff than the JVM.
The kind of thing I'd really like to see is (for example) lots of time is spent
in a particular ejb 'find' method.
Hi Folks,
Might anyone know of a good and free performance profile tool ?
I am looking for something like Candle, Intrascope or Mercury 'Deep Diagnostics'.
I've just seen a list on http://www.javaperformancetuning.com/resources.shtml
anyone used these ?
I'd get a commercial one. Usually such tools pay off after the first use.
Regards,
Slava Imeshev

Performance Profiler with Sub-Panel linked VI's

I've got a Main vi which contains a tab control, each page holding a subpanel linked to a SubVI. When using the Performance and Memory Profiler my Main.vi is the only one that shows up, so the timing information isn't very useful. If I get a Snapshot while the profiler is running it shows the linked subVI's as being in memory, but doesn't show any timing information on them (All 0.0's for all vi's including Main).
Am I doing something incorrectly?
Attachments:
Profiler.jpg ‏50 KB
Snapshot.jpg ‏58 KB

Looks like you are doing everything correctly. The problem is somewhere between how the SubPanel deals with VIs in memory, and where the Profiler gets its data. You should be able to see stats for your main VI once you stop it, but not the subVIs you are loading into the panel. Sorry if this isn't what you were looking for, but at least we can say that it is not you doing anything wrong. I would head on over to the Product Suggestion Center and submit this (these are taken seriously!). It's certainly a valid use case that may not have been considered yet.
Cheers,
Brian A.
National Instruments
Applications Engineer

Filter hud performance profiling results

I was really curious why it took so long for the filter hud to open and I think I may have at least a partial answer. I used the awesome Sampler and fs_usage tools that come with the developer kit. I started profiling and pushed the filter button. Here's what I found:
1) A bunch of database access happens in the main thread. When you push the filter hud button you loose control of the application (SBOD) until the query is complete. I'm guessing it's asking the database which keywords exist in the currently selected project so it can make filter buttons for them. Why does a query of 600 images take 10 seconds...
2) Whenever you push the filter hud button thousands of little seeks and reads happen on the hard disk. I'm assuming this is the bottleneck.
Is this interesting to anyone? It's wandering off the standard subject matter a bit. However, performance is abysmal and knowing is half the battle!
Dual 1.8 G5 Mac OS X (10.4.3) 1GB RAM, Sony Artisan Monitor, Sony HC-1 HD Camera

I've dissected the SQLite database that Aperture uses. My current best guess is that either Aperture is doing something wrong with SQLite or it's doing extra queries that I don't understand. I recreated the database query that gives you a list of keywords for all images in the working set. It only took between 1/4 and 1/2 second to execute. Opening the filter hud in Aperture with the same working set of images takes over 3 seconds.
I've posted more detailed information on my site. It includes an overview of the database structure.
http://www.mungosmash.com/archives/2005/12/theapertureda.php
This is good news to me. It means the Aperture guys only have themselves to blame and should be able to speed it up considerably since the db is fast.
Dual 1.8 G5 Mac OS X (10.4.3) 1GB RAM, Sony Artisan Monitor, Sony HC-1 HD Camera

"Call Library Function" absent from performance profiling

Hello,
I'm trying to optimize my VI execution time by using the "Profile Performance and Memory" window.
The VI takes 25 sec to run, however the profiler reports something like 25 ms, if I understand correctly.
I know the 25 sec includes all other processes on the CPU. However I highly suspect a lot of time is spent in 3rd party DLL functions I use, which read and write files, among other things.
The problem is that the "Call Library Function" nodes do not appear in the profiler window at all! My questions:
1) Why don't clf nodes appear?
2) Is there some way to inspect the time spent in the DLL functions?
Note: There is a related (unanswered) post from 2009 here: http://forums.ni.com/t5/LabVIEW/Profile-performance-of-a-VI-using-DLL/m-p/888833#M401525
Thanks
Itay.

I recommend taking the simple approach - put a millisecond timer function before and after the call to the DLL, and subtract. I suspect that CLFNs do not appear in the profiler because LabVIEW hands off execution to the DLL and can't monitor the internals of what the DLL is doing. LabVIEW has no way to know how much memory the DLL allocates nor how much processor time the DLL uses.

Stranger performance profile

I suppose this is pretty vague, but I will ask anyway... I have a couple of very large databases (6G or so apiece) on 64-bit linux using version 4.5 that I access and delete from using the java API in a single long transaction with multiple cursors in various loops-- it's sort of a graph search which is part of a bigger computation.
I'm finding that operations are many orders of magnitude slower in the midst of than what I expect, and to verify that it's not just because I have a huge database or inefficient algorithm, I've taken out the most problematic part and I run it on its own, and it runs hundreds or thousands of times faster. Cursor.getSearchBoth slows down from .08ms in the standalone case to 10ms; Cursor.delete, once it's already positioned, slows from .02ms to almost 5ms, and perhaps most bizarrely, Cursor.close slows from too small to measure to 6ms! I'm not even looking at commit time or anything right now, just performance of these cursors; again there's just one transaction here.
I don't see any evidence that the machine is thrashing, there's tons of free memory in the jvm, the cache misses per operation are extremely similar between the real app and the test app, and almost the whole database would fit in the cache I've set (though changing the cache size has negligible effect on all this). The key/data pairs are all quite small; 8 byte keys and ~40 byte values; I'm using the default page size. I guess I'm wondering how I can start to pin this down. These numbers really seem crazy, especially Cursor.close... what could I possibly be doing to make that run so slow?? If it helps, here is how I set up the environment, transactions, and cursors in both programs:
EnvironmentConfig config = new EnvironmentConfig();
config.setRunFatalRecovery(true);
config.setErrorStream(System.err);
config.setErrorPrefix("BerkeleyDB> ");
config.setCacheSize(40 * 1024 * 1024 * 1024L);
config.setLogRegionSize(512 * 1024);
config.setLogBufferSize(4 * 512 * 1024);
config.setMaxLocks(1000000);
config.setMaxLockObjects(1000000);
config.setMaxLockers(1000000);
config.setLockDetectMode(LockDetectMode.YOUNGEST);
File baseDirectory = new File(dir).getAbsoluteFile();
setDirectories(baseDirectory, config);
config.setAllowCreate(true);
config.setInitializeLocking(true);
config.setInitializeLogging(true);
config.setInitializeCache(true);
config.setTransactional(true);
config.setPrivate(true);
Environment dbEnv = new Environment(baseDirectory, config);
DatabaseConfig config = new DatabaseConfig();
config.setSortedDuplicates(true);
config.setAllowCreate(true);
config.setReadUncommitted(true);
config.setMode(0664);
config.setType(DatabaseType.BTREE);
config.setTransactional(true);
Database db = dbEnv.openDatabase(null, name,
null, config);
TransactionConfig transactionConfig = new TransactionConfig();
transactionConfig.setReadUncommitted(true);
Transaction txn = dbEnv.beginTransaction(null, transactionConfig);

Ugh.. OK, the best way to find a bug that's been driving you crazy all week is to publically whine about it on a forum. I had a cursor in a tight loop that I wasn't closing. I found this literally 5 minutes after posting... but hey, whatever it takes! People, take note! Don't mess with them cursors!

Performance profiler for Forms6i?

Hello,
Does anyone know of a Forms profiler which can instrument code and point at slow parts of the Form? An example of the problem we've encountered is that a Form takes 15 seconds to process the Submit button but there is only 5 seconds of database time in the corresponding SQL trace. We're trying to track down the "missing" 10 seconds.
With C or Java I can simply instrument the code and the tool will tell me what calls are consuming the most amount of time. I'm not sure if a similar tool exists for Forms? (Please don't tell me that the only way to do this is to put our own timers into the code!)
Thanks,
:-Phil

check this out:
http://otn.oracle.com/products/forms/pdf/perf_collect.pdf

Application performance profiling

We have developed an E commerce application for which WL is the app
server. We want to test the server side performance of the application.
Kindly suggest us which tool we have to adopt?
Thanks
Manohar Joshi
BFL Software Limited
PRODUCTS BU
Karnataka State
e-mail : [email protected]

HI,
Please let me know whether we can use LoadRunner for the following kind of plan,
If no, which tool you would propose us to use.
We are planning to deploy our EJB components on Weblogic Web Application
Server ( Not Enterprise).
We will be having a set of JARs(Java ARchives) to deploy in EJB container
provided by Weblogic. The deployment of beans can be on multiple
servers(Clustered Environment). The deployment will include enabling
transactions, security, and message components.
Thanks
Manohar Joshi
Michael Girdley wrote:
Hi Robert,
Is creating a Servlet that makes an RMI call to the EJBs a possbility for
you? Then you could use your standard http load generation tool.
The difficulty with benchmarking RMI clients to the WLS is that the WLS
client libraries have an optimization where they multiplex all client
requests to the servers via one socket. So, if you have ten client threads
in one JVM, they'll all share one socket. While this won't effect results
much, it means to be a realistic benchmark, you need to have many many JVMs.
One of our customers did just that.
Hope this helps!
Michael
Michael Girdley
Sr. Product Manager
WebLogic Server
BEA Systems
ph. 415.364.4556
[email protected]
Robert L. Doerr <[email protected]> wrote in message
news:[email protected]...
How were you able to use Load Runner in your testing? We wanted to stress
test the WebLogic application servers but Load Runner was unable to record
the RMI/EJB protocols that the client was using to communicate with thethe
app server. It has been a sore point here and has caused us a lot ofgrief trying
to test the application. The best thing we have been able to come up withso
far is make a WinRunner script of the Application and use that to start10+
instances of the application on each client machine to generate some load
on the system. We are still looking for a good way to test for the finalprojected
load of 1500+ users. Sun used to sell a tool called JavaLoad but I haveheard
that they discontinued it. I'm still looking for ideas to generate avalid test.
Regards,
Robert
Tamilselvan Ramasamy wrote:
LoadRunner is good.
Manohar Joshi wrote in message <[email protected]>...
Any success stories of performance testing through Load Runner (Mercury
Interactive's tool)
Thanks
Manohar Joshi wrote:
We have developed an E commerce application for which WL is the app
server. We want to test the server side performance of the
application.
Kindly suggest us which tool we have to adopt?
Thanks
Manohar Joshi
BFL Software Limited
PRODUCTS BU
Karnataka State
e-mail : [email protected]
Manohar Joshi
BFL Software Limited
PRODUCTS BU
Gopalakrishna Complex
45/3, Residency Road
Bangalore - 560 025
Karnataka State
INDIA
PH: 5588722, 5587419, 5594222 Ext:518
e-mail : [email protected]
ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,
Manohar Joshi
BFL Software Limited
PRODUCTS BU
Gopalakrishna Complex
45/3, Residency Road
Bangalore - 560 025
Karnataka State
INDIA
PH: 5588722, 5587419, 5594222 Ext:518
e-mail : [email protected]
ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,ø¤º°`°º¤ø,¸¸,

Nvidia 260gtx: always high performance profile

I have a nvidia 260gtx running the nvidia-340xx legacy drivers on KDE4. Looking at nvidia-settings, the powermizer entry shows "auto - adaptive", nvidia-settings reports 0% gpu usage, yet the clocks never leave the high performance level (2).
The power mode is set to adaptive if queried via console also:
$ nvidia-settings -q | grep PowerMizer
Attribute 'GPUPowerMizerMode' (Takemikazuchi:0[gpu:0]): 0.
Valid values for 'GPUPowerMizerMode' are: 0, 1 and 2.
'GPUPowerMizerMode' can use the following target types: GPU.
Attribute 'GPUPowerMizerDefaultMode' (Takemikazuchi:0[gpu:0]): 0.
'GPUPowerMizerDefaultMode' is an integer attribute.
'GPUPowerMizerDefaultMode' is a read-only attribute.
'GPUPowerMizerDefaultMode' can use the following target types: GPU.
I have tried disabling all effects in KWin, as well as switching to xrandr (which should use CPU instead). I have not used any xorg.conf files.
I have another PC with a 8600m GT (see sig), which switches to level 0 almost immediately. About the only difference I see is it running KDE5, and only having one monitor while my 260GTX one has two.
Any idea how to make it use levels 0-1?

Ok, this is weird. Apparently, the clocks are also stuck at their maximum on windows as well. Could it be some kind of BIOS modification MSI did in their version of the 260GTX card? It's the TwinFrozr version of it.
edit: double checked, nvidia control panel is set to "adaptive" on windows as well.
Last edited by Soukyuu (2015-03-01 19:28:16)

JVM performance profiling tool

Hello,
I wonder if there are any tools available out there for profiling a Java program running inside Oracle8i. A tool that measures in what methods execution time has been spent (how much time), how many instances of different objects have been created, heap size etc etc. I've been using OptimizeIt for standalone applications, but I don't know what to use for JServer inside 8i.
Help !!!
/Patrik

Have you looked at JDeveloper? It has a set of execution profile and memory profile tools. It also supports debugging in the database for both PL/SQL and Java stored procedures.

HP HDX 16-1010EA: Battery charging and performance problems

Hi,
Due to my warranty being finished by HP, this place will be the only hope for some problems that I'm having, so please help guys.
1. I own a personal copy of Windows 7 Pro x64, installed it on a different partition from my Vista x64 that came with my HDX 16. After i updated the drivers, especially the chipset drivers, running on battery with the balanced profile is like HELL, it's like loading windows 7 on an old 8086 IBM piece, it decreases the performance radically, now it's obvious that the problem is on a driver level but i've updated the chipset for 3 times so far and nothing has changed, i cannot address Intel for this since i have no idea what to tell them, so I'm hoping someone had some similar issues and has a solution for it.
2. Lately, my battery started to charge very very slow and end the cycle of charge pretty fast, now since i cannot act on the balanced profile (explained above why.) i have to use the high performance one, i know this can reduce the life cycle dramatically, but 30-40 minutes are too LESS for even the high performance profile. AND, i am a very careful notebook owner and i treat it as a baby, i charge it fully, then i unplug it immediately, and don't charge it until it reaches 10%. I also give it an airspray treatment every 2 months so you can imagine, this notebook should run as normal as brand new.
3. Now this is not a problem, though pure curiosity. I know the HDX 18 has two hard drive ports and it comes with two hard drives. After Intel upgraded from Matrix Storage app to the Rapid Storage Technology one, i have noticed a little extra interface (logical wise) on the output. Now to clear things out, HDX 16 has an interface for the hard drive, another one for the optical drive and an external interface (eSata), if you look clearly in the picture linked below, you'll see another port, a mysterious port:
http://i46.tinypic.com/x5cw28.jpg
Internal Empty Port, with a SATA Icon on it, is there any chance that the HDX 16 has a spare interface to connect another hard drive??
Regards,
Pluto

Anyone?

How do I improve performance while doing pull, push and delete from Azure Storage Queue

Hi,
I am working on a distributed application with Azure Storage Queue for message queuing. queue will be used by multiple clients across the clock and thus it is expected that it would be heavily loaded most on the time in usage. business case is typical as in
it pulls message from queue, process the message then deletes the message from queue. this module also sends back a notification to user indicating process is complete. functions/modules work fine as in they meet the logical requirement. pretty typical queue
scenario.
Now, coming to the problem statement. since it is envisaged that the queue would be heavily loaded most of the time, I am pushing towards to speed up processing of the overall message lifetime. the faster I can clear messages, the better overall experience
it would be for everyone, system and users.
To improve on performance I did multiple cycles for performance profiling and then improving on the identified "HOT" path/function.
It all came down to a point where only the Azure Queue pull and delete are the only two most time consuming calls outside. I can further improve on pull, which i did by batch pulling 32 message at a time (which is the max message count i can pull from Azure
queue at once at the time of writing this question.), this returned me a favor as in by reducing processing time to a big margin. all good till this as well.
i am processing these messages in parallel so as to improve on overall performance.
pseudo code:
//AzureQueue Class is encapsulating calls to Azure Storage Queue.
//assume nothing fancy inside, vanila calls to queue for pull/push/delete
var batchMessages = AzureQueue.Pull(32); Parallel.ForEach(batchMessages, bMessage =>
//DoSomething does some background processing;
try{DoSomething(bMessage);}
catch()
//Log exception
AzureQueue.Delete(bMessage);
With this change now, profiling results show that up-to 90% of time is only taken by the Azure Message delete calls. As it is good to delete message as soon as processing is done, i remove it just after "DoSomething" is finished.
what i need now is suggestions on how to further improve performance of this function when 90% of the time is being eaten up by the Azure Queue Delete call itself? is there a better faster way to perform delete/bulk delete etc?
with the implementation mentioned here, i get speed of close to 25 messages/sec. Right now Azure queue delete calls are choking application performance. so is there any hope to push it further.
Does it also makes difference in performance which queue delete call am making? as of now queue has overloaded method for deleting message, one which except message object and another which accepts message identifier and pop receipt. i am using the later
one here with message identifier nad pop receipt to delete message from queue.
Let me know if you need any additional information or any clarification in question.
Inputs/suggestions are welcome.
Many thanks.

The first thing that came to mind was to use a parallel delete at the same time you run the work in DoSomething. If DoSomething fails, add the message back into the queue. This won't work for every application, and work that was in the queue
near the head could be pushed back to the tail, so you'd have to think about how that may effect your workload.
Or, make a threadpool queued delete after the work was successful. Fire and forget. However, if you're loading the processing at 25/sec, and 90% of time sits on the delete, you'd quickly accumulate delete calls for the threadpool until you'd
never catch up. At 70-80% duty cycle this may work, but the closer you get to always being busy could make this dangerous.
I wonder if calling the delete REST API yourself may offer any improvements. If you find the delete sets up a TCP connection each time, this may be all you need. Try to keep the connection open, or see if the REST API can delete more at a time
than the SDK API can.
Or, if you have the funds, just have more VM instances doing the work in parallel, so the first machine handles 25/sec, the second at 25/sec also - and you just live with the slow delete. If that's still not good enough, add more instances.
Darin R.

Blackfin Performance Profiling

Similar Messages

Maybe you are looking for