Strange FPGA FIFO DMA behaviour

Hi all,
let me warn that 1) the post is long, and 2) I'm conscious that the problem I'm having has many workarounds (some I know, some I don't, some I think to know...). However, I woul like to understand exactly why it's happening, both to be able to find the most efficient solution and for personal "culture"...
So, I have a very simple VI running on an FPGA: at each cyle, it reads 8 ADC channels and build an an array of 9 elements (time+8 channels). I'll refer to one fo this 9-element array as a "data vector". Because I have to pass this data vector to the host VI, I set up a DMA FIFO in which I write the elements in order: time, ch0, ch2 ... ch7, time, ch0, ch2... etc (I can't directly wirte arrays in the DMA FIFO).
The host VI opens a reference to the fpga VI, invokes the start method and then enters a timed loop. On the first iteration it attempts to read 0 elements, obtaining the info on how many elements remain in the FIFO. Then using a feedback node it passes this information (N) to the next cycle, in which it reads K = 9*floor(N/9) elements (i.e., a integer number of data vectors) and reshape them in a Kx9 array. Finally, it formats the array and write it to a text file. Then the cycle repeats using the number of remaining element obtained from the previous iteration.
Now, the problem is that in the host VI the user has the ability ot set both the sampling frequency of the FPGA and the timed loop frequency. While this allows to find the preferred match of sampling frequency and wiriting (to file) interval, it may happen that the values are such that the FIFO saturates (i.e. the timed loop that empties the FIFO executes too slowly compared to the FPGA acquisition that fills it), and this has two main consequences. One is loss of data, that I don't care in this case as long as I notify the user that he is loosing data. The other one is that, even when i set the parameters back to a combination that allows the buffer not to be filled, I've lost trace of the "first element" of my data vector.
Now, I thought I had an explanation for that: when the buffer saturates, the "FIFO write" in the FPGA code will keep timing out untill the buffer is (at least partially) freed, that can happen in any moment of its cycle. Let's suppose that it found the FIFO full when it was trying to write ch3, and then resumed (because it found free space in the FIFO) when trying to write ch0: this would result in an incorrect sequence of data (time, ch0, ch1, ch2, ch0, ch1, ch2, ch3...) that would compromise any subsequent reading.
However, I'm observing another behaviour on another routine that makes me doubt about this explanation. (I told you the post was long!).
So, actually (I layed you before...) the FPGA code is writing two identical FIFO (FIFO2), not just one. This is because I want onether routine to run and visualize data independently from the one that writes them to file. This "visualization routine" opens a reference to the FPGA code, doesn't invoke the start methode and enters a loop that is almost identical to the one of the host VI, except that it sends the shaped array to some graph and similar instead of writing it to file. Now, because this routine is meant to be run as the user needs to watch the data, it often happens that it is started with big delay with respect to the FPGA code, resulting in FIFO2 being full. Thus, as expected, the reconstructed array is screwed up because the first element is no more what it is expected to be. However, stopping and restarting this rountine whithin an interval shorter than the time needed to fill the FIFO2 buffer (that is only possible if the sampling frequency is relatively slow, as stopping and restarting a routine takes at least a couple of seconds) makes it work. This doesn't match to what I would expect from my previous explanation, because stopping this rountine doesn't stop the FPGA code (i don't want to) and doesn't reset the FIFO2 buffer, so if there is a shift in the order of the elements from a previous timout, it should be maintaned...
Understanting why this is happening is interesting to me not only because it demontrates that I don't really understand how this FIFO thing works, but also because it would probably suggest me a simple solution to implement in both routines when I detect a write timeout from the FOGA code.
Does all this make sense to you?
Thanks
Giacomo

I had some time today and I thought this was an interesting question so based on your description I went ahead and created a LabVIEW project template of how I would implement this application. I used one DMA FIFO in a producer loop on the host then use queues to send the data to two seperate consumer queues. Check out the atachment and let me know what you think.
FPGA VI
Host VI
-Hunter
Attachments:
FIFO Example Program.zip ‏127 KB

Similar Messages

  • Passing data from RT host to FPGA through DMA FIFO

    Hello,
    I am trying to write some data from an RT host to FPGA target using DMA FIFO then process this data and then read them back from the FPGA target to the RT host through another DMA FIFO. I am working on NI PXIe chassis 1062Q, with NI PXIe-8130 embedded RT controller and NI PXIe-7965R FPGA target.
    The problem I am facing is that I want to send three different arrays, two of the same size and the third one with different size, and I need the smaller one to be sent to the FPGA first. I tried using flat sequece with two frames in the FPGA VI. In the first frame I read and write the first array in a while loop which is finite (i.e., finite number of iterations). The second frame contains the process of reading and writing the second two arrays (of the same size) in a while loop that can be finite or infinite (according to a control). The problem is that this didn't work. The 2 arrays are displayed on the front panel of the RT host VI and are working fine, however, the array that should have been read in the first sequence doesn't show up on the front panel of the RT host VI. This doesn't make sense because if it is not passed from the host to the fpga and vice versa then the second frame shouldn't have been executed. Note that I am wiring (-1) to the timeout to block the while loop iterations till the passing of each element is complete. Thus the first while loop has 3 iterations only. Could someone help me undersdtand why this happens and how to solve this problem?
    I am attaching a picture of both the host and the fpga vi.
    Thank you.
    Solved!
    Go to Solution.
    Attachments:
    RT host vi.png ‏102 KB
    FPGA vi.png ‏28 KB

    No need to initalize the arrays with values that you will immediately overwrite.  Here's what I believe to be equivalent code:
    The array outputs should be wired directly to the FPGA FIFO writes.  Do not use local variables when you can wire directly.
    If you know that you want to transfer the Temp Data Array first, why not make your code do that?  Eliminate the sequence structure, and put the functions in the order in which you want them to execute.  Use the FPGA reference and error wires to enforce that order.  You might consider writing the Temp Data Array, reading it back, then writing the Real and Imag A arrays, to see if that gets you the results you expect.  Run the code in simulation (in the project, right-click on the FPGA target and execute on the host with simulated IO) so that you can use execution highlighting and probes to see what is happening.  Wire the error wires through and see if you get an error anywhere.  Make sure you're not missing something simple like looking at the wrong starting array index.

  • Can't make RT use generic FPGA FIFO

    Hi all,
    On a cRIO, I'm making a TCP server in RT that feeds data to a DMA FIFO for FPGA consumption.  We are setting up multiple servers where each feeds a different DMA.  (No, I don't want to merge and multiplex to single DMA).
    I want a single RT program that takes a FIFO reference so that I can reuse the same RT code for different FIFOs.  I can define a FPGA FIFO reference control,
    but this only supports writing a single element at a time (or two on our target if you select peer-to-peer streaming for the FIFO).  I have tuned the RT program for performance, and looping over thousands of elements per second in RT is NOT the solution.  I want to use the "other" FIFO write which supports arrays - the one I can reference from an FPGA reference as follows:
    However, using the FPGA reference I believe requires a particular named FIFO - therefore tying the RT code to a particular FIFO.
    Any ideas how to do this?
    Steve

    All,
    Thanks for the advice.  I'm trying to create a basic test using the Advance Sessions Sources and am getting a FIFO name error (-61206) from the Get Single Resource Session node.  The messy diagram below is showing the test VI on the right and a client VI on the left.  The idea is to pass an FPGA reference and the FIFO name "FpgaDataIn" and the test VI grabs the FIFO by name and writes to the FIFO.  
    One issue to overcome is my beginner skills: the passed FPGA reference needs to be compatible with the FPGA reference control - which defeats the purpose of supporting generic FPGA references.  There must be a way to specifiy a control that accepts a generic FPGA reference (dynamic FPGA cast?  (which I've never used)).
    Putting that issue aside for a moment, even when I define a FPGA reference control with a signature compatible with the passed FPGA reference I still get the 61206 error.  I noticed issues related to upper/lower case and tried everything lower case (I THINK - I'm not positive I did it completely - but it's tedious with the FPGA reference signatures).
    I read how the Get Single Resource Session needs a constant reference with a single "FIFO" fifo and have that.
    Any tips?
    Thanks!
    Steve

  • Qu'arrive-t-il si je ne lis pas les données dans un FIFO DMA?

    Bonjour,
        J'aimerais savoir ce qui arrive lorsque je ne lis pas les données dans un FIFO DMA. Est-ce que lorsque le FIFO DMA est plein, la nouvelle donnée est mise dans la file d'attente en faisant tout simplement sortir la plus ancienne donnée?
    Merci

    Bonjour,
    J'imagine que vous utilisez un CompactRIO ou une carte FPGA.
    CE qu'il faut savoir, c'est qu'une FIFO DMA est composée de 2 parties, une partie sur le composant FPGA (de la taille indiquée lors de la création de la FIFO) et une partie sur le Hote (Windows ou RT) beaucoup plus grande, et dont on peut faire varier la taille à l'aide d'un noeud de méthode de la palette FPGA Interface.
    Quoiqu'il en soit, lors de l'écriture dans la FIFO, ces 2 buffeurs vont se remplir. Lorsque les 2 seront plein, la fonction d'écriture vous retournera une erreur de Timeout, et la donnée que vous avez essayé d'écrire sera tout simplement perdue.
    Cordialement,
    Olivier L. | Certified LabVIEW Developer

  • Data transfer from RT to FPGA using DMA FIFO

    Hello all,
    My question is "How do you stream data from RT target to FPGA target using DMA FIFOs?"
    I would like to control some indicators (or controls) in FPGA vi using controls in the RT vi using DMA FIFO.
    I have used four controls in my RT vi, but I get only one indicator out on my FPGA vi. (I would actually like to use some controls on the FPGA target using controls on the RT target)
    Is this possible?
    Can anyone help me with this?
    I have attached my vi s. 
    Attachments:
    fpgatest.vi ‏28 KB
    rt_test.vi ‏73 KB

    Based purely on your example, I see two options:
    1. Do as RavensFan suggests and use Boolean Array To Number to send a single number down to the FPGA.  Your FPGA can break up the number easily enough to update the indicators.
    2. Just write dirctly to the indicators.  I do not see a need for the DMA.  Again, based purely on your example.
    There are only two ways to tell somebody thanks: Kudos and Marked Solutions
    Unofficial Forum Rules and Guidelines

  • Error -63150 when accessing FPGA FIFO in host VI

    I have a problem accessing a target-to-host FIFO in the host VI (32767 elements, U16) using the FIFO invoke node:
    FIFO.stop works without error
    FIFO.start and FIFO.read result in the following error message:
    Error -63150 occurred at Invoke Method: FIFO.Read in host.vi
    Possible reason(s): NI-RIO:  (Hex 0xFFFF0952) An unspecified hardware failure has occurred. The operation could not be completed.
    I already tried the following:
    removed all other PCI/PCIe cards from the PC (except the graphics card in PCIe slot 1): didn't help.
    I put the connector card for the PXIe chassis (which was originally in PCIe slot 2) to slot 3 and 4, respectively (see below for motherboard details): didn't help.
    I put the connector card into another (older and much too slow) PC (WinXP). There, everything is running fine.
    Hardware/software Details:
    target: PXIe-7962R (with mounted FlexRIO adapter NI 5751; PCI bus 9, device 0, function 0). The device is displayed as working properly both in the Windows Device Manager and in Measurement & Automation Explorer.
    chassis: PXIe-1073 (FPGA card is in slot 2, a PXI-6229 card is in hybrid slot 4)
    LabVIEW 2010 SP1 with FPGA module 10.0.01
    NI-RIO 3.6
    operating system: Windows 7 Enterprise (32bit, Service Pack 1)
    processor: Core i7 960
    motherboard: ASROCK X58 Deluxe3: http://www.asrock.com/mb/overview.de.asp?Model=X58%20Deluxe3&cat=Specifications
    What could I do to fix this problem? Any hint highly appreciated.
    Solved!
    Go to Solution.

    Current status (still not solved):
    reinstalling Windows 7 (32bit Enterprise Edition) and LV2010 SP1 with FPGA module and drivers didn't help. Error is still there.
    @cheggers: the error is produced when FIFO.read() is executed the first time, it's not only after stop/start.
    trying the chassis/cards and test VI on a similar PC: there it works perfectly. This other PC differs only the following aspects from mine:
    motherboard: ASUS P6T DELUXE
    processor: core i7 950 instead of 960
    RAM: 12GB (12GB DDR3 1333MHz Memory) instead 6GB (DDR3 1333MHz Memory)
    OS: Windows 7 64bit Ultimate instead of 32bit Enterprise
    graphics card: NVIDIA GeForce GFX 460 instead of Quadro NVS 295
    FPGA module: 10.0.0 instead of 10.0.1
    It really seems to be related my PC's hardware. Motherboard, processor, memory, graphics card???

  • Strange measurement input fields behaviour

    Hello,
    I have a Czech version of Illustrator CC on Windows 7 Pro.
    Some measurement input fields are not working as expected, for example:
    When creating a new document:
    Decimal marks in document dimensions have disappeared (instead 210.00 mm it shows 210 00 mm). It also disappears after I add it. Another strange thing is when I click in the input box, do nothing and click out the value changes to 5779 55 mm
    (Something similar is happening in my InDesign: I can't change an object's dimension expressed with a number with a decimal mark to another number with a decimal mark - only to an integer. I'm not working in Web mode.)
    Is this behaviour controlled by some settings I can't find or is this a bug?
    Every note leading to a solution much appreciated!

    maara,
    The decimal point is basically governed by your Regional/Language settings, in the Control Panel I believe.
    There may be some issues between . and , for decimal point (and reversely for thousands).
    But, especially if the behaviour has not always been there, you may try the list below.
    In any case you may try a chat or a support call, here or here,
    Creative Cloud support (all Creative Cloud customer service issues, chat open between 5AM and 7PM PST/PDT on workdays)
    http://helpx.adobe.com/x-productkb/global/service-ccm.html
    Adobe Support (phone),
    http://helpx.adobe.com/adobe-connect/adobe-connect-phone-numbers.html
    The following is a general list of things you may try when the issue is not in a specific file (you may have tried/done some of them already); 1) and 2) are the easy ones for temporary strangenesses, and 3) and 4) are specifically aimed at possibly corrupt preferences); 5) is a list in itself, and 6) is the last resort.
    If possible/applicable, you should save curent artwork first, of course.
    1) Close down Illy and open again;
    2) Restart the computer (you may do that up to 3 times);
    3) Close down Illy and press Ctrl+Alt+Shift/Cmd+Option+Shift during startup (easy but irreversible);
    4) Move the folder (follow the link with that name) with Illy closed (more tedious but also more thorough and reversible);
    5) Look through and try out the relevant among the Other options (follow the link with that name, Item 7) is a list of usual suspects among other applications that may disturb and confuse Illy, Item 15) applies to CC, CS6, and maybe CS5);
    Even more seriously, you may:
    6) Uninstall, run the Cleaner Tool (if you have CS3/CS4/CS5/CS6/CC), and reinstall.
    http://www.adobe.com/support/contact/cscleanertool.html

  • Anyone understand this strange LR3 side panel behaviour?

    In the last couple of days my laptop has developed a piece of strange behaviour related to the side panels.  I'm pretty sure it didn't start when I installed the LR3.5 update, but this may have been a contributing factor.
    Until recently I always worked in "solo" mode.  A couple of days ago, LR's behaviour changed so that opening or closing any of the side panels would cause the relevant side (left or right) to become totally unresponsive.  The rest of the program continued to work fine, and the equivalent side panel in other modules continued to work properly until a panel was opened or closed.
    In an attempt to understand what was going on, I turned "solo" mode off.  This led to some very interesting behaviour indeed:
    If I used the triangle to open or close a side panel, everything worked exactly as expected.
    If I clicked on the dark grey bar to open or close a side panel, it became stuck in one of the two states.
    When stuck in "open" state, clicking in the grey area simply caused the panel to close and then re-open, very rapidly.
    When stuck in "closed" state, something similar happened in that the panel opened and then re-closed, very rapidly.
    When in "stuck" state, clicking on the triangle had no effect on the displayed panel but it seemed to toggle the internal state between opened and closed.
    Once toggled, clicking on the grey bar opened the panel where it used to close it, and vice versa.
    When collapsed, the panel displays as pale grey rather than the more usual dark grey.  See below for an illustration.
    I have a desktop machine as well which doesn't show the behaviour.  I've tried reinstalling LR3.5, to no avail.  My guess is that we're dealing with inconsistent panel state information in the "preferences" file, but I don't know for sure.
    If anyone knows what is happening, or (more importantly) how to stop it happening, I'd be enormously grateful for some help. 
    Many thanks in advance,
    Ian Wilson
    Cambridge UK

    Further investigation yielded yet more interesting behaviour - and, it turns out, a pointer to the problem.
    I tried renaming the old "preferences" file, causing LR to create a new one.  The problem was unchanged, apart from losing my registration information... 
    I tried creating a new, virgin catalogue.  This behaved normally, so I thought I was onto something!
    I tried importing a folder of pictures into the new catalogue.  The old behaviour returned.
    I created a new virgin catalogue, which reverted to good behaviour.  I then imported one picture into the catalogue.  Behaviour BAD.
    I then tried removing the single picture from the catalogue and restarted LR.  Behaviour GOOD!
    Conclusion:  Lightroom works just fine for me providing I never put any pictures into my catalogue!! 
    This got me thinking that, perhaps, the issue was related to the rendering of images.  I tried installing the latest nVidia drivers for my laptop, but this had no effect whatsoever.  Given the issue clearly related to how images are rendered, however, I wondered whether the problem could relate to colour profiles.  BINGO!!  I'd recently upgraded my "i1xtreme" to X-Rite's new "i1publish".  The profiles created with the latest software are ".icm" rather than ".icc", although this shouldn't matter (and doesn't matter to PS and Bridge).  It seems to matter to LR, however, as reverting to my earlier ".icc" profiles made everything work again just fine. 
    There is obviously a bug somewhere, but it's not clear whether I should be chasing Adobe or X-Rite.  I'd welcome any comments or advice...
    Many thanks for the suggestions,
    Ian.

  • Strange Custom Template Sections Behaviour

    I've created some custom template sections so I can insert my own various page types for a document I'm producing. The strange thing is Pages will insert an extra blank page (similar to the master but not exactly the same it's missing a text box) between sections when I add a section—but only some of the time!
    It will not do it when I initially add the section.
    It will only add in a 2nd page between when I add additional sections, be they the same type of section or another type of section.
    I cannot delete this 2nd page without deleting the 1st page of the section as well.
    When I try to capture just the 1st page to make another version of the custom template it still exhibits the above behaviours, inserting this "buffer" 2nd page in between sections.
    All the other custom sections I have created do not exhibit this behaviour.
    I've turned on invisibles and can't see anything that makes this section special or different.
    Any guesses as to why it inserts this extra page in between sections to make a 2 page section instead of staying as a 1 page section?
    Message was edited by: Nathan Muirhead

    Nathan
    You have some text being forced over to the next page, either because you hammered away at the spacebar/return key/tabs, or have some object wrap pushing returns over, or both:
    +Menu > View > Show Invisibles/ Show Layout+
    Once you have cleaned up your pages recapture them as sections and resave your template over the old one.
    Peter

  • 1D Boolean Array to 1D Integer Array conversion for FPGA FIFO

    Hello, I am using a PXI7813R card. I would like to pass some data between the target (FPGA) vi and the host vi using the FIFO. I have a FIFO setup to 1023 "32 bit integer" samples. I have a boolean array of 32000 samples which would be the same as 1000 32 bit integers, that I acquired using the PXI7813R card.  I would like to convert the 1D boolean array to a 1D "32 bit intger" array. This seems like a more a difficult problem than I first thought as the labview functions are reduced when targetting a FPGA device. I have attached a jpg of how I would like to do it. I am getting a "Arrays must be fixed size in current target" error for the output from the array subset function. I know this is because one of the inputs is not exactly a constant, i.e. the index input  for array subset, but regardless of the index, I will only ever be taking 32 bits from the boolean array at any time to convert to a 32 bit integer to then place in the FIFO. Any suggestions of how I may get around this problem would be gratefully recieved. Regards, Michael.
    Message Edited by Michael_Limerick on 02-08-2008 04:54 AM
    Attachments:
    fifo_out1.JPG ‏52 KB

    Hi Daniel,
    Thanks for your reply.
    I had a look at the thread that you suggested and I'm not sure if that would solve the problem I was having, the option box was checked as default. I think my issue has to do with the limitations of the different LV functions when targeting a FPGA device.
    I have decided to take another route anyway, it seems that trying to compile a large array (even a 1D boolean array) for a FPGA target both takes a long time and also a lot of FPGA resources.
    Thanks again for your reply,
    Regards,
    Michael.

  • Strange 6i's Compiler Behaviour

    Hi all,
    I am experiencing a strange behaviour on Form Builder (patch 5). Sometime when I fix/change something and recompile (using run button or file-admin-compile file), other part of the application (ussually report) will be broken.
    For example, before I change anything, it could run the report just fine (using run_product built in). Then I change something (NOT the run_product) like the name of the button and recompile. The next time I try to run the report, it sometime display the error message or display blank report.
    To fix it, I have to open the trigger that contain the run_product built in to see the PL/SQL code, press the compile button and then recompile the whole form again.
    Anybody else experiencing the same problem?
    Regards,
    Ari

    Hi,
    I got the same problem but not in run_report built-in, in the key-next-item.
    Sometimes when I'm navigate trough my items, my application close. I do same thing that you do, I recompile the trigger and the problem appears in another item.
    We didn't find a solution yet.
    bye Roxane
    null

  • High Speed Streaming with Multiple FPGA FIFOs and TDMS Advanced Asynchronous (Data Ref)

    I am using an FPGA with adapter card (7962 with 5751) for data acquisition and signal processing. I have adapted the FlexRio example "High Throughput Streaming," which works very well for a transferring data from the FPGA via a single FIFO. This example uses the TDMS Advanced Asynchronous Write (Data Ref). The "High Throughput Streaming" example is similar to "Streaming External Data to a TDMS File (Windows)" but includes more code to prep the FIFO buffer size and TDMS size.
    My question is how can I adapt this code to incorporate multiple FIFOs that write data to different channels in the TDMS file? Can I use multiple instances of  TDMS Advanced Asynchronous Write (Data Ref) in a single VI for each FIFO Acquire Read Region? If so, how do I insure that the correct data is written to the correct channel in the TDMS file?

    Thank you DeppSu for your explanation, I will look into that.
    But first, I want to be sure that the FPGA and the Hot general designs are correct, which for the moment I am not sure. So I have included my code.
    I tried the Host vi several times, and it seems that it works sometimes and sometimes not, like there are some communication problems between the fpga and the host on the "read acquire region" method which is not executed. I managed to make it work randomly before, but not now. Maybe it is because of the reset that I added?
    If someone could check my code and help me, I would really appreciate it since nobody in my workplace has the expertise to do so :-) If you see some obvious mistake, please share with me, I also added some comment boxes in the code with questions.
    Delphine
    Attachments:
    thoughput.zip ‏1261 KB

  • Sudden and strange change in GC behaviour

    I see very strange GC behaviour with our application running by one of our customers, but I fail to reproduce it in lab. Most of the time the application is running just fine, and suddenly GC may start doing absolutely strange things.
    Firstly, it starts running mostly major collections (normally one major collection in a few hours under the same user load)
    81703.7: [GC 81703.7: [DefNew: 737664K->737664K(760704K), 0.0000446 secs]81703.7: [Tenured: 2173068K->588301K(2350848K), 13.0162284 secs] 2910732K->588301K(3111552K), 13.0167844 secs]
    81730.7: [GC 81730.7: [DefNew: 737664K->319K(760704K), 0.0544614 secs] 2205198K->1467853K(3111552K), 0.0548729 secs]
    81746.8: [GC 81746.8: [DefNew: 735966K->735966K(760704K), 0.0000519 secs]81746.8: [Tenured: 2347147K->580785K(2350848K), 9.9294041 secs] 3083113K->580785K(3111552K), 9.9300118 secs]
    The size of used memory suggests that objects get allocated directly to Tenured Generation. Secondly, collections become much more frequent (from 150-200 secs to 10-20 sec gap between collections). All together produce an astonishing rate of memory usage (acquired memory per second) of 50-100 MB/sec (normally it is about 2MB/sec). On the other hand, major collections take much less time, from 40 sec to 6-8 sec, which is good.
    I do not see anything happening on the server that could explain such a major jump in memory usage! Also, the users do not see any significant and sudden change in response time. But over time it definitely affects performance (mostly because of frequent major collections) and the server needs to restart to get back to normal.
    JVM version 1.4.1._02 on Solaris 8, parameters:
    -Xms3000000 �Xmx3000000 -XX:NewRatio=3
    I cannot reproduce it in test environment, so I cannot try a different JVM version and need to know why it happens before changing any JVM parameters (the current set looks a bit conservative but it works fine most of the time)

    I see very strange GC behaviour with our application
    running by one of our customers, but I fail to
    reproduce it in lab. Most of the time the application
    is running just fine, and suddenly GC may start doing
    absolutely strange things.
    Firstly, it starts running mostly major collections
    (normally one major collection in a few hours under
    the same user load)
    81703.7: [GC 81703.7: [DefNew:
    737664K->737664K(760704K), 0.0000446 secs]81703.7:
    [Tenured: 2173068K->588301K(2350848K), 13.0162284
    secs] 2910732K->588301K(3111552K), 13.0167844 secs]
    81730.7: [GC 81730.7: [DefNew:
    737664K->319K(760704K), 0.0544614 secs]
    2205198K->1467853K(3111552K), 0.0548729 secs]
    81746.8: [GC 81746.8: [DefNew:
    735966K->735966K(760704K), 0.0000519 secs]81746.8:
    [Tenured: 2347147K->580785K(2350848K), 9.9294041
    secs] 3083113K->580785K(3111552K), 9.9300118 secs]
    As you noticed, objects are getting allocated directly in the
    Tenured Generation. (As background for those of our
    readers new to reading GC logs:
    The occupancy of the Tenured(Old) generation following the
    completion of the GC at 81730.7 is (1467853 - 319)K =
    1467534 K. Now look at the Occupancy of the Tenured
    generation at the start of the GC at 81746.8; it is
    2347147 K. That means that between these two
    collections, 879613 K (or about 870 MB) was directly
    allocated into the old generation.)
    More about that direct allocation in the old generation later.
    But a consequence of the fact that the old generation is so full,
    is that the remaining free space is insufficient to allow the worst
    case promotion from the young generation to succeed. As a result
    a scavenge is not attempted, but rather a full compacting GC is
    done instead.
    Two things now.
    First, if you use a more recent version of the JVM (1.5.0_06 i believe)
    and set -XX:+HandlePromotionFailure, then you need not be hobbled
    by the pessimistic worst-case promotion causing a full collection
    to happen, when in fact a scavenge might indeed have succeeded
    and sufficed. (see
    http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html#0.0.0.0.%20Young%20Generation%20Guarantee%7Coutline
    or Google "Sun JVM Young Generation Guarantee").
    Second, and coming back to the direct allocation issue, this
    may be caused by one of two possible factors:
    . it could be that your program is trying to allocate very large
    objects in Eden that would not fit there and require
    allocating directly in the much larger old generation.
    Think about the kinds of data structures you have in your program.
    Do you have largish hashmaps (which typically grow by doubling --
    that could be one reason for a very large allocation request. Use
    +PrintClassHistogram to understand the kinds of data structures
    extant in your heap and their sizes; once again the document I mentioned
    above or Google "HotSpot JVM PrintClassHistogram" or check out
    Joe Mocker's HotSpot options list for how to use itL:
    http://blogs.sun.com/roller/resources/watt/jvm-options-list.html
    . it could also be that your application is making heavy use of
    serialization (for example), or through some other means (perhaps
    directly through JNI) executing JNI critical sections which may be
    locking out GC for an extended period of time. In older JVM's this
    could translate into extremely poor mutator (application) performance
    until the critical section is exited. The reason is that GC is locked
    out in the interim and once Eden fills up, allocations go slow-path to
    the old generation.
    The latter has been addressed in later JVM's; please try 1.5.0_06 for example
    and let us know your experience.
    Finally, it would seem as though your application needs a
    large Java heap and you want to avoid excessive GC costs/pauses.
    Consider running with the mostly concurrent collector especially
    if you have some concurrent processing power to spare on your
    deployment platform. Please refer to:
    http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html#0.0.0.%20The%20Concurrent%20Low%20Pause%20Collector%7Coutline
    The size of used memory suggests that objects get
    allocated directly to Tenured Generation. Secondly,
    collections become much more frequent (from 150-200
    secs to 10-20 sec gap between collections). All
    together produce an astonishing rate of memory usage
    (acquired memory per second) of 50-100 MB/sec
    (normally it is about 2MB/sec). On the other hand,
    major collections take much less time, from 40 sec to
    6-8 sec, which is good.
    I do not see anything happening on the server that
    could explain such a major jump in memory usage!
    Also, the users do not see any significant and sudden
    change in response time. But over time it definitely
    affects performance (mostly because of frequent major
    collections) and the server needs to restart to get
    back to normal.
    JVM version 1.4.1._02 on Solaris 8, parameters:
    -Xms3000000 �Xmx3000000 -XX:NewRatio=3
    I cannot reproduce it in test environment, so I
    cannot try a different JVM version and need to know
    why it happens before changing any JVM parameters
    (the current set looks a bit conservative but it
    works fine most of the time)

  • Strange 10gR1 Opatch Logging Behaviour

    Hi,
    just to see if someone else has experencied that:
    10gR1 opatch needs ORACLE_HOME/.patch_storage directory
    for logging opatch lsinventory command messages, that is LsInventory.pm perl module.
    The point is that the module doesn't create that directory and it will show up following error message:
    Problems with creating the log file:
    Couldn't create file for logging. Error is: A file or directory in the path name does not exist.
    So until you use opatch apply command (to apply a patch to the ORACLE_HOME) or until you manual create that directory in your ORACLE_HOME, you
    will be unable to log opatch messages.
    Thats' all,
    Greetings-

    Hi!
    I have recently encountered the same behaviour when running Forms from the Builder through my local OC4J container.
    The error was not same everytime.
    I would start debugging and the problem would disappear.
    I found that if I did a "Compile All" on the Form everytime i run, I can avoid these problems.and I wasted many hours on this. not in forms but in reports also, everytime i have to compile all the rdf before running it from web.
    I was running Forms Builder 9i (9.0.2) on Windows XP and all the forms were build in this version. as I had not used this developer version I was saying this is 9i bug.
    I have used developer 10g version and face no such problems but tim is facing this in 10g. so it looks problem is somewhere........
    Frank this indeed sounds strange
    reg

  • MacBookPro 15" 1.1: Strange GPU temp sensor behaviour

    Hello to all,
    Since I let my trusty MBP alone for a week two months ago some real strange behaviour occured.
    I left my MBP in standby as usual. When I came back from my trip, the battery was completely discharged and I had to connect the power adapter. MBP awoke from suspend to disk (grey screen, progress bar) and a few seconds after beeing ready for work, fans started to rev up. 5 min later the fans were running at full speed.
    Puzzled, I did a restart. Same behaviour, high RPM right after boot logo, full 6000rpm after a few minutes.
    When I took a look at iStatMenus I saw that the GPU Heatsink Sensor Temperature was around 35 degrees Celsius (around 70 Fahrenheit) higher than the GPU Temp sensor itself. It was the highest readout by all sensors, which makes no sense at all.
    I tried: PRAM reset, SMC reset, PMU reset. Apple Hardware Diagnostic CD by me and a German ASP, and by a Genius Bar expert. No errors shown.
    But over the two months the difference between the values sank to 18-20deg Celsius, and I found a tool to let only the CPU sensor control the fan, so I was happy again.
    Yesterday I came back from another trip, guess what: battery empty, woke up from standby..... the difference is now around 36 deg Celsius, and the MBP shuts down automatically even on the smallest YouTube clip when GPU Heatsink Temp exceeds around 86deg Celcius.
    I am clueless, everybody around me too. So maybe one of you has an idea?
    Regards, Matthias

    Not sure about sensors adapting to reality by themselves. This sounds like very strange behavior. I wonder if there could be something like a loose connection somewhere.
    Did the computer misbehave in front of the genius? If it didn't, it might be worth while seeing if you can determine the exact circumstances under which it will act up so you can make it act up in front of the genius.
    For example, my iBook used to freeze, but only when warmed up. I went to the genius bar early so I could give it time to warm up so that it would be sure to malfunction in front of them. This is especially important to do if you are still under warranty. It's important to establish that a problem first occurred under warranty, especially if your warranty is close to running out.
    Good luck!

Maybe you are looking for