Xenballoond - ballooning - Memory ballooning - memory overcommit

Hi,
I am using the Oracle VM Server 2.2.1 (late October 2010).
When I try to find the xenballoond process, I can not find it.
ps -ef | grep xenballoond
When I try to find the xenballoond file on disk, I can not find it.
find / -name '*xenballoond*' -print 2>/dev/null
When I try to detect if the xenballoond service exists, it does not exist.
service xenballoond status
However, the following does return information
# cat /proc/xen/balloon
Current allocation: 114688 kB
Requested target: 114688 kB
Low-mem balloon: 24576 kB
High-mem balloon: 0 kB
Driver pages: 136 kB
Xen hard limit: ??? kB
According to the information on the URL:
The Underground Oracle VM Manual
Chapter 4: Oracle VM Server Sizing, Installation and Updates
http://itnewscast.com/chapter-4-oracle-vm-server-sizing-installation-and-updates
the following is mentioned:
Oracle VM 2.2 ships with Xen 3.4.0, which supports the experimental Xenballoond memory overcommit feature.
Is ballooning working in Oracle VM Server 2.2.1 (October 2010)?
Is it being done through some other process and not the (not-found) xenballoond?
Thanks,
AIM

AIM,
Greetings, Oracle VM 2.2/Xen 3.4.0 ships with the experimental Xenballoond memory overcommit feature, although Xenballoond memory overcommit is not enabled or supported by Oracle.
You can enable and test he experimental Xenballoond memory overcommit feature by following http://xenbits.xensource.com/xen-3.3-testing.hg?file/536475e491cc/tools/xenballoon/xenballoond.README
Respectfully,
Roddy

Similar Messages

Out of Memory (OOM) - Kernel Freezing

Linux BLOZUP 3.17.6-1-ARCH #1 SMP PREEMPT Sun Dec 7 23:43:32 UTC 2014 x86_64 GNU/Linux
240GB SSD, encypted with dm-crypt
4GB RAM
No swap
XFCE4 Desktop
Every once and a while it appears that my kernel freezes. Mouse movement grinds to a halt, but occasionally partially responds after 5-50-500s delay. Keypresses are just as slow. When I am able to switch processes/see conky, it's out of memory. I have no swap, and would like to keep it that way as it's an SSD only system. When it does freeze, my BIOS network and disk LEDs are solid for some reason.
I recently swapped the mechanical disk out for an SSD and encrypted it, and I don't recall it happening before that (but am not 100% sure). It's happened about 4 times total now, in the last 3 months. It also might be related to Chromium, as it has been open every time it happens. It's usually after I leave the computer alone for a few hours and come back, the CPU fan is at 100% and it's unresponsive. I've been able to kill off processes manually, but it takes forever to get to that point, so I usually do a hard reboot.
Normally my computer uses about 40% of it's RAM. This is with 4-6 tabs open in Chromium, Geany and/or Eclipse, and a couple terminals. Plus all the background services.
I'd like to setup my system so that if this happens again I'll have some logs of what's taking up so much CPU/memory. Any suggestions? Something like get conky or top to write-out to a file or the system log every 10 minutes or so, rotating as necessary?

Vain wrote:
finale wrote:If you run without swap the kernel will refuse to "overcommit" memory
Could you elaborate on this? Can you give some example scenarios?
(Afaik, Linux refuses requests that exceed the available memory. So, if you’ve got 4GB RAM + 0 byte swap, a malloc() of 5GB will fail. But if you’ve got 4GB RAM + 2GB swap, then that malloc() will succeed. You could call this “overcommit” since the program requested more memory than you’ve got RAM. Okay. A request of 10GB will still fail, though. On the other hand, nothing stops you from running two processes, each requesting 3GB of RAM without actually using them on a machine with 4GB RAM + 0 byte swap. That’s what I know as “memory overcommit”. Anyway, this might be nitpicking. I’m interested in this topic and I’d love to hear more.)
I'm not very knowledgeable about this subject, but some time ago I had the same problem and had to create a swap file. I suspect you know more about it than I do.
My "explanation" was actually wrong -- by default the kernel will overcommit. Like you said, if programs actually use enough of the memory it will start the OOM killer. (Sorry for the confusion. I'm not usually that careless, but my excuse is that I was interrupted while writing the post .)
However, based on my own experience and from looking around the internet, it seems common for the system to lock up at this point if there is no swap.

Mapping set heap sizes to used memory

Hi all,
I've got a question about the parameters used to control your java process' heap sizes: "-Xms128m -Xmx256m" etc.
Let's say I set my min and max to 2Gb, just for a simplistic example.
If I then look at the linux server my process is running on, I may see a top screen like so:
PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
10647 javaprog 20   0 2180m 1.9g 18m S 1.3 3.7   1:57.02 javaWhat I'm trying to understand is what relationship - if any - there is between these arguments and the figures I see within top. One thing in particular that I'm interested in is the fact that I occasionally see a RES (or more commonly a VIRT) size higher than the maximum that I have provided to Java. Naively I would assume that therefore there isn't a relationship between the two... but I wouldn't mind someone clarifiying this for me.
Any resources on the matter would be appreciated, and I apologise if this question is outside the realms of this particular subforum.
Dave.

Peter Lawrey wrote:
user5287726 wrote:
Peter Lawrey wrote:
It will always reserve this much virtual memory, plus. In term of resident memory, even the minimum is not guarenteed to be used. The minimum specifies at what point it will make little effort to recycle memory. i.e. it grows to the minimum size freely, but a "Hello World" program still won't use the minimum size.No, Linux does not reserve virtual memory. Just Google "Linux memory overcommit". Out-of-the-box, every Linux distro I'm aware of will just keep returning virtual memory to processes until things fall apart and the kernel starts killing processes that are using lots of memory - like your database server, web server, or application-critical JVMs. You know - the very processes you built and deployed the machine to run. Just Google "Linux OOM killer".Thats not the behaviour I see. When I start a process which busy waits, but doesn't create any objects, the virtual memory sized used is based on the -mx option, not how much is used. Given virtual memeory is largely free, why would an OS only give virtual memory on an as needs basis.
Busy looping process which does nothing.
In each case the resident size is 16m
option       virtual size
-mx100m      368m = 100m + 268m
-mx250m      517m = 250m + 267m
-mx500m      769m = 500m + 269m
-mx1g        1294m = 1024m + 270m
-mx2g        2321m = 2048m + 273mTo me it appears that the maximum size you ask is immediately added to the virtual memory size, even if its not used (plus an overhead) i.e. the resident size is only 16m.Yes, it's only using 16m. And its virtual size may very well be what you see. But that doesn't mean the OS actually has enough RAM + swap the hold what it tells all running processes they can have.
How much RAM + swap does your machine have? Say it's 4 GB. You can probably run 10 or 20 JVMs simultaneously with the "-mx2g" option. Imagine what happens, though, if they actually try and use that memory - that the OS said they could have, but which doesn't all exist.
What happens?
The OOM killer fires up and starts killing processes. Which ones? Gee, it's a "standard election procedure". Which on a server that's actually doing something tend to be the processes actually doing something, like your DBMS or web server or JVM. Or maybe it's your backups that get whacked because they're "newly started" and got promised access to memory that doesn't exist.
Memory overcommit on a server with availability and reliability requirements more stringent than risible is indefensible.

WLC 5508 running 7.4.110.0 unable to tftp upload config from controller

Hi,
Two WLC 5508 running identical code version. One is 50 license Primary, the second is HA. Identical config on both. HA WLC can upload its config to the TFTP or FTP server but Primary cannot. The operation fails for both CLI and GUI and for different protocols i.e. TFTP, FTP.
#### Primary Controller
(Cisco Controller) >show sysinfo
Manufacturer's Name.............................. Cisco Systems Inc.
Product Name..................................... Cisco Controller
Product Version.................................. 7.4.110.0
Bootloader Version............................... 1.0.20
Field Recovery Image Version..................... 7.6.95.16
Firmware Version................................. FPGA 1.7, Env 1.8, USB console 2.2
Build Type....................................... DATA + WPS
System Name...................................... PRODWC7309
System Location..................................
System Contact...................................
System ObjectID.................................. 1.3.6.1.4.1.9.1.1069
Redundancy Mode.................................. Disabled
IP Address....................................... 10.1.30.210
Last Reset....................................... Power on reset
System Up Time................................... 18 days 18 hrs 51 mins 35 secs
System Timezone Location......................... (GMT+10:00) Sydney, Melbourne, Canberra
System Stats Realtime Interval................... 5
System Stats Normal Interval..................... 180
Configured Country............................... AU - Australia
Operating Environment............................ Commercial (0 to 40 C)
--More-- or (q)uit
Internal Temp Alarm Limits....................... 0 to 65 C
Internal Temperature............................. +34 C
External Temperature............................. +17 C
Fan Status....................................... OK
State of 802.11b Network......................... Enabled
State of 802.11a Network......................... Enabled
Number of WLANs.................................. 8
Number of Active Clients......................... 138
Memory Current Usage............................. Unknown
Memory Average Usage............................. Unknown
CPU Current Usage................................ Unknown
CPU Average Usage................................ Unknown
Burned-in MAC Address............................ 3C:08:F6:CA:52:20
Power Supply 1................................... Present, OK
Power Supply 2................................... Present, OK
Maximum number of APs supported.................. 50
(Cisco Controller) >debug transfer trace enable
(Cisco Controller) >transfer upload start
Mode............................................. TFTP
TFTP Server IP................................... 10.1.22.2
TFTP Path........................................ /
TFTP Filename.................................... PRODWC7309-tmp.cfg
Data Type........................................ Config File
Encryption....................................... Disabled
*** WARNING: Config File Encryption Disabled ***
Are you sure you want to start? (y/N) Y
*TransferTask: Jun 02 10:41:15.183: Memory overcommit policy changed from 0 to 1
*TransferTask: Jun 02 10:41:15.183: RESULT_STRING: TFTP Config transfer starting.
TFTP Config transfer starting.
*TransferTask: Jun 02 10:41:15.183: RESULT_CODE:1
*TransferTask: Jun 02 10:41:24.309: Locking tftp semaphore, pHost=10.1.22.2 pFilename=/PRODWC7309-tmp.cfg
*TransferTask: Jun 02 10:41:24.393: Semaphore locked, now unlocking, pHost=10.1.22.2 pFilename=/PRODWC7309-tmp.cfg
*TransferTask: Jun 02 10:41:24.393: Semaphore successfully unlocked, pHost=10.1.22.2 pFilename=/PRODWC7309-tmp.cfg
*TransferTask: Jun 02 10:41:24.394: tftp rc=-1, pHost=10.1.22.2 pFilename=/PRODWC7309-tmp.cfg
pLocalFilename=/mnt/application/xml/clis/clifile
*TransferTask: Jun 02 10:41:24.394: RESULT_STRING: % Error: Config file transfer failed - Unknown error - refer to log
*TransferTask: Jun 02 10:41:24.394: RESULT_CODE:12
*TransferTask: Jun 02 10:41:24.394: Memory overcommit policy restored from 1 to 0
% Error: Config file transfer failed - Unknown error - refer to log
(Cisco Controller) >show logging
*TransferTask: Jun 02 10:41:24.393: #UPDATE-3-FILE_OPEN_FAIL: updcode.c:4579 Failed to open file /mnt/application/xml/clis/clifile.
*sshpmReceiveTask: Jun 02 10:41:24.315: #OSAPI-3-MUTEX_FREE_INFO: osapi_sem.c:1087 Sema 0x2b32def8 time=142 ulk=1621944 lk=1621802 Locker(sshpmReceiveTask sshpmrecv.c:1662 pc=0x10b07938) unLocker(sshpmReceiveTask sshpmReceiveTaskEntry:1647 pc=0x10b07938)
-Traceback: 0x10af9500 0x1072517c 0x10b07938 0x12020250 0x12080bfc
*TransferTask: Jun 02 10:39:01.789: #UPDATE-3-FILE_OPEN_FAIL: updcode.c:4579 Failed to open file /mnt/application/xml/clis/clifile.
*sshpmReceiveTask: Jun 02 10:39:01.713: #OSAPI-3-MUTEX_FREE_INFO: osapi_sem.c:1087 Sema 0x2b32def8 time=5598 ulk=1621801 lk=1616203 Locker(sshpmReceiveTask sshpmrecv.c:1662 pc=0x10b07938) unLocker(sshpmReceiveTask sshpmReceiveTaskEntry:1647 pc=0x10b07938)
-Traceback: 0x10af9500 0x1072517c 0x10b07938 0x12020250 0x12080bfc
#### HA Controller
(Cisco Controller) >show sysinfo
Manufacturer's Name.............................. Cisco Systems Inc.
Product Name..................................... Cisco Controller
Product Version.................................. 7.4.110.0
Bootloader Version............................... 1.0.20
Field Recovery Image Version..................... 7.6.95.16
Firmware Version................................. FPGA 1.7, Env 1.8, USB console 2.2
Build Type....................................... DATA + WPS
System Name...................................... PRODWC7310
System Location..................................
System Contact...................................
System ObjectID.................................. 1.3.6.1.4.1.9.1.1069
Redundancy Mode.................................. Disabled
IP Address....................................... 10.1.31.210
Last Reset....................................... Software reset
System Up Time................................... 18 days 19 hrs 1 mins 27 secs
System Timezone Location......................... (GMT+10:00) Sydney, Melbourne, Canberra
System Stats Realtime Interval................... 5
System Stats Normal Interval..................... 180
Configured Country............................... AU - Australia
Operating Environment............................ Commercial (0 to 40 C)
--More-- or (q)uit
Internal Temp Alarm Limits....................... 0 to 65 C
Internal Temperature............................. +34 C
External Temperature............................. +17 C
Fan Status....................................... OK
State of 802.11b Network......................... Enabled
State of 802.11a Network......................... Enabled
Number of WLANs.................................. 4
Number of Active Clients......................... 0
Memory Current Usage............................. Unknown
Memory Average Usage............................. Unknown
CPU Current Usage................................ Unknown
CPU Average Usage................................ Unknown
Burned-in MAC Address............................ 3C:08:F6:CA:53:C0
Power Supply 1................................... Present, OK
Power Supply 2................................... Present, OK
Maximum number of APs supported.................. 500
(Cisco Controller) >debug transfer trace enable
(Cisco Controller) >transfer upload start
Mode............................................. FTP
FTP Server IP.................................... 10.1.22.2
FTP Server Port.................................. 21
FTP Path......................................... /
FTP Filename..................................... 10_1_31_210_140602_1050.cfg
FTP Username..................................... ftpuser
FTP Password..................................... *********
Data Type........................................ Config File
Encryption....................................... Disabled
*** WARNING: Config File Encryption Disabled ***
Are you sure you want to start? (y/N) y
*TransferTask: Jun 02 10:51:31.278: Memory overcommit policy changed from 0 to 1
*TransferTask: Jun 02 10:51:31.278: RESULT_STRING: FTP Config transfer starting.
FTP Config transfer starting.
*TransferTask: Jun 02 10:51:31.278: RESULT_CODE:1
*TransferTask: Jun 02 10:52:05.468: ftp operation returns 0
*TransferTask: Jun 02 10:52:05.477: RESULT_STRING: File transfer operation completed successfully.
*TransferTask: Jun 02 10:52:05.477: RESULT_CODE:11
File transfer operation completed successfully.
Not upgrading to 7.4.121.0 because of bug CSCuo63103. Have not restarted the controller yet.
Any one else had this issue ? Is there a workaround ?
Thanks,
Rick.

Thanks Stephen, In my deployments of 7.4.110.0 version I have not seen this issue so may be controller reboot will fix it (we do have HA to minimize the impact). I will keep the thread updated with findings and may request TAC for the special release 7.4.121.0 if the still not happy with 7.4.110.0
Rick.

Installing wildcard certificate in a WLC (ver 7.0.240 and 7.5.102)

Is it possible to install a widcard certificate for web auth in those versions?
Is there any difference between this two versions.
Are both of them versions supporting wildcards certificates?
Here you have the log file resulting of installing the wildcart certificate in the wlc with v 7.0.240.
*TransferTask: Nov 28 11:20:51.117: Memory overcommit policy changed from 0 to 1
*TransferTask: Nov 28 11:20:51.319: Delete ramdisk for ap bunble
*TransferTask: Nov 28 11:20:51.432: RESULT_STRING: TFTP Webauth cert transfer starting.
*TransferTask: Nov 28 11:20:51.432: RESULT_CODE:1
*TransferTask: Nov 28 11:20:55.434: Locking tftp semaphore, pHost=10.16.50.63 pFilename=/wild2013_priv.pem
*TransferTask: Nov 28 11:20:55.516: Semaphore locked, now unlocking, pHost=10.16.50.63 pFilename=/wild2013_priv.pem
*TransferTask: Nov 28 11:20:55.516: Semaphore successfully unlocked, pHost=10.16.50.63 pFilename=/wild2013_priv.pem
*TransferTask: Nov 28 11:20:55.517: TFTP: Binding to local=0.0.0.0 remote=10.16.50.63
*TransferTask: Nov 28 11:20:55.588: TFP End: 1666 bytes transferred (0 retransmitted packets)
*TransferTask: Nov 28 11:20:55.589: tftp rc=0, pHost=10.16.50.63 pFilename=/wild2013_priv.pem
pLocalFilename=cert.p12
*TransferTask: Nov 28 11:20:55.589: RESULT_STRING: TFTP receive complete... Installing Certificate.
*TransferTask: Nov 28 11:20:55.589: RESULT_CODE:13
*TransferTask: Nov 28 11:20:59.590: Adding cert (5 bytes) with certificate key password.
*TransferTask: Nov 28 11:20:59.590: RESULT_STRING: Error installing certificate.
*TransferTask: Nov 28 11:20:59.591: RESULT_CODE:12
*TransferTask: Nov 28 11:20:59.591: ummounting: <umount /mnt/download/ >/dev/null 2>&1> cwd = /mnt/application
*TransferTask: Nov 28 11:20:59.624: finished umounting
*TransferTask: Nov 28 11:20:59.903: Create ramdisk for ap bunble
*TransferTask: Nov 28 11:20:59.904: start to create c1240 primary image
*TransferTask: Nov 28 11:21:01.322: start to create c1240 backup image
*TransferTask: Nov 28 11:21:02.750: Success to create the c1240 image
*TransferTask: Nov 28 11:21:02.933: Memory overcommit policy restored from 1 to 0
(Cisco Controller) >
Would I have the same results in wlc with v 7.5.102?
Thank you.

Hi Pdero,
Please check out these docs:
https://supportforums.cisco.com/thread/2052662
http://netboyers.wordpress.com/2012/03/06/wildcard-certs-for-wlc/
https://supportforums.cisco.com/thread/2067781
https://supportforums.cisco.com/thread/2024363
https://supportforums.cisco.com/community/netpro/wireless-mobility/security-network-management/blog/2011/11/26/generate-csr-for-third-party-cert-and-download-unchained-cert-on-wireless-lan-controller-wlc
Regards
Dont forget to rate helpful posts.

Error when installing webauth certificate virtual wireless LAN controller

Hi there
I am having issues installing web auth certificate for our virtual wirelesss LAN controller.
I am issuing a certificate from our own PKI in following format
device cert for WLC > Intermediate > our root cert.
I have followed the discussion here
https://supportforums.cisco.com/discussion/10890871/generating-csr-wlc-5508
and the document here
http://www.cisco.com/c/en/us/support/docs/wireless/4400-series-wireless-lan-controllers/109597-csr-chained-certificates-wlc-00.html#support
However I am still getting the following errors
*sshpmLscTask: Jun 30 17:18:26.443: sshpmLscTask: LSC Task received a message 4
*TransferTask: Jun 30 17:18:28.785: Memory overcommit policy changed from 0 to 1
*TransferTask: Jun 30 17:18:28.785: RESULT_STRING: FTP Webauth cert transfer starting.
*TransferTask: Jun 30 17:18:28.785: RESULT_CODE:1
FTP Webauth cert transfer starting.
*TransferTask: Jun 30 17:18:33.154: ftp operation returns 0
*TransferTask: Jun 30 17:18:33.154: RESULT_STRING: FTP receive complete... Installing Certificate.
FTP receive complete... Installing Certificate.
*TransferTask: Jun 30 17:18:33.154: RESULT_CODE:13
*TransferTask: Jun 30 17:18:37.159: Adding cert (8217 bytes) with certificate key password.
*TransferTask: Jun 30 17:18:37.169: sshpmCheckWebauthCert: Verification return code: 1
*TransferTask: Jun 30 17:18:37.169: Verification result text: ok
*TransferTask: Jun 30 17:18:37.171: sshpmAddWebauthCert: Extracting private key from webauth cert and using bundled pkcs12 password.
*TransferTask: Jun 30 17:18:37.361: sshpmDecodePrivateKey: calling ssh_skb_decode()...
*TransferTask: Jun 30 17:18:37.493: sshpmDecodePrivateKey: SshPrivateKeyPtr after skb_decode: 0x2aaaacb51628
*TransferTask: Jun 30 17:18:37.493: sshpmAddWebauthCert: got private key; extracting certificate...
*TransferTask: Jun 30 17:18:37.494: sshpmAddWebauthCert: extracted binary cert; doing x509 decode
*TransferTask: Jun 30 17:18:37.494: sshpmAddWebauthCert: doing x509 decode for 1594 byte certificate...
*TransferTask: Jun 30 17:18:37.494: sshpmAddWebauthCert: failed to validate certificate...
*TransferTask: Jun 30 17:18:37.494: RESULT_STRING: Error installing certificate.
*TransferTask: Jun 30 17:18:37.495: RESULT_CODE:12
*TransferTask: Jun 30 17:18:37.495: Memory overcommit policy restored from 1 to 0
Error installing certificate.
Any help is much appreciated

Similar issue:
https://supportforums.cisco.com/discussion/11043836/wism-42112-and-web-auth-certificate

Pdflush hanging disks when disk activity is high

Hi everyone! my "problem" is that when im copying or moving larga amounts of data between my partitions/disks my the system response is right, but the apps who show or access those disks hangs for a few seconds including the copy/moving itself, theres a lot of pdflush process that's what conky and htop show me when those little hangs appear... dont know if it has something to do, but the disk is a sata drive inside a usb case formatted as ext3... copying from that disk from the same disk or any other inside mi box (1 80gb IDE drive) cause those pdflush process to appear...
1. is this normal?
2. what is pdflush?
3. what it do exactly?
thanks!!!
EDIT: forgot to mention... cpu activity during those mini hangs is low... 5%-10% or less cpu usage... in a athlon xp 2600+ cpu... and pdflush processes are 0% using cpu... there are 5 or more of those pdflush...
Last edited by leo2501 (2007-08-27 10:44:24)

Auto SOLVED for anybody interested...
http://www.westnet.com/~gsmith/content/ … dflush.htm
The Linux Page Cache and pdflush:
Theory of Operation and Tuning for Write-Heavy Loads
As you write out data ultimately intended for disk, Linux caches this information in an area of memory called the page cache. You can find out basic info about the page cache using tools like free, vmstat or top. See http://gentoo-wiki.com/FAQ_Linux_Memory_Management to learn how to interpret top's memory information, or atop to get an improved version.
Full information about the page cache only shows up by looking at /proc/meminfo. Here is a sample from a system with 4GB of RAM:
MemTotal: 3950112 kB
MemFree: 622560 kB
Buffers: 78048 kB
Cached: 2901484 kB
SwapCached: 0 kB
Active: 3108012 kB
Inactive: 55296 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 3950112 kB
LowFree: 622560 kB
SwapTotal: 4198272 kB
SwapFree: 4198244 kB
Dirty: 416 kB
Writeback: 0 kB
Mapped: 999852 kB
Slab: 57104 kB
Committed_AS: 3340368 kB
PageTables: 6672 kB
VmallocTotal: 536870911 kB
VmallocUsed: 35300 kB
VmallocChunk: 536835611 kB
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 2048 kB
The size of the page cache itself is the "Cached" figure here, in this example it's 2.9GB. As pages are written, the size of the "Dirty" section will increase. Once writes to disk have begun, you'll see the "Writeback" figure go up until the write is finished. It can be very hard to actually catch the Writeback value going high, as its value is very transient and only increases during the brief period when I/O is queued but not yet written.
Linux usually writes data out of the page cache using a process called pdflush. At any moment, between 2 and 8 pdflush threads are running on the system. You can monitor how many are active by looking at /proc/sys/vm/nr_pdflush_threads. Whenever all existing pdflush threads are busy for at least one second, an additional pdflush daemon is spawned. The new ones try to write back data to device queues that are not congested, aiming to have each device that's active get its own thread flushing data to that device. Each time a second has passed without any pdflush activity, one of the threads is removed. There are tunables for adjusting the minimum and maximum number of pdflush processes, but it's very rare they need to be adjusted.
pdflush tunables
Exactly what each pdflush thread does is controlled by a series of parameters in /proc/sys/vm:
/proc/sys/vm/dirty_writeback_centisecs (default 500): In hundredths of a second, this is how often pdflush wakes up to write data to disk. The default wakes up the two (or more) active threads twice a second.
There can be undocumented behavior that thwarts attempts to decrease dirty_writeback_centisecs in an attempt to make pdflush more aggressive. For example, in early 2.6 kernels, the Linux mm/page-writeback.c code includes logic that's described as "if a writeback event takes longer than a dirty_writeback_centisecs interval, then leave a one-second gap". In general, this "congestion" logic in the kernel is documented only by the kernel source itself, and how it operates can vary considerably depending on which kernel you are running. Because of all this, it's unlikely you'll gain much benefit from lowering the writeback time; the thread spawning code assures that they will automatically run themselves as often as is practical to try and meet the other requirements.
The first thing pdflush works on is writing pages that have been dirty for longer than it deems acceptable. This is controlled by:
/proc/sys/vm/dirty_expire_centiseconds (default 3000): In hundredths of a second, how long data can be in the page cache before it's considered expired and must be written at the next opportunity. Note that this default is very long: a full 30 seconds. That means that under normal circumstances, unless you write enough to trigger the other pdflush method, Linux won't actually commit anything you write until 30 seconds later.
The second thing pdflush will work on is writing pages if memory is low. This is controlled by:
/proc/sys/vm/dirty_background_ratio (default 10): Maximum percentage of active that can be filled with dirty pages before pdflush begins to write them
Note that some kernel versions may internally put a lower bound on this value at 5%.
Most of the documentation you'll find about this parameter suggests it's in terms of total memory, but a look at the source code shows this isn't true. In terms of the meminfo output, the code actually looks at
MemFree + Cached - Mapped
So on the system above, where this figure gives 2.5GB, with the default of 10% the system actually begins writing when the total for Dirty pages is slightly less than 250MB--not the 400MB you'd expect based on the total memory figure.
Summary: when does pdflush write?
In the default configuration, then, data written to disk will sit in memory until either a) they're more than 30 seconds old, or b) the dirty pages have consumed more than 10% of the active, working memory. If you are writing heavily, once you reach the dirty_background_ratio driven figure worth of dirty memory, you may find that all your writes are driven by that limit. It's fairly easy to get in a situation where pages are always being written out by that mechanism well before they are considered expired by the dirty_expire_centiseconds mechanism.
Other than laptop_mode, which changes several parameters to optimize for keeping the hard drive spinning as infrequently as possible (see http://www.samwel.tk/laptop_mode/ for more information) those are all the important kernel tunables that control the pdflush threads.
Process page writes
There is another parameter involved though that can spill over into management of user processes:
/proc/sys/vm/dirty_ratio (default 40): Maximum percentage of total memory that can be filled with dirty pages before processes are forced to write dirty buffers themselves during their time slice instead of being allowed to do more writes.
Note that all processes are blocked for writes when this happens, not just the one that filled the write buffers. This can cause what is perceived as an unfair behavior where one "write-hog" process can block all I/O on the system. The classic way to trigger this behavior is to execute a script that does "dd if=/dev/zero of=hog" and watch what happens. See Kernel Korner: I/O Schedulers for examples showing this behavior.
Tuning Recommendations for write-heavy operations
The usual issue that people who are writing heavily encouter is that Linux buffers too much information at once, in its attempt to improve efficiency. This is particularly troublesome for operations that require synchronizing the filesystem using system calls like fsync. If there is a lot of data in the buffer cace when this call is made, the system can freeze for quite some time to process the sync.
Another common issue is that because so much must be written before any phyiscal writes start, the I/O appears more bursty than would seem optimal. You'll have long periods where no physical writes happen at all, as the large page cache is filled, followed by writes at the highest speed the device can achieve once one of the pdflush triggers is tripped.
dirty_background_ratio: Primary tunable to adjust, probably downward. If your goal is to reduce the amount of data Linux keeps cached in memory, so that it writes it more consistently to the disk rather than in a batch, lowering dirty_background_ratio is the most effective way to do that. It is more likely the default is too large in situations where the system has large amounts of memory and/or slow physical I/O.
dirty_ratio: Secondary tunable to adjust only for some workloads. Applications that can cope with their writes being blocked altogether might benefit from substantially lowering this value. See "Warnings" below before adjusting.
dirty_expire_centisecs: Test lowering, but not to extremely low levels. Attempting to speed how long pages sit dirty in memory can be accomplished here, but this will considerably slow average I/O speed because of how much less efficient this is. This is particularly true on systems with slow physical I/O to disk. Because of the way the dirty page writing mechanism works, trying to lower this value to be very quick (less than a few seconds) is unlikely to work well. Constantly trying to write dirty pages out will just trigger the I/O congestion code more frequently.
dirty_writeback_centisecs: Leave alone. The timing of pdflush threads set by this parameter is so complicated by rules in the kernel code for things like write congestion that adjusting this tunable is unlikely to cause any real effect. It's generally advisable to keep it at the default so that this internal timing tuning matches the frequency at which pdflush runs.
Swapping
By default, Linux will aggressively swap processes out of physical memory onto disk in order to keep the disk cache as large as possible. This means that pages that haven't been used recently will be pushed into swap long before the system even comes close to running out of memory, which is an unexpected behavior compared to some operating systems. The /proc/sys/vm/swappiness parameter controls how aggressive Linux is in this area.
As good a description as you'll find of the numeric details of this setting is in section 4.15 of http://people.redhat.com/nhorman/papers/rhel4_vm.pdf It's based on a combination of how much of memory is mapped (that total is in /proc/meminfo) as well as how difficult it has been for the virtual memory manager to find pages to use.
A value of 0 will avoid ever swapping out just for caching space. Using 100 will always favor making the disk cache bigger. Most distributions set this value to be 60, tuned toward moderately aggressive swapping to increase disk cache.
The optimal setting here is very dependant on workload. In general, high values maximize throughput: how much work your system gets down during a unit of time. Low values favor latency: getting a quick response time from applications. Some desktop users so favor low latency that they set swappiness to 0, so that user applications are never swapped to disk (as can happen when the system is executing background tasks while the user is away). That's perfectly reasonable if the amount of memory in the system exceeds the usual working set for the applications used. Servers that are very active and usually throughput bound could justify setting it to 100. On the flip side, a desktop system that is so limited in memory that every active byte helps might also prefer a setting of 100.
Since the size of the disk cache directly determines things like how much dirty data Linux will allow in memory, adjusting swappiness can greatly influence that behavior even though it's not directly tied to that.
Warnings
-There is a currently outstanding Linux kernel bug that is rare and difficult to trigger even intentionally on most kernel versions. However, it is easier to encounter when reducing dirty_ratio setting below its default. An introduction to the issue starts at http://lkml.org/lkml/2006/12/28/171 and comments about it not being specific to the current kernel release are at http://lkml.org/lkml/2006/12/28/131
-The standard Linux memory allocation behavior uses an "overcommit" setting that allows processes to allocate more memory than is actually available were they to all ask for their pages at once. This is aimed at increasing the amount of memory available for the page cache, but can be dangerous for some types of applications. See http://www.linuxinsight.com/proc_sys_vm … emory.html for a note on the settings you can adjust. An example of an application that can have issues when overcommit is turned on is PostgreSQL; see "Linux Memory Overcommit" at http://www.postgresql.org/docs/current/ … urces.html for their warnings on this subject.
References: page cache
Neil Horman, "Understanding Virtual Memory in Red Hat Enterprise Linux 4" http://people.redhat.com/nhorman/papers/rhel4_vm.pdf
Daniel P. Bovet and Marco Cesati, "Understanding the Linux Kernel, 3rd edition", chapter 15 "The Page Cache". Available on the web at http://www.linux-security.cn/ebooks/ulk3-html/
Robert Love, "Linux Kernel Development, 2nd edition", chapter 15 "The Page Cache and Page Writeback"
"Runtime Memory Management", http://tree.celinuxforum.org/CelfPubWik … easurement
"Red Hat Enterprise Linux-Specific [Memory] Information", http://www.redhat.com/docs/manuals/ente … lspec.html
"Tuning Swapiness", http://kerneltrap.org/node/3000
"FAQ Linux Memory Management", http://gentoo-wiki.com/FAQ_Linux_Memory_Management
From the Linux kernel tree:
* Documentation/filesystems/proc.txt (the meminfo documentation there originally from http://lwn.net/Articles/28345/)
* Documentation/sysctl/vm.txt
* Mm/page-writeback.c
References: I/O scheduling
While not directly addressed here, the I/O scheduling algorithms in Linux actually handle the writes themselves, and some knowledge or tuning of them may be synergistic with adjusting the parameters here. Adjusting the scheduler only makes sense in the context where you've already configured the page cache flushing correctly for your workload.
D. John Shakshober, "Choosing an I/O Scheduler for Red Hat Enterprise Linux 4 and the 2.6 Kernel" http://www.redhat.com/magazine/008jun05 … chedulers/
Robert Love, "Kernel Korner: I/O Schedulers", http://www.linuxjournal.com/article/6931
Seelam, Romero, and Teller, "Enhancements to Linux I/O Scheduling", http://linux.inet.hr/files/ols2005/seelam-reprint.pdf
Heger, D., Pratt, S., "Workload Dependent Performance Evaluation of the Linux 2.6 I/O Schedulers", http://linux.inet.hr/files/ols2004/pratt-reprint.pdf
Upcoming Linux work in progress
-There is a patch in testing from SuSE that adds a parameter called dirty_ratio_centisecs to the kernel tuning which fine-tunes the write-throttling behavior. See "Patch: per-task predictive write throttling" at http://lwn.net/Articles/152277/ and Andrea Arcangeli's article (which has a useful commentary on the existing write throttling code) at http://www.lugroma.org/contenuti/eventi … rnel26.pdf
-SuSE also has suggested a patch at http://lwn.net/Articles/216853/ that allows setting the dirty_ratio settings below the current useful range, aimed at systems with very large memory capacity. The commentary on this patch also has some helpful comments on improving dirty buffer writing, although it is fairly specific to ext3 filesystems.
-The stock 2.6.22 Linux kernel has substantially reduced the default values for the dirty memory parameters. dirty_background_ratio defaulted to 10, now it defaults to 5. vm_dirty_ratio defaulted to 40, now it's 10
-A recent lively discussion on the Linux kernel mailing list discusses some of the limitations of the fsync mechanism when using ext3.
Copyright 2007 Gregory Smith. Last update 8/08/2007.

Memory ballooning creating sites

I am creating 100 sites using PowerShell something like this:
Start-SPAssigment -Global$web = New-SPWeb $url –Name $Name –UseParentTopNav -description $description -ea stop
$web.ApplyWebTemplate($Id["SiteTemplate_1"])Stop-SPAssignment - Global
I am seeing memory ballooning. I am running this on a test system at the moment but I can't actually create 100 sites without getting system out of memory errors. The powershell process itself is using 6 GB by this point. I tried using advanced assignment
like this:
$spO = Start-SPAssigment
$web = $spO | New-SPWeb $url –Name $Name –UseParentTopNav -description $description -ea stop
$web.ApplyWebTemplate($Id["SiteTemplate_1"])
Stop-SPAssignment $spO
It made little to no difference. If I don't use SPAssignment and dispose of the spWeb object explicitly I don't see the problem. So my questions are does SPAssignment actually work as intended? Am I using it wrong? My samples are simplified from what
I am actually doing and to be honest wrapping code blocks in SPAssignment -global is easy but pointless if it doesn't work correctly...
Thanks
Darrell

Just to be clear this is PowerShell I am talking about...
This technet article: http://technet.microsoft.com/en-us/library/ff607664(v=office.15).aspx disagrees with you.
Quoted here:
The Start-SPAssignment cmdlet properly disposes of objects used with variable assignments.
Large amounts of memory are often required when SPWeb, SPSite, or SPSiteAdminsitration objects are used. So the use of these objects, or lists of these objects, in Windows PowerShell scripts
requires proper memory management. By default, all Get commands dispose of these objects immediately after the pipeline finishes, but by using SPAssignment, you can assign the list of objects to a variable and
dispose of the objects after they are no longer needed. You can also ensure that the objects remain as long as you need them, even throughout multiple iterations of commands.
Cheers
Darrell

Linux memory ballooning problem

Hi all, my customer has a MS HyperV 2012 (not R2) cluster for virtualization purposes. Stumbled in dynamic memory allocation I just had a try, reading around about supported configurations etc. As per what i understand, for a 2012 environment the only
concern is to set identical startup and maximum ram values.
Once the vm is booted, after a while, the ram amount start to decrease, meanwhyle in terminal some "out of memory kill process" starts and the amount returns into the original max value.
Any idea or experience about that scenario? The RHEL release is 6.4, integration services embedded.
Tanks in advance.

~
~
. . . , the udevs ones is really interesting.
~
~
But use rules by
Nikolay Pushkarev :
RHEL
6.5 / CentOS 6.5 ; RHEL 6.4 / CentOS 6.4 include support for Hyper-V drivers
==
Following rule works slightly faster for me (assuming that memory0 bank always in online state):
SUBSYSTEM=="memory", ACTION=="add", DEVPATH=="/devices/system/memory/memory[1-9]*", RUN+="/bin/cp /sys$devpath/../memory0/state /sys$devpath/state"
And one more thing, udev solution doesn't work in 32-bit kernel architecture, only for 64-bit. Is this by design or yet another bug?
==
P.S.
And see topic
Dynamic Memory on Linux VM

Extremely high memory usage after upgrading to Firefox 12

After I upgraded to Firefox 12, I began frequently experiencing Firefox memory usage ballooning extremely high (2-3GB after a few minutes of light browsing). Sometimes it will drop back down to a more reasonable level (a few hundred MB), sometimes it hangs (presumably while trying to garbage collect everything), and sometimes it crashes. Usually the crashing thread cannot be determined, but when it can be, it is in the garbage collection code ( [https://crash-stats.mozilla.com/report/list?signature=js%3A%3Agc%3A%3AMarkChildren%28JSTracer*%2C+js%3A%3Atypes%3A%3ATypeObject*%29] ).
I was able to capture an about:memory report when Firefox had gotten to about 1.5 GB and have attached an image.
A couple of things I've tried. I have lots of tabs open (though the Don't load tabs until selected option is enabled), so I copied my profile, kept all my extensions enabled, but closed all my tabs. I then left a page open to http://news.google.com/ and it ran fine for several days, whereas my original profile crashes multiple times a day.
I also tried disabling most of my extensions, leaving the following extensions that I refuse to browse without:
Adblock Plus
BetterPrivacy
NoScript
PasswordMaker
Perspectives
Priv3
However, the problem still happened in that case.
Don't know if any of this helps or not. I'm looking forward to trying Firefox 13 when it comes out.

hello, thanks for reporting back with detailed information.
from a brief look at your extensions i don't recognize any known (to me at least) memory leaking ones. in the last weeks there were also reports about the java plugin causing high memory consumption in combination with firefox 12 - in case you have it installed in firefox > addons > plugins try disabling it for a few days & test how firefox is behaving with many tabs.
& although probably not related to the memory problems you could update your graphics driver to get better results with hardware acceleration in firefox - this is the latest driver by intel for your model & os:
[http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=21135&lang=eng&OSVersion=Windows%207%20%2864-bit%29*&DownloadType=Drivers]

Massive memory hemorrhage; heap size to go from about 64mb, to 1.3gb usage

**[SOLVED]**
Note: I posted this on stackoverflow as well, but a solution was not found.
Here's the problem:
[1] http://i.stack.imgur.com/sqqtS.png
As you can see, the memory usage balloons out of control! I've had to add arguments to the JVM to increase the heapsize just to avoid out of memory errors while I figure out what's going on. Not good!
##Basic Application Summary (for context)
This application is (eventually) going to be used for basic on screen CV and template matching type things for automation purposes. I want to achieve as high of a frame rate as possible for watching the screen, and handle all of the processing via a series of separate consumer threads.
I quickly found out that the stock Robot class is really terrible speed wise, so I opened up the source, took out all of the duplicated effort and wasted overhead, and rebuilt it as my own class called FastRobot.
##The Class' Code:
 public class FastRobot {
 private Rectangle screenRect;
 private GraphicsDevice screen;
 private final Toolkit toolkit;
 private final Robot elRoboto;
 private final RobotPeer peer;
 private final Point gdloc;
 private final DirectColorModel screenCapCM;
 private final int[] bandmasks;
 public FastRobot() throws HeadlessException, AWTException {
 this.screenRect = new Rectangle(Toolkit.getDefaultToolkit().getScreenSize());
 this.screen = GraphicsEnvironment.getLocalGraphicsEnvironment().getDefaultScreenDevice();
 toolkit = Toolkit.getDefaultToolkit();
 elRoboto = new Robot();
 peer = ((ComponentFactory)toolkit).createRobot(elRoboto, screen);
 gdloc = screen.getDefaultConfiguration().getBounds().getLocation();
 this.screenRect.translate(gdloc.x, gdloc.y);
 screenCapCM = new DirectColorModel(24,
 /* red mask */ 0x00FF0000,
 /* green mask */ 0x0000FF00,
 /* blue mask */ 0x000000FF);
 bandmasks = new int[3];
 bandmasks[0] = screenCapCM.getRedMask();
 bandmasks[1] = screenCapCM.getGreenMask();
 bandmasks[2] = screenCapCM.getBlueMask();
 Toolkit.getDefaultToolkit().sync();
 public void autoResetGraphicsEnv() {
 this.screenRect = new Rectangle(Toolkit.getDefaultToolkit().getScreenSize());
 this.screen = GraphicsEnvironment.getLocalGraphicsEnvironment().getDefaultScreenDevice();
 public void manuallySetGraphicsEnv(Rectangle screenRect, GraphicsDevice screen) {
 this.screenRect = screenRect;
 this.screen = screen;
 public BufferedImage createBufferedScreenCapture(int pixels[]) throws HeadlessException, AWTException {
 // BufferedImage image;
 DataBufferInt buffer;
 WritableRaster raster;
 pixels = peer.getRGBPixels(screenRect);
 buffer = new DataBufferInt(pixels, pixels.length);
 raster = Raster.createPackedRaster(buffer, screenRect.width, screenRect.height, screenRect.width, bandmasks, null);
 return new BufferedImage(screenCapCM, raster, false, null);
 public int[] createArrayScreenCapture() throws HeadlessException, AWTException {
 return peer.getRGBPixels(screenRect);
 public WritableRaster createRasterScreenCapture(int pixels[]) throws HeadlessException, AWTException {
 // BufferedImage image;
 DataBufferInt buffer;
 WritableRaster raster;
 pixels = peer.getRGBPixels(screenRect);
 buffer = new DataBufferInt(pixels, pixels.length);
 raster = Raster.createPackedRaster(buffer, screenRect.width, screenRect.height, screenRect.width, bandmasks, null);
 // SunWritableRaster.makeTrackable(buffer);
 return raster;
 }In essence, all I've changed from the original is moving many of the allocations from function bodies, and set them as attributes of the class so they're not called every time. Doing this actually had a significant affect on frame rate. Even on my severely under powered laptop, it went from ~4 fps with the stock Robot class, to ~30fps with my FastRobot class.
##First Test:
When I started outofmemory errors in my main program, I set up this very simple test to keep an eye on the FastRobot. Note: this is the code which produced the heap profile above.
 public class TestFBot {
 public static void main(String[] args) {
 try {
 FastRobot fbot = new FastRobot();
 double startTime = System.currentTimeMillis();
 for (int i=0; i < 1000; i++)
 fbot.createArrayScreenCapture();
 System.out.println("Time taken: " + (System.currentTimeMillis() - startTime)/1000.);
 } catch (AWTException e) {
 e.printStackTrace();
 }##Examined:
It doesn't do this every time, which is really strange (and frustrating!). In fact, it rarely does it at all with the above code. However, the memory issue becomes easily reproducible if I have multiple for loops back to back.
#Test 2
 public class TestFBot {
 public static void main(String[] args) {
 try {
 FastRobot fbot = new FastRobot();
 double startTime = System.currentTimeMillis();
 for (int i=0; i < 1000; i++)
 fbot.createArrayScreenCapture();
 System.out.println("Time taken: " + (System.currentTimeMillis() - startTime)/1000.);
 startTime = System.currentTimeMillis();
 for (int i=0; i < 500; i++)
 fbot.createArrayScreenCapture();
 System.out.println("Time taken: " + (System.currentTimeMillis() - startTime)/1000.);
 startTime = System.currentTimeMillis();
 for (int i=0; i < 200; i++)
 fbot.createArrayScreenCapture();
 System.out.println("Time taken: " + (System.currentTimeMillis() - startTime)/1000.);
 startTime = System.currentTimeMillis();
 for (int i=0; i < 1500; i++)
 fbot.createArrayScreenCapture();
 System.out.println("Time taken: " + (System.currentTimeMillis() - startTime)/1000.);
 } catch (AWTException e) {
 e.printStackTrace();
 }##Examined
The out of control heap is now reproducible I'd say about 80% of the time. I've looked all though the profiler, and the thing of most note (I think) is that the garbage collector seemingly stops right as the fourth and final loop begins.
The output form the above code gave the following times:
Time taken: 24.282 //Loop1
Time taken: 11.294 //Loop2
Time taken: 7.1 //Loop3
Time taken: 70.739 //Loop4
Now, if you sum the first three loops, it adds up to 42.676, which suspiciously corresponds to the exact time that the garbage collector stops, and the memory spikes.
[2] http://i.stack.imgur.com/fSTOs.png
Now, this is my first rodeo with profiling, not to mention the first time I've ever even thought about garbage collection -- it was always something that just kind of worked magically in the background -- so, I'm unsure what, if anything, I've found out.
##Additional Profile Information
[3] http://i.stack.imgur.com/ENocy.png
Augusto suggested looking at the memory profile. There are 1500+ `int[]` that are listed as "unreachable, but not yet collected." These are surely the `int[]` arrays that the `peer.getRGBPixels()` creates, but for some reason they're not being destroyed. This additional info, unfortunately, only adds to my confusion, as I'm not sure why the GC wouldn't be collecting them
##Profile using small heap argument -Xmx256m:
At irreputable and Hot Licks suggestion I set the max heap size to something significantly smaller. While this does prevent it from making the 1gb jump in memory usage, it still doesn't explain why the program is ballooning to its max heap size upon entering the 4th iteration.
[4] http://i.stack.imgur.com/bR3NP.png
As you can see, the exact issue still exists, it's just been made smaller. ;) The issue with this solution is that the program, for some reason, is still eating through all of the memory it can -- there is also a marked change in fps performance from the first the iterations, which consume very little memory, and the final iteration, which consumes as much memory as it can.
The question remains why is it ballooning at all?
##Results after hitting "Force Garbage Collection" button:
At jtahlborn's suggestion, I hit the Force Garbage Collection button. It worked beautifully. It goes from 1gb of memory usage, down to the basline of 60mb or so.
[5] http://i.stack.imgur.com/x4282.png
So, this seems to be the cure. The question now is, how do I pro grammatically force the GC to do this?
##Results after adding local Peer to function's scope:
At David Waters suggestion, I modified the `createArrayCapture()` function so that it holds a local `Peer` object.
Unfortunately no change in the memory usage pattern.
[6] http://i.stack.imgur.com/Ky5vb.png
Still gets huge on the 3rd or 4th iteration.
#Memory Pool Analysis:
###ScreenShots from the different memory pools
##All pools:
[7] http://i.stack.imgur.com/nXXeo.png
##Eden Pool:
[8] http://i.stack.imgur.com/R4ZHG.png
##Old Gen:
[9] http://i.stack.imgur.com/gmfe2.png
Just about all of the memory usage seems to fall in this pool.
Note: PS Survivor Space had (apparently) 0 usage
##I'm left with several questions:
(a) does the Garbage Profiler graph mean what I think it means? Or am I confusing correlation with causation? As I said, I'm in an unknown area with these issues.
(b) If it is the garbage collector... what do I do about it..? Why is it stopping altogether, and then running at a reduced rate for the remainder of the program?
(c) How do I fix this?
Does anyone have any idea what's going on here?
[1]: http://i.stack.imgur.com/sqqtS.png
[2]: http://i.stack.imgur.com/fSTOs.png
[3]: http://i.stack.imgur.com/ENocy.png
[4]: http://i.stack.imgur.com/bR3NP.png
[5]: http://i.stack.imgur.com/x4282.png
[6]: http://i.stack.imgur.com/Ky5vb.png
[7]: http://i.stack.imgur.com/nXXeo.png
[8]: http://i.stack.imgur.com/R4ZHG.png
[9]: http://i.stack.imgur.com/gmfe2.png
Edited by: 991051 on Feb 28, 2013 11:30 AM
Edited by: 991051 on Feb 28, 2013 11:35 AM
Edited by: 991051 on Feb 28, 2013 11:36 AM
Edited by: 991051 on Mar 1, 2013 9:44 AM

SO came through.
Turns out this issue was directly related to the garbage collector. The default one, for whatever reason, would get behind on its collection at points, and thus the memory would balloon out of control, which then, once allocated, became the new normal for the GC to operate at.
Manually setting the GC to ConcurrentMarkSweep solved this issue completely. After numerous tests, I have been unable to reproduce the memory issue. The garbage collector does an excellent job of keeping on top of these minor collections.

Memory increases, GC does not collect.

I am developing an AIR application using FlexBuilder (FlashBuilder) 4.0.1 (Standard) in an Eclipse plugin environment, on the Windows 7 64-bit platform.
My application is intended to run full-screen, on a touchscreen device, as an application for small children (pre-k).
The general operation is to present a series of scenarios, one by one, from a collection of separately developed swf files. (each swf file is a scenario). The child responds to the swf file, the swf file sends an event to the player, which performs an unloadAndStop() on the Loader object that loaded the swf. Then the next swf is loaded.
The problem we are seeing is that each swf file adds about 15-25 MB to the Workingset Private allocation of the AIR application (or the ADL.EXE runtime). Very little of this RAM is returned. After about 140 of these scenarios, the AIR runtime consumes about 1.6 GB of RAM. This, in itself, is not the problem - we could buy more RAM.
The problem is that the Loader crashes here, and takes the whole AIR runtime with it.
We also have a few "special purpose" swf files which are basically just wrappers around a s:VideoPlayer control, that plays a bindable file - .flv video, and the complete event fires the event that tells the AIR player to unloadAndStop the swf.
Since the video player took no manual intervention to test, I built a set of these to simulate a high load, and found that:
The .flv files are opened, but they are never closed. Since the s:VideoPlayer control is an MXML construct, there is no explicit way I can see to tell it to close the dang file when it's done. (it's not documented, anyway).
At exactly 32 video-swf items, the videos will continue to play, but there is no more audio.
At exactly 73 video-swf items, (seemingly regardless of the SIZE of the .flv file I select), the 73rd item crashes the Loader (and AIR player).
I supply unloadAndStop() with the (true) parameter. I follow it with a System.gc().
I explicitly remove ALL of my listeners. (in the AIR player application - I assume no control over items within the swf items that I am playing, since we assume we eventually may end up playing 3rd-party developed items; so we rely only on the return of a completion event, and a data structure).
I explicitly close() ALL of my audio streams. (at the AIR player level - not in the swfs.)
I explicitly stop() ALL of my timers.
My loader is instantiated in an element called "myLoaderContainer" - and after I receive my completion event, I do a myLoaderContainer.removeAllElements();
I have tried both the System.gc() call and the "unsupported" LocalConnection().connect('foo') call. both of these have zero effect. (in other words, I think there is probably no garbage to collect).
Still, I find it strange that nowhere, is it written, any kind of hint about how often garbage collection will do it's thing, and having graphed the flash.system.System.privateMemory; usage against all of my functions, and narrowed it down to my loader, and seeing that this allocation is never reclaimed, even on loader.unloadAndStop(true); - I wonder exactly what is the (true) parameter for, in that method?
I don't know what else I am supposed to do to force my Loader to *actually* unloadAndStop().
One would think that unloadAndStop() would not mean: "cease all functioning, but continue to occupy memory until the parent application chokes on it's own heap."

I recompiled the memory leaking program using AIR 3.0 and the leak remains. However my previous description of the problem was incorrect. It's not a bytearray I'm using. It's just a normal array, of 9 floats. Now this array is is being updated in an event, on an indefinite basis (from hardware data provided by a separate process running as a socket server). The updating array is a class scope array. On each update the index restarts to zero (ie. it's not a ballooning array). However, each item of that updating array is then being redispatched in the context of another socket event (request events by another process), using socket.writeDouble() - also on an indefinite basis.
So while the array is being written in the context of one event, it's simultaneously being read in the context of another uncorrelated event.
My only theory is that the speed at which the array is being written/read (since only nine floats) is causing some overflow in the number of temporary arrays that might be generated to accomodate such abuse. And that maybe such temps are becomining lost to the gc - that pehaps the gc can't keep up with the abuse. But that's just me speculating what theorys I could test for workarounds.
In native code I'd use various thread controls (mutexes etc), to control read/writes on the async socket events, but here I'm somewhat at a loss as to how to otherwise control the data flow. Indeed I ended up rewriting the program in native code (c++) to get around the memory leak.
Carl

SWFLoader Memory - At my wits end! Ugh!

I am pulling my hair out trying to make sense of why SWFLoader controls consume so much memory?
I'm trying to load SWF's that are pages of a document. Each page is about 100k in size. In this example, I'm loading 100 pages (the same page actually).
I fully understand that 100 * 200k = 20MB of memory. Lets assume the SWFLoader has a *ton* of overhead - 1MB of memory per instance - that math still means the app shouldn't consume more than 120MB of memory - but in this instance it is consuming up to 1GB of memory!!
I can't / don't understand what is going on or why it would consume this much memory.
When you first load the example, all is well (sort of at somewhere around 80MB of memory). However, if you click and drag the scroll thumb up and down slowly, you'll see your memory balloon to well over 1GB of memory - and continue to balloon until crash.
Note that I'm not attaching *any* event handlers or even attempting to unload anything(!) This is as plain vanilla as it gets!
<?xml version="1.0" encoding="utf-8"?>
<mx:Application xmlns:mx="http://www.adobe.com/2006/mxml" layout="absolute">
 <mx:Script>
 <![CDATA[
 import mx.controls.SWFLoader;
 private function loadDocs() : void {
 var sl : SWFLoader;
 for(var i : uint = 1 ; i < 100; i++){
 sl = holder.addChild( new SWFLoader() ) as SWFLoader;
 sl.load('e514508fa074478cbef52d2662925246_1.swf' );
 ]]>
 </mx:Script>
 <mx:VBox height="100%" width="100%">
 <mx:HBox width="100%" height="100">
 <mx:Button click="loadDocs()" label="load docs"/>
 </mx:HBox>
 <mx:VBox id="holder" height="{this.height - 130}" width="{this.width - 30}"/>
 </mx:VBox>
</mx:Application>
Attached is the test document I'm loading.. so you can try this for yourself.
I just can't figure out what is going on or how to approach this any differently.
I know it is Friday and we've all checked out at this point mentally, but if anyone can please help me solve this, I'll paypal you a $100 bounty instantly to buy you and yours a few rounds on me over the weekend (and saving my sanity)!
Thanks for any ideas you have in advance
-Nate
[email protected]

My apologies - after exhausting everything else, I started to test the client's generated swf and apparently it is their swf file specifically that is causing the problem.
So, turns out I'm not taking crazy pills after all.

"Memory full." error while exporting report to PDF format - CR2008SP2

 
Hello, 
 
I am developing a C#.NET application that uses the CR2008 SP2 .NET libraries. This application performs some database updates and uses CR2008 SP2 to run 7 different reports and export the results to PDF files. This application is a console application that only uses Crystal to export the rendered reports - it does not use previews or printing. 
 
The specific pattern of this application is as follows: 
 
- perform some database updates 
- render report #1 and export it as a PDF file 
- perform some database updates 
- render report #2 and export it as a PDF file 
... pattern repeats to report #7 ... 
 
This application works fine as long as I run only one instance of it at a time. The problem occurs when I try to run multiple instances of this application at the same time. When I run multiple instances of this application at the same time (against totally different databases) each instance will start up happily and begin running through the process described above. After a few moments one or more instances will fail during the report export. The specific error is as follows: 
 
"Memory full. Failed to export the report. Not enough memory for operation." 
 
The error always comes from the call to: 
 
CrystalDecisions.CrystalReports.Engine.ReportDocument.ExportToDisk(Format, FileName) 
 
I have watched the memory consumption of these instances of my application while they are running. They never seem to exceed approximately 52MB each. At this time task manager reports over 1GB of physical memory free. These numbers lead me to believe that memory is not actually "full". 
 
Here are some specifics about the environment: 
 
Dell Vostro 1720 / P8600 / 4GB 
Windows 7 Ultimate x64 
SQL Server 2008 SP1 x64 
Crystal Reports 2008 SP2 
 
Specifics about the C# application: 
 
IDE: Visual Studio 2008 SP1 
Type: Console Application 
Target platform: x86 
.NET Framework: 3.5 SP1 
Crystal References: 
- CrystalDecisions.CrystalReports.Engine (v12.0.2000.0) 
- CrystalDecisions.Shared (v12.0.2000.0) 
 
Specifics about the report templates: 
 
The report templates are RPT files saved from CR2008 SP2. They are relatively simple. A few of them (maybe 3) contain a single subreport. 
 
Specifics about the database: 
 
Each database is approximately 1GB in size. The database contains many tables but the reports only access a handful of tables. Each table that the reports access have maybe a few hundred rows tops. Most have less than 100. Likewise, when the reports perform their selections the resulting rowset for the reports is anywhere from about 20 records to a few hundred records tops. 
 
A few items to note: 
 
- Multiple instances of my application need to be able to run on a single machine at the same time. It is a specific design requirement. 
- My application works fine as long as I run only one instance of it at a time. 
- My application works fine when I run multiple instances if I comment out the Crystal part of it. 
- My application works fine when I run multiple instances if I change the export format from PDF to HTML32 or HTML40 (have not tried any others) 
- The machine has over 1GB of physical memory free when this error occurs. 
- I have taken steps to ensure that I am properly disposing of my CrystalDecisions.CrystalReports.Engine.ReportDocument instance just after each export is complete. I have tried to use the "using" keyword, as well as explicitly setting the instance to null and calling the .NET framework garbage collector. This did not seem to help. 
 
Any assistance with this issue would be greatly appreciated. 
 
Steve 
 
Edited by: scbraddy on Mar 11, 2010 1:53 AM

 
Jonathan & Ludek, 
 
Thanks for the timely response and good suggestions guys! 
 
I have performed a few more tests today in order to help answer some of your questions. 
 
Below are my responses. Some of them are answers to questions and some of them are observations based on tests I have performed today. 
 
- The database system is SQL Server 2008 SP1 Developer Edition x64. 
 
- The C# application connects to the database using the native .NET SQL client (System.Data.Sql). 
 
- During runtime I connect the report to the database by looping through the ReportDocument.Database.Tables collection. For each Table found I create a CrystalDecisions.Shared.TableLogOnInfo instance, fill in the table name, server name, database name, user ID, and password, and use Table.ApplyLogOnInfo() to apply it. I do the same thing for the subreports collection if there are any. Like I said, about half of the reports contain a single subreport. 
 
- The report templates are currently set to use the "SQLOLEDB" Provider. The Database Type is set to "OLE DB (ADO)". I am not entirely sure what you mean by trying a different database driver. I assume you mean to change the provider? If so, given the fact that we are using SQL Server 2008, what would I change it to? 
 
- If I do not perform the database updates during application runtime (instead, perform them ahead of time and then allow the application to only render the reports) then I still have the same problem. Nothing changes. 
 
- I installed the Fix Pack 2.5, rebooted, and tried again. Same problem. 
 
- I do not beleive that I am using XML transforms. How would I know? 
 
- Regarding fragmented memory: It is hard for me to believe that, given the amount of memory free on the machine once all processes are up and running (1GB+), and the size of the processes (~50MB), that a contiguous block could not be found. Maybe there is some Crystal process that is attempting to balloon out of control, but as far as I can tell my application is not getting very big. I will not rule this out of course because I don't know for sure. 
 
- Regarding killing the process once a report is rendered: The problem with this is that the application needs to perform these 7 groups of updates and render these 7 reports in a specific sequence all as part of one complete "pass". Also, this application needs to be able to be able to handle a scneario where one or more instance of the application is started on a single machine at roughly the same time (pointed at different databases). The application actually does work fine in this regard (staying well within the capabilities of my 4GB laptop even when 8 of them are running) except that I always get the same error when I choose PDF as my export format. 
 
- I ran a test today where I changed the export format to HTML40 and started 8 instances of the application at the same time, each working from a different copy of the database. The instances ran a little slow but did complete in good time with no errors. Please note that changing the export format to PDF will cause the application to fail with the "memory full" error if even only 2 instances are running at the same time. The only way I can successfully complete a run with the export format set to PDF is to run only 1 instance at a time. I can run it over and over all day long and it will not fail, as long as only 1 instance is running at a given time. As of yesterday I had only proven that the application would complete when using the HTML40 export format when 2 or 4 instances are running, but today I doubled it again (to 8) and still no failure while using HTML40. This is possibly the most interesting fact about this problem, and is another reason why I am skeptical that memory fragmentation is the culprit. 
 
We would like to solve this problem because we would like to continue to target the PDF format. It is the standard report export format for our applications. If we cannot solve this problem we may call a meeting and decide whether to proceed with the HTML40 export format, or possibly dump Crystal from this project altogether. 
 
Thanks again for your interest in our problem and your timely and helpful responses. I'm really not sure where to go next ... maybe change the database provider in the templates? I will need some specific advice in this area because I'm not sure what to do. Any more ideas you guys have will be greatly appreciated on this end! 
 
Steve 
 
Edited by: scbraddy on Mar 12, 2010 2:18 AM

Out of memory error? Are we serious?

When using AIM on Mac OS 9, if I talk enough (litterally, it's related to how much I IM), I'll eventually get an "Out of memory, please close some open apps" or something like that.
I have 1GB of RAM! And from what I see in About This Mac, I have like 891MB free! Is this why people hated Mac OS 9's memory management? It just seems "Stupid" to say the least..

Hi, Transgenic -
In OS 9, most programs have a static memory allocation - it can not be changed on the fly in response to need. The OS itself and SimpleText are a couple of exceptions - their memory allocations are dynamic, capable of being increased in response to need (SimpleText only a small amount, though).
Other programs can not grab more memory than the amount which has been allocated for them, and they must grab that at the beginning when they first load. The solution then is to incease the memory allocation, allowing them to grab more when they are started up. The item to increase is the Preferred allocation amount -
Article #TA21666 - Assigning More Memory to an Application
Note that the program can not be running when doing the Get Info, and that the Get Info must be done on the icon of the program itself, not on an alias to it or a folder.
You may need to experiment with increasing the Preferred amount, until you get sufficient into it to handle your needs. Doubling the original Preferred amount is often a good starting point.
Don't be shy about raising it a lot if that seems to be needed. For example, Diablo II - in order to run it with Virtual Memory disabled (which is what I prefer), I've added 410,000 to its original 77,500 allocation.
<hr>
The memory usage bars shown in About This Computer show relative usage. One trick to monitor accurately the memory usage of a program while it is running is to open About This Computer, turn Balloons on ("Show Balloons" in the Help menu), then hover the cursor over the memory bar for a program in About This Computer (don't click) - the balloon that appears will display the memory usage in exact numbers.
<hr>
Is this why people hated Mac OS 9's memory management?
I've never hated it; it is what it is. Like most things that get in my way, I learn to work with or around it.
<hr>
It just seems "Stupid" to say the least..
I don't think "Stupid" is an appropriate description. It is unreasonable to expect an OS version that was first coded nearly 10 years ago (the initial release of OS 9), and one whose roots are 25 years old (OS 1 in 1984), to be as proficient in some things as more modern OS's, such as OSX.
It would be like finding fault with a 1960's automobile for not having air conditioning or airbags or disk brakes.

Xenballoond - ballooning - Memory ballooning - memory overcommit

Similar Messages

Maybe you are looking for