Bad NTFS on ZFS performance

hi,
i am using solaris express 2010.11 as zfs/raidz based storage for a RHEL6 KVM virtualization host.
disk performance of the linux guests is ok (about 50MB/s), but for windows/NTFS guests, read AND right performance is extremely bad, about 5MB/s !!!!!
the disk performance is always bad, indepented of the access method (iSCSI RAW LUN, disk image on ext4 on ISCSI LUN, disk image on NFS export)...
NTFS block size is default (4k), ZFS blocksize/recordsize are also default (8k,128k)
i already tried changing the blocksize/recordsize of the ZFS volumes/filesystems, disabling caching on the kvm host, and enabled jumbo frames (9000)..... nothing helped...
any suggestions?
thanks for any help!!

I'm struggling with exactly the same problem, but on ESXi 4.1.
It seems that zfs inflate IO. When you check disk activity you can see that underline zfs trash the disks, while it results in a modest activity within ntfs.
I just cannot figure out how to cope with it.

Similar Messages

X4500 Best ZFS Performance

Hello all.
My company has recently purchased an x4500 and I am relatively new to ZFS. I am looking for the best performance configuration for the storage pool that I can get while still staying above 17TB in Raid-Z.
If anyone has any tips or best practices for the size of disk sets or their arrangement, your help would be appreciated.
Thanks.

This is what I did for 16TB with 2 spares (465GB disks). Not sure if it is best performance, but it is the best that I could figure out.
zpool create storage \
raidz2 c5t1d0 c{4,7,6,1,0}t4d0 c{4,7,6,1,0}t0d0 \
raidz2 c{5,4,7,6,1,0}t5d0 c{4,7,6,1,0}t1d0 \
raidz2 c{5,4,7,6,1,0}t6d0 c{4,7,6,1,0}t2d0 \
raidz2 c{5,4,7,6,1,0}t7d0 c{4,7,6,1,0}t3d0 \
spare c5t{2,3}d0
If any one sees any issues with this, please let me know.

Poor ZFS Performance on StorEdge 1000

Hello !
I've got a little ( or bigger performance ) problem with one of my Solaris 10 (05/09) ZFS installations.
First of all i know that the hardware is old and probably not recommended but i excpected a little more of it.
HW: SUN Enterprise 420R 2GB RAM
2 x FUJITSU Product: MAG3182L SUN18G Revision: 1111 internal
SUN StoreEdge 1000 4x FUJITSU Product: MAN3184M SUN18G Revision: 1502 ext scsi connected
i made two zfs pools the root and the external like the following:
rootpool:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s0 ONLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
external pool:
NAME STATE READ WRITE CKSUM
foo ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t8d0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t9d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
I started to install application servers uploaded several files etc. I noticed that all the write operations which were done on the external
storage were extrem slow. Did several tests , researches on the net but havent found any satisfying answer yet. i did a test with
Ben Rockwoods dtrace script spa_sync.d. While running the d-script i did some file copy of 150MB file on the root and on the external pool
which gave me the following output / throuput performance:
rootpool:
root-->time cp test.dbf test2.dbf
real 0m13.382s
user 0m0.004s
sys 0m1.766s
2009 Aug 6 14:58:21t: 109 MB; 3842 ms of spa_sync; avg sz : 114 KB; throughput 28 MB/sn
2009 Aug 6 14:58:25t: 0 MB; 0 ms of spa_sync; avg sz : 0 KB; throughput 0 MB/sn
2009 Aug 6 14:58:25t: 2 MB; 853 ms of spa_sync; avg sz : 16 KB; throughput 2 MB/sn
2009 Aug 6 14:58:27t: 109 MB; 3700 ms of spa_sync; avg sz : 115 KB; throughput 29 MB/sn
2009 Aug 6 14:58:30t: 2 MB; 373 ms of spa_sync; avg sz : 28 KB; throughput 7 MB/sn
2009 Aug 6 14:58:31t: 0 MB; 0 ms of spa_sync; avg sz : 0 KB; throughput 0 MB/sn
2009 Aug 6 14:58:40t: 1 MB; 1197 ms of spa_sync; avg sz : 8 KB; throughput 1 MB/sn
2009 Aug 6 14:58:53t: 104 MB; 3642 ms of spa_sync; avg sz : 112 KB; throughput 28 MB/sn
2009 Aug 6 14:58:56t: 0 MB; 0 ms of spa_sync; avg sz : 0 KB; throughput 0 MB/sn
2009 Aug 6 14:58:56t: 2 MB; 611 ms of spa_sync; avg sz : 17 KB; throughput 3 MB/sn
2009 Aug 6 14:58:59t: 108 MB; 3801 ms of spa_sync; avg sz : 112 KB; throughput 28 MB/sn
2009 Aug 6 14:59:03t: 4 MB; 579 ms of spa_sync; avg sz : 30 KB; throughput 7 MB/sn
external pool:
test-->time cp test.dbf test2.dbf
real 2m0.267s
user 0m0.004s
sys 0m2.763s
2009 Aug 6 15:00:26t: 17 MB; 5612 ms of spa_sync; avg sz : 74 KB; throughput 3 MB/sn
2009 Aug 6 15:00:33t: 1 MB; 394 ms of spa_sync; avg sz : 14 KB; throughput 4 MB/sn
2009 Aug 6 15:00:35t: 19 MB; 5782 ms of spa_sync; avg sz : 75 KB; throughput 3 MB/sn
2009 Aug 6 15:00:41t: 19 MB; 6416 ms of spa_sync; avg sz : 68 KB; throughput 2 MB/sn
2009 Aug 6 15:00:47t: 16 MB; 5241 ms of spa_sync; avg sz : 62 KB; throughput 3 MB/sn
2009 Aug 6 15:00:54t: 16 MB; 4525 ms of spa_sync; avg sz : 69 KB; throughput 3 MB/sn
2009 Aug 6 15:00:59t: 13 MB; 4868 ms of spa_sync; avg sz : 56 KB; throughput 2 MB/sn
2009 Aug 6 15:01:04t: 15 MB; 4825 ms of spa_sync; avg sz : 58 KB; throughput 3 MB/sn
2009 Aug 6 15:01:10t: 15 MB; 5038 ms of spa_sync; avg sz : 66 KB; throughput 2 MB/sn
2009 Aug 6 15:01:17t: 15 MB; 4841 ms of spa_sync; avg sz : 60 KB; throughput 3 MB/sn
2009 Aug 6 15:01:22t: 17 MB; 4572 ms of spa_sync; avg sz : 52 KB; throughput 3 MB/sn
2009 Aug 6 15:01:29t: 16 MB; 4809 ms of spa_sync; avg sz : 49 KB; throughput 3 MB/sn
2009 Aug 6 15:01:35t: 16 MB; 5166 ms of spa_sync; avg sz : 59 KB; throughput 3 MB/sn
2009 Aug 6 15:01:42t: 18 MB; 5189 ms of spa_sync; avg sz : 66 KB; throughput 3 MB/sn
2009 Aug 6 15:01:48t: 14 MB; 4270 ms of spa_sync; avg sz : 59 KB; throughput 3 MB/sn
2009 Aug 6 15:01:53t: 16 MB; 4536 ms of spa_sync; avg sz : 70 KB; throughput 3 MB/sn
2009 Aug 6 15:01:58t: 17 MB; 4855 ms of spa_sync; avg sz : 63 KB; throughput 3 MB/sn
2009 Aug 6 15:02:03t: 2 MB; 387 ms of spa_sync; avg sz : 18 KB; throughput 5 MB/sn
2009 Aug 6 15:02:05t: 19 MB; 5491 ms of spa_sync; avg sz : 66 KB; throughput 3 MB/sn
2009 Aug 6 15:02:12t: 17 MB; 4787 ms of spa_sync; avg sz : 65 KB; throughput 3 MB/sn
2009 Aug 6 15:02:17t: 14 MB; 4168 ms of spa_sync; avg sz : 58 KB; throughput 3 MB/sn
Is this explainable / normal. Well im aware of the old config but the disks internal and external a quite the same so if both pool would
have the same performance i'd rather be convinced that the old HW is the source of low performance. But with the difference of about 1 1/2 minutes
i find this rather odd tbh. So anyone any clues / tips ?
bg martin

Hi Don,
thx for the reply and indeed i have several errors on 1-2 disk(s)
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c1t8d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c1t9d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 2 219 191 412 c2t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 35 19 54 c2t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c0t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c0t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 8 3 0 11 c0t6d0
so i'll replace those and make the performance tests again.
bg martin

OWB 10.2.0.4 really bad set-based delete performance

Hi, we recently upgraded to OWB 10.2.0.4, with one of the reasons being the ability to do set-based deletes instead of row-based. However, upon testing this, we're seeing maps that in row based deletes go from 30 - 40 seconds, now taking literally 1.5 to 2 HOURS to run.
I expected the SQL from the set based to take the form of:
delete from my_table
where (col_a, col_b, col_c) in (select a, b, c from ....)
but instead the format is different:
delete from my_table
where exists (select 1 from ....)
I don't quite understand what the SQL is trying to accomplish - and truthfully, it performs horribly compared to the hand-written version (explain plan shows estimated cost of 14,000 for my query, and over 5 million for the OWB query).
Has anyone else seen this - and is there a solution? Part of me wants to say I'm doing something wrong, but the other part says "sure, but it works fine in row-based mode(target only)" - exact same map.
Any ideas?
Thanks!
Scott

Hi everyone, we'll I've figured out what is causing the problem and how to fix it...but still don't understand why it causes the problem.
Here's a high level overview of the ETL - we find deleted record by selecting business key columns from our existing dimension table and doing a MINUS on the matching columns from the source table. If any records come out of this, it means the record was deleted on the source, and we go ahead and do a matching delete on the dimension table.
Here's where the odd thing happens though - there's a column called "source system name" that is part of the dimension business key. This column does NOT exist on the source system - it's just a hard coded constant (put in just in case we ever add an additional system in the future).
Basically, if we do the minus logic on all the columns EXCEPT for this one, and then connect a constant to the delete operator that has this hard coded value in - the delete takes FOREVER... On the other hand, if we actually put this field into the minus operator by simply repointing the existing constant there instead of directly to the delete table...the deletes magically start taking 30 seconds instead of 10 minutes to run.
No idea (at all) why this makes a diff, but it seems to - and it's a day and night different.
Hopefully this can help someone else out who runs into the same issue.
Thanks!
Scott

SNMP V3 : Bad Credential Check "Error performing SNMP operation"

Dears,
I've tried to configure SNMP V3 on some devices using LMS 4 as required by a customer and all appears fine.
Except that I could not use the template in netconfig as my customer desired to use AES rather then DES and there is no dropdown box for the priv encryption algorithm.
After a week the customer tells me all his snmp v3 devices are now failing in the credentail check report.
The message is "Error performing SNMP operation"
I've tested this in my lab using LMS 4 and I can confirm that there is no problem speaking snmp v3 to the devices.
D:\CSCOpx\objects\jt\bin>snmpwalk.exe -v3 -l authNoPriv -u oss -a SHA -A OSS_LAB123 10.170.46.250
Cannot find module (UCD-SNMP-MIB): At line 4 in (none)
Cannot find module (UCD-DEMO-MIB): At line 4 in (none)
Cannot find module (NET-SNMP-AGENT-MIB): At line 4 in (none)
Cannot find module (UCD-DLMOD-MIB): At line 4 in (none)
RFC1213-MIB::sysDescr.0 = STRING: "Cisco Internetwork Operating System Software
IOS (tm) C2950 Software (C2950-I6Q4L2-M), Version 12.1(22)EA6, RELEASE SOFTWARE (fc1)
Copyright (c) 1986-2005 by cisco Systems, Inc.
Compiled Fri 21-Oct-05 01:59 by yenanh"
RFC1213-MIB::sysObjectID.0 = OID: SNMPv2-SMI-v1::enterprises.9.1.427
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (354935628) 41 days, 1:55:56.28
RFC1213-MIB::sysContact.0 = ""
RFC1213-MIB::sysName.0 = STRING: "OSS-SW1"
RFC1213-MIB::sysLocation.0 = ""
RFC1213-MIB::sysServices.0 = INTEGER: 2
But still LMS claims that there is an issue.
Anyone else having this? Did anyone find a sollution?

I just noticed a thread where Joe says this is expected behavior.
I would be nice if the report could produce a sensible message such as "Failed", "RO OK" or "RO & RW OK"
Shouldn't be that hard to fix this
Cheers,
Michel

I've been holding off updating my mac to OS Yosemite due to all the bad reviews of the performance. Is it worth it or should I wait?

I've been holding off updating my MacAir to OS Yosemite due to all the bad reviews. Is it worth downloading or should i wait for a later OS?

Well as any television network executive will tell you, for every complaint letter or email they get, there are likely a thousand people who feel the same but don't bother to comment. That's why it must be considered sample data, and only a little over a week's worth at that. That makes it even more remarkable since it reflects the latest bug fix release, and things should be getting better. I recently called Apple Support for a problem I was having with Yosemite on my 27" Cinema Display, and the hold time was 20 minutes when it's usually just a couple. Apple Support was excellent as usual once I got someone, but the long wait time tells me something is keeping them very busy these days. People don't call for support when all is well.
Now don't get me wrong - I'm a huge Apple fan and have been for many years. I've bought into the entire ecosystem with 5 Macs, 3 iPhones, 3 iPads, 3 Apple TV's, etc. I care passionately what happens with Apple and I'm looking forward to a long continued relationship with them and a bright future. But I have to honestly say that I have been concerned with the User Interface direction they have taken with iOS and now OSX over the past two years. I'm a business owner and can easily spend 12 hours a day in front of my Mac display getting real work done - not just watching videos, checking FaceBook, or playing games. I have perfect vision, but I'm getting severe eye strain from Yosemite. That's not just about disliking how it looks - it's a functional usability issue. Add to that the lack of quality control in this release (as evidenced by the widely reported wi-fi issues), and I'm genuinely and sincerely concerned about what is going on. Yes I freely admit that I'm nostalgic for how beautiful (and internally efficient) Snow Leopard was with it's colored application sidebar icons and "lickable" traffic light icons, but I could also use it and every release of OSX up to Mavericks for hours on end without eye fatigue.
And make no mistake, even with the misgivings I have over the directions taken in iOS and OSX of late, Apple is still by far better than any of the alternatives, and I will stand by them as they work their way through these growing pains, and hopefully listen to their loyal users as we express our sincere heart-felt concerns.

ZFS and StorageTek 6140 performance

We have a Sun StorageTek 6140 Disk array and currently two Solaris 10 x86 hosts connected to it via Fibre channel through a Qlogic 5602 FC Switch.
One system is our production E-mail system (Running Sun Messaging) the other is a backup server.
The backup server is running CAM software an periodically issues a snapshot to be done on the 6140. I have noticed that copying or taring up files on either the production volume or the snapshot volume has very poor performance.
Basically between 2-4MB/s
We have patched the kernel 5.10 Generic_127128-11 i86pc i386 i86pc and tried various settings in /etc/system
set zfs:zfs_prefetch_disable=1
set zfs:zfs_nocacheflush=1
But still the performance is not improving. The array seems to function properly (that is if I use "dd" then the array performs quickly so I must believe that it has something to do with ZFS).
Has anyone else had issues with ZFS performance on a 6140 array or similar? What kind of speeds are you seeing with actual file system usage?
I should also add that If I used a UFS formatted filesystem on the array I saw cp/tar speeds around 10-12MB/s
thanks,
-Tomas

Hello Nik,
Fortunately I have generated supportdata package before upgrading and CAM version is 6.7.0.12. In addition to your reply I found an article at http://www.tune-it.ru/web/bender/blogs/-/blogs/восстановление-томов-на-массивах-6000-и-2000-серии , for not-russian speakers: the article provides the similar solution with /opt/SUNWsefms/bin/service utility, but the author made a note about an offset=<blocks> field, he multiplies the value from profile by 512, I had some volumes being stored on a single Vdisk, and I'm not sure now, because in your's and author's service utulity template it was clearly marked that the value of an argument is in blocks, and the stored in profile value is also in blocks (not in bytes, the piece of my profile - "+...GB (598923542528 bytes) Offset (blocks): ...+"), is he right by multiplying the value from profile?
- Second question - does a service utility provide a functionality to change wwn's on volumes and Storage array identifier (SAID) at whole device? I found out that the previous license files are not accepted, because of another Feature Enable Identifier (I think it is calculated from a changed value of Storage array identifier, am I right?), and why I want to change the wwn's and mappings (mappings will correct from the bui) on recreated volumes as per profile is is that I want to avoid problems by possible misrecognition them by vxvm at a server side (target numbering change) and further recorrecting/reimporting vxvm disks and disk group ownership.

How to delete files from external ntfs hard disk [Solved]

Hi guys
first, sorry for my bad English.
I have an external hard disk ( WD 500GB ) with ntfs file system and i have installed ntfs-3g package.
3 days ago, when i wanted to delete some files, i get a problem with it,
look the output :
[jahangir@Arch New Metal]$ sudo rm *
[sudo] password for jahangir:
rm: cannot remove '02 - Korn - Love and Meth.mp3': No such file or directory
rm: cannot remove '30Seconds To Mars': No such file or directory
rm: cannot remove '30Seconds To Mars 1': Is a directory
rm: cannot remove 'Avantasia': No such file or directory
rm: cannot remove 'Avantasia 1': Is a directory
rm: cannot remove 'Avantasia 2': Is a directory
rm: cannot remove 'Behemoth': No such file or directory
rm: cannot remove 'Behemoth 1': Is a directory
rm: cannot remove 'Hanging Garden - At Every Door - 2013': No such file or directory
rm: cannot remove 'Hanging Garden - At Every Door - 2014': No such file or directory
rm: cannot remove 'Rosetta': No such file or directory
rm: cannot remove 'Rosetta 1': No such file or directory
rm: cannot remove 'Sepultura': No such file or directory
rm: cannot remove 'Sepultura 1': No such file or directory
rm: cannot remove 'Slipknot': No such file or directory
rm: cannot remove 'Slipknot 1': No such file or directory
rm: cannot remove 'Tokio Hotel': No such file or directory
rm: cannot remove 'Tokio Hotel 1': No such file or directory
rm: cannot remove 'T\303\275r': No such file or directory
rm: cannot remove 'neww': No such file or directory
[jahangir@Arch New Metal]$
Who can help me ?
I wanted to delete .trash-1000 file from my main directory hard dist and i confront with this error :
[jahangir@Arch My Passport]$ sudo rm .Trash-1000
[sudo] password for jahangir:
rm: cannot remove '.Trash-1000': No such file or directory
[jahangir@Arch My Passport]$
In the event that it is there.
also in main directory of my hard disk i have 1 mp3 file that i can't view it in file manager and it will be displayed in Windows OS and with ls command in terminal :
[jahangir@Arch My Passport]$ ls
ls: cannot access 01 - Lost.mp3: No such file or directory
ls: cannot access 02 - Surrendered To The Decadence.mp3: No such file or directory
01 - Lost.mp3 In The Name Of God Videos ZzZ - IMAN winold
02 - Surrendered To The Decadence.mp3 New Metal World of Warcraft Cataclysm 4.3.4 enGB navid wow wrath
[jahangir@Arch My Passport]$
what is this file and how can i delete .Trash-1000 and this files and content of "New Metal" directory ?
Last edited by jiros (2013-12-23 20:57:05)

I believe you used ntfs for a reason. As far as I know, Windows isn't friendly with hdd filesystems others than fat or ntfs, so once you format your external harddrive to ext4, windows won't talk to it at all, unless you install some additional driver or software.
You have several possibilities to do:
1) You could use FAT32, it's kind of a dumb filesystem, linux, mac and windows can read and write to it, there are some limitations like file permisions or 4GB file size limit.
2) You could make multiple partitions on your external harddrive, one with ntfs (for windows) and the other with some fs that is support natively in Linux and Mac, I believe only option would be HFS+. I'm not an expert, maybe somebody will correct me. Anyway, if you aren't going to connect your disk to Mac, than ext4 would be a good choice. But this approach with two different partions is kind of dumb, because usually you need the same data available on whatever platform.
3) If I were you, I would continue using NTFS or FAT32. It's not ideal, but it's a price you have to pay for dealing with Windows systems.
4) If there is any other smarter solution, I believe somebody will add it to responses bellow.
Anyway, it's weird that your problem persists. There has to be something wrong with your filesystem, otherwise ls wouldn't show you question marks in its output. Did you perform chkdisk via GUI? It has to say that either there wasn't any error with your fs, or that there was some error. We live (unfortunatelly) in binary computer world. I mean you can perform that command from shell, or however microsoft calls it, and if you run it in a proper configuration, it will tell you whether your fs is bad or not and perform needed repairs.
And how to format disk to ext4?
Backup your data, run as root fdisk /dev/yourexthdd (fdisk /dev/sdd), delete all partitions, create new ones, once you are done, write changes down. fdisk is pretty easy to use, don't be afraid of it. Then you have to create filesystem on each partition you created with fdisk, so if you created only one, run mkfs.ext4 /dev/yourexthddwithpartnumber (mkfs.ext4 /dev/sdd1). There are nice articles about doing these things on Arch Wiki (https://wiki.archlinux.org/index.php/File_Systems), don't be worried to read them

Instability and Poor Performance with 11 11/11 and 11.1

I've upgraded an OpenSolaris install to Solaris 11.1 over time and ever since I hit Solaris 11 11/11 and Solaris 11.1 my system has been unstable and slow (especially ZFS and GDM (which I had to disable in 11/11.1 because it was using too much CPU)). Whenever I shut down in Solaris 11/11.1 it causes a kernel panic. I run this command to shutdown:
/usr/sbin/shutdown -y -g 60 -i 5
and it causes this (then the system auto-restarts -- it never completes the shutdown):
TIME UUID SUNW-MSG-ID
Jan 28 2013 23:19:14.682124000 54fbe302-2309-6f14-8d7f-c81e9c3369b7 SUNOS-8000-KL
TIME CLASS ENA
Jan 28 23:18:29.9322 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000
nvlist version: 0
version = 0x0
class = list.suspect
uuid = 54fbe302-2309-6f14-8d7f-c81e9c3369b7
code = SUNOS-8000-KL
diag-time = 1359433153 925385
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
__case_state = 0x1
topo-uuid = 78f32799-20fb-446f-b758-f24f4197b812
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru = sw:///:path=/var/crash/opensolaris/.54fbe302-2309-6f14-8d7f-c81e9c3369b7
resource = sw:///:path=/var/crash/opensolaris/.54fbe302-2309-6f14-8d7f-c81e9c3369b7
savecore-succcess = 0
os-instance-uuid = 54fbe302-2309-6f14-8d7f-c81e9c3369b7
panicstr = deadman: timed out after 120 seconds of clock inactivity
panicstack = fffffffffb9fcc56 () | genunix:cyclic_expire+ac () | genunix:cyclic_fire+76 () | unix:cbe_fire+65 () | unix:av_dispatch_autovect+74 () | unix:dispatch_hilevel+1f () | unix:switch_sp_and_call+13 () | unix:do_interrupt+f2 () | unix:cmnint+ba () | unix:mach_cpu_pause+21 () | unix:cpu_pause+7f () | unix:thread_start+8 () |
crashtime = 1359432916
panic-time = January 28, 2013 11:15:16 PM EST EST
(end fault-list[0])
fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x51074dc2 0x28a862e0
Additionally, I've seen a huge slowdown in ZFS performance (I kept the old boot environments for the previous versions so I went back and pulled these #s using dd after I upgraded to Solaris 11.1):
WRITE:
OpenSolaris SNV134     211 MB/s
Solaris 11 Express     194 MB/s
OpenIndiana 151a7     215 MB/s
Solaris 11 11/11     182 MB/s
Solaris 11.1          150 MB/s
READ:
OpenSolaris SNV134     470 MB/s
Solaris 11 Express     499 MB/s
OpenIndiana 151a7     417 MB/s
Solaris 11 11/11     177 MB/s
Solaris 11.1          276 MB/s
Lastly, there's been a couple times where just running tests on my zfs pool would cause a kernel panic (like dd or bonnie++):
TIME UUID SUNW-MSG-ID
Jan 26 2013 18:40:21.947381000 5a9c2174-51bd-6af5-cda3-ceb12d0591bb SUNOS-8000-KL
TIME CLASS ENA
Jan 26 18:39:33.6420 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000
nvlist version: 0
version = 0x0
class = list.suspect
uuid = 5a9c2174-51bd-6af5-cda3-ceb12d0591bb
code = SUNOS-8000-KL
diag-time = 1359243621 817586
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
__case_state = 0x1
topo-uuid = 08cec1f5-1959-c812-85e3-fa1bb969b7a3
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru = sw:///:path=/var/crash/opensolaris/.5a9c2174-51bd-6af5-cda3-ceb12d0591bb
resource = sw:///:path=/var/crash/opensolaris/.5a9c2174-51bd-6af5-cda3-ceb12d0591bb
savecore-succcess = 0
os-instance-uuid = 5a9c2174-51bd-6af5-cda3-ceb12d0591bb
panicstr = BAD TRAP: type=e (#pf Page fault) rp=fffffffc801bba00 addr=28 occurred in module "zfs" due to a NULL pointer dereference
panicstack = unix:die+105 () | unix:trap+153e () | unix:cmntrap+e6 () | zfs:arc_hash_remove+28 () | zfs:arc_evict_from_ghost+c0 () | zfs:arc_adjust_ghost+4e () | zfs:arc_adjust+51 () | zfs:arc_reclaim_thread+1aa () | unix:thread_start+8 () |
crashtime = 1359232124
panic-time = January 26, 2013 03:28:44 PM EST EST
(end fault-list[0])
fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x51046965 0x3877e308
What could be causing all these issues -- why are the OpenSolaris and Solaris 11 Express installs faster/more stable? Is it a hardware incompatibility issue? How can I determine the root cause and fix it?
Thanks.
Edited by: RavenShadow on Feb 10, 2013 10:17 AM

Alan, if the cause is a 5400 RPM drive, I'm not sure why when I boot into my older BEs for Opensolaris/Solaris 11 Express/OpenIndiana I see much better ZFS performance. After I upgraded to 11.1 I noticed slowness and went back to my old Boot Environments and generated the dd read/write speeds I put in the top post, so it's not like any of the hardware changed during that hour I was bench marking between BEs nor did the amount of space used in my zpool change (it is mostly empty and I deleted everything that I wrote with dd after each test).
echo ::memstat | mdb -k
Page Summary Pages MB %Tot
Kernel 184943 722 18%
ZFS File Data 99400 388 9%
Anon 30595 119 3%
Exec and libs 1434 5 0%
Page cache 6011 23 1%
Free (cachelist) 10061 39 1%
Free (freelist) 715746 2795 68%
Total 1048190 4094
RAM usage doesn't seem that bad.
Edited by: RavenShadow on Feb 13, 2013 3:46 AM

Hard drive array losing access - suspect controller - zfs

i am having a problem with one of my arrays, this is a zfs fielsystem. It consists of a 1x500GB, 2x750GB, and 1x2TB, linear array. the pool is named 'pool'. I have to mention here, i dont have enough hard drive to have a raidz (raid5) setup yet, so there is no actual redundancy to so the zfs cant auto repair itself from a copy because there is none, therefore all auto repair features can be thrown out the door in this equation meaning i believe its possible that the filesystem can easily be corrupted by the controller in this specific case which i suspect. Please keep that in mind while reading the following.
I just upgraded my binaries, therefore i removed zfs-fuse and installed archzfs. did i remove it completely? not sure. i wasnt able to get my array back up and running until i fiddled with the sata cables, moved around the sata connectors, tinkered with bios drive detect. after i got it running, i copied some files off of it from samba thinking it might not last long. the copy was succesfull, but problems began surfacing again shortly after. so now i suspect i have a bad controller on my gigabyte board. I round recently someone else who had this issue so im thinking its not the hard drive.
I did some smartmontools tests last night and found that ll drives are showing good on a short test, they all passed. today im not having so much luck with getting access. there is hangs on reboot, and the drive light stays on. when i try to run zfs and zpool commands its stating the system is hanging. i have been getting what appears as HD errors as well, ill have to manually type them in here since no copy and paste from the console to the maching im posting from, and the errors arent showing up via ssh or i would copy them from my terminal tha ti currently have open to here.
ata7: SRST failed (errno=-16)
reset failed, giving up,
end_request I/O error, dev sdc, sector 637543760
' ' ' ' '''' ' ' ''' sector 637543833
sd 6:0:0:0 got wrong page
' ' ' ' ' '' asking for cache data failed
' ' ' ' ' ' assuming drive cache: write through
info task txg_sync:348 blocked for more than 120 seconds
and so forth, and when i boot i see this each time which is making me feel that the HD is going bad, however i still want to believe its the controller.
Note, it seems only those two sectors show up, is it possible that the controller shot out those two sectors with bad data? {Note, i have had a windows system prior installed on this motherboard and after a few months of running lost a couple raid arrays of data as well.}
failed command: WRITE DMA EXT
... more stuff here...
ata7.00 error DRDY ERR
ICRC ABRT
blah blah blah.
so now i can give you some info from the diagnosis that im doing on it, copied from a shell terminal. Note the following metadata errors JUST appeared after i was trying to delete some files, copying didnt cause this, so it apears either something is currently degrading, or it just inevitably happened from a bad controller
[root@falcon wolfdogg]# zpool status -v
pool: pool
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: resilvered 33K in 0h0m with 0 errors on Sun Jul 21 03:52:53 2013
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 26 0
ata-ST2000DM001-9YN164_W1E07E0G ONLINE 6 41 0
ata-ST3750640AS_5QD03NB9 ONLINE 0 0 0
ata-ST3750640AS_3QD0AD6E ONLINE 0 0 0
ata-WDC_WD5000AADS-00S9B0_WD-WCAV93917591 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<metadata>:<0x1>
<metadata>:<0x14>
<metadata>:<0x15>
<metadata>:<0x16d>
<metadata>:<0x171>
<metadata>:<0x277>
<metadata>:<0x179>
if one of the devices are faulted, then why are they all 4 stating online???
[root@falcon dev]# smartctl -a /dev/sdc
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.9-1-ARCH] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: /6:0:0:0
Product:
User Capacity: 600,332,565,813,390,450 bytes [600 PB]
Logical block size: 774843950 bytes
scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46
scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
my drive list
[root@falcon wolfdogg]# ls -lah /dev/disk/by-id/
total 0
drwxr-xr-x 2 root root 280 Jul 21 03:52 .
drwxr-xr-x 4 root root 80 Jul 21 03:52 ..
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-_NEC_DVD_RW_ND-2510A -> ../../sr0
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-ST2000DM001-9YN164_W1E07E0G -> ../../sdc
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-ST3250823AS_5ND0MS6K -> ../../sdb
lrwxrwxrwx 1 root root 10 Jul 21 03:52 ata-ST3250823AS_5ND0MS6K-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Jul 21 03:52 ata-ST3250823AS_5ND0MS6K-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Jul 21 03:52 ata-ST3250823AS_5ND0MS6K-part3 -> ../../sdb3
lrwxrwxrwx 1 root root 10 Jul 21 03:52 ata-ST3250823AS_5ND0MS6K-part4 -> ../../sdb4
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-ST3750640AS_3QD0AD6E -> ../../sde
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-ST3750640AS_5QD03NB9 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jul 21 03:52 ata-WDC_WD5000AADS-00S9B0_WD-WCAV93917591 -> ../../sda
lrwxrwxrwx 1 root root 9 Jul 21 03:52 wwn-0x5000c50045406de0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jul 21 03:52 wwn-0x50014ee1ad3cc907 -> ../../sda
and this one i dont get
[root@falcon dev]# zfs list
no datasets available
i remember creating a dataset last year, why is it reporting none, but still working
is anybody seeing any patterns here? im prepared to destroy the pool and recreate it just to see if its bad data.But what im thinking to do now is since the problem appears to only be happening on the 2TB drive, either the controller just cant handle it, or the drive is bad. So, to rule out the controller there might be hope. I have a scsi card (pci to sata) connected that one of the drives in the array is connected to since i only have 4 sata slots on the mobo, and i keep the 500GB connected to there and have not yet tried the 2tb there yet. So if i connect this 2TB drive to the scsi i should see the problems disappear, unless the drive got corrupted already.
Does any experience in the arch forums know whats going on here? did i mess up by not completely removing zfs-fuse, is my HD going bad, is my controller bad, or did ZFS just get misconfigured?
Last edited by wolfdogg (2013-07-21 19:38:51)

ok, something interesting happened when i connected it (the badly reacting 2TB drive) to the scsi pci card. first of all no errors on boot.... then take a look at this, some clues to some remanants to the older zfs-fuse setup, and a working pool.
[root@falcon wolfdogg]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
pool 2.95T 636G 23K /pool
pool/backup 2.95T 636G 3.49G /backup
pool/backup/falcon 27.0G 636G 27.0G /backup/falcon
pool/backup/redtail 2.92T 636G 2.92T /backup/redtail
[root@falcon wolfdogg]# zpool status
pool: pool
state: ONLINE
status: The pool is formatted using a legacy on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on software that does not support
feature flags.
scan: resilvered 33K in 0h0m with 0 errors on Sun Jul 21 04:52:52 2013
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
ata-ST2000DM001-9YN164_W1E07E0G ONLINE 0 0 0
ata-ST3750640AS_5QD03NB9 ONLINE 0 0 0
ata-ST3750640AS_3QD0AD6E ONLINE 0 0 0
ata-WDC_WD5000AADS-00S9B0_WD-WCAV93917591 ONLINE 0 0 0
errors: No known data errors
am i looking at a bios update needed here so the controller can talk to the 2TB properly?
Last edited by wolfdogg (2013-07-21 19:50:18)

Performance issue with brand new intel iMac extreme

I am at a loss to explain a problem I've been having and I thought I might put it out to you guys.
In September I purchased a macbook Pro (2.4 ghz, 4 GB RAM) to use in video editing with Final Cut Pro, and for the most part I've been thrilled. I use 1TB LaCie external drives connected via FW800, and perform Multiclip editing with 4-5 video streams at a time and only on occasion have dropped frames during the editing process.
In December I determined that I needed to have an additional system, and thought a 2.8Ghz Intel iMac extreme would be an excellent choice, since for the same price I could get a little more power in the processor, more hard drive space and a bigger screen to work on. When we picked up the new system in the store (The Grove Apple Store in LA), we had them upgrade the memory to 4GB.
Since day one we have had performance issues, including problems playing streaming and DVD video, severe delays mounting and unmounting drives (firewire and USB) and application images, and freezing while doing even simple tasks like printing or checking email. These problems occur even while there are no external drives are connected. I have none of these issues with the Macbook Pro, which has virtually an identical set of programs installed, and both running the same version of Leopard.
I already took the original iMac back to the store, and they exchanged it, but did not have 4GB sets of RAM in stock so they took the RAM from the original machine and put it in the new one. They said if I continued to have problems then it was most likely the RAM and I should come back when they got more in stock. I DID have the same problems with the new machine, and took it back to the Apple Store and they swapped the memory. It seemed to improve the issue, but now I'm seeing the same severe performance issues again.
All tech support can do is tell me to do a PRAM reset, which seems to improve things very temporarily (but that may be my imagination) or have me restart, which at least has the ability to make the printing of documents capable.
What I'm wondering is if it is likely that the RAM is the issue and I just got another bad batch, or if the iMac has some weird glitch that isn't present in the macbook Pro...?? Or could I have possibly gotten 2 bad systems in a row? It's extremely frustrating, and I KNOW it shouldn't be this way! It's so bad I get better performance out of my single-core G5 tower! How do I get a good working system that operates like it should? Am I better off getting another Macbook Pro? I'd rather not for several reasons...
I have xbench on both the MBP and the iMac and can provide test numbers if they'll help, as well as any other info.
Thank you so much for reading my novella of a post and also for any insight you have!
Best,
Travis

Hi!
I got the same problem with my MacBook when it still was new in may 2006. It was supposed to be one of the faster Laptops around but it was soooo slow it drove me nuts. I can only advise to have a look if there is something hugging up your RAM and run some tests using these programs on your machine:
Xbench:
http://www.macupdate.com/info.php/id/10081
MenuMeters:
http://www.macupdate.com/info.php/id/10451
If they show any unusual results you might have your problem...
As to my problem with the MacBook: I did a complete re-install (writing the harddisk over with zeroes) and suddenly everything was just fine. (But be sure to back all your files before that, I learned this one the hard way.) I know it is just a standard answer, but it worked out for me this time...
Hope this helps in some ways.
Cheers,
Rene

User exit / substiution / badi for changing baseline date

Dear Experts ,
                       I have a requirement to change the base line date of residual document created in F-28 / F-32 to the base line date of original document getting partially cleared .
I have explored the option of substitution but it doesnt work as field ZBLDT is not available for substitution there .
Please let me know any BADI / Exit which can perform this change of baseline date .
thanks in advance

Hi Milind,
Following are the user-exits for F-32 :
RFAVIS01            Customer Exit for Changing Payment Advice Segment Text
RFEPOS00            Line item display: Checking of selection conditions
RFKORIEX            Automatic correspondence
SAPLF051            Workflow for FI (pre-capture, release for payment)
F050S001            FIDCMT, FIDCC1, FIDCC2: Edit user-defined IDoc segment
F050S002            FIDCC1: Change IDoc/do not send
F050S003            FIDCC2: Change IDoc/do not send
F050S004            FIDCMT, FIDCC1, FIDCC2: Change outbound IDoc/do not send
F050S005            FIDCMT, FIDCC1, FIDCC2 Inbound IDoc: Change FI document
F050S006            FI Outgoing IDoc: Reset Clearing in FI Document
F050S007            FIDCCH Outbound: Influence on IDoc for Document Change
F180A001            Balance Sheet Adjustment
FARC0002            Additional Checks for Archiving MM Vendor Master Data
I hope this will help you.
Regards,
Nitin.

What is bad # for 'Space Allocated/Used" and 'ITL Waits' in V$SEGMENT_STATS

I ran a query against the V$SEGMENT_STATISTICS view today and got some possibly disturbing numbers. Can some one let me know if they are bad or if I just reading to much into them.
DB has been up since 1/10/2011 so they represent the stats since then. DB size is 3TB
OBJECT_NAME     OBJECT_TYPE     STATISTIC_NAME     VALUE
XXPK0EMIANCE     INDEX     space allocated     27,246,198,784
ITEMINTANCE     TABLE     space allocated     22,228,762,624
LITEMINSTANCE     TABLE     space used     19,497,901,889
XXPK0TEMINSTANCE     INDEX     space used     17,431,957,592
TTINGCORE     TABLE     space allocated     8,724,152,320
XXPK0IANCE     INDEX     space allocated     6,912,212,992
SKISTANCE     TABLE     space allocated     4,697,620,480
IIXCNSTANCE     TABLE     space allocated     4,697,620,480
on the XXPK0EMIANCE index the inital extent is 64k
XXPK0MINSTANCE     INDEX     ITL waits     1,123
XXIEKILSTANCE     INDEX     ITL waits     467
XXPKLINSTANCE     INDEX     ITL waits     463
XXPKCE     INDEX     ITL waits     338
XXIE3ENT     INDEX     ITL waits     237
If these are bad do they impact performance? My understanding is that being wait states, things stop until they are resolved. Is that true.
Also these looked high, are they?
LATION_PK     INDEX     logical reads     242,212,503,104
XXAK1STSCORE     INDEX     logical reads     117,542,351,984
XXPK0TSTANCE     INDEX     logical reads     113,532,240,160
TCORE     TABLE     db block changes 1,913,902,176
SDENT     TABLE     physical reads     72,161,312
XXPK0PDUCT     INDEX     segment scans     35,268,027
ESTSORE     TABLE     buffer busy waits     2,604,947
XXPK0SUCORE     INDEX     buffer busy waits     119,007
XXPK0INSTANCE     INDEX     row lock waits     63,810
XXPK0EMINSTANCE     INDEX row lock waits     58,129
XXPK0NSTANCE     INDEX     row lock waits     57,776
XXIE2DDSTANCE     INDEX     row lock waits     54,788
XXPK0DDDSTSCORE     INDEX row lock waits     49,167
Am i just reading too much into this? I am not a DBA, our DBA is too busy doing data changes and such to spent time looking at these stuff. I was tasked to try to find out why our DB is so slow.

Statistics on waits and reads are cumulative since the last database instance startup --- which was more than 4 months ago.
So :
XXPK0MINSTANCE INDEX ITL waits 1,1231,123 waits in 4+ months isn't bad.
Reading such statistics without reference to the duration is utterly meaningless.
Those 1,123 waits could have been 10 waits a day @1 every 2 hours.
OR those 1,123 waits could have occurred between 01:00 and 01:30 on 03-May-2011.
We have no way of knowing which is the case.
Hemant K Chitale

Archvie log generation and performance issue

Hi there,
I am facing some problem with archvie log file which is significantly degrading database performance.
Please go thru below details
there are some long running transaction(DML) performed in our database and during this time massive Archive log files are generating,
here the problem comes, This operation is running for about 1 hr and during this time the database performance is very slow even user logging in to the application is taking time.
There is enouch undo tablespace and undo retention configured for the database,
I am not getting why its making such a bad impact on database performance.
----- What could be the reason for this performance degrade -------
----- Is there any way to find which "user session" and "transaction" are generating too many archive log file -----
Your quick response will be highly appriciated

To resolve your problem with performance degradation first thing to do is to collect more information during performance degradation.
You can do that running AWR or statspack reports during specified time (as it is said in post before) or checking tables like v$session_wait or v$system_event. Then search in report where are you losing your time or find expensive queries.
Run AWR or statspack reports and post information about wait events and then you will probably get more precise help. You can also post information about Oracle version, host, optimizer parameters or similar relevant information.
The more information you provide, the better help you'll get.
btw
Do you receive "checkpoint not complete" in alert log during excessive redo generation?
You can also check if application can reduce redo generation using 'nologging'. If you have transaction that deletes whole table, you could use truncate instead.
Regards,
Marko Sutic
Edited by: msutic on Mar 1, 2009 12:11 AM

The warnings has impacts on the performance

My project's JDK upgrades from 1.4 to 5.0. As the programs are still in the 1.4 standard, lots of warnings appears such as variables define...
It makes my project's performance very bad. With 1.4, my project can complete processing in 15 minutes, but now it needs over 20 minuts with jdk5.0. When I use @suppress warning to ignore those wranings, the performance improves to 14 minutes.
Why the warnings have bad influence on the performance? will the JDK write log if there are warning???

Fredo_xia wrote:
My project's JDK upgrades from 1.4 to 5.0. As the programs are still in the 1.4 standard, lots of warnings appears such as variables define...
It makes my project's performance very bad. With 1.4, my project can complete processing in 15 minutes, but now it needs over 20 minuts with jdk5.0. When I use @suppress warning to ignore those wranings, the performance improves to 14 minutes.
Why the warnings have bad influence on the performance? will the JDK write log if there are warning???Well, as stated above, that is only compile time, not execution time. I would, however, say to fix those warnings, not just suppress them. But that's your choice.

Bad NTFS on ZFS performance

Similar Messages

Maybe you are looking for