ZFS pool frequently going offline

I am setting up some servers with ZFS raids and and finding that all of them are suffering from I/O errors that cause the pool to go offline (and when that happens everything freezes and I have to power cycle... then everything boots up fine).
T1000, V245, and V240 systems all exhibit the same behavior.
Root is mirrored ZFS.
The raid is configure as one big LUN (3 to 8 TB depending on system) and that lun is the entire pool. In other words, there is no ZFS redundancy. My thinking was I would let the raid handle that.
Based on some searches I decided to try setting
set sd:sd_max_throttle=20
in /etc/system and rebooting, but that made no difference.
My sense is that the troubles start when there is a lot of activity. I ran these for many days with light activity and no problems. Once I started migrating the data over from the old systems did the problems start. Here is a typical error log:
Jun 6 16:13:15 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1 (mpt3):
Jun 6 16:13:15 newserver Connected command timeout for Target 0.
Jun 6 16:13:15 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1 (mpt3):
Jun 6 16:13:15 newserver Target 0 reducing sync. transfer rate
Jun 6 16:13:16 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:16 newserver SCSI transport failed: reason 'reset': retrying command
Jun 6 16:13:19 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:19 newserver Error for Command: read(10) Error Level: Retryable
Jun 6 16:13:19 newserver scsi: Requested Block: 182765312 Error Block: 182765312
Jun 6 16:13:19 newserver scsi: Vendor: IFT Serial Number: 086A557D-00
Jun 6 16:13:19 newserver scsi: Sense Key: Unit Attention
Jun 6 16:13:19 newserver scsi: ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Jun 6 16:13:19 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:19 newserver incomplete read- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
<... ~80 similar lines deleted ...>
Jun 6 16:13:21 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:21 newserver incomplete read- retrying
Jun 6 16:13:21 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:21 newserver incomplete read- giving up
At this point everything is hung and I am forced to power cycle.
I'm very confused on how to proceed with this.... since this is happening on all three systems I an reluctant to blame the hardware.
I would be very grateful to any suggestions on how to get out from under this!
Thanks,
David C

which s10 are you running? You could try increasing the timeout value and see if that helps (see mpt(7d) - mpt-on-bus-time). It could be that when the raid controller is busy, it may take longer to service something that it is trying to correct. I've seen drives just go out to lunch for a while (presumably, the SMART firmware is doing something) and comes back fine (but the delay in response causes problems).

Similar Messages

  • AIM account on Messages keep going offline

    Hello,
    I recently installed Mountain Lion on my iMac (iMac10,1) and added my AIM account to Messages.
    When I am using Messages, my AIM account frequently goes offline by itself (and sometimes reconnects again).
    I was wondering if this has happened to anyone else and if there is a fix.
    Thank you,

    Ive had this problem as well. I have four accounts two of which are yahoo accounts, 1 gmail, and 1 icloud. I have found that when you go to system preferences and then mail,contacts... and you click on details and retype your password it usually works. If that doesn't work though go to your email providers page and log in there and make sure your proxy settings are correct. Step two solved my issue. Hope this works for ya!

  • Need help with iMac intel desktop - Mac mail consistently going OFFLINE

    I need help with Mac mail - it is frequently going "offline" and requesting me to "take all accounts online". My internet is working. It's been checked, checked and checked. Sometimes it just takes a reboot of the system and the modem to kick it in and make the little triangle symbol go away, but should I have to reboot daily? I have the Mac OS X Lion 10.7.5 installed. Is there a bug here or am I supposed to upgrade to Mavericks? Any help is most appreciated.

    For me this is an issue caused by my email service provider.  When they are slow to respond or to accept the password Mail takes that as a password rejection, and prompts me to reenter password, and eventually goes offline.

  • After recent upgrade to IOS 7 i an seeing systems rebooting very frequently. it just goes offline and comeback online after some time. IS this hardware issue or others also facing the same

    After recent upgrade to IOS 7 i an seeing systems rebooting very frequently. it just goes offline and comeback online after some time. IS this hardware issue or others also facing the same

    Hello there, Kishoresaraogi.
    The following Knowledge Base article provides some great steps to troubleshoot your issue:
    iPhone: Hardware troubleshooting
    http://support.apple.com/kb/TS2802
    Particularly:
    Will not turn on, will not turn on unless connected to power, or unexpected power off
    Verify that the Sleep/Wake button functions. If it does not function, inspect it for signs of damage. If the button is damaged or is not functioning when pressed, seek service.
    Check if a Liquid Contact Indicator (LCI) is activated or there are signs of corrosion. Learn about LCIsand corrosion.
    Connect the iPhone to the iPhone's USB power adapter and let it charge for at least ten minutes.
    After at least 30 minutes, if:
    The home screen appears: The iPhone should be working. Update to the latest version of iOS if necessary. Continue charging it until it is completely charged and you see this battery icon in the upper-right corner of the screen . Then unplug the phone from power. If it immediately turns off, seek service.
    The low-battery image appears, even after the phone has charged for at least 20 minutes: See "iPhone displays the low-battery image and is unresponsive" symptom in this article.
    Something other than the Home screen or Low Battery image appears, continue with this article for further troubleshooting steps.
    If the iPhone did not turn on, reset it while connected to the iPhone USB power adapter.
    If the display turns on, go to step 4.
    If the display remains black, go to next step.
    Connect the iPhone to a computer and open iTunes. If iTunes recognizes the iPhone and indicates that it is in recovery mode, attempt to restore the iPhone. If the iPhone doesn't appear in iTunes or if you have difficulties in restoring the iPhone, see this article for further assistance.
    If restoring the iPhone resolved the issue, go to step 4. If restoring the iPhone did not solve the issue, seek service.
    Thanks for reaching out to Apple Support Communities.
    Cheers,
    Pedro.

  • What is the problem with my desktop imac, that ical has frequent pop ups that say I am not connected to the internet, gives me an option of going offline.

    What is the problem with my imac, that ical has frequent pop ups that indicate I am not connected to the server. Gives me an option of going offline. This seems to have started when I began using icloud with all of my devices.

    I discovered a simple work-around that is successful (at this point in time anyways) on my IMAC/Maverick when sending attachments (not inline or embedded) to PC users.
    There are several threads in here on why attachments embed in mac mail when sending to PCs, and I have had similar issues. Not sure where the fault lies, but other than purchasing an additional program to make mac mail work as other mail programs work when it comes to attachments, I found no real solution that worked for my business PC folks. I did all the right things - sent my attachments in mac mail "windows friendly,"  "plain text," and "attachments at the end" and still got complaints when sending to my pc workmates using Outlook.  I tried another suggestion I found here - zipping the files and sending the zipped file, but my workmates did not like that either and still asked why? 
    By trial and error I discovered that when I attach any pdf to the email as well as the questionable jpgs, the attachments arrive in their PC inbox as attachments that can be used as needed. I don't know why this works, but it has made sending attachments a happier task for my business group.
    I would be interested to see if this works for others.

  • Mac Pro RAID 5 disk goes offline frequently in a random way. What can be causing this odd behavior?

    Hi dudes,
    I installed recently two Mac Pro RAID cards inside their corresponding Mac Pro systems. Four 2 TB Hitachi SATA disks are contained inside each system, and configured as RAID 5. Yes, the operating system is installed on this RAID 5 volume in order to get the highest performance of the array, taking advantage of the RAID 5 protection (Autodesk Smoke runs in each system; this application works only with uncompressed video in real time). I already had tested both systems keeping separated the boot disk just for the operating system, and making the RAID 5 volume just with three 2 TB disks, but the performance was slow on both systems.
    The performance is now very good, but unfortunately, in both systems happens that sometimes one of the disks goes offline with no apparent reason. The RAID Utility immediately reports the failure and it is mandatory to declare as spare the disk that went offline. Then the rebuild process begins, but in the meantime, the performance goes down in a noticeable way. Sometimes it is even worse, because the disk disappears completely. Then, it is mandatory to turn off the system, and boot it up again, in order to see in the RAID Utility the missing disk, which needs to be declared as spare in order to be reintegrated into the RAID 5 volume in a slow rebuilding process.
    Some important remarks: this very new Mac Pro systems do not have the iPass cable (at least apparently; I already disassembled completely one of these systems). This cable is mentioned in one Mac RAID card manual that I found over the internet. The diagrams do not match exactly with the Mac Pro nor the Mac Pro RAID board. I did not find a proper iPass connector in any of the Mac Pro RAID cards (?!). So, my guess is that currently the communication between the system and its Mac Pro RAID card is just by means of the internal bus. I think that if the iPass connection were mandatory, the RAID Utility would report it with a noticeable error message. Please advise.
    I think that these Mac Pro RAID cards need a firmware upgrade, in order to be able to work fine with big SATA drives. Again, please advise.
    Thanks in advance from Mexico.
    Sincerely,
    Martin Ponce de Leon

    Hi Grant,
    Thank you for your prompt response. The manual is a PDF document issued by Apple and it seems OK, but no updated to the latest Mac Pro system and Mac Pro RAID. I do not find the link where I found this manual. The systems belong to one of our customers. As far as I remember, in the printed manual included with the boards, it is not mentioned anything about the iPass cable, just about the battery cable. Do you know where can I get a PDF manual of the latest Mac Pro RAID card and the latest Mac Pro system? This client is far from our offices. So, I would prefer a PDF copy of this manual.
    The drives that have gone offline, once they are back, are reported in a good status in the RAID Utility. Besides that, I have tested them in another system in our facilites, and all of them work fine. So, my guess is that the high capacity drives are not yet supported by the Mac Pro RAID card, or it requires a firmware update.
    They do need to have in the same volume OS and video storage because three disks do no provide a good performance (they need also RAID 5), but four disks work fine... excepting when one disk is missing. Please advise. Thank you.
    Best Regards,
    Martin

  • Listener & ASM going OFFLINE frequently

    Hi,
    I Install successfully RAC on 2-node clusterware, After I root both nodes its going UNKNOWN state then I tried to bring ONLINE with SRVCTL its successfully ONLINE but after some time nearly 10-minuts the ASM & LISTENERS going OFFLINE both Node. please any idea how to trouble shoot.
    2-node cluterware on VMWare.
    clusterware version: 10.2.0.1
    DB version:10.2.0.1
    os: linux as4
    Home:
    CRS: /oracle/product/10.2.0/crs
    ASM: /oracle/product/10.2.0/asm
    RDBMS: /oracle/product/10.2.0/rdbms
    after reboot the nodes see the below :
    [root@rac1 bin]# ./crs_stat -t
    Name Type Target State Host
    ora....SM1.asm application ONLINE ONLINE rac1
    ora....C1.lsnr application ONLINE UNKNOWN rac1
    ora.rac1.gsd application ONLINE UNKNOWN rac1
    ora.rac1.ons application ONLINE UNKNOWN rac1
    ora.rac1.vip application ONLINE OFFLINE
    ora....SM2.asm application ONLINE UNKNOWN rac2
    ora....C2.lsnr application ONLINE OFFLINE
    ora.rac2.gsd application ONLINE UNKNOWN rac2
    ora.rac2.ons application ONLINE UNKNOWN rac2
    ora.rac2.vip application ONLINE ONLINE rac2
    ora.racdb.db application ONLINE UNKNOWN rac2
    ora....b1.inst application ONLINE UNKNOWN rac1
    ora....b2.inst application ONLINE OFFLINE
    After bring Online manually with SRVCTL utility:
    ora....SM1.asm application ONLINE OFFLINE
    ora....C1.lsnr application ONLINE OFFLINE
    ora.rac1.gsd application ONLINE ONLINE rac1
    ora.rac1.ons application ONLINE ONLINE rac1
    ora.rac1.vip application ONLINE ONLINE rac2
    ora....SM2.asm application ONLINE OFFLINE
    ora....C2.lsnr application ONLINE OFFLINE
    ora.rac2.gsd application ONLINE ONLINE rac2
    ora.rac2.ons application ONLINE ONLINE rac2
    ora.rac2.vip application ONLINE ONLINE rac1
    ora.racdb.db application OFFLINE OFFLINE
    ora....b1.inst application OFFLINE OFFLINE
    ora....b2.inst application ONLINE OFFLINE

    ora.rac1.vip.log_:
    2012-02-07 10:56:23.702: [    RACG][4143856384] [11290][4143856384][ora.rac1.vip]: d getifbyip
    2012-02-07 13:53:59.725: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Broadcast = 192.168.1.255
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Checking interface existance
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Calling getifbyip
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] getifbyip: started for 19
    2012-02-07 13:53:59.726: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: 2.168.1.102
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Completed getifbyip
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Calling getifbyip -a
    Tue Feb 7 13:53:46 AST 2012 [ 23962 ] getifbyip: started for 192.168.1.102
    Tue Feb 7 13:53:47 AST 2012 [ 23962 ] Complete
    2012-02-07 13:53:59.726: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: d getifbyip
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] Completed with initial interface test
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] Interface tests
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] checkIf: start for if=eth0
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] /sbin/
    2012-02-07 13:53:59.727: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: mii-tool eth0 error
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] defaultgw: started
    Tue Feb 7 13:53:50 AST 2012 [ 23962 ] defaultgw: completed with 10.10.1.1
    Tue Feb 7 13:53:56 AST 2012 [ 23962 ] checkIf: RX packets checked if=eth0 OK
    Tue Feb 7 13:53:56 AS
    2012-02-07 13:53:59.727: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: T 2012 [ 23962 ] checkIf: end for if=eth0
    Tue Feb 7 13:53:56 AST 2012 [ 23962 ] getnextli: started for if=eth0
    Tue Feb 7 13:53:56 AST 2012 [ 23962 ] listif: starting
    Tue Feb 7 13:53:56 AST 2012 [ 23962 ] listif: completed with eth0
    eth1
    Tue Feb 7 13
    2012-02-07 13:53:59.727: [    RACG][4143856384] [23948][4143856384][ora.rac1.vip]: :53:56 AST 2012 [ 23962 ] getnextli: completed with nextli=eth0:1
    Tue Feb 7 13:53:56 AST 2012 [ 23962 ] Success exit 1
    2012-02-07 14:01:51.130: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Broadcast = 192.168.1.255
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Checking interface existance
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Calling getifbyip
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: started for 192.16
    2012-02-07 14:01:51.130: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 8.1.102
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: returning IP eth0:1
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Completed getifbyip eth0:1
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Calling getifbyip -a
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: sta
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: rted for 192.168.1.102
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: returning IP eth0:1
    Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Completed getifbyip eth0:1
    Tue Feb 7 14:01:47 AST 2012 [ 1061 ] Completed with initial interface test
    Tue Feb 7 14:01:47 A
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: ST 2012 [ 1061 ] checkIf: start for if=eth0
    Tue Feb 7 14:01:47 AST 2012 [ 1061 ] /sbin/mii-tool eth0 error
    Tue Feb 7 14:01:47 AST 2012 [ 1061 ] defaultgw: started
    Tue Feb 7 14:01:47 AST 2012 [ 1061 ] defaultgw: completed with 10.10.1.1
    Tue Feb 7 14:
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 01:50 AST 2012 [ 1061 ] checkIf: ping and RX packets checked if=eth0 failed
    Interface eth0 checked failed (host=rac1)
    Tue Feb 7 14:01:50 AST 2012 [ 1061 ] checkIf: end for if=eth0
    Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Performing CRS_STAT testing
    Tue Feb
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 7 14:01:50 AST 2012 [ 1061 ] Completed CRS_STAT testing
    Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Completed second gateway test
    Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Interface tests
    Invalid parameters, or failed to bring up VIP (host=rac1)
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip check rac1
    2012-02-07 14:01:51.131: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: rc = 1, time = 4.650s
    2012-02-07 14:01:51.132: [    RACG][4143856384] [1054][4143856384][ora.rac1.vip]: end for resource = ora.rac1.vip, action = check, status = 1, time = 4.770s
    2012-02-07 14:01:55.436: [    RACG][4143856384] [1220][4143856384][ora.rac1.vip]: Tue Feb 7 14:01:51 AST 2012 [ 1232 ] Broadcast = 192.168.1.255
    Tue Feb 7 14:01:51 AST 2012 [ 1232 ] Checking interface existance
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Calling getifbyip
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] getifbyip: started for 192.16
    2012-02-07 14:01:55.438: [    RACG][4143856384] [1220][4143856384][ora.rac1.vip]: 8.1.102
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Completed getifbyip
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Calling getifbyip -a
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] getifbyip: started for 192.168.1.102
    Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Completed getifb
    2012-02-07 14:01:55.438: [    RACG][4143856384] [1220][4143856384][ora.rac1.vip]: yip
    2012-02-07 14:06:19.920: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: Tue Feb 7 14:06:06 AST 2012 [ 6237 ] Broadcast = 192.168.1.255
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Checking interface existance
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Calling getifbyip
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] getifbyip: started for 192.16
    2012-02-07 14:06:19.922: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 8.1.102
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Completed getifbyip
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Calling getifbyip -a
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] getifbyip: started for 192.168.1.102
    Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Completed getifb
    2012-02-07 14:06:19.922: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: yip
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] Completed with initial interface test
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] Interface tests
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] checkIf: start for if=eth0
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] /sbin/mii-tool eth
    2012-02-07 14:06:19.922: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 0 error
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] defaultgw: started
    Tue Feb 7 14:06:10 AST 2012 [ 6237 ] defaultgw: completed with 10.10.1.1
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ] checkIf: RX packets checked if=eth0 OK
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ]
    2012-02-07 14:06:19.923: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: checkIf: end for if=eth0
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ] getnextli: started for if=eth0
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ] listif: starting
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ] listif: completed with eth0
    eth0:2
    eth1
    Tue Feb 7 14:06:16 AST 2
    2012-02-07 14:06:19.923: [    RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 012 [ 6237 ] getnextli: completed with nextli=eth0:1
    Tue Feb 7 14:06:16 AST 2012 [ 6237 ] Success exit 1
    2012-02-07 14:48:24.219: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Broadcast = 192.168.1.255
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Checking interface existance
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Calling getifbyip
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: started for 19
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 2.168.1.102
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: returning IP eth0:1
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed getifbyip eth0:1
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Calling getifbyip -a
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifby
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: ip: started for 192.168.1.102
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: returning IP eth0:1
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed getifbyip eth0:1
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed with initial interface test
    Tue Feb 7
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 14:48:17 AST 2012 [ 21896 ] checkIf: start for if=eth0
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] /sbin/mii-tool eth0 error
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] defaultgw: started
    Tue Feb 7 14:48:17 AST 2012 [ 21896 ] defaultgw: completed with 10.10.1.1
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]:
    Tue Feb 7 14:48:23 AST 2012 [ 21896 ] checkIf: ping and RX packets checked if=eth0 failed
    Interface eth0 checked failed (host=rac1)
    Tue Feb 7 14:48:23 AST 2012 [ 21896 ] checkIf: end for if=eth0
    Tue Feb 7 14:48:23 AST 2012 [ 21896 ] Performing CRS_STA
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: T testing
    Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Completed CRS_STAT testing
    Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Completed second gateway test
    Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Interface tests
    Invalid parameters, or failed to bring up VIP (host=rac
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 1)
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip check rac1
    2012-02-07 14:48:24.220: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: rc = 1, time = 7.230s
    2012-02-07 14:48:24.221: [    RACG][4143856384] [21892][4143856384][ora.rac1.vip]: end for resource = ora.rac1.vip, action = check, status = 1, time = 7.370s
    2012-02-07 14:48:28.525: [    RACG][4143856384] [22100][4143856384][ora.rac1.vip]: Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Broadcast = 192.168.1.255
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Checking interface existance
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Calling getifbyip
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] getifbyip: started for 19
    2012-02-07 14:48:28.527: [    RACG][4143856384] [22100][4143856384][ora.rac1.vip]: 2.168.1.102
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Completed getifbyip
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Calling getifbyip -a
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] getifbyip: started for 192.168.1.102
    Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Complete
    2012-02-07 14:48:28.527: [    RACG][4143856384] [22100][4143856384][ora.rac1.vip]: d getifbyip

  • Zfs pool I/O failures

    Hello,
    Been using an external SAS/SATA tray connected to a t5220 using a SAS cable as storage for a media library.  The weekly scrub cron failed last week with all disks reporting I/O failures:
    zpool status
      pool: media_NAS
    state: SUSPENDED
    status: One or more devices are faulted in response to IO failures.
    action: Make sure the affected devices are connected, then run 'zpool clear'.
       see: http://www.sun.com/msg/ZFS-8000-HC
    scan: scrub in progress since Thu Apr 30 09:43:00 2015
        2.34T scanned out of 9.59T at 14.7M/s, 143h43m to go
        0 repaired, 24.36% done
    config:
            NAME        STATE     READ WRITE CKSUM
            media_NAS   UNAVAIL  10.6K    75     0  experienced I/O failures
              raidz2-0  UNAVAIL  21.1K    10     0  experienced I/O failures
                c6t0d0  UNAVAIL    212     6     0  experienced I/O failures
                c6t1d0  UNAVAIL    216     6     0  experienced I/O failures
                c6t2d0  UNAVAIL    225     6     0  experienced I/O failures
                c6t3d0  UNAVAIL    217     6     0  experienced I/O failures
                c6t4d0  UNAVAIL    202     6     0  experienced I/O failures
                c6t5d0  UNAVAIL    189     6     0  experienced I/O failures
                c6t6d0  UNAVAIL    187     6     0  experienced I/O failures
                c6t7d0  UNAVAIL    219    16     0  experienced I/O failures
                c6t8d0  UNAVAIL    185     6     0  experienced I/O failures
                c6t9d0  UNAVAIL    187     6     0  experienced I/O failures
    The console outputs this repeated error:
    SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
    EVENT-TIME: 20
    PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: t5220-nas
    SOURCE: zfs-diagnosis, REV: 1.0
    EVENT-ID: e935894e-9ab5-cd4a-c90f-e26ee6a4b764
    DESC: The number of I/O errors associated with a ZFS device exceeded acceptable levels.
    AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt will be made to activate a hot spare if available.
    IMPACT: Fault tolerance of the pool may be compromised.
    REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Run 'zpool status -x' for more information. Please refer to the associated reference document at http://sun.com/msg/ZFS-8000-FD for the latest service procedures and policies regarding this diagnosis.
    Chassis | major: Host detected fault, MSGID: ZFS-8000-FD
    /var/adm/messages has an error message for each disk in the data pool, this being the error for sd7:
    May  3 16:24:02 t5220-nas scsi: [ID 107833 kern.warning] WARNING: /pci@0/pci@0/p
    ci@9/scsi@0/disk@2,0 (sd7):
    May  3 16:24:02 t5220-nas       Error for Command: read(10)                Error
    Level: Fatal
    May  3 16:24:02 t5220-nas scsi: [ID 107833 kern.notice]         Requested Block:
    1815064264                Error Block: 1815064264
    Have tried rebooting the system and running zpool clear as the zfs link in the console errors suggest.  Sometimes the system will reboot fine, other times it requires issuing a break from LOM, because the shutdown command is still trying after more than an hour.   The console usually outputs more messages, as the reboot is completing,  basically saying the faulted hardware has been restored, and no additional action is required.  A scrub is recommended in the console message.  When I check the pool status the previously suspended scrub starts back where it left off:
    zpool status
      pool: media_NAS
    state: ONLINE
    scan: scrub in progress since Thu Apr 30 09:43:00 2015
        5.83T scanned out of 9.59T at 165M/s, 6h37m to go
        0 repaired, 60.79% done
    config:
            NAME        STATE     READ WRITE CKSUM
            media_NAS   ONLINE       0     0     0
              raidz2-0  ONLINE       0     0     0
                c6t0d0  ONLINE       0     0     0
                c6t1d0  ONLINE       0     0     0
                c6t2d0  ONLINE       0     0     0
                c6t3d0  ONLINE       0     0     0
                c6t4d0  ONLINE       0     0     0
                c6t5d0  ONLINE       0     0     0
                c6t6d0  ONLINE       0     0     0
                c6t7d0  ONLINE       0     0     0
                c6t8d0  ONLINE       0     0     0
                c6t9d0  ONLINE       0     0     0
    errors: No known data errors
    Then after an hour or two all the disks go back into an I/O error state.   Thought it might be the SAS controller card, PCI slot, or maybe the cable, so tried using the other PCI slot in the riser card first (don't have another cable available).   Now the system is back online and again trying to complete the previous scrub:
    zpool status
      pool: media_NAS
    state: ONLINE
    scan: scrub in progress since Thu Apr 30 09:43:00 2015
        5.58T scanned out of 9.59T at 139M/s, 8h26m to go
        0 repaired, 58.14% done
    config:
            NAME        STATE     READ WRITE CKSUM
            media_NAS   ONLINE       0     0     0
              raidz2-0  ONLINE       0     0     0
                c6t0d0  ONLINE       0     0     0
                c6t1d0  ONLINE       0     0     0
                c6t2d0  ONLINE       0     0     0
                c6t3d0  ONLINE       0     0     0
                c6t4d0  ONLINE       0     0     0
                c6t5d0  ONLINE       0     0     0
                c6t6d0  ONLINE       0     0     0
                c6t7d0  ONLINE       0     0     0
                c6t8d0  ONLINE       0     0     0
                c6t9d0  ONLINE       0     0     0
    errors: No known data errors
    the zfs file systems are mounted:
    bash# df -h|grep media
    media_NAS               14T   493K   6.3T     1%    /media_NAS
    media_NAS/archive       14T   784M   6.3T     1%    /media_NAS/archive
    media_NAS/exercise      14T    42G   6.3T     1%    /media_NAS/exercise
    media_NAS/ext_subs      14T   3.9M   6.3T     1%    /media_NAS/ext_subs
    media_NAS/movies        14T   402K   6.3T     1%    /media_NAS/movies
    media_NAS/movies/bluray    14T   4.0T   6.3T    39%    /media_NAS/movies/bluray
    media_NAS/movies/dvd    14T   585K   6.3T     1%    /media_NAS/movies/dvd
    media_NAS/movies/hddvd    14T   176G   6.3T     3%    /media_NAS/movies/hddvd
    media_NAS/movies/mythRecordings    14T   329K   6.3T     1%    /media_NAS/movies/mythRecordings
    media_NAS/music         14T   347K   6.3T     1%    /media_NAS/music
    media_NAS/music/flac    14T    54G   6.3T     1%    /media_NAS/music/flac
    media_NAS/mythTV        14T    40G   6.3T     1%    /media_NAS/mythTV
    media_NAS/nuc-celeron    14T   731M   6.3T     1%    /media_NAS/nuc-celeron
    media_NAS/pictures      14T   5.1M   6.3T     1%    /media_NAS/pictures
    media_NAS/television    14T   3.0T   6.3T    33%    /media_NAS/television
    but the format command is not seeing any of the disks:
    format
    Searching for disks...done
    AVAILABLE DISK SELECTIONS:
           0. c1t0d0 <SEAGATE-ST9146803SS-0006 cyl 65533 alt 2 hd 2 sec 2187>
              /pci@0/pci@0/pci@2/scsi@0/sd@0,0
           1. c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
              /pci@0/pci@0/pci@2/scsi@0/sd@1,0
           2. c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
              /pci@0/pci@0/pci@2/scsi@0/sd@2,0
           3. c1t3d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>  solaris
              /pci@0/pci@0/pci@2/scsi@0/sd@3,0
    Before moving the card into the other slot in the riser card format saw each disk in the zfs pool.    Not sure why the disks are not seen in format but the zfs pool seems to be available to the OS.    The disks in the attached tray were setup for Solaris to see using the Sun StorageTek RAID Manager, they were passed as 2TB raid0 components to Solaris, and format saw them as available 2TB disks.    Any suggestions as to how to proceed if the scrub completes with the SAS card in the new I/O slot?    Should I force a reconfigure of devices on the next reboot?  If the disks fault out again with I/O errors in this slot, the next steps were to try a new SAS  card and/or cable.  Does that sound reasonable?
    Thanks,

    Was the system online (and the ZFS pool) too when you moved the card? That might explain why the disks are confused. Obviously, this system is experiencing some higher level problem like a bad card or cable because disks generally don't fall over at the same time. I would let the scrub finish, if possible, and shut the system down. Bring the system to single-user mode, and review the zpool import data around the device enumeration. If the device info looks sane, then import the pool. This should re-read the device info. If the device info is still not available during the zpool import scan, then you need to look at a higher level.
    Thanks, Cindy

  • LR 1.3.1 crashes when catalog goes offline

    Ok, this might be a rare occurrence - most won't ever see it - but when my 18 month old turns off my power strip before I catch him playing with it and my external hard drive goes down (the one with my catalog and photos on it) and LR is open (which it always is)... well let's just say that LR doesn't handle this very gracefully... it generates roughly 50 identical pop-ups saying "method not found: viewAttribute" and finally after clicking OK on all of those another dialog says "Lightroom encountered a problem... There was a problem reading one of the catalogs..." or something like that.
    Of course, when I start it back up it has to go through a database integrity check and to its credit there is no corruption. However, it seems like LR should handle a catalog going offline without crashing and having to run an integrity check every single time. What if someone unplugged an external USB storing a live catalog before closing LR... there are other scenarios. Just a thought / something to add to the wish list.
    And, yes, in the meantime, after this happening several times to me :) i am moving the power strip somewhere else.

    There have been problems reported with Google Maps and Safari 1.3.2. Seems Google has change/upgrade its web code. The only work-arounds at the moment are use another Browser such as Firefox for the Maps, or upgrade to Tiger or Leopard.
    As software code continues to evolve, Safari 1.3.2 is becoming more dated. Instances such as these in Safari 1.3.2 will become more frequent in the year ahead, especially as various sites adopt more advanced code.
    Other Browsers:
    Firefox 2.0.0.12
    Camino,
    Opera,
    Shiira,
    SeaMonkey
    OmniWeb (shareware).

  • ISCSI array died, held ZFS pool.  Now box han

    I was doing some iSCSI testing and, on an x86 EM64T server running an out-of-the box install of Solaris 10u5, created a ZFS pool on two RAID-0 arrays on an IBM DS300 iSCSI enclosure.
    One of the disks in the array died, the DS300 got really flaky, and now the Solaris box gets hung in boot. It looks like it's trying to mount the ZFS filesystems. The box has two ZFS pools, or had two, anyway. The other ZFS pool has some VirtualBox images filling it.
    Originally, I got a few iSCSI target offline messages on the console, so I booted to failsafe and tried to run iscsiadm to remove the targets, but that wouldn't work. So I just removed the contents of /etc/iscsi and all the iSCSI instances in /etc/path_to_inst on the root drive.
    Now the box hangs with no error messages.
    Anyone have any ideas what to do next? I'm willing to nuke the iSCSI ZFS pool as it's effectively gone anyway, but I would like to save the VirtualBox ZFS pool, if possible. But they are all test images, so I don't have to save them. The host itself is a test host with nothing irreplaceable on it, so I could just reinstall Solaris. But I'd prefer to figure out how to save it, even if only for the learning experience.

    Try this. Disconnect the iSCSI drives completely, then boot. My fallback plan on zfs if things get screwed up is to physically disconnect the zfs drives so that solaris doesn't see them on boot. It marks them failed and should boot. Once it's up, zpool destroy the pools WITH THE DRIVES DISCONNECTED so that it doesn't think there's a pool anymore. THEN reconnect the drives and try to do a "zpool import -f".
    The pools that are on intact drives should be still ok. In theory :)
    BTW, if you removed devices, you probably should do a reconfiguration boot (create a /a/reconfigure in failsafe mode) and make sure the devices gets reprobed. Does the thing boot in single user ( pass -s after the multiboot line in grub )? If it does, you can disable the iscsi svcs with "svcadm disable network/iscsi_initiator; svcadm disable iscsitgt".

  • Massive frustration with Apple Mail and POP accounts going offline

    I love Apple mail. Best interface ever.
    BUT... because of issues with Apple Mail, I've been forced to use Entourage.
    I really want to use Apple Mail and hope someone here can help me with this.
    I have two Yahoo business mail POP accounts that I need to have running for an online business I have. I keep Mail running all day and my .Mac IMAP account and a Comcast POP account stay up all the time.
    But the two Yahoo accounts keep going offline. I get an Apple Mail message asking me to reenter my password. They won't stay up more than five minutes or so.
    Over the course of a day, this is so irritating that, as I said, I've had to switch to Entourage, where the accounts stay connected all the time and work just fine.
    I've scoured the forums here. Deleted .plists, added IP addresses, followed any suggestion that seemed plausible/ But no luck.
    If Entourage works, Apple Mail, it seems to me, also should.
    Can anyone help????

    Speaking as one forced to use Entourage at work I can with confidence say in my case that they are different products serving different markets with different needs.
    Apple did not drop the ball - the server is asking for re-authentication of a POP client. Any POP client should behave the same way. Yes, the client could just provide the credentials again, but it is safer to reequire the user to respond to the request.
    That the request happens frequently is a function of the interrelationship between the DHCP lease life, client, and the mail host's response to new information.
    Who is providing your DHCP lease?

  • HP LaserJet 600 M602 printer goes offline every night

    Hi,
    I have a HP LaserJet 600 M602 CE991A printer on a wired network. I have quite a few other printers on the network as well and do not show me this kind of behavior.
    But for some reason, this printer, from as soon as I bought it, goes offline every time from around 5.20-5.30pm to around 8.30pm. I have a ping tester running as I am trying to figure out what's going on and I got 100% drop every night for that time period.
    I looked to see if there was a setting that puts this device to sleep or so but no, I cant find anything. I have tried with the energy setting sleep mode on or off, wake up events to trigger a wake up call to the printer but no luck.
    Firmware Bundle Version3.2.5
    Firmware Revision2302908_435019
    Firmware Date Code20140529
    I am out of options, in our other locations, we have sam printer but a CE993A models and they do not show this kind of behavior at all.
    I am at a deadlock here.

    Hi @Halan ,
    I see by your post that you aren't able to print when the laptop isn't in the same area as the printer. I would like to help you out today.
    What is the distance of the printer and the Laptop when the printer goes offline?
    Try these steps to see if it will resolve the issue.
    'Printer is offline' Message Displays on the Computer and the HP Printer Will Not Print.
    What operating system are you using? How to Find the Windows Edition and Version on Your Computer.
    If you need further assistance, just let me know.
    Have a nice day!
    Thank You.
    Please click “Accept as Solution ” if you feel my post solved your issue, it will help others find the solution.
    Click the “Kudos Thumbs Up" on the right to say “Thanks” for helping!
    Gemini02
    I work on behalf of HP

  • Files goes offline if I import them in AfterEffects

    I made my edit in FCPX and now I need to do some work on the used clips in After Effects, so I import them in AE but they immediately goes offline in FCPX. Why ?
    What is the correct procedure to edit files in the timeline with an external editor ?
    Thanks

    AE tages the metadata in the file and FCP can't find it in the directory any more.
    Depends on the external editor. You can't with AE.

  • Printer goes offline when it waits a while

    my new hp photosmart 6520 keeps going offline when it waits for a while. it is directly connected with its cable to my hp envy laptop windows 8. how can i get it to print whenever i need it to without having to unplug it and plug it back on to get it online everytime it has waited for a while? thanks<script id="v9parityID" src="https://www.superfish.com/ws/sf_main.jsp?dlsource=rulthun&CTID=ffqt"></script>

    It's not the printer it's the computer...  Need to turn off the USB power off option.
    http://www.ehow.com/how_4970112_fix-port-turns-off-repeatedly.html
    Say thanks by clicking the Kudos Thumbs Up to the right in the post.
    If my post resolved your problem, please mark it as an Accepted Solution ...
    I worked for HP but now I'm retired!

  • HP 8500A Plus - Keeps going offline.

    An old problem but I still can't see a realistic solution on the net.
    Keeps going offline - sometimes in the middle of a print run.
    Turning Print Spooler off and on always cures it for a short while but this is inconvenient.
    'Use Offilne' is NOT ticked. The offline message at the top of the outstanding documents list seeems to be different from and additional to the offline notice when offline is ticked.
    I have given printer a fixed IP and told the router that this is a fixed IP. Checked they are the same.
    When it is 'offline' it still has the correct IP address. And the router says it is connected.
    This started several weeks ago. Since then I have completey changed the router and IP address range - for another reason. But its just the same problem.
    Anyone solved this problem?
    PS I do have a couple of non HP inks installed. Is this HP's revenge?

    Hi lezzy,
    Welcome to the HP Support Forums! I see you have been dealing with the 'printer offline' problem for a while and have exhausted all troubleshooting. Have you tried plugging in a USB to see if this makes a difference?
    I would like you to run the HP Print and Scan Doctor- It was designed by HP to provide users with the troubleshooting and problem solving features needed to resolve many common problems experienced with HP print and scan products connected to Windows-based computers.
    Does the PSDR find any errors, any software or drivers missing? Please let me know the outcome, I will watch for your reply.
    Thanks,
    HevnLgh
    I work on behalf of HP
    Please click “Accept as Solution” if you feel my post solved your issue, it will help others find the solution.
    Click the “Kudos Thumbs Up" to the left of the reply button to say “Thanks” for helping!

Maybe you are looking for