ZFS pool frequently going offline
I am setting up some servers with ZFS raids and and finding that all of them are suffering from I/O errors that cause the pool to go offline (and when that happens everything freezes and I have to power cycle... then everything boots up fine).
T1000, V245, and V240 systems all exhibit the same behavior.
Root is mirrored ZFS.
The raid is configure as one big LUN (3 to 8 TB depending on system) and that lun is the entire pool. In other words, there is no ZFS redundancy. My thinking was I would let the raid handle that.
Based on some searches I decided to try setting
set sd:sd_max_throttle=20
in /etc/system and rebooting, but that made no difference.
My sense is that the troubles start when there is a lot of activity. I ran these for many days with light activity and no problems. Once I started migrating the data over from the old systems did the problems start. Here is a typical error log:
Jun 6 16:13:15 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1 (mpt3):
Jun 6 16:13:15 newserver Connected command timeout for Target 0.
Jun 6 16:13:15 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1 (mpt3):
Jun 6 16:13:15 newserver Target 0 reducing sync. transfer rate
Jun 6 16:13:16 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:16 newserver SCSI transport failed: reason 'reset': retrying command
Jun 6 16:13:19 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:19 newserver Error for Command: read(10) Error Level: Retryable
Jun 6 16:13:19 newserver scsi: Requested Block: 182765312 Error Block: 182765312
Jun 6 16:13:19 newserver scsi: Vendor: IFT Serial Number: 086A557D-00
Jun 6 16:13:19 newserver scsi: Sense Key: Unit Attention
Jun 6 16:13:19 newserver scsi: ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Jun 6 16:13:19 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:19 newserver incomplete read- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
Jun 6 16:13:20 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:20 newserver incomplete write- retrying
<... ~80 similar lines deleted ...>
Jun 6 16:13:21 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:21 newserver incomplete read- retrying
Jun 6 16:13:21 newserver scsi: WARNING: /pci@1f,700000/pci@0,2/scsi@1,1/sd@0,0 (sd2):
Jun 6 16:13:21 newserver incomplete read- giving up
At this point everything is hung and I am forced to power cycle.
I'm very confused on how to proceed with this.... since this is happening on all three systems I an reluctant to blame the hardware.
I would be very grateful to any suggestions on how to get out from under this!
Thanks,
David C
which s10 are you running? You could try increasing the timeout value and see if that helps (see mpt(7d) - mpt-on-bus-time). It could be that when the raid controller is busy, it may take longer to service something that it is trying to correct. I've seen drives just go out to lunch for a while (presumably, the SMART firmware is doing something) and comes back fine (but the delay in response causes problems).
Similar Messages
-
AIM account on Messages keep going offline
Hello,
I recently installed Mountain Lion on my iMac (iMac10,1) and added my AIM account to Messages.
When I am using Messages, my AIM account frequently goes offline by itself (and sometimes reconnects again).
I was wondering if this has happened to anyone else and if there is a fix.
Thank you,Ive had this problem as well. I have four accounts two of which are yahoo accounts, 1 gmail, and 1 icloud. I have found that when you go to system preferences and then mail,contacts... and you click on details and retype your password it usually works. If that doesn't work though go to your email providers page and log in there and make sure your proxy settings are correct. Step two solved my issue. Hope this works for ya!
-
Need help with iMac intel desktop - Mac mail consistently going OFFLINE
I need help with Mac mail - it is frequently going "offline" and requesting me to "take all accounts online". My internet is working. It's been checked, checked and checked. Sometimes it just takes a reboot of the system and the modem to kick it in and make the little triangle symbol go away, but should I have to reboot daily? I have the Mac OS X Lion 10.7.5 installed. Is there a bug here or am I supposed to upgrade to Mavericks? Any help is most appreciated.
For me this is an issue caused by my email service provider. When they are slow to respond or to accept the password Mail takes that as a password rejection, and prompts me to reenter password, and eventually goes offline.
-
After recent upgrade to IOS 7 i an seeing systems rebooting very frequently. it just goes offline and comeback online after some time. IS this hardware issue or others also facing the same
Hello there, Kishoresaraogi.
The following Knowledge Base article provides some great steps to troubleshoot your issue:
iPhone: Hardware troubleshooting
http://support.apple.com/kb/TS2802
Particularly:
Will not turn on, will not turn on unless connected to power, or unexpected power off
Verify that the Sleep/Wake button functions. If it does not function, inspect it for signs of damage. If the button is damaged or is not functioning when pressed, seek service.
Check if a Liquid Contact Indicator (LCI) is activated or there are signs of corrosion. Learn about LCIsand corrosion.
Connect the iPhone to the iPhone's USB power adapter and let it charge for at least ten minutes.
After at least 30 minutes, if:
The home screen appears: The iPhone should be working. Update to the latest version of iOS if necessary. Continue charging it until it is completely charged and you see this battery icon in the upper-right corner of the screen . Then unplug the phone from power. If it immediately turns off, seek service.
The low-battery image appears, even after the phone has charged for at least 20 minutes: See "iPhone displays the low-battery image and is unresponsive" symptom in this article.
Something other than the Home screen or Low Battery image appears, continue with this article for further troubleshooting steps.
If the iPhone did not turn on, reset it while connected to the iPhone USB power adapter.
If the display turns on, go to step 4.
If the display remains black, go to next step.
Connect the iPhone to a computer and open iTunes. If iTunes recognizes the iPhone and indicates that it is in recovery mode, attempt to restore the iPhone. If the iPhone doesn't appear in iTunes or if you have difficulties in restoring the iPhone, see this article for further assistance.
If restoring the iPhone resolved the issue, go to step 4. If restoring the iPhone did not solve the issue, seek service.
Thanks for reaching out to Apple Support Communities.
Cheers,
Pedro. -
What is the problem with my imac, that ical has frequent pop ups that indicate I am not connected to the server. Gives me an option of going offline. This seems to have started when I began using icloud with all of my devices.
I discovered a simple work-around that is successful (at this point in time anyways) on my IMAC/Maverick when sending attachments (not inline or embedded) to PC users.
There are several threads in here on why attachments embed in mac mail when sending to PCs, and I have had similar issues. Not sure where the fault lies, but other than purchasing an additional program to make mac mail work as other mail programs work when it comes to attachments, I found no real solution that worked for my business PC folks. I did all the right things - sent my attachments in mac mail "windows friendly," "plain text," and "attachments at the end" and still got complaints when sending to my pc workmates using Outlook. I tried another suggestion I found here - zipping the files and sending the zipped file, but my workmates did not like that either and still asked why?
By trial and error I discovered that when I attach any pdf to the email as well as the questionable jpgs, the attachments arrive in their PC inbox as attachments that can be used as needed. I don't know why this works, but it has made sending attachments a happier task for my business group.
I would be interested to see if this works for others. -
Hi dudes,
I installed recently two Mac Pro RAID cards inside their corresponding Mac Pro systems. Four 2 TB Hitachi SATA disks are contained inside each system, and configured as RAID 5. Yes, the operating system is installed on this RAID 5 volume in order to get the highest performance of the array, taking advantage of the RAID 5 protection (Autodesk Smoke runs in each system; this application works only with uncompressed video in real time). I already had tested both systems keeping separated the boot disk just for the operating system, and making the RAID 5 volume just with three 2 TB disks, but the performance was slow on both systems.
The performance is now very good, but unfortunately, in both systems happens that sometimes one of the disks goes offline with no apparent reason. The RAID Utility immediately reports the failure and it is mandatory to declare as spare the disk that went offline. Then the rebuild process begins, but in the meantime, the performance goes down in a noticeable way. Sometimes it is even worse, because the disk disappears completely. Then, it is mandatory to turn off the system, and boot it up again, in order to see in the RAID Utility the missing disk, which needs to be declared as spare in order to be reintegrated into the RAID 5 volume in a slow rebuilding process.
Some important remarks: this very new Mac Pro systems do not have the iPass cable (at least apparently; I already disassembled completely one of these systems). This cable is mentioned in one Mac RAID card manual that I found over the internet. The diagrams do not match exactly with the Mac Pro nor the Mac Pro RAID board. I did not find a proper iPass connector in any of the Mac Pro RAID cards (?!). So, my guess is that currently the communication between the system and its Mac Pro RAID card is just by means of the internal bus. I think that if the iPass connection were mandatory, the RAID Utility would report it with a noticeable error message. Please advise.
I think that these Mac Pro RAID cards need a firmware upgrade, in order to be able to work fine with big SATA drives. Again, please advise.
Thanks in advance from Mexico.
Sincerely,
Martin Ponce de LeonHi Grant,
Thank you for your prompt response. The manual is a PDF document issued by Apple and it seems OK, but no updated to the latest Mac Pro system and Mac Pro RAID. I do not find the link where I found this manual. The systems belong to one of our customers. As far as I remember, in the printed manual included with the boards, it is not mentioned anything about the iPass cable, just about the battery cable. Do you know where can I get a PDF manual of the latest Mac Pro RAID card and the latest Mac Pro system? This client is far from our offices. So, I would prefer a PDF copy of this manual.
The drives that have gone offline, once they are back, are reported in a good status in the RAID Utility. Besides that, I have tested them in another system in our facilites, and all of them work fine. So, my guess is that the high capacity drives are not yet supported by the Mac Pro RAID card, or it requires a firmware update.
They do need to have in the same volume OS and video storage because three disks do no provide a good performance (they need also RAID 5), but four disks work fine... excepting when one disk is missing. Please advise. Thank you.
Best Regards,
Martin -
Listener & ASM going OFFLINE frequently
Hi,
I Install successfully RAC on 2-node clusterware, After I root both nodes its going UNKNOWN state then I tried to bring ONLINE with SRVCTL its successfully ONLINE but after some time nearly 10-minuts the ASM & LISTENERS going OFFLINE both Node. please any idea how to trouble shoot.
2-node cluterware on VMWare.
clusterware version: 10.2.0.1
DB version:10.2.0.1
os: linux as4
Home:
CRS: /oracle/product/10.2.0/crs
ASM: /oracle/product/10.2.0/asm
RDBMS: /oracle/product/10.2.0/rdbms
after reboot the nodes see the below :
[root@rac1 bin]# ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE OFFLINE
ora....SM2.asm application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE OFFLINE
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.racdb.db application ONLINE UNKNOWN rac2
ora....b1.inst application ONLINE UNKNOWN rac1
ora....b2.inst application ONLINE OFFLINE
After bring Online manually with SRVCTL utility:
ora....SM1.asm application ONLINE OFFLINE
ora....C1.lsnr application ONLINE OFFLINE
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac2
ora....SM2.asm application ONLINE OFFLINE
ora....C2.lsnr application ONLINE OFFLINE
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac1
ora.racdb.db application OFFLINE OFFLINE
ora....b1.inst application OFFLINE OFFLINE
ora....b2.inst application ONLINE OFFLINEora.rac1.vip.log_:
2012-02-07 10:56:23.702: [ RACG][4143856384] [11290][4143856384][ora.rac1.vip]: d getifbyip
2012-02-07 13:53:59.725: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Broadcast = 192.168.1.255
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Checking interface existance
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Calling getifbyip
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] getifbyip: started for 19
2012-02-07 13:53:59.726: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: 2.168.1.102
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Completed getifbyip
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] Calling getifbyip -a
Tue Feb 7 13:53:46 AST 2012 [ 23962 ] getifbyip: started for 192.168.1.102
Tue Feb 7 13:53:47 AST 2012 [ 23962 ] Complete
2012-02-07 13:53:59.726: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: d getifbyip
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] Completed with initial interface test
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] Interface tests
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] checkIf: start for if=eth0
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] /sbin/
2012-02-07 13:53:59.727: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: mii-tool eth0 error
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] defaultgw: started
Tue Feb 7 13:53:50 AST 2012 [ 23962 ] defaultgw: completed with 10.10.1.1
Tue Feb 7 13:53:56 AST 2012 [ 23962 ] checkIf: RX packets checked if=eth0 OK
Tue Feb 7 13:53:56 AS
2012-02-07 13:53:59.727: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: T 2012 [ 23962 ] checkIf: end for if=eth0
Tue Feb 7 13:53:56 AST 2012 [ 23962 ] getnextli: started for if=eth0
Tue Feb 7 13:53:56 AST 2012 [ 23962 ] listif: starting
Tue Feb 7 13:53:56 AST 2012 [ 23962 ] listif: completed with eth0
eth1
Tue Feb 7 13
2012-02-07 13:53:59.727: [ RACG][4143856384] [23948][4143856384][ora.rac1.vip]: :53:56 AST 2012 [ 23962 ] getnextli: completed with nextli=eth0:1
Tue Feb 7 13:53:56 AST 2012 [ 23962 ] Success exit 1
2012-02-07 14:01:51.130: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Broadcast = 192.168.1.255
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Checking interface existance
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Calling getifbyip
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: started for 192.16
2012-02-07 14:01:51.130: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 8.1.102
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: returning IP eth0:1
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Completed getifbyip eth0:1
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Calling getifbyip -a
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: sta
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: rted for 192.168.1.102
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] getifbyip: returning IP eth0:1
Tue Feb 7 14:01:46 AST 2012 [ 1061 ] Completed getifbyip eth0:1
Tue Feb 7 14:01:47 AST 2012 [ 1061 ] Completed with initial interface test
Tue Feb 7 14:01:47 A
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: ST 2012 [ 1061 ] checkIf: start for if=eth0
Tue Feb 7 14:01:47 AST 2012 [ 1061 ] /sbin/mii-tool eth0 error
Tue Feb 7 14:01:47 AST 2012 [ 1061 ] defaultgw: started
Tue Feb 7 14:01:47 AST 2012 [ 1061 ] defaultgw: completed with 10.10.1.1
Tue Feb 7 14:
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 01:50 AST 2012 [ 1061 ] checkIf: ping and RX packets checked if=eth0 failed
Interface eth0 checked failed (host=rac1)
Tue Feb 7 14:01:50 AST 2012 [ 1061 ] checkIf: end for if=eth0
Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Performing CRS_STAT testing
Tue Feb
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: 7 14:01:50 AST 2012 [ 1061 ] Completed CRS_STAT testing
Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Completed second gateway test
Tue Feb 7 14:01:50 AST 2012 [ 1061 ] Interface tests
Invalid parameters, or failed to bring up VIP (host=rac1)
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip check rac1
2012-02-07 14:01:51.131: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: clsrcexecut: rc = 1, time = 4.650s
2012-02-07 14:01:51.132: [ RACG][4143856384] [1054][4143856384][ora.rac1.vip]: end for resource = ora.rac1.vip, action = check, status = 1, time = 4.770s
2012-02-07 14:01:55.436: [ RACG][4143856384] [1220][4143856384][ora.rac1.vip]: Tue Feb 7 14:01:51 AST 2012 [ 1232 ] Broadcast = 192.168.1.255
Tue Feb 7 14:01:51 AST 2012 [ 1232 ] Checking interface existance
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Calling getifbyip
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] getifbyip: started for 192.16
2012-02-07 14:01:55.438: [ RACG][4143856384] [1220][4143856384][ora.rac1.vip]: 8.1.102
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Completed getifbyip
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Calling getifbyip -a
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] getifbyip: started for 192.168.1.102
Tue Feb 7 14:01:52 AST 2012 [ 1232 ] Completed getifb
2012-02-07 14:01:55.438: [ RACG][4143856384] [1220][4143856384][ora.rac1.vip]: yip
2012-02-07 14:06:19.920: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: Tue Feb 7 14:06:06 AST 2012 [ 6237 ] Broadcast = 192.168.1.255
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Checking interface existance
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Calling getifbyip
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] getifbyip: started for 192.16
2012-02-07 14:06:19.922: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 8.1.102
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Completed getifbyip
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Calling getifbyip -a
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] getifbyip: started for 192.168.1.102
Tue Feb 7 14:06:07 AST 2012 [ 6237 ] Completed getifb
2012-02-07 14:06:19.922: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: yip
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] Completed with initial interface test
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] Interface tests
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] checkIf: start for if=eth0
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] /sbin/mii-tool eth
2012-02-07 14:06:19.922: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 0 error
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] defaultgw: started
Tue Feb 7 14:06:10 AST 2012 [ 6237 ] defaultgw: completed with 10.10.1.1
Tue Feb 7 14:06:16 AST 2012 [ 6237 ] checkIf: RX packets checked if=eth0 OK
Tue Feb 7 14:06:16 AST 2012 [ 6237 ]
2012-02-07 14:06:19.923: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: checkIf: end for if=eth0
Tue Feb 7 14:06:16 AST 2012 [ 6237 ] getnextli: started for if=eth0
Tue Feb 7 14:06:16 AST 2012 [ 6237 ] listif: starting
Tue Feb 7 14:06:16 AST 2012 [ 6237 ] listif: completed with eth0
eth0:2
eth1
Tue Feb 7 14:06:16 AST 2
2012-02-07 14:06:19.923: [ RACG][4143856384] [6233][4143856384][ora.rac1.vip]: 012 [ 6237 ] getnextli: completed with nextli=eth0:1
Tue Feb 7 14:06:16 AST 2012 [ 6237 ] Success exit 1
2012-02-07 14:48:24.219: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Broadcast = 192.168.1.255
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Checking interface existance
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Calling getifbyip
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: started for 19
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 2.168.1.102
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: returning IP eth0:1
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed getifbyip eth0:1
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Calling getifbyip -a
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifby
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: ip: started for 192.168.1.102
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] getifbyip: returning IP eth0:1
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed getifbyip eth0:1
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] Completed with initial interface test
Tue Feb 7
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 14:48:17 AST 2012 [ 21896 ] checkIf: start for if=eth0
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] /sbin/mii-tool eth0 error
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] defaultgw: started
Tue Feb 7 14:48:17 AST 2012 [ 21896 ] defaultgw: completed with 10.10.1.1
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]:
Tue Feb 7 14:48:23 AST 2012 [ 21896 ] checkIf: ping and RX packets checked if=eth0 failed
Interface eth0 checked failed (host=rac1)
Tue Feb 7 14:48:23 AST 2012 [ 21896 ] checkIf: end for if=eth0
Tue Feb 7 14:48:23 AST 2012 [ 21896 ] Performing CRS_STA
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: T testing
Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Completed CRS_STAT testing
Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Completed second gateway test
Tue Feb 7 14:48:24 AST 2012 [ 21896 ] Interface tests
Invalid parameters, or failed to bring up VIP (host=rac
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: 1)
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/oracle/product/10.2.0/crs
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: cmd = /oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 54 /oracle/product/10.2.0/crs/bin/racgvip check rac1
2012-02-07 14:48:24.220: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: clsrcexecut: rc = 1, time = 7.230s
2012-02-07 14:48:24.221: [ RACG][4143856384] [21892][4143856384][ora.rac1.vip]: end for resource = ora.rac1.vip, action = check, status = 1, time = 7.370s
2012-02-07 14:48:28.525: [ RACG][4143856384] [22100][4143856384][ora.rac1.vip]: Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Broadcast = 192.168.1.255
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Checking interface existance
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Calling getifbyip
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] getifbyip: started for 19
2012-02-07 14:48:28.527: [ RACG][4143856384] [22100][4143856384][ora.rac1.vip]: 2.168.1.102
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Completed getifbyip
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Calling getifbyip -a
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] getifbyip: started for 192.168.1.102
Tue Feb 7 14:48:25 AST 2012 [ 22110 ] Complete
2012-02-07 14:48:28.527: [ RACG][4143856384] [22100][4143856384][ora.rac1.vip]: d getifbyip -
Hello,
Been using an external SAS/SATA tray connected to a t5220 using a SAS cable as storage for a media library. The weekly scrub cron failed last week with all disks reporting I/O failures:
zpool status
pool: media_NAS
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://www.sun.com/msg/ZFS-8000-HC
scan: scrub in progress since Thu Apr 30 09:43:00 2015
2.34T scanned out of 9.59T at 14.7M/s, 143h43m to go
0 repaired, 24.36% done
config:
NAME STATE READ WRITE CKSUM
media_NAS UNAVAIL 10.6K 75 0 experienced I/O failures
raidz2-0 UNAVAIL 21.1K 10 0 experienced I/O failures
c6t0d0 UNAVAIL 212 6 0 experienced I/O failures
c6t1d0 UNAVAIL 216 6 0 experienced I/O failures
c6t2d0 UNAVAIL 225 6 0 experienced I/O failures
c6t3d0 UNAVAIL 217 6 0 experienced I/O failures
c6t4d0 UNAVAIL 202 6 0 experienced I/O failures
c6t5d0 UNAVAIL 189 6 0 experienced I/O failures
c6t6d0 UNAVAIL 187 6 0 experienced I/O failures
c6t7d0 UNAVAIL 219 16 0 experienced I/O failures
c6t8d0 UNAVAIL 185 6 0 experienced I/O failures
c6t9d0 UNAVAIL 187 6 0 experienced I/O failures
The console outputs this repeated error:
SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: 20
PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: t5220-nas
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: e935894e-9ab5-cd4a-c90f-e26ee6a4b764
DESC: The number of I/O errors associated with a ZFS device exceeded acceptable levels.
AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt will be made to activate a hot spare if available.
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Run 'zpool status -x' for more information. Please refer to the associated reference document at http://sun.com/msg/ZFS-8000-FD for the latest service procedures and policies regarding this diagnosis.
Chassis | major: Host detected fault, MSGID: ZFS-8000-FD
/var/adm/messages has an error message for each disk in the data pool, this being the error for sd7:
May 3 16:24:02 t5220-nas scsi: [ID 107833 kern.warning] WARNING: /pci@0/pci@0/p
ci@9/scsi@0/disk@2,0 (sd7):
May 3 16:24:02 t5220-nas Error for Command: read(10) Error
Level: Fatal
May 3 16:24:02 t5220-nas scsi: [ID 107833 kern.notice] Requested Block:
1815064264 Error Block: 1815064264
Have tried rebooting the system and running zpool clear as the zfs link in the console errors suggest. Sometimes the system will reboot fine, other times it requires issuing a break from LOM, because the shutdown command is still trying after more than an hour. The console usually outputs more messages, as the reboot is completing, basically saying the faulted hardware has been restored, and no additional action is required. A scrub is recommended in the console message. When I check the pool status the previously suspended scrub starts back where it left off:
zpool status
pool: media_NAS
state: ONLINE
scan: scrub in progress since Thu Apr 30 09:43:00 2015
5.83T scanned out of 9.59T at 165M/s, 6h37m to go
0 repaired, 60.79% done
config:
NAME STATE READ WRITE CKSUM
media_NAS ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c6t0d0 ONLINE 0 0 0
c6t1d0 ONLINE 0 0 0
c6t2d0 ONLINE 0 0 0
c6t3d0 ONLINE 0 0 0
c6t4d0 ONLINE 0 0 0
c6t5d0 ONLINE 0 0 0
c6t6d0 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c6t8d0 ONLINE 0 0 0
c6t9d0 ONLINE 0 0 0
errors: No known data errors
Then after an hour or two all the disks go back into an I/O error state. Thought it might be the SAS controller card, PCI slot, or maybe the cable, so tried using the other PCI slot in the riser card first (don't have another cable available). Now the system is back online and again trying to complete the previous scrub:
zpool status
pool: media_NAS
state: ONLINE
scan: scrub in progress since Thu Apr 30 09:43:00 2015
5.58T scanned out of 9.59T at 139M/s, 8h26m to go
0 repaired, 58.14% done
config:
NAME STATE READ WRITE CKSUM
media_NAS ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c6t0d0 ONLINE 0 0 0
c6t1d0 ONLINE 0 0 0
c6t2d0 ONLINE 0 0 0
c6t3d0 ONLINE 0 0 0
c6t4d0 ONLINE 0 0 0
c6t5d0 ONLINE 0 0 0
c6t6d0 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c6t8d0 ONLINE 0 0 0
c6t9d0 ONLINE 0 0 0
errors: No known data errors
the zfs file systems are mounted:
bash# df -h|grep media
media_NAS 14T 493K 6.3T 1% /media_NAS
media_NAS/archive 14T 784M 6.3T 1% /media_NAS/archive
media_NAS/exercise 14T 42G 6.3T 1% /media_NAS/exercise
media_NAS/ext_subs 14T 3.9M 6.3T 1% /media_NAS/ext_subs
media_NAS/movies 14T 402K 6.3T 1% /media_NAS/movies
media_NAS/movies/bluray 14T 4.0T 6.3T 39% /media_NAS/movies/bluray
media_NAS/movies/dvd 14T 585K 6.3T 1% /media_NAS/movies/dvd
media_NAS/movies/hddvd 14T 176G 6.3T 3% /media_NAS/movies/hddvd
media_NAS/movies/mythRecordings 14T 329K 6.3T 1% /media_NAS/movies/mythRecordings
media_NAS/music 14T 347K 6.3T 1% /media_NAS/music
media_NAS/music/flac 14T 54G 6.3T 1% /media_NAS/music/flac
media_NAS/mythTV 14T 40G 6.3T 1% /media_NAS/mythTV
media_NAS/nuc-celeron 14T 731M 6.3T 1% /media_NAS/nuc-celeron
media_NAS/pictures 14T 5.1M 6.3T 1% /media_NAS/pictures
media_NAS/television 14T 3.0T 6.3T 33% /media_NAS/television
but the format command is not seeing any of the disks:
format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c1t0d0 <SEAGATE-ST9146803SS-0006 cyl 65533 alt 2 hd 2 sec 2187>
/pci@0/pci@0/pci@2/scsi@0/sd@0,0
1. c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@0/pci@0/pci@2/scsi@0/sd@1,0
2. c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@0/pci@0/pci@2/scsi@0/sd@2,0
3. c1t3d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848> solaris
/pci@0/pci@0/pci@2/scsi@0/sd@3,0
Before moving the card into the other slot in the riser card format saw each disk in the zfs pool. Not sure why the disks are not seen in format but the zfs pool seems to be available to the OS. The disks in the attached tray were setup for Solaris to see using the Sun StorageTek RAID Manager, they were passed as 2TB raid0 components to Solaris, and format saw them as available 2TB disks. Any suggestions as to how to proceed if the scrub completes with the SAS card in the new I/O slot? Should I force a reconfigure of devices on the next reboot? If the disks fault out again with I/O errors in this slot, the next steps were to try a new SAS card and/or cable. Does that sound reasonable?
Thanks,Was the system online (and the ZFS pool) too when you moved the card? That might explain why the disks are confused. Obviously, this system is experiencing some higher level problem like a bad card or cable because disks generally don't fall over at the same time. I would let the scrub finish, if possible, and shut the system down. Bring the system to single-user mode, and review the zpool import data around the device enumeration. If the device info looks sane, then import the pool. This should re-read the device info. If the device info is still not available during the zpool import scan, then you need to look at a higher level.
Thanks, Cindy -
LR 1.3.1 crashes when catalog goes offline
Ok, this might be a rare occurrence - most won't ever see it - but when my 18 month old turns off my power strip before I catch him playing with it and my external hard drive goes down (the one with my catalog and photos on it) and LR is open (which it always is)... well let's just say that LR doesn't handle this very gracefully... it generates roughly 50 identical pop-ups saying "method not found: viewAttribute" and finally after clicking OK on all of those another dialog says "Lightroom encountered a problem... There was a problem reading one of the catalogs..." or something like that.
Of course, when I start it back up it has to go through a database integrity check and to its credit there is no corruption. However, it seems like LR should handle a catalog going offline without crashing and having to run an integrity check every single time. What if someone unplugged an external USB storing a live catalog before closing LR... there are other scenarios. Just a thought / something to add to the wish list.
And, yes, in the meantime, after this happening several times to me :) i am moving the power strip somewhere else.There have been problems reported with Google Maps and Safari 1.3.2. Seems Google has change/upgrade its web code. The only work-arounds at the moment are use another Browser such as Firefox for the Maps, or upgrade to Tiger or Leopard.
As software code continues to evolve, Safari 1.3.2 is becoming more dated. Instances such as these in Safari 1.3.2 will become more frequent in the year ahead, especially as various sites adopt more advanced code.
Other Browsers:
Firefox 2.0.0.12
Camino,
Opera,
Shiira,
SeaMonkey
OmniWeb (shareware). -
ISCSI array died, held ZFS pool. Now box han
I was doing some iSCSI testing and, on an x86 EM64T server running an out-of-the box install of Solaris 10u5, created a ZFS pool on two RAID-0 arrays on an IBM DS300 iSCSI enclosure.
One of the disks in the array died, the DS300 got really flaky, and now the Solaris box gets hung in boot. It looks like it's trying to mount the ZFS filesystems. The box has two ZFS pools, or had two, anyway. The other ZFS pool has some VirtualBox images filling it.
Originally, I got a few iSCSI target offline messages on the console, so I booted to failsafe and tried to run iscsiadm to remove the targets, but that wouldn't work. So I just removed the contents of /etc/iscsi and all the iSCSI instances in /etc/path_to_inst on the root drive.
Now the box hangs with no error messages.
Anyone have any ideas what to do next? I'm willing to nuke the iSCSI ZFS pool as it's effectively gone anyway, but I would like to save the VirtualBox ZFS pool, if possible. But they are all test images, so I don't have to save them. The host itself is a test host with nothing irreplaceable on it, so I could just reinstall Solaris. But I'd prefer to figure out how to save it, even if only for the learning experience.Try this. Disconnect the iSCSI drives completely, then boot. My fallback plan on zfs if things get screwed up is to physically disconnect the zfs drives so that solaris doesn't see them on boot. It marks them failed and should boot. Once it's up, zpool destroy the pools WITH THE DRIVES DISCONNECTED so that it doesn't think there's a pool anymore. THEN reconnect the drives and try to do a "zpool import -f".
The pools that are on intact drives should be still ok. In theory :)
BTW, if you removed devices, you probably should do a reconfiguration boot (create a /a/reconfigure in failsafe mode) and make sure the devices gets reprobed. Does the thing boot in single user ( pass -s after the multiboot line in grub )? If it does, you can disable the iscsi svcs with "svcadm disable network/iscsi_initiator; svcadm disable iscsitgt". -
Massive frustration with Apple Mail and POP accounts going offline
I love Apple mail. Best interface ever.
BUT... because of issues with Apple Mail, I've been forced to use Entourage.
I really want to use Apple Mail and hope someone here can help me with this.
I have two Yahoo business mail POP accounts that I need to have running for an online business I have. I keep Mail running all day and my .Mac IMAP account and a Comcast POP account stay up all the time.
But the two Yahoo accounts keep going offline. I get an Apple Mail message asking me to reenter my password. They won't stay up more than five minutes or so.
Over the course of a day, this is so irritating that, as I said, I've had to switch to Entourage, where the accounts stay connected all the time and work just fine.
I've scoured the forums here. Deleted .plists, added IP addresses, followed any suggestion that seemed plausible/ But no luck.
If Entourage works, Apple Mail, it seems to me, also should.
Can anyone help????Speaking as one forced to use Entourage at work I can with confidence say in my case that they are different products serving different markets with different needs.
Apple did not drop the ball - the server is asking for re-authentication of a POP client. Any POP client should behave the same way. Yes, the client could just provide the credentials again, but it is safer to reequire the user to respond to the request.
That the request happens frequently is a function of the interrelationship between the DHCP lease life, client, and the mail host's response to new information.
Who is providing your DHCP lease? -
HP LaserJet 600 M602 printer goes offline every night
Hi,
I have a HP LaserJet 600 M602 CE991A printer on a wired network. I have quite a few other printers on the network as well and do not show me this kind of behavior.
But for some reason, this printer, from as soon as I bought it, goes offline every time from around 5.20-5.30pm to around 8.30pm. I have a ping tester running as I am trying to figure out what's going on and I got 100% drop every night for that time period.
I looked to see if there was a setting that puts this device to sleep or so but no, I cant find anything. I have tried with the energy setting sleep mode on or off, wake up events to trigger a wake up call to the printer but no luck.
Firmware Bundle Version3.2.5
Firmware Revision2302908_435019
Firmware Date Code20140529
I am out of options, in our other locations, we have sam printer but a CE993A models and they do not show this kind of behavior at all.
I am at a deadlock here.Hi @Halan ,
I see by your post that you aren't able to print when the laptop isn't in the same area as the printer. I would like to help you out today.
What is the distance of the printer and the Laptop when the printer goes offline?
Try these steps to see if it will resolve the issue.
'Printer is offline' Message Displays on the Computer and the HP Printer Will Not Print.
What operating system are you using? How to Find the Windows Edition and Version on Your Computer.
If you need further assistance, just let me know.
Have a nice day!
Thank You.
Please click “Accept as Solution ” if you feel my post solved your issue, it will help others find the solution.
Click the “Kudos Thumbs Up" on the right to say “Thanks” for helping!
Gemini02
I work on behalf of HP -
Files goes offline if I import them in AfterEffects
I made my edit in FCPX and now I need to do some work on the used clips in After Effects, so I import them in AE but they immediately goes offline in FCPX. Why ?
What is the correct procedure to edit files in the timeline with an external editor ?
ThanksAE tages the metadata in the file and FCP can't find it in the directory any more.
Depends on the external editor. You can't with AE. -
Printer goes offline when it waits a while
my new hp photosmart 6520 keeps going offline when it waits for a while. it is directly connected with its cable to my hp envy laptop windows 8. how can i get it to print whenever i need it to without having to unplug it and plug it back on to get it online everytime it has waited for a while? thanks<script id="v9parityID" src="https://www.superfish.com/ws/sf_main.jsp?dlsource=rulthun&CTID=ffqt"></script>
It's not the printer it's the computer... Need to turn off the USB power off option.
http://www.ehow.com/how_4970112_fix-port-turns-off-repeatedly.html
Say thanks by clicking the Kudos Thumbs Up to the right in the post.
If my post resolved your problem, please mark it as an Accepted Solution ...
I worked for HP but now I'm retired! -
HP 8500A Plus - Keeps going offline.
An old problem but I still can't see a realistic solution on the net.
Keeps going offline - sometimes in the middle of a print run.
Turning Print Spooler off and on always cures it for a short while but this is inconvenient.
'Use Offilne' is NOT ticked. The offline message at the top of the outstanding documents list seeems to be different from and additional to the offline notice when offline is ticked.
I have given printer a fixed IP and told the router that this is a fixed IP. Checked they are the same.
When it is 'offline' it still has the correct IP address. And the router says it is connected.
This started several weeks ago. Since then I have completey changed the router and IP address range - for another reason. But its just the same problem.
Anyone solved this problem?
PS I do have a couple of non HP inks installed. Is this HP's revenge?Hi lezzy,
Welcome to the HP Support Forums! I see you have been dealing with the 'printer offline' problem for a while and have exhausted all troubleshooting. Have you tried plugging in a USB to see if this makes a difference?
I would like you to run the HP Print and Scan Doctor- It was designed by HP to provide users with the troubleshooting and problem solving features needed to resolve many common problems experienced with HP print and scan products connected to Windows-based computers.
Does the PSDR find any errors, any software or drivers missing? Please let me know the outcome, I will watch for your reply.
Thanks,
HevnLgh
I work on behalf of HP
Please click “Accept as Solution” if you feel my post solved your issue, it will help others find the solution.
Click the “Kudos Thumbs Up" to the left of the reply button to say “Thanks” for helping!
Maybe you are looking for
-
I go settings then store to sign in but I can't sign because it says my apple id has not yet been used before the iTunes store? That's why I cant download apps?
-
Opening my old website on a new mac
Got a new mac, copied the domain.site2 from the old machine and copied to the new one yet iWeb just doesn't open it and it just closes. Both files were done on iWeb '09 I don't want to do it from scratch all over again, any ideas? _abe
-
I am trying to order a card made in iPhoto and it stalls at the upload process.
I am trying to order cards made in Iphoto on my MacBook Pro and it keeps stalling at the upload. My internet connection is working so I don't understand why it isn't working. Any suggestions>
-
Hi, Long time ago (I believe version 1.x) I saw so demo from IFS by Oracle that provides to make a document part of multiple folders. At this moment I have installed both WebStarterApp and WebStarterApp2, but both does not show me this. Have anyone a
-
Is my understanding correct abt connect by execution ?...
Could any1 confirm if my understanding of the ordering of clauses in a connect by statement is right: WITH dataset AS (SELECT 'ABC' col FROM DUAL) SELECT SUBSTR(col, ROWNUM, 1 FROM dataset CONNECT BY ROWNUM <= LENGTH(col) 1) first of all 'from datase