[Solved] Can't Import ZFS Pool as /dev/disk/by-id

I have a 4 disk raidz1 pool "data" made up of 3TB disks. Each disk is so that that partition 1 is a 2GB swap partition, and partition 2 is the rest of the drive. The zpool was built out of /dev/disk/by-id(s) pointing to the second partition.
# lsblk -i
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 2.7T 0 disk
|-sda1 8:1 0 2G 0 part
`-sda2 8:2 0 2.7T 0 part
sdb 8:16 0 2.7T 0 disk
|-sdb1 8:17 0 2G 0 part
`-sdb2 8:18 0 2.7T 0 part
sdc 8:32 0 2.7T 0 disk
|-sdc1 8:33 0 2G 0 part
`-sdc2 8:34 0 2.7T 0 part
sdd 8:48 0 2.7T 0 disk
|-sdd1 8:49 0 2G 0 part
`-sdd2 8:50 0 2.7T 0 part
sde 8:64 1 14.9G 0 disk
|-sde1 8:65 1 100M 0 part /boot
`-sde2 8:66 1 3G 0 part /
I had a strange disk failure where the controller one one of the drives flaked out and caused my zpool not to come online after a reboot, and I had to zpool export data/zpool import data to get the zpool put back together. However, now it is fixed, but my drives are now identified by their device name:
[root@osiris disk]# zpool status
pool: data
state: ONLINE
scan: resilvered 36K in 0h0m with 0 errors on Wed Aug 13 22:37:19 2014
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda2 ONLINE 0 0 0
sdb2 ONLINE 0 0 0
sdc2 ONLINE 0 0 0
sdd2 ONLINE 0 0 0
errors: No known data errors
If I try to import by-id without a zpool name, I get this (its trying to import the disks, not the partitions):
cannot import 'data': one or more devices is currently unavailable
[root@osiris disk]# zpool import -d /dev/disk/by-id/
pool: data
id: 16401462993758165592
state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: http://zfsonlinux.org/msg/ZFS-8000-5E
config:
data FAULTED corrupted data
raidz1-0 ONLINE
ata-ST3000DM001-1CH166_Z1F28ZJX UNAVAIL corrupted data
ata-ST3000DM001-1CH166_Z1F0XAXV UNAVAIL corrupted data
ata-ST3000DM001-1CH166_Z1F108YC UNAVAIL corrupted data
ata-ST3000DM001-1CH166_Z1F12FJZ UNAVAIL corrupted data
[root@osiris disk]# zpool status
no pools available
... and the import doesn't succeed.
If I put the pool name at the end, I get:
[root@osiris disk]# zpool import -d /dev/disk/by-id/ data
cannot import 'data': one or more devices is currently unavailable
Yet, if I do the same thing with the /dev/disk/by-partuuid paths, it seems to work fine (other than the fact that I don't want partuuids). Presumably because there are no entries here for entire disks.
[root@osiris disk]# zpool import -d /dev/disk/by-partuuid/ data
[root@osiris disk]# zpool status
pool: data
state: ONLINE
scan: resilvered 36K in 0h0m with 0 errors on Wed Aug 13 22:37:19 2014
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
d8bd1ef5-fab9-4d47-8d30-a031de9cd368 ONLINE 0 0 0
fbe63a02-0976-42ed-8ecb-10f1506625f6 ONLINE 0 0 0
3d1c9279-0708-475d-aa0c-545c98408117 ONLINE 0 0 0
a2d9067c-85b9-45ea-8a23-350123211140 ONLINE 0 0 0
errors: No known data errors
As another approach, I tried to offline and replace sda2 with /dev/disk/by-id/ata-ST3000DM001-1CH166_Z1F28ZJX-part2, but that doesn't work either:
[root@osiris disk]# zpool offline data sda2
[root@osiris disk]# zpool status
pool: data
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: resilvered 36K in 0h0m with 0 errors on Wed Aug 13 22:37:19 2014
config:
NAME STATE READ WRITE CKSUM
data DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
sda2 OFFLINE 0 0 0
sdb2 ONLINE 0 0 0
sdc2 ONLINE 0 0 0
sdd2 ONLINE 0 0 0
errors: No known data errors
[root@osiris disk]# zpool replace data sda2 /dev/disk/by-id/ata-ST3000DM001-1CH166_Z1F28ZJX-part2
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-ST3000DM001-1CH166_Z1F28ZJX-part2 is part of active pool 'data'
[root@osiris disk]# zpool replace -f data sda2 /dev/disk/by-id/ata-ST3000DM001-1CH166_Z1F28ZJX-part2
invalid vdev specification
the following errors must be manually repaired:
/dev/disk/by-id/ata-ST3000DM001-1CH166_Z1F28ZJX-part2 is part of active pool 'data'
I would appreciate if anyone else had any suggestions/workarounds on how to fix this
As I was typing this up, I stumbled upon a solution by deleting the symlinks that pointed to entire devices in /dev/disk/by-id (ata-* and wwn*). I then was able to do a zpool import -d /dev/disk/by-id data and it pulled in the partition 2's. It persisted after a reboot and my symlinks were automatically regenerated when the system came back up:
[root@osiris server]# zpool status
pool: data
state: ONLINE
scan: resilvered 36K in 0h0m with 0 errors on Wed Aug 13 23:06:46 2014
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-ST3000DM001-1CH166_Z1F28ZJX-part2 ONLINE 0 0 0
ata-ST3000DM001-1CH166_Z1F0XAXV-part2 ONLINE 0 0 0
ata-ST3000DM001-1CH166_Z1F108YC-part2 ONLINE 0 0 0
ata-ST3000DM001-1CH166_Z1F12FJZ-part2 ONLINE 0 0 0
It would appear to be an issue with specifically importing non-whole devices by-id. Although this was mainly rambling and no longer a question, hopefully this might help someone having issues re-importing a zpool by /dev/disk/by-id.
Matt

This just saved my morning Thank you!
I was using Ubuntu 14.04 and after an upgrade to 3.13.0-43-generic it somehow broke... Anyhow now the zpool survives restarts again and I don't have to import it every time using partuuids.

Similar Messages

[Solved] Can't boot; udev not creating /dev/disk/

After doing a system upgrade I can no longer boot. I get the message that /dev/disk/by-uuid/{device id} cannot be found. (There is no /dev/disk/ path at all). I tried changing the grub boot path to dev/sdaX, but /dev/sda?? are also not created.
I have a Fedora rescue CD lying around, and if I boot from that I can see and mount the boot device (/dev/disk/by-uuid/ is created), so I don't think it's a hardware issue.
Last edited by rhodie (2012-03-20 18:26:53)

Booting from a NetInstaller CD worked.
A couple of notes in case others have the same issue.
I had to turn on networking before I could run pacman. I turned on networking by issuing these commands (not sure of the proper order for them or if they're all necessary, but it worked):
/etc/rc.d/networking start
ip link set eth0 up
dhcpd
I then ran through the steps listed at https://wiki.archlinux.org/index.php/Pa … nger_boot. , but I got errors on "mkinitcpio -p linux" so I added "pacman -S linux," which seemed to include mkinitcpio, but I ran mkinitcpio again afterwards just to be sure and got no errors.
Last edited by rhodie (2012-03-20 16:09:20)

Can't import photos from camera -"Insufficient disk space on volume"?

Please excuse my ignorance if there is a fix for this that I haven't found here.... I reinstalled iLife '11 yesterday because iPhoto would not recognize the camera. Now it recognizes the camera but it won't import the photos. I've tried all the suggestions, trashed the cache, the preference file, restarted, created a new library and I still can't import the photos from the camera to iPhoto because it says "insufficient disk space on volume." I have plenty of space on my hard drive. I can import through Image Capture but why go the extra step when importing to iPhoto was so very easy before? Can anyone help with this?

This has been reported - not sure if there ever was a solution or not - search the forums to find the threads
you can try renewing the iPhoto preference file to see if that helps -
A good general step for strange issues is to renew the iPhoto preference file - quit iPhoto and go to "your user name" ==> library ==> preferences ==> com.apple.iPhoto.plist and trash it - launch iPhoto which creates a fresh new default preference file and reset any personal preferences you have changed and if you have moved the iPhoto library repoint to it. This may help
This does not affect your photos or any database information (keywords, faces, places, ratings, etc) in any way - they are stored in the iPhoto library - the iPhoto preference file simply controls how iPhoto works - which is why renewing it is a good first step.
LN

Can't get ZFS Pool to validate in HAStoragePlus

Hello.
We rebuilt our cluster with Solaris 10 U6 with Sun Cluster 3.2 U1.
When I was running U5, we never had this issue, but with U6, I can't get the system to validate properly the zpool resource to the resource group.
I am running the following commands:
zpool create -f tank raidz2 c2t0d0 c2t1d0 c2t2d0 c2t3d0 c3t0d0 c3t1d0 c3t2d0 c3t3d0 spare c2t4d0
zfs set mountpoint=/share tank
These commands build my zpool, zpool status comes back good.
I then run
clresource create -g tank_rg -t SUNW.HAStoragePlus -p Zpools=tank hastorage_rs
I get the following output:
clresource: mbfilestor1 - : no error
clresource: (C189917) VALIDATE on resource storage_rs, resource group tank_rg, exited with non-zero exit status.
clresource: (C720144) Validation of resource storage_rs in resource group tank_rg on node mbfilestor1 failed.
clresource: (C891200) Dec 2 10:27:00 mbfilestor1 SC[SUNW.HAStoragePlus:6,tank_rg,storage_rs,hastorageplus_validate]: : no error
Dec 2 10:27:00 mbfilestor1 Cluster.RGM.rgmd: VALIDATE failed on resource <storage_rs>, resource group <tank_rg>, time used: 0% of timeout <1800, seconds>
Failed to create resource "storage_rs".
My resource group and logical host all work no problems, and when I ran this command on the older version of Solaris, it worked no problem. Is this a problem with the newer version of Solaris only?
I though maybe downloading the most up to date patches would fix this, but it didn't.
I did notice this in my messages:
Dec 2 10:26:58 mbfilestor1 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <hastorageplus_validate> for resource <storage_rs>, resource group <tank_rg>, node <mbfilestor1>, timeout <1800> seconds
Dec 2 10:26:58 mbfilestor1 Cluster.RGM.rgmd: [ID 616562 daemon.notice] 9 fe_rpc_command: cmd_type(enum):<1>:cmd=</usr/cluster/lib/rgm/rt/hastorageplus/hastorageplus_validate>:tag=<tank_rg.storage_rs.2>: Calling security_clnt_connect(..., host=<mbfilestor1>, sec_type {0:WEAK, 1:STRONG, 2:DES} =<1>, ...)
Dec 2 10:27:00 mbfilestor1 SC[SUNW.HAStoragePlus:6,tank_rg,storage_rs,hastorageplus_validate]: [ID 471757 daemon.error] : no error
Dec 2 10:27:00 mbfilestor1 Cluster.RGM.rgmd: [ID 699104 daemon.error] VALIDATE failed on resource <storage_rs>, resource group <tank_rg>, time used: 0% of timeout <1800, seconds>
Any ideas, or should I put in a bug fix request with Sun?

Hi,
Thanks, I ended up just going back to Solaris 10 U5. It was too critical to get back up and running, and I got tired of messing with it, so I ended up going back. Everything is working like it should. I may try to do a LU on the server and see what happens. Maybe the pools and cluster resources will be fine.
Edited by: mbunixadm on Dec 15, 2008 9:09 AM

[solved] Can't format (or I destroyed) a disk

I have a USB hard disk that had three NTFS partitions on it. The drive was working fine before I started messing with it. I would turn it on, it would auto-mount, and I could read/write to/from it.
I'm trying to format the disk, but it's not cooperating. Initially I attempted to use gparted (sudo gparted /dev/sdg), but it hung on, "scanning all devices." I read that gparted sometimes hangs on NTFS partitions. Since I didn't want them anyway, I deleted them all using fdisk:
sudo fdisk /dev/sdg
Then did d, 1, d, 2, d, 3, w. fdisk said, "Partition table has been altered!" or whatever it says when it's doing its thing. Except it sat there for about 10 minutes and never finished/exited. I couldn't get it to exit (ctrl-c, esc, q...) and had to close the terminal.
After that, sudo fdisk -l /dev/sdg showed no partitions. So it seemed like fdisk had been successful despite hanging. However, gparted still couldn't get past scanning, and (oddly?) sudo fdisk -l /dev/sdg1 still showed an ntfs partition taking up the entire disk. That was strange to me because there were three NTFS partitions on the disk previously. Shouldn't sdg1 be a partition, and not have partitions? Particularly when sdg showed no partitions at all? As far as I know, there were no extended partitions on here.
Anyway, I did sudo fdisk -l /dev/sdg1 and deleted the NTFS partition. Again, fdisk hung after, "Altering table!" and I had to close the terminal. But again the changes did seem to take effect. Now if i fdisk -l on either sdg or sdg1, it prints an empty partition table. Gparted still can't read the disk. I can create partitions using fdisk on either sdg or sdg1, but the drive won't mount. (No errors, I just turn it off and back on, and nothing happens.) Every time I write a change to the disk, fdisk hangs and I need to close the terminal.
I feel like the problem is the partition table(?) on sdg1. Shouldn't the result of fdisk -l on sdg1 be, "No partition table found?" Is there anything I can do to just, like, reset the whole disk so I can get it open in gparted? Or did I kill it when I exited fdisk?
Last edited by yawns (2010-06-25 21:33:50)

/dev/sdg1 is the first partition of the device; that it exists at all means that the kernel wasn't informed that the partition table changed. This is likely the step at which fdisk hung. You can try to do this manually with the partprobe command. If that hangs, I'm not sure what else to do except for unplugging and replugging the disk, and failing in that, rebooting. Also, check dmesg for anything interesting.
EDIT: Also, weird stuff will happen if any of those partitions were mounted before you started messing with the partition table, so make sure to unmount everything relating to that disk.
Last edited by tavianator (2010-06-24 23:24:25)

I can't import filmfiles from the hard disk into iMovie?

Hello,
I want to move some videofiles from my pc (windows) to my new imac. I put the files on an external usb device and tried to import the files into iMovie. But iMovie does not import the files. Earlier i imported some files from my sony videocamera and this was know problem. can anyone help me with this problem.
I'm new with Apple en iMovie!
Greetings
Bram

Welcome to the Apple Discussions.
Several people have mentioned this and it seems to be a change with 10.6
If you're using Safari then right click on an image and choose 'Add Image to iPhoto Library'. Otherwise drag the image to the desktop first.
Regards
TD

VDI 3 patch 2 on Solaris10U7 can't import Desktop - failed zfs commands

Hello,
I've installed and tested VDI 3 patch 2 successfully. Everthing worked correctly as expected. All components are installed on Solaris 10 U7 latest patch cluster, two hosts, one serves the VDI service + SunRay + VirtualBox 2.0.10 and the other is working as storage host.
Later I decided to clean up all pools and the Storage Provider. Delete everything exept the Settings for Users Directorys.
Now i wanted to reactivate the Desktop Provider with the same Options ( Virtualbox host, and storage host on different machines)
Everything seem like it works correct but when i try to import a Desktop then it fails after a few seconds.
cacao.0 says:
INFO: thr#38 Starting Job[513]: IMPORT for [pool]
18.09.2009 16:18:29 com.sun.vda.service.vbox.ZfsStorage resizeVolume
SCHWERWIEGEND: thr#38 Command "zfs set volsize=18253610496B tank/07cf8f40-2d27-416d-87dc-7613b7c699ad" failed on host con05!
Error code: 1
18.09.2009 16:18:29 com.sun.vda.service.vbox.Storage copyFileToVolume
SCHWERWIEGEND: thr#38 null
com.sun.vda.service.api.ServiceException: Resizing of volume 07cf8f40-2d27-416d-87dc-7613b7c699ad failed on storage server con05.
at com.sun.vda.service.vbox.ZfsStorage.resizeVolume(Unknown Source)
at com.sun.vda.service.vbox.Storage.copyFileToVolume(Unknown Source)
at com.sun.vda.service.vbox.VBDesktopProvider.importImage(Unknown Source)
at com.sun.vda.service.vbox.VBDesktopProvider.newDesktopInstance(Unknown Source)
at com.sun.vda.service.core.jobs.ImportJob.runJob(Unknown Source)
at com.sun.vda.service.core.jobs.Job.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
at java.lang.Thread.run(Thread.java:595)
and on storage side the zfs history (first command was me, others belong to the vda service):
2009-09-18.11:37:11 zpool create tank c0t0d0s1
2009-09-18.16:16:50 zfs create -s -V 1048576B tank/test0.6129920980714989
2009-09-18.16:16:53 zfs set shareiscsi=off tank/test0.6129920980714989
2009-09-18.16:16:54 zfs destroy tank/test0.6129920980714989
2009-09-18.16:16:57 zfs create -s -V 1048576B tank/test0.6574280129256773
2009-09-18.16:16:59 zfs set shareiscsi=off tank/test0.6574280129256773
2009-09-18.16:17:01 zfs destroy tank/test0.6574280129256773
2009-09-18.16:18:24 zfs create -s -V 7516192768B tank/07cf8f40-2d27-416d-87dc-7613b7c699ad
2009-09-18.16:18:27 zfs set shareiscsi=on tank/07cf8f40-2d27-416d-87dc-7613b7c699ad
2009-09-18.16:18:28 zfs set shareiscsi=off tank/07cf8f40-2d27-416d-87dc-7613b7c699ad
2009-09-18.16:18:31 zfs set shareiscsi=off tank/07cf8f40-2d27-416d-87dc-7613b7c699ad
2009-09-18.16:18:32 zfs destroy -R tank/07cf8f40-2d27-416d-87dc-7613b7c699ad
what is the vda service doing at this point?
I hope someone can take me on the right way..
thanks in advance
daniel

Hey all -
Being that I came across this and was in a hurry, I hacked up a very hacky hack to hack around the problem.
#! /usr/bin/ksh
if echo "$*" | grep 53687090688B >/dev/null 2>&1
then
        logger -p kern.crit "Saw Dumb Stuff - employing workaround"
        eval `echo $* | sed 's/53687090688/53687091200/'`
        EXITCODE=$?
else
        zfs.binary $*
        EXITCODE=$?
fi
exit $EXITCODEI determined the size it was using by using the dtrace toolkit execsnoop script, then calculated the next block size up from there.
I copied /usr/sbin/zfs to /usr/sbin/zfs.binary and replaced /usr/sbin/zfs with my script ONLY WHILST THE IMPORT OPERATION WAS HAPPENING. Clearly, there are a lot of operations the script does bad things to...
It's putrid, and has absolutely no error checking and it should not be used under any circumstances. :)
You can see what it might take to work around the problem though...
Looks like it's a classic off by one error. In my case, the calculated value they try to resize to is 8192 bytes off (one * blocksize) so it fails.
YMMV, but I'm importing as we speak, so it is worth a bash.
NOTE: don't just copy / paste the script if you don't know what you are doing. Just don't. (eg: It'll always return success!)
Cheers,
Nathan.
Edited by: vdidude on Sep 30, 2009 7:39 PM
Edited by: vdidude on Sep 30, 2009 8:47 PM

Can't import ISO Files in VM Manager because of empty Server Pool Drop Down

Hello,
I have the problem I can't import ISO files to my ressources because in the import mask the Server Pool Name Drop Down is empty. In the Virtual Machine Templates mask the Server Pool Name is available and I have already one VM running with a oracle template.
The Server is 2.2.1 and the vm manager is 2.2.0. The server is a Pentium 4 3.2 GHz with 4 MB of RAM. The High Availabiliy Mode is enabled for this server pool.
Has anyone an idea what the problem could be?
Any help will be appreciated.
Thx in forward
Ben

I am not expert in this area myself, but I believe the issue may be with shared storage changes that are needed for high availability.
The following reference may help: http://download.oracle.com/docs/cd/E15458_01/doc.22/e15444/ha.htm
I may of course be misleading you completely.

Solaris 10 upgrade and zfs pool import

Hello folks,
I am currently running "Solaris 10 5/08 s10x_u5wos_10 X86" on a Sun Thumper box where two drives are mirrored UFS boot volume and the rest is used in ZFS pools. I would like to upgrade my system to "10/08 s10x_u6wos_07b X86" to be able to use ZFS for the boot volume. I've seen documentation that describes how to break the mirror, create new BE and so on. This system is only being used as iSCSI target for windows systems so there is really nothing on the box that i need other then my zfs pools. Could i simply pop the DVD in and perform a clean install and select my current UFS drives as my install location, basically telling Solaris to wipe them clean and create an rpool out of them. Once the installation is complete, would i be able to import my existing zfs pools ?
Thank you very much

Sure. As long as you don't write over any of the disks in your ZFS pool you should be fine.
Darren

ZFS - Can't make raidz pool available. Please Help

Hi All,
Several months ago I created a raidz pool on a 6 disk external sun array. It was working fine until the other day when I lost a drive. I took out the old drive, and put in the new drive, and am unable to bring the pool back up. It wont let me issue a zpool replace, or an online, or anything. Here is hopefully all the info you need to see what's going on: (If you need more info, let me know)
Piece of dmesg from after the reboot.
Dec 19 14:17:14 stzehlsun fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-CS, TYPE: Fault, VER: 1, SEVERITY: Major
Dec 19 14:17:14 stzehlsun EVENT-TIME: Tue Dec 19 14:17:14 EST 2006
Dec 19 14:17:14 stzehlsun PLATFORM: SUNW,Ultra-2, CSN: -, HOSTNAME: stzehlsun
Dec 19 14:17:14 stzehlsun SOURCE: zfs-diagnosis, REV: 1.0
Dec 19 14:17:14 stzehlsun EVENT-ID: 644874cf-084d-413d-88c6-c195db617041
Dec 19 14:17:14 stzehlsun DESC: A ZFS pool failed to open. Refer to http://sun.com/msg/ZFS-8000-CS for more information.
Dec 19 14:17:14 stzehlsun AUTO-RESPONSE: No automated response will occur.
Dec 19 14:17:14 stzehlsun IMPACT: The pool data is unavailable
Dec 19 14:17:14 stzehlsun REC-ACTION: Run 'zpool status -x' and either attach the missing device or
Dec 19 14:17:14 stzehlsun restore from backup.
# zpool status
pool: array
state: FAULTED
status: One or more devices could not be opened. There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-D3
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
array UNAVAIL 0 0 0 insufficient replicas
c0t9d0 ONLINE 0 0 0
c0t10d0 ONLINE 0 0 0
c0t11d0 ONLINE 0 0 0
c0t12d0 ONLINE 0 0 0
c0t13d0 UNAVAIL 0 0 0 cannot open
c0t14d0 ONLINE 0 0 0
# zpool online array c0t13d0
cannot open 'array': pool is currently unavailable
run 'zpool status array' for detailed information
# zpool replace array c0t13d0
cannot open 'array': pool is currently unavailable
run 'zpool status array' for detailed information
As you can see, I've replaced c0t13d0 with the new drive, format sees it just fine, and it apprears to be up and running. What do I need to do to get this new drive into the raidz pool and get my pool back on-line? I just don't see what Im missing here. Thanks!
Steve

Sadly, I never received an answer on this forum, So I opened a ticket with Sun, and they got right back to me. For anyone following this thread, I'll pass along what they told me.
Basically, I THOUGHT I had created a raidz pool, apparently I did not, and only created a RAID0 pool. so with the one disk gone, there was no parity disk to re-build the array, so it remained faulted, no way to fix it, only solution is to destroy the pool and start again. I really thought I had created a raidz, but now that I have created a raidz pool, I can see the difference in the zpool status command.
Before: (MUST have been RAID0)
NAME STATE READ WRITE CKSUM
array UNAVAIL 0 0 0 insufficient replicas
c0t9d0 ONLINE 0 0 0
c0t10d0 ONLINE 0 0 0
c0t11d0 ONLINE 0 0 0
c0t12d0 ONLINE 0 0 0
c0t13d0 UNAVAIL 0 0 0 cannot open
c0t14d0 ONLINE 0 0 0
After creating a REAL raidz pool:
NAME STATE READ WRITE CKSUM
array ONLINE 0 0 0
raidz ONLINE 0 0 0
c0t9d0 ONLINE 0 0 0
c0t10d0 ONLINE 0 0 0
c0t11d0 ONLINE 0 0 0
c0t12d0 ONLINE 0 0 0
c0t13d0 ONLINE 0 0 0
c0t14d0 ONLINE 0 0 0
Note the added raidz line.
I asked the tech support guy if it was possible that I HAD created a raidz, but due to the disk loss and reboots it was a bug that was only showing it as a RAID0, and he said there are no reported cases of such an incident, and he really didn't think so. So, I guess I just messed up when I created it in the first place, and since I didn't know what a raidz pool would look like, I had no way of knowing I hadn't created one (Yes, I know I could have added up the disk space and realized no disk was being used for parity, but I didn't)
So moral here is to make sure you created what you thought you had created, and then it will do what you expect.

[SOLVED] ERROR: Can't find root device /dev/disk/by-uuid/...

Yesterday I upgraded my Lenovo X100e
# pacman -Syu
and rebooted. Upon reboot, I received the error
ERROR: Cannot find root device '/dev/disk/by-uuid/[...]'
just after kernel decompression. I got dropped into the recovery shell
and could not boot.
The problem persisted despite using different root device names (e.g. /dev/sda3, the actual root device). After reading the instructions at https://wiki.archlinux.org/index.php/Chroot I used a core installation image (on a usb stick) to boot the machine, then chrooted into my installation. I ran mkinitcpio and found that the udev hook was missing, i.e. not in /lib/initcpio/hooks. Nor was there a file called 'udev' in /lib/initcpio/install.
I copied these files from a friend's installation and then re-ran mkinitcpio:
# mkinitcpio -p linux
I was able to reboot successfully after that.
The weird thing: I don't know how the udev hook script was deleted from /lib/initcpio/hooks.
If someone else runs into this problem: try to run mkinitcpio (e.g. by using the chroot), and check for this problem. I think the problem was that the root device could not be found because the udev hook had not run, and therefore /dev was unpopulated.

Betwen udev-181-2 and udev-181-5, the hooks have moved from /lib/initcpio to /usr/lib/initcpio. But mkinitcpio -L should still list them.
I have a similar problem since the last update. The kernel doesn't seem to load my raid driver anymore. Upon boot it throws some cryptic udev messages at me and then crashes. Haven't found out what that is about yet.

SFTP chroot from non-global zone to zfs pool

Hi,
I am unable to create an SFTP chroot inside a zone to a shared folder on the global zone.
Inside the global zone:
I have created a zfs pool (rpool/data) and then mounted it to /data.
I then created some shared folders: /data/sftp/ipl/import and /data/sftp/ipl/export
I then created a non-global zone and added a file system that loops back to /data.
Inside the zone:
I then did the ususal stuff to create a chroot sftp user, similar to: http://nixinfra.blogspot.com.au/2012/12/openssh-chroot-sftp-setup-in-linux.html
I modifed the /etc/ssh/sshd_config file and hard wired the ChrootDirectory to /data/sftp/ipl.
When I attempt to sftp into the zone an error message is displayed in the zone -> fatal: bad ownership or modes for chroot directory /data/
Multiple web sites warn that folder ownership and access privileges is important. However, issuing chown -R root:iplgroup /data made no difference. Perhaps it is something todo with the fact the folders were created in the global zone?
If I create a simple shared folder inside the zone it works, e.g. /data3/ftp/ipl......ChrootDirectory => /data3/ftp/ipl
If I use the users home directory it works. eg /export/home/sftpuser......ChrootDirectory => %h
FYI. The reason for having a ZFS shared folder is to allow separate SFTP and FTP zones and a common/shared data repository for FTP and SFTP exchanges with remote systems. e.g. One remote client pushes data to the FTP server. A second remote client pulls the data via SFTP. Having separate zones increases security?
Any help would be appreciated to solve this issue.
Regards John

sanjaykumarfromsymantec wrote:
Hi,
I want to do IPC between inter-zones ( commnication between processes running two different zones). So what are the different techniques can be used. I am not interested in TCP/IP ( AF_INET) sockets.Zones are designed to prevent most visibility between non-global zones and other zones. So network communication (like you might use between two physical machines) are the most common method.
You could mount a global zone filesystem into multiple non-global zones (via lofs) and have your programs push data there. But you'll probably have to poll for updates. I'm not certain that's easier or better than network communication.
Darren

Large number of Transport errors on ZFS pool

This is sort of a continuation of thread:
Issues with HBA and ZFS
But since it is a separate question thought I'd start a new thread.
Because of a bug in 11.1, I had to downgrade to 10_U11. Using an LSI 9207-8i HBA (SAS2308 chipset). I have no errors on my pools but i consistently see errors when trying to read from the disks. They are always Retryable or Reset. All in all the system functions but as I started testing I am seeing a lot of errors in IOSTAT.
bash-3.2# iostat -exmn
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.1 0.2 1.0 28.9 0.0 0.0 0.0 41.8 0 1 0 0 1489 1489 c0t5000C500599DDBB3d0
0.0 0.7 0.2 75.0 0.0 0.0 21.2 63.4 1 1 0 1 679 680 c0t5000C500420F6833d0
0.0 0.7 0.3 74.6 0.0 0.0 20.9 69.8 1 1 0 0 895 895 c0t5000C500420CDFD3d0
0.0 0.6 0.4 75.5 0.0 0.0 26.7 73.7 1 1 0 1 998 999 c0t5000C500420FB3E3d0
0.0 0.6 0.4 75.3 0.0 0.0 18.3 68.7 0 1 0 1 877 878 c0t5000C500420F5C43d0
0.0 0.0 0.2 0.7 0.0 0.0 0.0 2.1 0 0 0 0 0 0 c0t5000C500420CE623d0
0.0 0.6 0.3 76.0 0.0 0.0 20.7 67.8 0 1 0 0 638 638 c0t5000C500420CD537d0
0.0 0.6 0.2 74.9 0.0 0.0 24.6 72.6 1 1 0 0 638 638 c0t5000C5004210A687d0
0.0 0.6 0.3 76.2 0.0 0.0 20.0 78.4 1 1 0 1 858 859 c0t5000C5004210A4C7d0
0.0 0.6 0.2 74.3 0.0 0.0 22.8 69.1 0 1 0 0 648 648 c0t5000C500420C5E27d0
0.6 43.8 21.3 96.8 0.0 0.0 0.1 0.6 0 1 0 14 144 158 c0t5000C500420CDED7d0
0.0 0.6 0.3 75.7 0.0 0.0 23.0 67.6 1 1 0 2 890 892 c0t5000C500420C5E1Bd0
0.0 0.6 0.3 73.9 0.0 0.0 28.6 66.5 1 1 0 0 841 841 c0t5000C500420C602Bd0
0.0 0.6 0.3 73.6 0.0 0.0 25.5 65.7 0 1 0 0 678 678 c0t5000C500420D013Bd0
0.0 0.6 0.3 76.5 0.0 0.0 23.5 74.9 1 1 0 0 651 651 c0t5000C500420C50DBd0
0.0 0.6 0.7 70.1 0.0 0.1 22.9 82.9 1 1 0 2 1153 1155 c0t5000C500420F5DCBd0
0.0 0.6 0.4 75.3 0.0 0.0 19.2 58.8 0 1 0 1 682 683 c0t5000C500420CE86Bd0
0.0 0.0 0.2 0.7 0.0 0.0 0.0 1.9 0 0 0 0 0 0 c0t5000C500420F3EDBd0
0.1 0.2 1.0 26.5 0.0 0.0 0.0 41.9 0 1 0 0 1511 1511 c0t5000C500599E027Fd0
2.2 0.3 133.9 28.2 0.0 0.0 0.0 4.4 0 1 0 17 1342 1359 c0t5000C500599DD9DFd0
0.1 0.3 1.1 29.2 0.0 0.0 0.2 34.1 0 1 0 2 1498 1500 c0t5000C500599DD97Fd0
0.0 0.6 0.3 75.6 0.0 0.0 22.6 71.4 0 1 0 0 677 677 c0t5000C500420C51BFd0
0.0 0.6 0.3 74.8 0.0 0.1 28.6 83.8 1 1 0 0 876 876 c0t5000C5004210A64Fd0
0.6 43.8 18.4 96.9 0.0 0.0 0.1 0.6 0 1 0 5 154 159 c0t5000C500420CE4AFd0
Mar 12 2013 17:03:34.645205745 ereport.fs.zfs.io
nvlist version: 0
     class = ereport.fs.zfs.io
     ena = 0x114ff5c491a00c01
     detector = (embedded nvlist)
     nvlist version: 0
          version = 0x0
          scheme = zfs
          pool = 0x53f64e2baa9805c9
          vdev = 0x125ce3ac57ffb535
     (end detector)
     pool = SATA_Pool
     pool_guid = 0x53f64e2baa9805c9
     pool_context = 0
     pool_failmode = wait
     vdev_guid = 0x125ce3ac57ffb535
     vdev_type = disk
     vdev_path = /dev/dsk/c0t5000C500599DD97Fd0s0
     vdev_devid = id1,sd@n5000c500599dd97f/a
     parent_guid = 0xcf0109972ceae52c
     parent_type = mirror
     zio_err = 5
     zio_offset = 0x1d500000
     zio_size = 0xf1000
     zio_objset = 0x12
     zio_object = 0x0
     zio_level = -2
     zio_blkid = 0x452
     __ttl = 0x1
     __tod = 0x513fa636 0x26750ef1
I know all of these drives are not bad and I have confirmed they are all running the latest firmware and correct sector size, 512 (ashift 9). I am thinking it is some sort of compatibility with this new HBA but have no way of verifying. Anyone have any suggestions?
Edited by: 991704 on Mar 12, 2013 12:45 PM

There must be something small I am missing. We have another system configured nearly the same (same server and HBA, different drives) and it functions. I've gone through the recommended storage practices guide. The only item I have not been able to verify is
"Confirm that your controller honors cache flush commands so that you know your data is safely written, which is important before changing the pool's devices or splitting a mirrored storage pool. This is generally not a problem on Oracle/Sun hardware, but it is good practice to confirm that your hardware's cache flushing setting is enabled."
How can I confirm this? As far as I know these HBAs are simply HBAs. No battery backup. No on-board memory. The 9207 doesn't even offer RAID.
Edited by: 991704 on Mar 15, 2013 12:33 PM

ISCSI array died, held ZFS pool. Now box han

I was doing some iSCSI testing and, on an x86 EM64T server running an out-of-the box install of Solaris 10u5, created a ZFS pool on two RAID-0 arrays on an IBM DS300 iSCSI enclosure.
One of the disks in the array died, the DS300 got really flaky, and now the Solaris box gets hung in boot. It looks like it's trying to mount the ZFS filesystems. The box has two ZFS pools, or had two, anyway. The other ZFS pool has some VirtualBox images filling it.
Originally, I got a few iSCSI target offline messages on the console, so I booted to failsafe and tried to run iscsiadm to remove the targets, but that wouldn't work. So I just removed the contents of /etc/iscsi and all the iSCSI instances in /etc/path_to_inst on the root drive.
Now the box hangs with no error messages.
Anyone have any ideas what to do next? I'm willing to nuke the iSCSI ZFS pool as it's effectively gone anyway, but I would like to save the VirtualBox ZFS pool, if possible. But they are all test images, so I don't have to save them. The host itself is a test host with nothing irreplaceable on it, so I could just reinstall Solaris. But I'd prefer to figure out how to save it, even if only for the learning experience.

Try this. Disconnect the iSCSI drives completely, then boot. My fallback plan on zfs if things get screwed up is to physically disconnect the zfs drives so that solaris doesn't see them on boot. It marks them failed and should boot. Once it's up, zpool destroy the pools WITH THE DRIVES DISCONNECTED so that it doesn't think there's a pool anymore. THEN reconnect the drives and try to do a "zpool import -f".
The pools that are on intact drives should be still ok. In theory :)
BTW, if you removed devices, you probably should do a reconfiguration boot (create a /a/reconfigure in failsafe mode) and make sure the devices gets reprobed. Does the thing boot in single user ( pass -s after the multiboot line in grub )? If it does, you can disable the iscsi svcs with "svcadm disable network/iscsi_initiator; svcadm disable iscsitgt".

[Solved]Can't get past "Starting network" during installation from USB

I'm trying to do a fresh install of Archlinux on my laptop, following the Beginners Guide. However, I never get to the shell prompt because the installation gets stuck at "Starting network [Busy]".
These are steps I followed:
- I downloaded the latest Arch ISO.
- I wrote it to a usb stick ("Kingston Data Traveler") with "dd if=archlinux.iso of=/dev/sdb"
- I put the usb stick into my laptop (HP Pavilion dm1-4139sd), and booted it. I chose "x86_64". Everything seems to go fine, the only warning I get is: "microcode: failed to load file amd-ucode/microcode_amd.bin".
- I get the message "INIT: Entering runlevel: 3" and "Starting Syslog-NG [Done]", after that its endlessly (at least 20 minutes) stuck on "Starting network [Busy]".
I tried the following variations on the above:
- Wrote the ISO to a different USB, samen problem.
- Tried it with a different Arch version (from August), same problem.
- Turned kvm on and off, same problem.
- Pressed TAB at the "x86_64" line and added "acpi=off" to the end of it. I then get:
:: Triggering uevents....
[ 19.697932 ] usb 1-1: device not accepting address 2, error -110
:: Mounting '/dev/disk/by-label/ARCH_201209' to '/run/archiso/bootmnt'
Waiting 30 seconds for device /dev/disk/by-label/ARCH_201209
[ 35.312165 ] usb 1-1: device not accepting address 3, error -110
[ 45.821422 ] usb 1-1: device not accepting address 4, error -110
[ 56.330681 ] usb 1-1: device not accepting address 5, error -110
[ 56.330681 ] hub 1-0:1.0: unable to enumerate USB devices on port 1
ERROR: '/dev/disk/by-label/ARCH_201209' device did not show up after 30 seconds....
Falling back to interactive prompt
You can try to fix the problem manually, log out when you are finished
sh: can't access tty; job control turned off
[rootfs /]#
I'm not sure what to do in this prompt. At least the network isn't working and none of the normal commands.
- I reset the DHCP leases on my modem, same problem.
- I turned off/on the hardware WLAN, same problem.
- I tried it with or without the LAN cable plugged in, same problem.
- I tried it with or without the battery plugged in, and with or without the power cable plugged in. Same problem.
I also performed the Arch memory test, and the bios hard disk test. They both report no errors.
I did an Arch installation a month ago on this laptop, and back then it didn't have this problem. I still have that arch installed from back then and it boots/works without problem (I have Starting NetworkingManager [Done]). I know I technically don't need to install arch again, as I still have the old one, but I wanted to clean the changes I made to some settings files and I if it turns out this is some kind of hardware problem I'd like to return it to the store.
Please help! I'm not really that familiar with Linux in general, so it might have been a very basic mistake from me as well. I have no idea what to do next, except for formatting my harddisk (which I don't think should influence the installer?), or trying to install a different Linux distro.
Edit: The solution that solved this problem for me in the end was resetting the BIOS to its defaults.
Last edited by dbakker (2012-09-15 11:28:14)

DSpider wrote:Did you check the MD5 checksum? If you downloaded it using a BitTorrent client, then you don't need to (its hash is checked during downloading).
Yes, I downloaded it using BitTorrent and the MD5 matches.
DSpider wrote:Try a different stick?
Yes, I tried it with 2 different USB sticks. They are both from the same brand ("Kingston DataTraveler"), so I might try different brands as well, but I don't think it is the problem.
DSpider wrote:Or at least a different method from that wiki page. For example, "dd"-ing it should work just fine.
The methods for writing the ISO that I tried so far:
The base method:
dd if=archlinux.iso of=/dev/sdb
(from an Arch install)
The method without overwriting your USB drive (created a FAT32 drive with an VFAT filesystem, from an arch install)
Using Imagewriter (from Windows)
Each method gives the same result: the laptop gets stuck at "Starting network [Busy]" when booted from one of the USB sticks. My modem indicates that it "sees" the laptop, but that the laptop has no established connection. When the laptop is stuck at the "Starting network [Busy]", all the activity on the USB stick stops (the light stops flashing).
The thing I find most strange is that the OS currently installed on the laptop has no problems booting or accessing the internet (both wired and wireless) at all.

[Solved] Can't Import ZFS Pool as /dev/disk/by-id

Similar Messages

Maybe you are looking for