NSS Pool Deactivating - take 2

I'm having a similar situation to the other thread by this name.
Our server had a failed RAID drive and replaced it. Since then we had issues with some data files. When we tried to back them up or read them the NSS pool would dismount and we'd have to restart the server. I ran "nss /poolrebuild /purge" yesterday on all the pools and the problem persists.
We're running NW65 and have N65NSS8C installed.
What's our next step? I'm not quite ready to throw in the towel, back up what we can, and rebuild the server. I gotta believe there's SOMEthing else we can do.

Originally Posted by zeffan
I'm having a similar situation to the other thread by this name.
Our server had a failed RAID drive and replaced it. Since then we had issues with some data files. When we tried to back them up or read them the NSS pool would dismount and we'd have to restart the server. I ran "nss /poolrebuild /purge" yesterday on all the pools and the problem persists.
We're running NW65 and have N65NSS8C installed.
What's our next step? I'm not quite ready to throw in the towel, back up what we can, and rebuild the server. I gotta believe there's SOMEthing else we can do.
What make of server / array? What RAID level? Software or hardware RAID? If you replace a RAID drive and you lose data, or even have an interruption of service, you didn't have RAID. In many instances, if there is an array rebuild failure, the array will tell you this and then indicate that the bad data ( mirror mismatches ) will still exist until its overwritten. HP does this. The problem is that NSS cannot know which blocks are bad or not, and if it reads a bad block, the read fails and the NSS thinks the disk channel has failed ( which it has ).
Ultimately if the RAID failed to do its job, every block is suspect. Restore from last good backup.
More details are required to fully answer your question. But assuming you had an array failure that caused a loss of data, you should recreate the array and restore.
-- Bob

Similar Messages

OES11 SP2 - Linux File System and NSS Pools & Volumes

Planning to install our first OES11 SP2 server into an existing tree - the
idea is to run this new OES11 server virtualized on VMware ESXi 5.5
The existing tree has two physical NW6.5SP8 servers running eDirectory
Version 8.7.3.10b (NDS Version 10554.64). One of the existing Netware
servers is used for DHCP/DNS, File Sharing from three NSS volumes and
Groupwise 7.0.4 whilst the second server is used for FTP services and
eDirectory redundancy. Ultimately the plan is to have two virtualized OES11
SP2 server with one for file sharing and the other for making the move from
GW7 to GW2012. And we're planning to stick with NSS for file sharing on the
new OES11 SP2 server.
I've come across a couple of posts for earlier versions of OES which
recommended not to put the Linux Native OS File System and NSS storage
pools/volumes on the same hard drive. Apparently the advice was a result of
needing to allow EVMS to manage the drive which could be problemmatic.
I've read the OES11 documentation which says that "The Enterprise Volume
Management System (EVMS) has been deprecated in SLES 11, and is also
deprecated in OES 11. Novell Linux Volume Manager (NLVM) replaces EVMS for
managing NetWare partitions under Novell Storage Services (NSS) pools."
So I'm wondering if there is still a compelling requirement to keep the
Linux Native File System and NSS pools/volumes on separate hard drives or
can they both now safely co-exist on the same drive without causing
headaches or gotchas for the future?
Neil

Hi Willem,
Many thanks for the further reply.
So we can just use the VMWare setup to "split" the one physical drive into
two virtual drives (one for the OS and the second for the pools).
And I've seen posts in other forums about the need for a decent battery
backed cache module for the P410i controller so I'll make sure we get one
(probably 512Mb module + battery).
Can I ask what is the advantage of configuring each VM's virtual disk to run
on it's own virtual SCSI adapter (by setting disk1 to scsi 0:0, disk2 to
scsi 1:0, and so on)?
Cheers,
Neil
>>> On 9/5/14 at 12:56, in message
<[email protected]>,
magic31<[email protected]> wrote:
> HI Niel,
>
> xyzl;2318555 Wrote: >
>> The new installation will run on a Proliant ML350 G6 with P410i>
> controller
>> so we can use the raid capability to create two different logical drives>
> as
>> suggested.
>
> As you will be using ESXi 5.5 as host OS, it's not needed to split
> thehost server storage into two logical drives... unless that's what
> youwant in perspective for "general performance" or redundancy reasons.
> Italso depends on the options that P410i controller has.
>
> On a side note, I'm not too familiar with the P410i controller... domake
> sure you have a decent battery backed cache module installed, asthat will
> greatly help with the disk performance bit.
> If the controller can handle it and the controller can handle it, go
> forraid 10 or raid 50. That might be too big a space penalty but will
> helpwith disk performance.
>
> Once you have your VMware server up and running, you can configure
> thetwo VM's with each two or more drives attached (on for the OS,
> thesecond or others for your pools).
> I usually create a virtual disk per pool+volume set (e.g. DATAPOOL
> &DATAVOLUME on one vm virtual disk, USERPOOL & USER volume on an other
> vmvirtual disk).
> With VMware you can than also configure each VM's virtual disk to run
> onit's own virtual SCSI adapter (bij setting disk1 to scsi 0:0, disk2
> toscsi 1:0, and so on).
>
>
> xyzl;2318555 Wrote: > Do you have any suggestions for the disk space that
> should be reserved> or
>> used for the Linux Native OS File System (/boot, /swap and LVM)?
>>
>
> Here's one thread that might be of interest (there are more
> throughoutthe SLES/OES
>
forums):https://forums.novell.com/showthread...rtitioning-%28
> moving-from-NW%29
>
> I still contently follow the method I choose for back in 2008, justwith
> a little bigger sizing which now is:
>
> On a virtual disk sized 39GB:
>
> primary partition 1: 500 MB /boot , fs type ext2
> primary partition 2: 12GB / (root), fs type ext3
> primary partition 3: 3 GB swap , type swap
>
> primary partition 4: LVM VG-SYSTEM (LVM partition type 8E), takes up
> therest of the disk **
> LVM volume (lv_var): 12 GB /var , fs type ext3
> LVM volume (lv_usr-install): 7GB /usr/install, fs type ext3
> * there's still a little space left in the LVM VG, in case var needs
> toquickly be enlarged
>
> One thing that's different in here vs what I used to do: I replaced
> the/tmp mountpoint with /usr/install
>
> In /usr/install, I place all relevant install files/IOS's
> andinstallation specifics (text files) for the server in question. Keeps
> itall in one neat place imo.
>
> Cheers,
> Willem-- Knowledge Partner (voluntary sysop)
> ---
> If you find a post helpful and are logged into the web interface,
> please show your appreciation and click on the star below it.
>
Thanks!---------------------------------------------------------------------
---magi
> c31's Profile: https://forums.novell.com/member.php?userid=2303View this
> thread: https://forums.novell.com/showthread.php?t=476852

Recovering NSS pool after server crash

Server crashed the other day - a SLES 11 sp1 XEN host server system with 3 VMs, 2 OES servers (SLES 10 sp3) NSS pools/partitions and 1 pure SLES 11 server. After the crash (due either to power reset or raid controller failure) I had to run fsck to clear up multiple, multiple error conditions on the SLES partitions in order for the servers to reboot. Once booted up, edir and of course NSS would not load. eDir has been recovered and is now running but one of these servers (the one with most all of the data on NSS) is unstable and still has some boot problems. Would the following be a viable option at this point:
1. Take this server out of the tree (if possible)
2. Install fresh, new OES server in a logical drive
3. Join the existing tree
4. Attach the NSS partition to this server
5. Bring up NSS
Any reason why this would not work to get to a stable server that can mount the existing NSS pool/partition? Thanks.
Don

That will work, assuming the new server does not have a volume/pool by the same name already.

Large NSS Pool

I have a NSS Pool spanning 4 SAN volumes of 2TB a piece for a total pool of 8tb. No problems, until I need more space--but that's another issue. I wanted see if anyone has recommendations for NSS settings for such a big pool. I am using the defaults. Why I ask is that we are implementing a new Backup solution (Commvault) and our servers that have smaller pools, not spanning partitions get much better throughput.---identical hardware, HBA, RAM, etc.---just not 8tb of SAN volume. More of a ONE to ONE ratio---POOL to VOLUME to SAN VOLUME to Partition.
Could it be that the POOL spanning and it's size causing slowness?

Yes, the size and spanning would make this a little slower. If it were me, I would never do it like this. Just think if something happened to that pool and you had to rebuild it. A rule of thumb for a rebuild it 45 min to an hour for every 100G of uged disk space. As that pools starts to fill up, it will take days to repair it. I would not want the down to to do that. Even if you didn't repair it with rebuild, you decided that you were going to delete it an recreate it and then restore the data. That too will take a long time. To get back to the question, yes, it would slow it down.
jgray

Migrate NSS pools to new device?

Hi everyone:
This may sound strange, but it is what it is : We're trying to migrate a series of cluster enabled pools from one SAN to another SAN (the fiber connects have already been made and tested...so that's no part of the equation at this time).
Problem is, we've been asked to do this with ZERO disruption to the users. And we're talking about 30 tb of data here encompassing a cornucopia of volumes and pools servicing a large spectrum of users across a rather massive eDirectory structure (many login scripts to modify among other things).
SO assuming I can activate pools and mount volumes to a server from both SANs (which I can), is there any possible way to force the NSS Pool to activate on another device if the device has been mirrored from one SAN to the other?
Right now what happens is that the secondar lun (which has a copy of the data on the primary lun) shows up as a new device, and there doesn't seem to be any way to simply mount the pool that's been mirrored to it.
Barring that, is there any way to expand the pool to encompass both devices, but then shrink it back down so that it only encompasses the new device?
Any help would be greatly appreciated.
Thanks!

Originally Posted by jgray
To put it into a nut shell. You mirror the two sans together. Once the mirror is do, you break the mirror and now the data is on the new san. If the sans are from the same vendor, then they should have a tool to tranfer the data from one san to the other.
More on the mirroring process, in maybe a cassaba shell:
Mirrors for Moving Partitions - CoolSolutionsWiki
/dps

NSS pool errors during every NSS background check

We are running a NW 6.5 SP6 server with NSS pools SYS and POOL1.
POOL1 contains two volumes: VOL1 and MAIL. Both can grow to max. pool size.
During every NSSbackground check we receive the messages below.
I checked the novell support site, but can't find anything usefull on this.
Anyone got an idea what they mean and how to get rit of them?
Kind regards,
Mario van Rijt
systems administrator
System console
4-07-2008 16:59:10 : COMN-3.25-178 [nmID=A0020]
NSS-2.70-5009: Pool MFP02/POOL1 had an error
(20012(beastTree.c[510])) at block 23579655(file block -23579655)(ZID
1).
4-07-2008 16:59:10 : COMN-3.25-180 [nmID=A0022]
NSS-2.70-5008: Volume MFP02/VOL1 had an error
(20012(beastTree.c[510])) at block 23579655(file block -23579655)(ZID
1).
Logger screen
4 Jul 2008 16:59:10 NSS<COMN>-3.25-xxxx: comnPool.c[2629]
Pool POOL1: System data error 20012(beastTree.c[510]). Block
23579655(file
block -23579655)(ZID 1)
4 Jul 2008 16:59:10 NSS<COMN>-3.25-xxxx: comnVol.c[8852]
Volume VOL1: System data error 20012(beastTree.c[510]). Block
23579655(fil
e block -23579655)(ZID 1)

Originally Posted by Mario vanRijt
(20012(beastTree.c[510])) at block 23579655(file block -23579655)(ZID
1).
Logger screen
4 Jul 2008 16:59:10 NSS<COMN>-3.25-xxxx: comnPool.c[2629]
Pool POOL1: System data error 20012(beastTree.c[510]). Block
23579655(file
block -23579655)(ZID 1)
This one zeems to be
"20012 zERR_MEDIA_CORRUPTED: The media is corrupted."
This may indicate that it is time to do a pool rebuild, although I don't know enough about what line 510 of beastTree.c is looking for to say much about what the actual problem is.
/dps
4 Jul 2008 16:59:10 NSS<COMN>-3.25-xxxx: comnVol.c[8852]
Volume VOL1: System data error 20012(beastTree.c[510]). Block
23579655(fil
e block -23579655)(ZID 1)[/QUOTE]

Lost NSS Pool After Patching

Just ran the latest patches on my OES11SP1 server and one of the NSS pools has disappeared. The lost pool was built on top of a linux RAID5 array and worked with no problems until last night. After patching and restarting the server, the pool was no longer listed via nssmu and the raid is no longer listed under devices - a check of the linux array showed that it is up and stable. I tried to rebuild the pool via nsscon, but it said that it does not exist - any ideas on how can I get the pool up and running again?

Originally Posted by magic31
I'm assuming this is an OES install, installed directly on the hardware (no virtualisation layer)?
You mention the raid is no longer listed... but are the disk devices themselves listed? If not in nssmu, are you seeing them listed in the output of "fdisk -l"?
Along that line... What kind of disk controller(s) do you have in the server? Have you installed any separate drivers for controllers (not included in the SLES install) that might have not loaded with a kernel update?
If you have the option to open an SR alongside getting pointers here, that would be wise imo.
-Willem
This was an install of OES11SP1 directly to the hardware.
The disk themselves show up in nssmu and mdadm sees the array with no errors.
The disk controller is on the motherboard (Adaptec chipset) but did not need any special drivers or config during the install.
What I have seen in the docs is that with OES11SP1, it frowned upon to run NSS on a Linux-software array like I was using. I tried to mount the array using using the mnt command and a file type of "nsspool" but got an error; nsscon says to validate and rebuild the pool, but then comes back and won't let me put it in maintenance mode. I am wondering if I need to delete the old Pool object from NDS, but that won't address the lack of NSS seeing the device...
-L

Netware NSS-Pools under VMWare ESX

Just for my planning: When placing the Netware-Server on one
ESX-Datastore is it possible/advisable to place an
additional Netware-NSS--Pool for this server on a second
ESX-Datastore in the same ESX-Host (ESX 4.1)? How does
Netware recognize the additional disk-space?
Sincerely
Karl

Thanks a lot, Massimo.
Sincerely
Karl
>>> Massimo Rosen<[email protected]> 26.01.2012 12:08
>>>
>On 26.01.2012 11:55, [email protected] wrote:
>> Just for my planning: When placing the Netware-Server on
>one
>> ESX-Datastore is it possible/advisable to place an
>> additional Netware-NSS--Pool for this server on a second
>> ESX-Datastore in the same ESX-Host (ESX 4.1)?
>
>Of course. And Of course, that works with *every* OS.
>Because the OS has
>*absolutely no idea*. All it sees is two drives.
>
>
>> How does
>> Netware recognize the additional disk-space?
>
>Exactly in the same way as Netware recognizes more than
one
>physical
>harddrive. It's just a drive to it.
>
>CU,
>--
>Massimo Rosen
>Novell Knowledge Partner
>No emails please!
>http://www.cfc-it.de

NSS Pools/Volumes Deactivate - VMware

Greetings,
I've been trying to virtualize two of our NetWare 6.5 servers. My thought
was to install two new virtual servers and migrate the data from the
physical server to the virtual server. However, I've been running into
problems with the NSS volumes on the virtual servers deactivating
themselves ... not exactly the sort of thing you really want happening to
a production box.
When the pools deactivate, the errors displayed are similar to the
following:
10-05-2007 3:36:54am: COMN-3.25-1092 [nmID=A0025]
NSS-3.00-5001: Pool SERVERNAME/IFOLDER is being deactivated.
An I/O error (20204(zio.c[2260])) at block 23602279(file block
77251)(ZID
3) has compromised pool integrity.
10-05-2007 3:36:56am: SERVER-5.70.1534 [nmID=B0013]
Device "[V358-A2-D2:0] VMware Virtual disk f/w:1.0" deactivated by
driver
due to device failure.
FWIW, the VMware hosts are running ESX 3.0.2 and are connected to a NetApp
3050C filer via fiber channel. There are about 60 other Linux and Windows
virtual machines in the mix, all connected the same way that are not
exhibiting these sorts of issues.
I contacted VMware support regarding the issue and was told to use
VMDK-based disks rather than mapped raw LUNs. When that server
experienced the same issue the tech suggested I use locally attached
storage (!?). Other than that, the only other suggestion he gave to me
was to wait for a new LSIMPTNW.HAM to correct the problem I was
experiencing.
Obviously, many of you are running NetWare 6.5 on VMware...what's the key
for success?
Thanks,
Rob

Rob Walters,
>
> I've been trying to virtualize two of our NetWare 6.5 servers. My
> thought was to install two new virtual servers and migrate the data from
> the physical server to the virtual server. However, I've been running
> into problems with the NSS volumes on the virtual servers deactivating
> themselves ... not exactly the sort of thing you really want happening
> to a production box.
>
> When the pools deactivate, the errors displayed are similar to the
> following:
>
> 10-05-2007 3:36:54am: COMN-3.25-1092 [nmID=A0025]
> NSS-3.00-5001: Pool SERVERNAME/IFOLDER is being deactivated.
> An I/O error (20204(zio.c[2260])) at block 23602279(file block
> 77251)(ZID
> 3) has compromised pool integrity.
>
> 10-05-2007 3:36:56am: SERVER-5.70.1534 [nmID=B0013]
> Device "[V358-A2-D2:0] VMware Virtual disk f/w:1.0" deactivated by
> driver
> due to device failure.
>
> FWIW, the VMware hosts are running ESX 3.0.2 and are connected to a
> NetApp 3050C filer via fiber channel. There are about 60 other Linux
> and Windows virtual machines in the mix, all connected the same way that
> are not exhibiting these sorts of issues.
>
> I contacted VMware support regarding the issue and was told to use
> VMDK-based disks rather than mapped raw LUNs. When that server
> experienced the same issue the tech suggested I use locally attached
> storage (!?). Other than that, the only other suggestion he gave to me
> was to wait for a new LSIMPTNW.HAM to correct the problem I was
> experiencing.
>
Rob, I'm having the same problem on one of my NW6.5 servers that's on
VMWare. Last time, I think it showed the same error message you report.
It happened again just this morning, but this time it says:
10-11-2007 8:50:44 am: COMN-3.25-1196 [nmID=A0026]
NSS-3.00-5002: Pool MEDDIR/VOL2 is being deactivated.
An error (20012(nameTree.c[45])) at block 20227(file block -20227)
(ZID6) has compromised volume integrity.
Several other NW6.5 servers on the same VMware host have never given me
any problem. Too weird. If you find out what the culprit is, please
let me know.
Thanks,
Doug

Searching for a word or phrase in a file on NSS Pool/Volume

I have encountered a strange issue in that if I search for a word or phrase in a file at any level of the volume, from the root, down to sub folders, I get different results from 2 or more PC's.
Workstations are all Win XP SP2, Novell Client 4.91 SP4 for Windows
Server NW 6.5 SP6 with all post SP's installed.
I have ran a nss /PoolRebuild=POOLNAME and a NSS /VisibilityRebuild=VOLNAME
There is only one volume on this pool.
And I still encounter the same results.
As one example:
On my Admin workstation When I search the top level Folder of Closed Client folders for the word chronology, I get 2 files returned named chronfile1 and chonfile2, when I search from any other workstation (I have tried 4 different workstations at this point) at the same folder level, for the same word Chronology, I get multiple files with the Word in the File, but I do not see files chronfile1 or chronfile2. Likewise, the file names returned form the search results on the other workstations, are not returned in the search list ran on the admin (Mine) workstation. And yes, I am logged in as admin on the other workstations when running the search.
So far, this issue is only appearing on 1 volume.
Thoughts?

Originally Posted by ataubman
If you log in to the second PC as Admin, does the symptom move? IOW does it follow the user or the workstation? If the user you may have some odd file system rights set, although your specific results you report are puzzling.
I'm assuming you are using the Windows desktop search tool?
I ran a NSS /PoolRebuild last night which fixed an issue the workstations not being able to see the 2 files in question.
Previous night I had just performed an NSS /PoolVerify.
Part of the other anomoly, the other workstations seeing more files than the admin workstation (I feel so Stupid, after 25 years you think I'd be able to spot the obvious) the reason the other workstations were returning files that the Admin workstation was not - Admin workstation doesn't have Word Perfect installed.
Somebody, give me a Gibbs head slap, now!

Clustering CIFS for *existing* NSS pools: Howto?

Hi.
I found several documents stating this one:
"Once the CIFS protocol is enabled for the shared pool the load and unload script
files will get automatically modified to handle the CIFS service."
e.g. http://tinyurl.com/OES-2SP2CIFS and the docu file_cifs_lx.pdf
Currently CIFS is installed and running on both nodes of this cluster.
I can access the CIFS shares, login of users through the CifsProxyUsers is OK
addressing the ClusterNode, that currently has the NSS volumes mounted.
What I cannot figure out is how to create a appropriate vitual server object, and
what lines need to be added to the corresponding cluster ressource of the NSS
volumes I need to share using CIFS.
Who could point me towards more detailed documentation, or, provide some example
load / unload scripts from his/her environment?
TIA, regards, Rudi.

Hi.
I still fail to "cluster" the CIFS functionality.
BTW: Accessing \\node02-w\ works fine, this is shared, accessible through *ANY* ip
currently active on that node. Propably that's the issue, because the ports are
occupied, even if I add another, new secondary.IP? But if rcnovell-cifs is not
running, the load script line doesn't work either??
Did I get correctly,
- that rcnovell-cifs has to *run* on all designated CIFS cluster nodes?
- that the line ...
novcifs --add --vserver=.cn=NW06.o=MAIN.t=TREE. --ip-addr=10.27.1.51
... will add the resource to the virtual server NW06, so that
it will service it using CIFS?
Any further suggestions appreciated, regards, Rudi.
more details: This is what /var/log/cifs/cifs.log presents:
rcnovell-cifs running, then executing the line above: (each line shows up 2x, why?)
CRITICAL: BROWSER: Bind is failed for virual server ip: System Error No= 98
CRITICAL: NBNS: Bind is failed for virual server ip =10.27.1.51: System Error No = 98
CRITICAL: NBNS: Socket creation and binding for virtual server failed
nmap for 10.27.1.51
137/udp open|filtered netbios-ns
138/udp open|filtered netbios-dgm
139/tcp open netbios-ssn
adding another secondary ipaddress first doesn't make it work either:
.. /opt/novell/ncs/lib/ncsfuncs
add_secondary_ipaddress 10.27.99.51
novcifs --remove --vserver=.cn=NW06.o=MAIN.t=TREE. --ip-addr=10.27.99.51
cifs.log:
CRITICAL: ENTRY: Not Able To Conntect To Server 10.27.99.51
CRITICAL: ENTRY: Not Able To Conntect To Server 10.27.99.51
ERROR: CLI: CIFS: Server fdn .cn=NW06.o=Main.t=Tree. and server_name has already
been added
for that additional sec.IP ports look the same:
nmap: 10.27.99.51
137/udp open|filtered netbios-ns
138/udp open|filtered netbios-dgm
139/tcp open netbios-ssn

NSS pool rebuild

Hi,
I have a corrupted pool that need to repair, (vol = 520GB)
i started it through nss /poolrebuild ->choose the pool appear on list
it appears not doing anything, I have restarted the server after 20hours later, started all over --> same screen and exact value display i.e no data moment etc
it seem to me the tool had been stopped due to somesort of log that block it as it write to c:\ drive, could it be to do with .RLF file ?
could anyone shed somelight on this?

Originally Posted by mrosen
Hi.
Any different result when you directly call nss /poolrebuild=poolname?
Anything in the logger screen or console then?
And yes, an issue with C:\ could play a role here, is it full possibly?
Las but not least, you could try to bring up the server without autoexec
(server -na from dos), and try then, from the physical console. That way
you can be reasonably sure no other software may be interfering.
On 24.11.2013 12:26, rdy1 wrote:
>
> Hi,
> I have a corrupted pool that need to repair, (vol = 520GB)
> i started it through nss /poolrebuild ->choose the pool appear on list
> it appears not doing anything, I have restarted the server after 20hours
> later, started all over --> same screen and exact value display i.e no
> data moment etc
> it seem to me the tool had been stopped due to somesort of log that
> block it as it write to c:\ drive, could it be to do with .RLF file ?
> could anyone shed somelight on this?
>
>
Massimo Rosen
Novell Knowledge Partner
No emails please!
Untitled Document
Thanks Massimo,
I had tried pretty much the same you suggested, the nss /poolrebuild is not even running !!!
it was so bad the corruption to pool was so severe
but however, there is a way to mount back that volume even-though your pool is so corrupted.
we managed to get our data back, thank god for that.

Change ownership of NSS pool/vols

I have two servers connected to an iSCSI SAN - both servers are NW 6.5 SP8. I need to retire server 1 and replace with server 2 and I would like to permanently mount server 1 NSS volumes on server 2 and change the ownership of the eDir objects to reflect server 2. How do I go about doing this?
Thanks,
bp

On 27.05.2011 15:06, BPainter wrote:
>
> Yes, one of the volumes does contain user directories. I assume I would
> need to manually change the paths?
Well, there are two options:
1. You can manually (e.g via LDAP) modify the existing volume object to
point to the new server. However, this isn't perfect, as some activities
tend to undo this change.
2. You change the home directory setting of all your users after moving
the volume. This of course can be done manually, but there are also
tools that can help you with that. Search for "homes" on coolsolutions
for instance.
But generally, yes, moving volumes between servers can be done just easily.
CU,
Massimo Rosen
Novell Product Support Forum Sysop
No emails please!
http://www.cfc-it.de

Problems with NSS partition mirroring and pool errors.

OS: NetWare 6.5SP8 latest patches applied
I have an NSS pool that is made up of two 2TB LUNS on my SAN array.
I've been trying to move the pool from the LUNs on my SAN array to a pair of 2TB VMWare disks by using the partition mirror method.
All of the LUNS reside on the same array, but the two original LUNs are attached to an HBA in passthrough mode, so the VMWare infrastructure does not see those LUNS.
One partition mirrored without issue. The second one shows as synchronized yet remained at 99%.
A forum search indicated that a pool rebuild was in order.
I ran a pool verify and rebuild and tried mirroring the "problem child" partition again with the same result.
Before running the rebuild a second time, I removed both mirrored pairs.
I ran a pool verify and rebuild and tried mirroring the "problem child" partition again with the same result.
Last night, I repeated that same process and finally got 100% synchronization.
In the second and third rebuild processes, I got an error message stating that data would be lost if I continued the rebuild. Since I have backup, I continued the rebuild.
About 6 hours after that synchronization finished, I started receiving block error messages for the pool:
NSS-3.00-5002: Pool xxxx/DATA is being deactivated.
An error (20012(nameTree.c[45])) at block 536784421(file block -536784421)(ZID 6) has compromised volume integrity.
The pool deactivated itself.
I reset the server and when it mounted that volume, I continued to receive the errors.
I deactivated the pool, put it in maintenance mode and started another pool rebuild, this time with the /purge option.
I did not remove the mirrored pair before the rebuild started, so I have the one partition that gave me no problems un-mirrored and the partition that did give me problems is mirrored.
I don't know if that will have any affect on the rebuild process or not.
My questions are:
1) Will the fact that the pool is only half-mirrored be an issue for the rebuild?
2) Is there any other option in the rebuild process that I should have added?
3) Is the fact that I'm mirroring to a VMFS disk an issue? I could create a new pool/volume on those VMWare disks and use the server consolidation utility to copy the data between volumes, but my desire to avoid that was the reason to use the mirror process in the first place.

The difference between a sledge hammer and a Q-Tip is that you do a lot less damage with a Q-Tip.
Everything has been messed with to the point where it would be impossible for any other advice than make a new disk, make a new pool, make a new volume, and do your restore. Then, when all the data is restored without error, move the OLD DATA: volume out of the way and put in the NEW DATA: volume.
Originally Posted by gathagan
My questions are:
1) Will the fact that the pool is only half-mirrored be an issue for the rebuild?
2) Is there any other option in the rebuild process that I should have added?
3) Is the fact that I'm mirroring to a VMFS disk an issue? I could create a new pool/volume on those VMWare disks and use the server consolidation utility to copy the data between volumes, but my desire to avoid that was the reason to use the mirror process in the first place.
1. God only knows. It should not.
2. No. I think its an issue of doing too much, to quickly. It is unlikely a pool rebuild would "fix" a mirroring problem. I would not have gone ahead with a pool rebuild with mirroring broken.
3. A disk is a disk. Perhaps there is some underlying issue with the VMDK. How is it provisioned? Is it out of space? Buggy edge cases of having an extactly 2TB VMDK? Who knows.
The mirroring happens at a layer below the pool level. So I have a hard time understanding a how a rebuild would help mirroring unless the pool is really bent to begin with.
Again, the state of the current DATA: volume would be in question after all that fiddling, even if you can get it to work would you really trust it? I would not. I would recreate DATA: from backup on whatever new partition you want and chalk it up to experience. You can minimize the pain of that by restoring to a differently named volume and then renaming the volumes after you confirm proper operation.
-- Bob

Server umounts NSS volumes for no obvious reason

OES2 SP1, 2-node cluster
I had a server unmount an NSS volume with the message below in the logs.
Any ideas what could have caused this?
SAN_SERVER is the cluster pool.
USERAREA is an NSS volume in the pool.
Code:
May 21 10:21:21 vishnu ncs-resourced: resourceMonitor: SAN_SERVER.monitor failed 256
May 21 10:21:21 vishnu ncs-resourced: resourceMonitor: SAN_SERVER 1 failures in last 600 seconds
May 21 10:21:25 vishnu adminus daemon: umounting volume USERAREA lazy=1
May 21 10:21:25 vishnu adminus daemon: Failed to delete the directory /media/nss/USERAREA. Error=39(Directory not empty)
May 21 10:21:25 vishnu adminus daemon: Entries for Volume USERAREA from pool SAN are not removed during pool deactivation
May 21 10:21:25 vishnu kernel: Waiting for 2 Inuse beasts to unlink
... repeated multiple times
May 21 10:21:36 vishnu kernel: fastReadCache - Buffer's should always have an inode
May 21 10:21:36 vishnu kernel: isCached - Buffer's should always have an inode
May 21 10:21:36 vishnu kernel: isCached - Buffer's should always have an inode
... repeated multiple times

This looks as if you have a monitor script that is looking at your
resource - spots a failure and then starts a unload/load routine.
I think that your cluster monitor script has been copied from another
resource and has not been edited to reflect the change
T
On Tue, 22 May 2012 09:46:01 GMT, vimalkumar v
<[email protected]> wrote:
>
>*OES2 SP1, 2-node cluster
>*
>I had a server unmount an NSS volume with the message below in the
>logs.
>Any ideas what could have caused this?
>
>SAN_SERVER is the cluster pool.
>USERAREA is an NSS volume in the pool.
>
>
>Code:
>--------------------
> May 21 10:21:21 vishnu ncs-resourced: resourceMonitor: SAN_SERVER.monitor failed 256
> May 21 10:21:21 vishnu ncs-resourced: resourceMonitor: SAN_SERVER 1 failures in last 600 seconds
> May 21 10:21:25 vishnu adminus daemon: umounting volume USERAREA lazy=1
> May 21 10:21:25 vishnu adminus daemon: Failed to delete the directory /media/nss/USERAREA. Error=39(Directory not empty)
> May 21 10:21:25 vishnu adminus daemon: Entries for Volume USERAREA from pool SAN are not removed during pool deactivation
> May 21 10:21:25 vishnu kernel: Waiting for 2 Inuse beasts to unlink
> ... repeated multiple times
> May 21 10:21:36 vishnu kernel: fastReadCache - Buffer's should always have an inode
> May 21 10:21:36 vishnu kernel: isCached - Buffer's should always have an inode
> May 21 10:21:36 vishnu kernel: isCached - Buffer's should always have an inode
> ... repeated multiple times
>--------------------

NSS Pool Deactivating - take 2

Similar Messages

Maybe you are looking for