Infiniband on Solaris 10?
Does anyone succeed to enable IPoIB on Solaris 10 when use X1233A ?
Thanks for your answer.
Now work. But before I need to change two times, because I don't know after the first time I change, stranously the network doesn't worked. The machine couldn't ping any machine :-!!!!!!
After to put , cut and put again the entries in the files hosts, work !!!
Many thanks again,
Luz
Similar Messages
-
Infiniband/MT25208/10u7 throughput
Heya all,
Working on setting up an HPC cluster of sorts. Running into some funky issues with throughput of Infiniband on Solaris. I'm unable to get any sort of reasonable speed out of the cards; throughput seems limited to around 60 MB/s. Tested using FTP and also the netio package I stumbled across here
Currently using a pair of t5220's for testing, each with a Mellanox MHGA28-XTC card. The cards have the latest firmware installed, and according to Mellanox, use the MT25208 chipset. They show up under prntconf as 'pciex15b3,6282'. The t5220s are running Solaris 10u7. The Infiniband fabric is running over a Voltaire 9024 switch.
Noticed when I installed the cards were not being recognized. The Infiniband Update 3 solved this; but I noticed it's apparently meant for 10u6? Not sure if that's causing problems or not. Has anyone else used IU3 on 10u7?
Are there TCP settings I need to tweak/etc.? I'm thinking that might be it, but I'm having a hard time finding documentation about Infiniband on Solaris. :)
I've also tested the cards in a pair of x86 boxes running SuSE Enterprise Linux 11; there, transfers were up in the range of 300MB/s and were being bottlenecked by disk speeds. So I'm fairly certain the cards, cables and switch are fine.Dear Sir
Problem solved?
I want you to teach if you know something, because I am newbei Solaris user.
Perhaps, I have the same problem as yours. (But hardware environment is totally different.)
I examined Voltaire400Ex, 410Ex, and 600Ex.
Throughput is all low.
I tried in this environment.
# Dell R610
# HCA : Voltaire 410Ex-D (Mellanox MT25204)
# Switch : Voltaire 9024D
Now, I have 410Ex only .
Therefore, I am using 410.
(1) Solaris 10 5/09 s10x_u7wos_08 X86 ( ib_updates_3d_s10u7)
#1. netperf(v2.4.5)
+5485.80(10^6bits/sec)+
#2. netio(v1.26)
+713309 KByte/s+
(2) CentOS 5.3 (2.6.18-128.el5) ( OFED-1.4.1)
#1. netperf(v2.4.5)
+10819.86(10^6bits/sec)+
#2. netio(v1.26)
+1220885 KByte/s+
On solaris, I was able to improve the performance by tuning . (a little)
However, it is still slow.
Can I improve it more?
I think that Solaris device driver has the problem. -
Solaris 10 x86 infiniband support for connected mode?
Anyone know when solaris 10 x86 is going to support connected mode infiniband?
Large MTU size is what I'm looking for, 65520 as opposed to 2044.
Running....
Solaris 10 10/08 s10x_u6wos_07b X86
X4600 with dual port pcie infiniband card.
thanks.
Edited by: njk3rd on Jul 5, 2010 11:06 AMHi, i'm Chris from Argentina.
Same Hardware A7n8x-e Deluxe, and same problem. Solaris 10 is unable to detect mi SATA harddrive.
I do have a PATA harddrive so i'll try to install it in there. But i want mi Sata to work.
I'll continue searching for a solution a while before using a pata hd. -
InfiniBand Host Channel Adaptors on M-series and T-series on Solaris 10?
Hi,
I saw spec and implementation of Sun InfiniBand HCA's (PCIe) on Sun X-series servers running Solaris 10, but hasn't seen these InfiniBand HCA's on M-seris or T-series. Is it possible to use these InfiniBand HCA's on M- or T-series servers? Has any one used Sun or any other InfiniBand HCA's (PCIe) on Sun M-series or T-series running Solaris 10?
Thanks,
RayHi,
The README for DAQmx 8.0 for Linux has some information on pthread:
Linking with the pthread Library
In C or C++, use care if your NI-DAQmx application links to the Linux pthread
library. It is recommended to link your application with the gcc -pthread flag
instead of linking directly with, for instance, -lpthread. Incorrect linking
can lead to segmentation faults when the NI-DAQmx libraries load. If, after
replacing -lpthread, you still get a segmentation fault when loading NI-DAQmx,
you must explicitly link the dl library (-ldl) as the first library in your
list.
Also, one of my colleagues suggested that you should be using fork as soon as possible before calling any NI API or
else the child process could potentially die/segfault.
Please let me know if this helps.
Thanks,
Salvador Santolucito -
Need help on Solaris 10 with Infiniband card
I have Solaris 10 Update 10 running. I have Mellanox Infiniband card. All I can see that its detecting it as:
Code:
#/usr/X11/bin/scanpci | grep Mellanox
Mellanox Technologies MT26428 VPI PCIe 2.0 5GT/s – IT QDR / 10GigE]
#ls –R /devices | grep 15b3
Pci15b3, 22@0
Pci15b3,22@0:devctl
/devices/pci@0,0/pci8086,3410@9/pci15b3,22@0:
#pkginfo | grep –I infini
System SUNWeoib Solaris Ethernet over InfiBand
System SUNWib Sun InfiniBand Framework
System SUNWibsdp Sun InfiniBand layered Sockets Direct Protocol
System SUnWibsdpib
System SUnWibsdpu
System SUNWiopoib
System WUNWrpcib
#modinfo | grep infi
92 ffffffffefb0a000 b4d0 - 1 ibdm ( InfiniBand Device Manager)
250 fffffffffff08e2000 e9c8 170 I ibd (InfiniBand GLDv3 Driver 1.38)
Please help me if anything else need to be configure for making this card run.
I was following this doc http://docs.oracle.com/cd/E19241-01/821-2144-10/p21.html#scrolltoc but I wonder if it really needs Sun Firmware Tool since it does verify the Mellanox driver successfully.
Please guide on how to configure ib interface.
Edited by: 912582 on 06-Feb-2012 07:25
Edited by: 912582 on 06-Feb-2012 07:26Hi,
Below notes will help you
TECH: Calculating Oracle's SEMAPHORE Requirements (Doc ID 15654.1)
Semaphores and Shared Memory - An Overview (Doc ID 153961.1) - It has detailed explanations on parameters and calculation methods.
For more information, following are the kernel parameters and its possible values
Name Default Min Max Reference
shmsys:shminfo_shmmax 1048576 1048576 4294967295 Maximum shm segment size in bytes
shmsys:shminfo_shmmin 1 1 - Minimum shm segment size in bytes
shmsys:shminfo_shmmni 100 100 - Number of shm identifiers to pre-allocate
shmsys:shminfo_shmseg 6 6 - Maximum number of shm segments per process
semsys:seminfo_semmap 10 10 - Number of entries in semaphore map
semsys:seminfo_semmni 10 10 65535 Number of semaphore identifiers
semsys:seminfo_semmns 60 - - Number of semaphores in system
semsys:seminfo_semmnu 30 - - Number of undo structures in system
semsys:seminfo_semmsl 25 - - Maximum number of semaphores per ID
semsys:seminfo_semopm 10 - - Maximum number of operations per semop call
semsys:seminfo_semume 10 - - Maximum number of undo entries per process
semsys:seminfo_semusz 96 - - Size in bytes of undo structure, derived from semume
semsys:seminfo_semvmx 32767 - - Semaphore maximum value
semsys:seminfo_semaem 16384 - - Adjust on exit maximum value
msgsys:msgmap 100 100 - # of entries in msg map
msgsys:msgmax 2048 2048 - max message size
msgsys:msgmnb 4096 4096 - max # bytes on queue
msgsys:msgmni 50 50 - # of message queue identifiers
msgsys:msgssz 8 8 - msg segment size (should be word size multiple)
msgsys:msgtql 40 40 - # of system message header
msgsys:msgseg 1024 1024 32767 # of msg segments
thanks,
Krishna -
10g R2 RAC on Solaris 10 with EMC Storage
We are in the process of setting up 3/4 node RAC with the following components:
Oracle 10g R2 RAC
Oracle Clusterware / Sun Cluster / Veritas Cluster
Sun Solaris 10
EMC storage
ASM/Cluster FS
I would appreciate if some one can through some light on:
* ) Veritas cluster / Sun Cluster is must component or Can I use Oracle clusterware ? what are the advantages and disadvantages of using Oracl clusterware compare with varitas cluster or sun cluster
* ) Is cluster filesystem a compulsory component or Can I use ASM instead of Cluster File system.
* ) If I don't use cluster filesystem where to put CRS repository and voting disk ?
* ) What is best option for Oracle_Home, is it shared oracle home or sepereate oracle_home on each node ?
* ) Are there any known risks invovled in using ASM. How is the I/O performance with ASM on EMC with Solaris ? Are there any best practices
* ) Is GigE okay for interconnect or do I need to go for Infiniband ?
* ) Is there any notes on Best practices for the above components
*) Do I need to consider fail over option for NIC's (interconnect and public), if yes, how to do that ?
*) Are there any other risks do I need to consider ?
Thanks
GHi,
I see lot of good input. I have done few RAC installs on sun/solaris/emc ...
Here are few things to consider.
* ) Veritas cluster / Sun Cluster is must component or Can I use Oracle clusterware ? what are the advantages and disadvantages of using Oracl clusterware compare with varitas cluster or sun cluster
Just stay with Oracle Clusterware. If there are any issues then you only have to deal with one vendor and there will be no finger pointing. In any case Oracle Clusterware is needed even if you install Veritas/Sun.
* ) Is cluster filesystem a compulsory component or Can I use ASM instead of Cluster File system.
For the database you can use ASM. The only time I have considered a cluster filesystem is if external tables were in use.
When you use ASM you need to partition the disk with 1 meg offset or start at cyclinder 1.
* ) If I don't use cluster filesystem where to put CRS repository and voting disk ?
OCR and Voting Disk go on raw devices.
* ) What is best option for Oracle_Home, is it shared oracle home or sepereate oracle_home on each node ?
Install ORACLE_HOME, ASM_HOME and CRS_HOME locally on each server.
* ) Are there any known risks invovled in using ASM. How is the I/O performance with ASM on EMC with Solaris ? Are there any best practices
http://www.oracle.com/technology/products/database/asm/pdf/asm-on-emc-5_3.pdf
We have always installed 2 HBAs and used powerpath.
* ) Is GigE okay for interconnect or do I need to go for Infiniband ?
For a majority of cases gigE is sufficient.
* ) Is there any notes on Best practices for the above components
Have redundancy at each level.
*) Do I need to consider fail over option for NIC's (interconnect and public), if yes, how to do that ?
You can use IPMP. Use large send/receive buffers. Enable Jumbo Frames.
We had to apply some patches.
5128575 - RAC install of 10.2.0.2 does not update libknlopt.a on all nodes
4769197 - WHILE ONE NODE OF RAC IS DOWN, CONNECTIONS FROM CLIENT HANG
patch 5749953
Thanks
G -
After sccessfull instalation and reboot.
I got message "Not booting as part of a cluster"
My steps.
1) Install Solaris 10_x86 1/06 on IBM xSeries 345
2) Add line in hosts and ipnodes new host sol2
172.20.128.72 sol1
3) Install latest Putch Cluster for 04/12/06
4) Load SunCluster installation java_es_05Q4_cluster-ga-solaris-x86.zip
5) run ./java_es_05Q4_cluster-ga-solaris-x86/Solaris_x86/installer
1. Automatically update with version on installer disk (recommended)
2. Configure Later - Manually configure following installation
6) scinstall
1) Install a cluster or cluster node
2) Install just this machine as the first node of a new cluster
2) Custom
What is the name of the cluster you want to establish? mycls
Node name (Control-D to finish): sol1
Do you need to use DES authentication (yes/no) [no]? no
Is it okay to accept the default network address (yes/no) [yes]? yes
Is it okay to accept the default netmask (yes/no) [yes]? yes
Does this two-node cluster use transport junctions (yes/no) [yes]? yes
What is the name of the first junction in the cluster [switch1]? switch1
What is the name of the second junction in the cluster [switch2]? switch2
1) e1000g1
2) Other
Option: 1
What is the name of the second cluster transport adapter? rtls0
Will this be a dedicated cluster transport adapter (yes/no) [yes]? yes
The default is to use /globaldevices.
Is it okay to use this default (yes/no) [yes]? yes
Do you want to disable automatic quorum device selection (yes/no) [no]? no
scinstall -ik \
-C mycls \
-F \
-T node=sol2,node=sol1,authtype=sys \
-A trtype=dlpi,name=e1000g1 -A trtype=dlpi,name=rtls0 \
-B type=switch,name=switch1 -B type=switch,name=switch2 \
-m endpoint=:e1000g1,endpoint=switch1 \
-m endpoint=:rtls0,endpoint=switch2 \
-P task=quorum,state=INIT
Are these the options you want to use (yes/no) [yes]? yes
Checking device to use for global devices file system ... done
Initializing cluster name to "mycls" ... done
Initializing authentication options ... done
Initializing configuration for adapter "e1000g1" ... done
Initializing configuration for adapter "rtls0" ... done
Initializing configuration for junction "switch1" ... done
Initializing configuration for junction "switch2" ... done
Initializing configuration for cable ... done
Initializing configuration for cable ... done
Setting the node ID for "sol2" ... done (id=1)
Checking for global devices global file system ... done
Updating vfstab ... done
Verifying that NTP is configured ... done
Initializing NTP configuration ... done
Updating nsswitch.conf ...
done
Adding clusternode entries to /etc/inet/hosts ... done
Configuring IP Multipathing groups in "/etc/hostname.<adapter>" files
IP Multipathing already configured in "/etc/hostname.e1000g0".
Verifying that power management is NOT configured ... done
Ensure network routing is disabled ... done
Log file - /var/cluster/logs/install/scinstall.log.18914
7)And I got messages on system console after reboot
SunOS Release 5.10 Version Generic_118855-02 32-bit
Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Configuring devices.
Hostname: sol2
devfsadm: minor_init failed for module
/usr/lib/devfsadm/linkmod/SUNW_scmd_link.so
Loading smf(5) service descriptions: 27/27
Not booting as part of a cluster
Note: path_to_inst was not be updated. Please 'boot -r' as needed to update.
checking ufs filesystems
/dev/rdsk/c1t0d0s6: is logging.
sol2 console login: Apr 12 14:09:30 sol2 sendmail[503]: My unqualified host name
(sol2) unk nown; sleeping for retry
Apr 12 14:09:30 sol2 sendmail[502]: My unqualified host name (sol2) unknown;
sleeping for r etry
Apr 12 14:09:33 sol2 xntpd[560]: couldn't resolve `clusternode1-priv', giving up on
it
What is wrong ?
What should be my next step ?
Thanks advance,
SergeyThank you for answer.
I have another server IBM xSeries 346
After successfull install I run scinstall.
* 1) Install a cluster or cluster node
2) Install just this machine as the first node of a new cluster
2) Custom
Do you want scinstall to install patches for you (yes/no) [yes]? no
What is the name of the cluster you want to establish? mycls
This is the complete list of nodes:
sol1
sol2
Is it correct (yes/no) [yes]? yes
Do you need to use DES authentication (yes/no) [no]? no
Is it okay to accept the default network address (yes/no) [yes]? yes
Is it okay to accept the default netmask (yes/no) [yes]? yes
Does this two-node cluster use transport junctions (yes/no) [yes]? no
What is the name of the first cluster transport adapter (help)? rtls0
Is "rtls0" an Ethernet adapter (yes/no) [no]? yes
Is "rtls0" an Infiniband adapter (yes/no) [no]?
What is the name of the second cluster transport adapter (help)? rtls1
Is it okay to use this default (yes/no) [yes]? yes
Do you want to disable automatic quorum device selection (yes/no) [no]? yes
Do you want scinstall to reboot for you (yes/no) [yes]? yes
>>> Confirmation <<<
Your responses indicate the following options to scinstall:
scinstall -ik \
-C mycls \
-F \
-T node=sol1,node=sol2,authtype=sys \
-A trtype=dlpi,name=rtls0 -A trtype=dlpi,name=rtls1 \
-B type=direct
Are these the options you want to use (yes/no) [yes]? yes
Do you want to continue with the install (yes/no) [yes]?
Checking device to use for global devices file system ... done
Initializing cluster name to "mycls" ... done
Initializing authentication options ... done
Initializing configuration for adapter "rtls0" ... done
Initializing configuration for adapter "rtls1" ... done
Setting the node ID for "sol1" ... done (id=1)
Checking for global devices global file system ... done
Updating vfstab ... done
Verifying that NTP is configured ... done
Initializing NTP configuration ... done
Updating nsswitch.conf ...
done
Adding clusternode entries to /etc/inet/hosts ... done
Configuring IP Multipathing groups in "/etc/hostname.<adapter>" files
Updating "/etc/hostname.elxl0".
Verifying that power management is NOT configured ... done
Unconfiguring power management ... done
/etc/power.conf has been renamed to /etc/power.conf.041706132745
Power management is incompatible with the HA goals of the cluster.
Please do not attempt to re-configure power management.
Ensure network routing is disabled ... done
Network routing has been disabled on this node by creating /etc/notrouter.
Having a cluster node act as a router is not supported by Sun Cluster.
Please do not re-enable network routing.
Log file - /var/cluster/logs/install/scinstall.log.20862
Rebooting ...
Apr 17 13:27:45 sol1 reboot: rebooted by root
updating /platform/i86pc/boot_archive...this may take a minute
SunOS Release 5.10 Version Generic_118844-26 64-bit
Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
NOTICE: Can't open /etc/cluster/nodeid
NOTICE: BOOTING IN NON CLUSTER MODE
Configuring devices.
Hostname: sol1
devfsadm: minor_init failed for module /usr/lib/devfsadm/linkmod/SUNW_scmd_link.so
Loading smf(5) service descriptions: 24/24
/usr/cluster/bin/scdidadm: Could not load DID instance list.
Cannot open /etc/cluster/ccr/did_instances.
Not booting as part of a cluster
/usr/cluster/bin/scdidadm: Could not load DID instance list.
Cannot open /etc/cluster/ccr/did_instances.
Note: path_to_inst was not be updated. Please 'boot -r' as needed to update.
checking ufs filesystems
/dev/rdsk/c0t0d0s6: is logging.
sol1 console login:
Then I did as you adviced.
# echo "etc/cluster/nodeid" >> /boot/solaris/filelist.ramdisk
# bootadm update-archive
updating /platform/i86pc/boot_archive...this may take a minute
# init 6
SunOS Release 5.10 Version Generic_118844-26 64-bit
Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hostname: sol1
/usr/cluster/bin/scdidadm: Could not load DID instance list.
Cannot open /etc/cluster/ccr/did_instances.
Booting as part of a cluster
NOTICE: CMM: Node sol1 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node sol1: attempting to join cluster.
NOTICE: CMM: Cluster has reached quorum.
NOTICE: CMM: Node sol1 (nodeid = 1) is up; new incarnation number = 1145266639.
NOTICE: CMM: Cluster members: sol1.
NOTICE: CMM: node reconfiguration #1 completed.
NOTICE: CMM: Node sol1: joined cluster.
WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
cladm: CLCLUSTER_ENABLE: No such device
UNRECOVERABLE ERROR: Sun Cluster boot: Could not initialize cluster framework
Apr 17 13:37:21 in.mpathd[108]: missed sending 1 probes cur_time 99763 snxt_time 100636 snxt_basetime 99879
Please reboot in non cluster mode(boot -x) and Repair
syncing file systems... done
WARNING: CMM: Node being shut down.
Press any key to reboot.
Where is the problem ?
What should I check ?
Thanks advance for answer.
Sergey -
Here I share with you a question I received from Gustavo Wolfmann, a professor from Universidad Nacional de Cordoba, Argentina. If you can help, would be great
i am installing infiniband interfaces over our servers sun fire X2200, interfaces from mellanox, model MHGS18-XTC.
As OS was installed Solaris 10, (upgrade 7 o 5 in others servers.)
From Sun pages, i download Infiniband drivers for these kind of interfaces, version 3, the last available. In his documentation, it says that supports mellanox chip MT25204, the one that comes with our interfaces, with driver arbel.
The problem is that, after installing these driver and reboot the server, the kernel not work properly with the hardware, almost in a part, because with dmesg we scan the activity with the interface, is recognized by the kernel, but the device is not created, as is signed in the tutorial , in /dev/ibd0, however the chip driver is loaded.
After seaching in the internet, some people says that it must work over driver tavor in "compatibility mode". But i don't find anyhere how to set these mode over the driver.
I consult with mellanox people and says that they don't support solaris. I consult with you (Sun) if sun people can fix the problem, and they said that Sun only support the certified hardware, obviously, not the mine.
So, finally, i desist from work with solaris and reinstall linux over the servers. It was installed and infiniband hardware is recognized and work properly. The problem is that i work researching in HPC with linear algebra algorithm and libraries. Configuring the stack OS, infiniband driver, math library and compiler was horrific. It is done, but i not sure that was in the better form.
These problem in configurating the old installation of the cluster using ethernet with solaris, sunperf library, sun studio complier and sun cluster tool, was not present, and as you know, i was doing same reaserch with these configuration.
Then, i am disapointed with the absence of the driver for our hardware from Sun covering solaris. If it stay available, i don't spend one moths of my work reinstalling linux over the servers, but most important, i fill sure that works fine.
So, the last question is if nobody in the entire corporation can assist me in my problem. If i can fix it, i gladly spend time reinstalling solaris over the cluster.
thanks
Gustavo Wolfmann
Lab. Computación
Fac. Cs. Ex.Fis y Nat.
Univ.Nac.CórdobaThere are two versions of the driver available on the Internet and on the Creative site. The newer one I believe however is just an upgrade version, not the full installation but I could be mistaken. I have a copy of the original instrallation CD that came with the PD1110 (not PD1100) so if you want to PM me I can set up an image somewhere for you to grab. Be advised however that this is the software I have been unable to uninstall so you may experience problems.
-noz -
Solaris 11.1 NFS RDMA symlinks not working?
Dear list,
We are encountering strange issues with symlinks in NFS-over-RDMA exported (ZFS backed) filesystems. We are using an Solaris 11.1 storage server to provide NFS-services to a number of CentOS6.4 clients. They are connected through Infiniband and using NFS-RDMA mostly works, except for the fact that symlinks are only readable/usable on the client on which the symlink was created. All other clients cannot access or follow the link.
Switching from RDMA to TCP "solves" the problem.
I found someone with the same issue on openindiana, but so far no answers on the list :
http://openindiana.org/pipermail/openindiana-discuss/2012-April/007638.html
Has anyone on the list encountered this as well (and is there a fix?).
With kind regards,
Jeroen Roodhart
University of Amsterdam
Edited by: Tuxwielder on Mar 25, 2013 6:25 AM
Edited by: Tuxwielder on Mar 25, 2013 6:25 AMThanks for mentioning my web site. :-)
Boot the Live Media and then use the Device Manager to see if can find a driver for your hardware.
If it cannot which appears to be the case then you may have to install VirtualBox or some competitor first and then use that to run Solaris.
alan -
Infiniband bad outgoing Throughput on 10Gbit HCA
Hi @ all,
I have a problem. I would like to use Solaris with ZFS to provide storage for a glusterfs server with nfs.
My test environment:
Node 1: CentOS 6.2 with OFED 1.5.4.1
Node 2: OI 151a4 with native IB and before a Solaris 11 at both Solaris and OpenIndiana have the same result.
If I run a test with iperf:
From CentOS to OI throughput around 4.90Gbit/s
[root@dev-cos62 ~]# iperf -c 1.1.1.2
Client connecting to 1.1.1.2, TCP port 5001
TCP window size: 193 KByte (default)
[ 3] local 1.1.1.1 port 36173 connected with 1.1.1.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 5.66 GBytes 4.86 Gbits/sec
From OI to CentOS throughput only 900Mbit/s
Croot@dev-oi:~# iperf -c 1.1.1.1
Client connecting to 1.1.1.1, TCP port 5001
TCP window size: 256 KByte (default)
[ 3] local 1.1.1.2 port 35841 connected with 1.1.1.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.13 GBytes 968 Mbits/sec
My IB Hardware is a new Mellanox InfiniScale switch and some older 10Gbit Mellanox HCAs. (MTLP23108)
A second dd test with a ramdisk shared over nfs tel me the maximum of solaris/oi infiniband outgoing throughput "write" is max 1Gbit.
Have anybody any idea?
thx and many greets from germany
Andreas
Edited by: 942419 on 23.06.2012 17:47Do you see this problem with S11 FCS too?
In any case, you could try increasing the ndd /dev/tcp setting
'tcp_naglim_def' from 4K to 64K and see if it helps.
#ndd /dev/tcp tcp_naglim_def 65535 -
Is there a way to use Java over infiniband?
It appears to be only on Solaris with Java 7.
However, I believe Infiniband supports TCP so you can use that approach on any version of java. (But I believe its not as efficient) -
Infiniband Mellanox MTU size change
Hello to all.
I am new to Solaris and know nothing about its administration tools.
We are willing to test Infiniband and need to change MTU of the adapter (Mellanox) to 4K (which is the default configu for our subnet manager).
Which command should be used to archieve the task ?
Next we need to configure an SRP target. Any specific howto ?
Thanks in advance for your kind replyThank you for your reply, Darren.
As per my test, ifconfig will work for the IPoIB (when creating a derivative interface to run IP over infiniband).
Actually the driver is installed, link comes up, fabric negotiates 2048 MTU and no interface is available to configure for IPoIB since I did not create it.
Ifconfig lists only the gigabit eth nic, since no ib0 card is defined (only ib device in /proc).
I am not going to run IP over Infiniband, al least not for storage access, since poses a huge penalty in performance.
To configure the device MTU (not the virtual ethernet over infiniband MTU) I think must be done at driver level, using something related to the low level adapter
cfgadm_ib(1M) - InfiniBand hardware-specific commands for cfgadm
datadm(1M) - maintain DAT static registry file
ifconfig(1M) - configure network interface parameters
libdat(3LIB) - direct access transport library
ib(4) - InfiniBand device driver configuration files
ibmf(7) - InfiniBand Management Transport Framework
daplt(7D) - Tavor uDAPL service driver
ib(7D) - InfiniBand Bus Nexus Driver
ibcm(7D) - InfiniBand Communication Manager
ibd(7D) - Infiniband IPoIB device driver
ibdm(7D) - Solaris InfiniBand device manager
tavor(7D) - InfiniBand (IB) Tavor driver
I guess it wil be cfgadm, but do not know syntax and parameters.
In vmware, using Mellanox tools, you issue esxcfg-module -s "port_type_default=1 set_4k_mtu=1" mlx4_en, where mlx4_en is the kernel module.
How to achieve this with Solaris ?
As per the SRP question, if Solaris supports COMSTAR it should support SRP target mode, right ?
Thank for sharing your thoughts. -
SRP Target setup over infiniband network
Hi All,
I'm very new to Solaris and have come over to get out of the box support for SRPT. I've trying to setup a SRT Target to use over an infiniband network. I'm using a HP BL460c G1 with IB 4xDDR Mezz card which is connected to an unmanaged IB switch inside a C7000 blade server
I can't see the IB HBA as per below:
root@Blade03:~# prtconf
System Configuration: Oracle Corporation i86pc
Memory size: 8190 Megabytes
System Peripherals (Software Nodes):
i86pc
scsi_vhci, instance #0
pci, instance #0
pci103c,31fd, instance #0
pci8086,25e2, instance #0
pci8086,3500, instance #8
pci8086,3510, instance #10
pci1166,103, instance #12
pci103c,703b, instance #1
pci8086,3514, instance #11
pci8086,350c, instance #9
pci8086,25e3, instance #1
pci1166,103, instance #13
pci1166,104, instance #1
pci103c,3211, instance #0
sd, instance #0
sd, instance #1
pci8086,25e4 (driver not attached)
pci8086,25e5 (driver not attached)
pci8086,25e6 (driver not attached)
pci8086,25e7 (driver not attached)
pci103c,31fd (driver not attached)
pci103c,31fd, instance #1
pci103c,31fd (driver not attached)
pci103c,31fd (driver not attached)
pci103c,31fd (driver not attached)
pci103c,31fd (driver not attached)
pci103c,31fd (driver not attached)
pci8086,2690, instance #6
pci1166,103, instance #7
pci103c,703b, instance #0
pci103c,31fe, instance #0
pci103c,31fe, instance #1
pci103c,31fe, instance #2
pci103c,31fe, instance #3
pci103c,31fe, instance #0
pci8086,244e, instance #0
display, instance #0
pci103c,3305 (driver not attached)
pci103c,3305 (driver not attached)
pci103c,3305, instance #4
device, instance #0
keyboard, instance #0
mouse, instance #1 (driver not attached)
hub, instance #0
pci103c,3305 (driver not attached)
isa, instance #0
i8042, instance #0
keyboard, instance #0
mouse, instance #0
asy, instance #0
asy, instance #1
pit_beep, instance #0
fw, instance #0
cpu (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
sb, instance #1
used-resources (driver not attached)
iscsi, instance #0
fcoe, instance #0
options, instance #0
pseudo, instance #0
agpgart, instance #0
root@Blade03:~# cfgadm -a
Ap_Id Type Receptacle Occupant Condition
c7 scsi-bus connected configured unknown
c7::dsk/c7t0d0 disk connected configured unknown
c7::dsk/c7t1d0 disk connected configured unknown
usb2/1 unknown empty unconfigured ok
usb2/2 unknown empty unconfigured ok
usb3/1 unknown empty unconfigured ok
usb3/2 unknown empty unconfigured ok
usb4/1 unknown empty unconfigured ok
usb4/2 unknown empty unconfigured ok
usb5/1 unknown empty unconfigured ok
usb5/2 unknown empty unconfigured ok
usb6/1 unknown empty unconfigured ok
usb6/2 unknown empty unconfigured ok
usb6/3 unknown empty unconfigured ok
usb6/4 unknown empty unconfigured ok
usb6/5 unknown empty unconfigured ok
usb6/6 unknown empty unconfigured ok
usb6/7 unknown empty unconfigured ok
usb6/8 unknown empty unconfigured ok
usb7/1 usb-device connected configured ok
usb7/2 usb-hub connected configured ok
usb7/2.1 unknown empty unconfigured ok
usb7/2.2 unknown empty unconfigured ok
usb7/2.3 unknown empty unconfigured ok
usb7/2.4 unknown empty unconfigured ok
usb7/2.5 unknown empty unconfigured ok
usb7/2.6 unknown empty unconfigured ok
usb7/2.7 unknown empty unconfigured ok
I've installed and enabled
root@Blade03:~# svcadm enable ibsrp/target
root@Blade03:~# svcadm enable stmf
root@Blade03:~# svcadm enable -r ibsrp/target
Any help I could get on this would be greatOk I think I have this all fixed...
I found this is the boot logs /var/adm/messages
*Oct 25 01:02:22 Blade03 hermon: [ID 271130 kern.warning] WARNING: hermon0: Device Error: HCA firmware not at minimum version*
*Oct 25 01:02:22 Blade03 hermon: [ID 549348 kern.notice] Unsupported Hermon FW version: expected: 0002.0006.0000, actual: 0002.0002.0000*
which pointed me to a fireware issue
root@Blade03:~# fwflash -l
Class [IB]
GUID: System Image - 001b78ffff34704b
Node Image - 001b78ffff347048
Port 1 - 001b78ffff347049
Port 2 - 001b78ffff34704a
Mac 1 - ffffffffffffffff
Mac 2 - ffffffffffffffff
Firmware revision : 2.2.000
Product : 448262-001 A0
No additional hardware info available for this device
Saved the current firmware just in case
root@Blade03:~# fwflash -y -r ib-firmware-2.2.000.bin -d /devices/pci@0,0/pci8086,25f9@6/pci103c,1718@0:devctl
root@Blade03:~# ls
ib-firmware-2.2.000.bin
I then downloaded the latest firmware from vendor (HP) and uploaded the new firmware, I was alittle worried doing this
root@Blade03:~# fwflash -f ./fw-25408-2_8_0000-448262-B21-clp-171.bin -d /devices/pci@0,0/pci8086,25f9@6/pci103c,1718@0:devctl
fwflash: hermon: No PSID match found
fwflash: hermon: Unable to verify firmware is appropriate for the hardware
fwflash: Do you want to continue? (Y/N): y
About to update firmware on /devices/pci@0,0/pci8086,25f9@6/pci103c,1718@0:devctl
with file ./fw-25408-2_8_0000-448262-B21-clp-171.bin.
Do you want to continue? (Y/N): y
Unable to completely verify that this firmware image (./fw-25408-2_8_0000-448262-B21-clp-171.bin) is compatible with your HCA /devices/pci@0,0/pci8086,25f9@6/pci103c,1718@0:devctlDo you really want to continue? (Y/N): y
. . . . . . . . . . . . . . . . +
fwflash: New firmware will be activated after you reboot
After the reboot all looks well
root@Blade03:~# fwflash -l
Device[100] /devices/pci@0,0/pci8086,25f9@6/pci103c,1718@0:devctl
Driver hermon
Class [IB]
GUID: System Image - 001b78ffff34704b
Node Image - 001b78ffff347048
Port 1 - 001b78ffff347049
Port 2 - 001b78ffff34704a
Mac 1 - 00001b7800347049
Mac 2 - 00001b780034704a
Firmware revision : 2.8.000
Product : 448262-001 A0
No additional hardware info available for this device
now in /var/adm/messages I see the below:
Oct 25 01:45:46 Blade03 hermon: [ID 904943 kern.info] vpd 448262-001
Oct 25 01:45:50 Blade03 pcplusmp: [ID 805372 kern.info] pcplusmp: pciex15b3,634a (hermon) instance 0 irq 0x1b vector 0x60 ioapic 0xff intin 0xff is bound to cpu 0
Oct 25 01:45:50 Blade03 pcplusmp: [ID 805372 kern.info] pcplusmp: pciex15b3,634a (hermon) instance 0 irq 0x1c vector 0x61 ioapic 0xff intin 0xff is bound to cpu 1
Oct 25 01:45:50 Blade03 pcplusmp: [ID 805372 kern.info] pcplusmp: pciex15b3,634a (hermon) instance 0 irq 0x1b vector 0x60 ioapic 0xff intin 0xff is bound to cpu 2
Oct 25 01:45:50 Blade03 hermon: [ID 753898 kern.info] NOTICE: Hermon is operational
Oct 25 01:45:50 Blade03 pcieb: [ID 586369 kern.info] PCIE-device: pci103c,1718@0, hermon0
Oct 25 01:45:50 Blade03 npe: [ID 236367 kern.info] PCI Express-device: pci103c,1718@0, hermon0
Oct 25 01:45:50 Blade03 genunix: [ID 936769 kern.info] hermon0 is /pci@0,0/pci8086,25f9@6/pci103c,1718@0
Oct 25 01:45:50 Blade03 hermon: [ID 749262 kern.info] hermon0: FW ver: 0002.0008.0000, HW rev: 160
*Oct 25 01:45:50 Blade03 hermon: [ID 969628 kern.info] hermon0: MT25408 ConnectX Mellanox Technologies (0x001b78ffff347048)*
*Oct 25 01:45:50 Blade03 hermon: [ID 330282 kern.info] Hermon attach complete*
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/rpcib@0 (rpcib0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/rpcib@0 (rpcib0) multipath status: degraded: path 2 hermon0/rpcib@rpcib,0 is online
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/eibnx@0 (eibnx0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/eibnx@0 (eibnx0) multipath status: degraded: path 3 hermon0/eibnx@eibnx,0 is online
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/rdsib@0 (rdsib0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/rdsib@0 (rdsib0) multipath status: degraded: path 4 hermon0/rdsib@rdsib,0 is online
Oct 25 01:46:46 Blade03 ib: [ID 842868 kern.info] IB device: daplt@0, daplt0
Oct 25 01:46:46 Blade03 genunix: [ID 936769 kern.info] daplt0 is /ib/daplt@0
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/daplt@0 (daplt0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/daplt@0 (daplt0) multipath status: degraded: path 5 hermon0/daplt@daplt,0 is online
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/rdsv3@0 (rdsv30) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/rdsv3@0 (rdsv30) multipath status: degraded: path 6 hermon0/rdsv3@rdsv3,0 is online
Oct 25 01:46:46 Blade03 ib: [ID 842868 kern.info] IB device: sol_uverbs@0, sol_uverbs0
Oct 25 01:46:46 Blade03 genunix: [ID 936769 kern.info] sol_uverbs0 is /ib/sol_uverbs@0
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/sol_uverbs@0 (sol_uverbs0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/sol_uverbs@0 (sol_uverbs0) multipath status: degraded: path 7 hermon0/sol_uverbs@sol_uverbs,0 is online
Oct 25 01:46:46 Blade03 ib: [ID 842868 kern.info] IB device: sol_umad@0, sol_umad0
Oct 25 01:46:46 Blade03 genunix: [ID 936769 kern.info] sol_umad0 is /ib/sol_umad@0
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/sol_umad@0 (sol_umad0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/sol_umad@0 (sol_umad0) multipath status: degraded: path 8 hermon0/sol_umad@sol_umad,0 is online
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/sdpib@0 (sdpib0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/sdpib@0 (sdpib0) multipath status: degraded: path 9 hermon0/sdpib@sdpib, is online
Oct 25 01:46:46 Blade03 genunix: [ID 227219 kern.info] This Solaris instance has UUID 41bed47a-ebb3-44bf-b1ad-cfe670b645e0
Oct 25 01:46:46 Blade03 ib: [ID 842868 kern.info] IB device: iser@0, iser0
Oct 25 01:46:46 Blade03 genunix: [ID 936769 kern.info] iser0 is /ib/iser@0
Oct 25 01:46:46 Blade03 genunix: [ID 408114 kern.info] /ib/iser@0 (iser0) online
Oct 25 01:46:46 Blade03 genunix: [ID 483743 kern.info] /ib/iser@0 (iser0) multipath status: degraded: path 10 hermon0/iser@iser,0 is online
Now I can see the physical IB-PORT as per below:
root@Blade03:~# cfgadm -a
Ap_Id Type Receptacle Occupant Condition
c7 scsi-bus connected configured unknown
c7::dsk/c7t0d0 disk connected configured unknown
c7::dsk/c7t1d0 disk connected configured unknown
ib IB-Fabric connected configured ok
ib::1B78FFFF34704A,0,ipib IB-PORT connected configured ok
ib::1B78FFFF347049,0,ipib IB-PORT connected configured ok
ib::daplt,0 IB-PSEUDO connected configured ok
ib::eibnx,0 IB-PSEUDO connected configured ok
ib::iser,0 IB-PSEUDO connected configured ok
ib::rdsib,0 IB-PSEUDO connected configured ok
ib::rdsv3,0 IB-PSEUDO connected configured ok
ib::rpcib,0 IB-PSEUDO connected configured ok
ib::sdpib,0 IB-PSEUDO connected configured ok
ib::sol_umad,0 IB-PSEUDO connected configured ok
ib::sol_uverbs,0 IB-PSEUDO connected configured ok
ib::srpt,0 IB-PSEUDO connected configured ok
and proconf now also looks good
root@Blade03:~# prtconf
System Configuration: Oracle Corporation i86pc
Memory size: 8190 Megabytes
System Peripherals (Software Nodes):
i86pc
ib, instance #0
srpt, instance #0
rpcib, instance #0
eibnx, instance #0
rdsib, instance #0
daplt, instance #0
rdsv3, instance #0
sol_uverbs, instance #0
sol_umad, instance #0
sdpib, instance #0
iser, instance #0
output snipped
pci8086,25f9, instance #2
pci103c,1718, instance #0
ibport, instance #0
ibport, instance #1
output snipped -
Logical interface in solaris 10
Hi there,
I need to configure logical interface in a solaris 10 3/05 server. After reading the Solaris 10 IP services manual, I am not quite sure what to do. All the examples and explanation are about using the new subcommand addif of ifconfig. It was not clear in the documentation if the setting logical interfaces via addif will persist across boot.
Can one still configure logical interface in Solaris 10 in a more traditional way like in Solaris 8? In an Solaris 8 server I will do the following.
Let's assume I want to configure in a solaris 8 server a logical interface named hme0:1 with IP address 192.168.20.28 with netmask 255.255.255.0 for hostname host001
# cat /etc/hostname.hme0:1
host001
^D
# echo "192.168.20.28 host001" >> /etc/inet/hosts
# echo "192.168.20.0 255.255.255.0" >> /etc/inet/netmasks
# reboot -- -r
Can one still do that in solaris 10 3/05 server?Hi there,
I need to configure logical interface in a solaris 10
3/05 server. After reading the Solaris 10 IP services
manual, I am not quite sure what to do. All the
examples and explanation are about using the new
subcommand addif of ifconfig. It was not clear in the
documentation if the setting logical interfaces via
addif will persist across boot.No. No 'ifconfig' command is persistent.
Can one still configure logical interface in Solaris
10 in a more traditional way like in Solaris 8? In an
Solaris 8 server I will do the following.
Let's assume I want to configure in a solaris 8
server a logical interface named hme0:1 with IP
address 192.168.20.28 with netmask 255.255.255.0 for
hostname host001
# cat /etc/hostname.hme0:1
host001
^D
# echo "192.168.20.28 host001" >> /etc/inet/hosts
# echo "192.168.20.0 255.255.255.0" >>
/etc/inet/netmasks
# reboot -- -r
Can one still do that in solaris 10 3/05 server?Absolutely.
You don't need to reboot (you can run ifconfig for this boot and let the files do the work next time) and the -r doesn't do anything with interfaces (expecially virtual interfaces) anyway.
Darren -
Installation problem on Solaris
I am trying to install sun one 7.0 on Solaris 8. The install is failing with this error:
ERROR - library load failed with following error: Can't load library: /opt/SUNWappserver7/lib/libinstallCore.so
INFO - End core server uninstallation
anyone know what causes this??
cheersLooks like Solaris package installation failed and installer reverted to uninstallation sequence. For low level pkgadd log please check /var/sadm/install/logs/Sun_ONE_Application_Server_install.B<timestamp> file (timestamp is date and time of your installation attempt in mmddHHMM format).
Look for any errors in this file. Most likely thing that could have happened is that the installation of Java Help (SUNWjhrt) package failed because you didn't have existing package based J2SE installation on the system. If that's the case, workaround is to either preinstall package based J2SE installation or to selected option to install bundled J2SE that comes with application server.
Maybe you are looking for
-
My iPod Touch 5 Will Not Turn On... At All.
My iPod touch 5 will not turn on. It's not that it's dead, nor that I turned it off. I left it on, wrapped my earbuds around it, plunged it into my backpack and headed out... In the rain. I had an otterbox case on it too. I come home, leave it to chi
-
Dvd recording problems!!!!!!
Two problems are really bugging me guys. I made a 15 minute video project through imovie imported it to idvd and it recorded the whole project except for the sound. I just bought a Lacie external burner dvd + & - rw. And it ejects my dvd and says the
-
Trying to run itunes 9.
I just installed the latest version of itunes from the website and it said it was successfully installed. However, when I click on the icon, it runs through its install process and at some point I get this error. Problem with Short cut Fatal error du
-
How to enable EDT to interrupt the worker thread at any time?
Hello, this is a Swing application - in order for EDT to be responsive, I do the graphical computation in another thread. The computation is very CPU intensive, and I notice the controls are a little jerky - that means the EDT can't get to event hand
-
Hello Experts.. We are setting up the core tables for Benefit and we approached client and received below requirements on all the benefit plans that they are going to provide. Looking at below list, our client seems really generous: 1. Health - a) Me