Recently I had an issue with Oracle Quad 10Gb/dual 40Gb Ethernet Adapter - the operating system shown only half of its connected links as up (online), another half was shown as down. Actually another half of links (which was down from Solaris OS) was connected and links also had green constant light, both from the switch and adapter sides.
root@hostname:~# dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet down 0 unknown i40e0
net1 Ethernet down 0 unknown i40e1
net2 Ethernet up 10000 full i40e2
net3 Ethernet down 0 unknown i40e3
net4 Ethernet down 0 unknown i40e4
net5 Ethernet down 0 unknown i40e5
net6 Ethernet up 10000 full i40e6
net7 Ethernet down 0 unknown i40e7
net8 Ethernet up 1000 full vnet0
net9 Ethernet up 1000 full vnet1
net10 Infiniband up 32000 unknown ibp0
net11 Infiniband up 32000 unknown ibp1
net14 Infiniband up 32000 unknown ibp2
net15 Infiniband up 32000 unknown ibp3
Here we've got the LDom with two CMIOU. Each CMIOU provides 4 i40e devices, because each ethernet adapter was configured to separate 4 (4x10Gb=40Gb) links. Actually net0, net2, net4 and net6 were physically connected and all of those must be in up state, but we can see that net0 and net4 were down. So I was recommended by support to downgrade the firmware of those adapters, power cycle the server and upgrade aftermath (plus power cycling again).
Every operation with firmware requires power cycle of PDom. Take it into account when planning your work.
The sequence of steps was as :
1. Shutdown any non-primary domains first (use ldm list to identify).
# init 0
2. Save LDom configuration if needed, boot info factory-default SP config
# ldm add-config backup
# ldm set-config factory-default
# ldm list-config
3. Shutdown primary domain.
# init 0
4. Inside ILOM power off the host (or the platform, it depends on your hardware) using stop for HOST (or SYS) target.
dd
5. Configure boot from iso over the network (ILOM) :
-> set /Servers/PDomains/PDomain_0/SP/services/kvms/host_storage_device/ mode=disabled
-> set /Servers/PDomains/PDomain_0/SP/services/kvms/host_storage_device/remote/ server_URI=nfs://10.xx.xx.xxx:/private/boot.iso
-> set /Servers/PDomains/PDomain_0/SP/services/kvms/host_storage_device/ mode=remote
-> set /SP/cli timeout=0
-> show /Servers/PDomains/PDomain_0/SP/services/kvms/host_storage_device/
Carefully modify and set the parameters, do not insert extra spaces, dashes, semicolons etc. Also take into account the necessity to create network available NFS v3 server.
Another option to configure iso file to boot from is to use the BUI (not in this post).
6. Start the HOST target
7. In OpenBoot prompt, check the availability of rcdrom device for boot and boot from it :
{0} ok devalias
fallback-miniroot /pci@304/pci@1/usb@0/storage@2/disk@0
rcdrom /pci@304/pci@1/usb@0/storage@2/disk@0
virtual-console /virtual-devices/console@1
name aliases
{0} ok boot rcdrom
Boot device: /pci@304/pci@1/usb@0/storage@2/disk@0 File and args:
You might receive new faulty events like inability to use interconnect from host to SP etc. It's explainable, just take them into account.
8. Logon into the system. I used Oracle VTS Image (MOS patch 26091982). Username and password are jack/jack. For root the password is solaris.
9. Create network link for connecting to NFS to download firmware (if it's not in iso already). Use network and link from working LDom.
# ipadm create-addr -T static -a local=10.xx.xx.xxx/24 net1
10. Mount NFS share :
# mount -F nfs -o rw,bg,hard,nointr,rsize=1048576,wsize=1048576,vers=3,proto=tcp,forcedirectio,nocto 10.xx.xx.xxx:/export/test-share /mnt/tst
11. Perform other manipulations which you're needed (downgrade/upgrade firmware, reconfigure equipment etc.)
12. Revert back LDom configuration via ILOM, and disable booting from remote iso, for example :
-> set /Servers/PDomains/PDomain_0/host/bootmode config=backup
-> set /Servers/PDomains/PDomain_0/sp/services/kvms/host_storage_device/ mode=miniroot
13. Shutdown OS and stop HOST target. Wait till status_detail of HOST target will be 'Host is off'. Then start the HOST target
14. The operating system should be loaded. Check the status of the links :
# dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 10000 full i40e0
net1 Ethernet down 0 unknown i40e1
net2 Ethernet up 10000 full i40e2
net3 Ethernet down 0 unknown i40e3
net4 Ethernet up 10000 full i40e4
net5 Ethernet down 0 unknown i40e5
net6 Ethernet up 10000 full i40e6
net7 Ethernet down 0 unknown i40e7
net8 Ethernet up 1000 full vnet0
net9 Ethernet up 1000 full vnet1
net10 Infiniband up 32000 unknown ibp0
net11 Infiniband up 32000 unknown ibp1
net14 Infiniband up 32000 unknown ibp2
net15 Infiniband up 32000 unknown ibp3
Looks good :) Repeat these steps for upgrading firmware (if needed).
That's it :) Good Luck !
No comments:
Post a Comment