Skip to main content

Problems That Might Occur With XCP 4040/XCP 3130 and Workarounds


Problems That Might Occur With XCP 4040/XCP 3130 and Workarounds
The following table lists the problems that might occur with XCP 4040/XCP 3130 and workarounds for them.
Table 3-6  Problems That Might Occur With XCP 4040/XCP 3130 and Workarounds
RTI No. RTIF2-170508-001
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If you reboot the XSCF by using the flashupdate(8) or rebootxscf(8) command while a physical partition (PPAR) is being powered on, the POST may stop in a state where the diagnosis is completed (Initialization Complete).
Workaround There is no effective workaround.
[How to restore]
Execute the reset por command, or power off the PPAR by using the poweroff -f command and then power it on again.
RTI No. RTIF2-170508-002
Model SPARC M12-2S
Description For the system connected to a crossbar box (XBBOX), suppose that you are powering on or off the physical partition (PPAR) not assigned to the maintenance-target FRU, and you execute the diagxbu(8) or testsb(8) command. Then, the diagnosis of a system board (PSB) may fail during the PSB power-off, and the following message may be output.
[Warning:010]
An internal error has occurred.
Workaround There is no effective workaround.
Execute the showboards(8) command to check that the [Pwr] field of the relevant PSB is set to "n".
If the field is set to "y", execute the showboards(8) command every few minutes to check that the field changes to "n".
RTI No. RTIF2-170508-003
Model SPARC M12-2S
Description Suppose that the setpparparam command sets the OpenBoot PROM environment variables and then the poweron -a command starts multiple physical partitions (PPARs) simultaneously. Then, the following error message is output to the OS console: "Error storing configuration variable. LDC is not up Configuration variable setting will not persist after a reset or power cycle." The OpenBoot PROM environment variables set by the setpparparam command may not be applied.
Also, as a result of the OpenBoot PROM environment variables not being applied, Oracle Solaris may not be able to start.
Workaround There is no effective workaround.
[How to restore]
Temporarily power off the physical partition (PPAR) indicated by the output error message. Then, execute the setpparparam(8) command to set the OpenBoot PROM environment variables, and power on the PPAR again.
RTI No. RTIF2-170508-004
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description After a firmware update, when the XCP version is checked with the version(8) command or the XSCF Web interface, the displayed XCP version may not be the same as the updated XCP version. In the following example, firmware was updated from XCP 3021 to XCP 3030. The XCP version of "XCP0 (Reserve):" for BB#00 has not been updated.

XSCF> version -c xcp -v
XBBOX#80-XSCF#0 (Master)
XCP0 (Reserve): 3030
XSCF : 03.03.0000
XCP1 (Current): 3030
XSCF : 03.03.0000
XBBOX#81-XSCF#0 (Standby)
XCP0 (Current): 3030
XSCF : 03.03.0000
XCP1 (Reserve): 3030
XSCF : 03.03.0000
BB#00-XSCF#0
XCP0 (Reserve): 3021
CMU : 03.03.0000
POST : 1.43.0
OpenBoot PROM : 4.34.0+1.22.0
Hypervisor : 0.27.8
XSCF : 03.02.0001
XCP1 (Current): 3030
CMU : 03.03.000
POST : 1.43.0
OpenBoot PROM : 4.34.0+1.22.0
Hypervisor : 0.27.8
XSCF : 03.03.0000
Workaround There is no effective workaround.
[How to restore]
For the SPARC M12-1, the SPARC M12-2, or the 1BB configuration with the SPARC M12-2S, execute the rebootxscf command to reboot the XSCF.
In the rebootxscf -b BB-ID command, specify the BB-ID of the target crossbar box (XBBOX) or SPARC M12-2S (BB), which has a non-updated XCP version. Then, execute the command to reboot the XSCF of the specified chassis.
RTI No. RTIF2-170508-005
Model SPARC M12-2S
Description If "process down," a panic, or a watchdog timeout occurs on the master XSCF during maintenance using the addfru(8) or replacefru(8) command, XSCF master/standby switching may occur.
In this case, the addfru(8) or replacefru(8) command is interrupted.
Workaround There is no effective workaround.
[How to restore]
The master XSCF cannot be restored to its original state because the switchscf(8) command was suppressed during maintenance work.
If the maintenance work was being done for a power supply unit (PSU), a fan, a slave chassis, or the XSCF of a slave chassis, start the maintenance work over, from the new master XSCF.
If the maintenance work was being done for the standby chassis or the XSCF of the standby chassis, power off the physical partition (PPAR), and perform cold maintenance.
RTI No. RTIF2-170508-007
Model SPARC M12-2S
Description In a SPARC M12-2S connected to a crossbar box (XBBOX), the following symptoms may occur if an XSCF failure occurs on some part of the chassis when the physical partition (PPAR) is in the powered-on state:
- Symptom 1
When the poweroff(8) command is executed, the PPAR is powered off, but the command does not respond for about 20 minutes.

- Symptom 2
When the PPAR is powered on, the following error occurs during power-on processing: "XB-XB interface fatal error." The power-on process is repeated and does not end normally.
Workaround If an XSCF failure has occurred, replace the XSCF unit (XSCFU) before performing a PPAR power operation.
[How to restore]
- Case of symptom 1
After about 20 minutes, the poweroff(8) command ends normally, and the PPAR is powered off.

- Case of symptom 2
Execute the poweroff -f command to forcibly power off the PPAR.
RTI No. RTIF2-170224-001
Model SPARC M12-2S
Description Suppose that you use the setpcl(8) command to change the LSB number of the SPARC M12 connected to a PCI expansion unit and start Oracle Solaris in the logical domain configuration. Then, you will be unable to display the configuration information for the PCI expansion unit, even by executing the showhardconf(8) command.
Workaround Use the setdomainconfig(8) command to set the logical domain configuration to the factory-default, and power on the physical partition (PPAR).
Then, configure the logical domain again.
RTI No. RTIF2-170224-002
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description Suppose that the setpowercapping(8) command is executed to set "Enable" for the power capping function and "powerlimit_p(percentage)" for the upper limit of power consumption. If the window time for exceeding the upper limit of power consumption is set to "none" and the input power is turned on or the physical partition (PPAR) is powered off, then "The limit of power has been exceeded" is registered in the event log.
Workaround There is no effective workaround.
Ignore this event log.
RTI No. RTIF2-170224-003
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description FRU registered in the error log is displayed as "PPAR#30" if the configuration error of the system board (PSB) is detected when executing the testsb(8) command or diagxbu(8) command.
Workaround There is no effective workaround.
Maintain the applicable PSB of the SPARC M12.
RTI No. RTIF2-170224-004
Model SPARC M12-2S
Description The switching of an XSCF may fail if the XSCF is switched by executing the switchscf(8) command while a physical partition (PPAR) is being powered on.
Workaround Do not switch an XSCF by using the switchscf(8) command while a PPAR is being powered on.
RTI No. RTIF2-170224-005
Model SPARC M12-2S
Description Powering on a physical partition (PPAR) in a system that satisfies all the following conditions may power on other PPARs too.
- Remote power management is enabled with the setremotepwrmgmt(8) command.

- A node is created whose SubNodeID is not set in a management item of remote power management.

- Multiple PPARs are configured.
Workaround If the system has multiple PPARs, create a management file for remote power management by specifying a PPAR-ID as a SubNodeID, and then register the remote power management settings with setremotepwrmgmt -c config.
RTI No. RTIF2-170224-006
Model SPARC M12-2S
Description If XSCF switching or an XSCF reboot occurred while the physical partition (PPAR) was being powered off, it may be impossible to turn off the power.
Workaround There is no effective workaround.
While powering off the PPAR, do not use the switchscf(8) command for the switching or the rebootxscf(8) command for the XSCF reboot.
[How to restore]
Turn off the input power, and then turn it on again. (AC OFF/ON)
RTI No. RTIF2-170224-007
Model SPARC M12-2S
Description If a hardware failure occurs in a 4BB or greater configuration, automatic cluster switching may fail.
If 16 or more guest nodes are incorporated into a single cluster, the following warning message may be output to the console of the control domain.
SA SA_xscf????.so to test host ??? failed
Workaround If automatic cluster switching fails, follow the procedure in the manual of the cluster software to perform switching manually.
RTI No. RTIF2-170224-008
Model SPARC M12-2S
Description If XSCF switching occurs while multiple physical partitions (PPARs) are being powered on at the same time, it may take more than usual to power them on.
Workaround There is no effective workaround.
Do not switch an XSCF by using the switchscf(8) command while PPARs are being powered on.
RTI No. RTIF2-170224-009
Model SPARC M12-2S
Description If the "SCF process down detected" error occurs on the standby XSCF while the XCP firmware is being updated, the "SCF panic detected" error may occur on the master XSCF.
Workaround There is no effective workaround.
After the reboot of every XSCF has completed, execute the flashupdate command with the -f option specified to update the XCP firmware again.
RTI No. RTIF2-170224-010
Model SPARC M12-2S
Description If a low-voltage problem in the XSCF unit occurs on the master XSCF, automatic master/standby switching may not occur.
Workaround There is no effective workaround.
If the master XSCF does not respond, execute the switchscf command with the -f option specified from the standby XSCF to forcibly switch the master/standby XSCF.
[Example]
XSCF> switchscf -t Master -f
The XSCF unit switch between the Master and Standby states. Continue? [y|n]:y
After the master/standby switching, replace the non-responsive XSCF unit.
RTI No. RTIF2-170224-011
Model SPARC M12-2S
Description After the master XSCF switchover has completed, any of the following events may occur.
[Event 1]
When executed with the replacefru command, active replacement of the XSCF unit (XSCFU) in the SPARC M12-2S fails with [Warning:051] displayed. This problem does not occur in active replacement of the XSCFU in a crossbar box.
[Event 2]
After you execute the rebootxscf -a command to reboot all XSCFs, hardware errors may not be detectable.
Workaround [Event 1]
After the master XSCF switchover has completed, wait about 20 minutes and then execute the replacefru command. Or, power off the PPAR requiring maintenance, and then execute the replacefru command.
[Event 2]
After the master XSCF switchover has completed, wait about 20 minutes and then execute the rebootxscf command.
[How to restore]
Reboot all the XSCFs by executing the rebootxscf -a command.
RTI No. RTIF2-170224-012
Model SPARC M12-2S
Description Suppose that you turn off/on (AC OFF/ON) the input power to the system when the execution results of the showstatus command show a component marked as a failure. After the power is turned on again, the failure mark of a chassis other than the master chassis may be cleared in the execution results of the showstatus command.
This problem occurs when an error occurs in a chassis other than the master chassis in a system with building block configuration and also the four bytes from the beginning of the first field in [Code:] of the error log have the value of "*0" as shown below.
[Example]
Date: Xxx XX HH:MM:SS XXX YYYY
Code: *******0-******************-************************
Workaround There is no effective workaround.
RTI No. RTIF2-170224-013
Model SPARC M12-2S
Description Suppose that you make a reservation for the operating physical partition (PPAR), consisting of one system board (PSB), to be disconnected at the next PPAR restart time. After that, if this reservation is canceled, the following command message is output: "An internal error has occurred."
This problem also occurs in the operating physical partition (PPAR), consisting of one system board (PSB), when the same PSB is specified to be configured.
[Example]
A reservation for disconnecting PSB#01-0, which is assigned to PPAR#01, is canceled.
XSCF> deleteboard -y -c reserve 01-0
PSB#01-0 will be unassigned from PPAR after the PPAR restarts.
Continue?[y|n] :y
XSCF> showboards -av
PSB R PPAR-ID(LSB) Assignment Pwr Conn Conf Test Fault
---- - ------------ ----------- ---- ---- ---- ------- --------
00-0 00(00) Assigned y y y Passed Normal
01-0 * 01(00) Assigned y y y Passed Normal
XSCF> addboard -y -c configure -p 1 01-0
PSB#01-0 will be configured into PPAR-ID 1. Continue?[y|n] :y
An internal error has occurred. Please contact your system administrator.
XSCF> showboards -av
PSB R PPAR-ID(LSB) Assignment Pwr Conn Conf Test Fault
---- - ------------ ----------- ---- ---- ---- ------- --------
00-0 00(00) Assigned y y y Passed Normal
01-0 01(00) Assigned y y y Passed Normal
Workaround Confirm the PSB to be specified when executing the addboard -c configure command. Also, ignore this error message since it has no effect on system operation.
RTI No. RTIF2-170224-014
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description When executed while a POST diagnosis is in progress on a PPAR consisting of one system board (PSB), the console command may not display the console screen.
Workaround For a building block configuration, restart PPAR power-on, switch the master/standby XSCF, or reboot the master XSCF.
For the SPARC M12-1, the SPARC M12-2, or the 1BB configuration with the SPARC M12-2S, power off the PPAR and then power it on again.
RTI No. RTIF2-170224-015
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If data is transmitted via SSH by the snapshot(8) -t command, this transmission may be delayed by about 10 to 30 minutes in comparison with transfer using USB devices and XSCF Web.
Workaround There is no effective workaround.
Even if transfer is delayed, there is no problem with the collected data.
RTI No. RTIF2-170224-016
Model SPARC M12-2S
Description While in the process of adding or removing a system board (PSB) using the addboard(8) or deleteboard(8) command of the DR feature, if another physical partition is restarted due to hardware failure or the poweroff(8)/poweron(8)/reset(8) command is executed on it, the executed addboard(8) or deleteboard(8) command may detect timeout and terminate abnormally.
Workaround Do not execute the poweroff(8)/poweron(8)/reset(8) command while the addboard(8) or deleteboard(8) command is being executed. There is no effective workaround if any hardware failure occurs while executing DR.
[How to restore]
Check the status of the system board (PSB) using the showboards(8) command. Execute the addboard(8) or deleteboard(8) command after that.
RTI No. RTIF2-170224-017
Model SPARC M12-2S
Description While executing the poweroff(8)/poweron(8)/reset(8) command on a physical partition, if the addboard(8) or the deleteboard(8) command of the DR feature is executed on another physical partition to add or remove a system board (PSB), the addboard(8) or deleteboard(8) command may detect timeout and terminate abnormally.
Workaround Do not execute the addboard(8) or deleteboard(8) command while the poweroff(8)/poweron(8)/reset(8) command is being executed elsewhere. There is no effective workaround if DR is executed while also executing power supply operations on another physical partition.
[How to restore]
Perform the following procedure.
1. Execute the showboards(8) command.

2. Check the Pwr/Conn/Conf/Test status of the system board (PSB) to confirm the end of power operations as follows:

- Power-on/Reset completed
The Pwr/Conn/Conf/Test status is "y y y passed" respectively.

- Power-off completed:
The Pwr/Conn/Conf status is "n n n" respectively.

3. Re-execute the addboard(8) or the deleteboard(8) command.
RTI No. RTIF2-170224-018
Model SPARC M12-2S
Description When the replacefru(8) or addfru(8) command is executed for the addition of a SPARC M12-2S or crossbar box, the following message is output and the addition may fail.

For replacefru(8):
[Warning:036]
Failed to find BB#x.
The BB-ID setting and/or the cable connections of the BB#1 will be wrong.
Please confirm the BB-ID setting and the cable connections.
Do you want to try to replace BB#x again?
[r:replace|c:cancel] :

For addfru(8):
[Warning:036]
Failed to find BB#x.
The BB-ID setting and/or the cable connections of the BB#x will be wrong.
Please confirm the BB-ID setting and the cable connections.
Do you want to try to add BB#x again?
[a:add|c:cancel] :
Workaround After executing the replacefru(8) or addfru(8) command and the following maintenance menu message appears, turn on the input power to the SPARC M12-2S or crossbar box being added. Then, wait 20 minutes before performing the next operation (step 4 for replacefru(8), or step 2 for addfru(8)).

For replacefru(8):
Please execute the following steps:
1) Remove (Delete) the BB#x from a system.
2) Turn off the breaker of the BB#x.
3) After the exchanged device is connected with the system, turn on the breaker of the BB#x.
4) Please select[f:finish] :

For addfru(8):
Please execute the following steps:
1) After the added device is connected with the system, please turn on the breaker of the BB#x.
2) Please select[f:finish] :

[How to restore]
For replacefru(8):
Enter "r" in response to the "[r:replace|c:cancel] :" message, and re-execute the replacefru(8) command.
For addfru(8):
Enter "a" in response to the "[a:add|c:cancel] :" message, and re-execute the addfru(8) command.
RTI No. RTIF2-170224-019
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description While XSCF is starting after being powered on, watchdog timeout may occur and XSCF is rebooted. After this reboot is completed, the configuration information of the components mounted on the system cannot be checked by the showhardconf(8) command.
Moreover, error logs regarding the following configurations may be registered.
Msg: Indispensable parts are not installed (PSU).
Msg: Indispensable parts are not installed (FAN).
Msg: Indispensable parts are not installed (OPNL).
Msg: PSU shortage
Msg: FAN shortage
Workaround There is no effective workaround.
[How to restore]
Re-execute power off and on.
RTI No. RTIF2-170224-020
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description In the OID information of scfComponentStatusEvent in the definition file of XSCF extended MIB, the path information of the suspected components may be denoted as "unspecified" in the trap notification.
This symptom occurs when the FaultEventCode information of the OID is any of the following:
05018113
05018123
05018133
05018211
05018221
05018231
Workaround There is no effective workaround. Execute the showlogs error command to confirm the suspected location.
RTI No. RTIF2-170224-021
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description At the XSCF reboot time, the error message "snmpd[XXXXX] svrSP: error doAction ACTION_CONTROL_LED" regarding SNMP may appear on the XSCF serial terminal.
Workaround There is no effective workaround.
Ignore this message.
RTI No. RTIF2-170224-022
Model SPARC M12-2S
Description At the time of replacement or cold replacement of the XSCFU or cold addition of the SPARC M12-2S, if the following conditions are met, "XCP firmware version synchronization failed" may be registered in the event log as the maintenance or addition fails.
- Multiple XSCFUs or SPARC M12 units are cold replaced or cold added at one time.

- The XCP version of a replacement component does not match that of the master XSCF.
Workaround For cold replacement or cold addition of two or more XSCFUs or SPARC M12 units, execute the replacefru(8) or addfru(8) command, and perform the operations one by one.
[How to restore]
Execute any of the following procedures.
- Procedure 1

1. Turn off the input power to the system and then turn it on again (AC OFF/ON).

2. Execute the flashupdate(8) command, specifying the XCP version.
XSCF> flashupdate -c update -m xcp -s xxxx -f
xxxx is the XCP version of the master XSCF.

- Procedure 2

Execute the replacefru(8) command to perform a pseudo replacement of the XSCFU or SPARC M12-2S that failed to be cold replaced.
RTI No. RTIF2-170224-023
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If XSCF login is performed with an XSCF user account, whose privileges are managed by an LDAP server, specified by the setldap(8) command, execution of commands in the XSCF shell or operations on XSCF Web may take a while.
Workaround In the case of an LDAP server, specified by the setldap(8) command, there is no effective workaround.
Specify the LDAP server with the setldapssl(8) command.
RTI No. RTIF2-170224-024
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description In case of SNMPv3, if the trap host name registered by the setsnmp(8) command contains a colon (:), it does not show up properly by the showsnmp(8) command.
[Example]
In case the trap host name is "test:example.com", the host name is shown as "test" and port number is shown as "0".
Workaround In case of SNMPv3, do not register a trap host name with the setsnmp(8) command, that contains a colon (:).
If such a trap host name has already been registered, use the following command to remove the trap host name:

setsnmp remv3traphost -u 'username' -p 'port_nunber' trap_host_name

In such a case, be sure to specify a port number. If the port number is not specified when removing a trap host name that includes a colon(:), the "Entry does not exist" message is displayed and the trap host name is not removed. The port number specified at the time of removal should be the one, which is not wrongly displayed by the showsnmp(8) command, but the one specified at the time of registration.
RTI No. RTIF2-170224-025
Model SPARC M12-2S
Description Suppose that a SPARC M12-2S (system board: PSB) has been degraded due to a failure in a physical partition (PPAR) consisting of several SPARC M12-2S units. After that, if the PPAR is not powered off before the setpciboxdio(8) command is executed to disable/enable the direct I/O function of the PCI card mounted in the PCI expansion unit connected to the degraded chassis, the following message is output and the command fails.
This operation cannot be done because the PPAR including a PSB of the target BB is powered on.

This symptom occurs when the state of the PSB is like the following, which can be derived from the executed showhardconf(8) command or showboards(8) command.
[Example] PSB#01-0 (BB#01) has been degraded.
XSCF> showhardconf
...
* BB#01 Status:Deconfigured;
...
XSCF> showboards -a
PSB R PPAR-ID(LSB) Assignment Pwr Conn Conf Test Fault
---- - ------------ ----------- ---- ---- ---- ------- --------
01-0 00(01) Assigned n n n Passed Faulted
Workaround Use the replacefru(8) command to perform maintenance on the chassis where the degradation occurred. Then, make settings.
RTI No. RTIF2-170224-026
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If the poweroff(8) command is executed and the master chassis XSCF is rebooted during the time that it takes for a prompt to be returned, the following power on/power off operation cannot be executed.
Workaround There is no effective workaround.
If this phenomenon occurs, turn off the input power to all chassis and then turn it on again.
RTI No. RTIF2-170224-027
Model SPARC M12-2S
Description With the system input power turned off, if the CPU memory unit lower (CMUL) is replaced or the SPARC M12-2S is added without using the maintenance menu, the following error log may be registered when automatic synchronization of XCP firmware versions is enabled.
Alarm: :SCF:Gaps between XBBOX-ID
Or
Information: :SCF:Gaps between BB-ID
Workaround There is no effective workaround.
Ignore this error log entry.
RTI No. RTIF2-170224-028
Model SPARC M12-2S
Description After the input power is turned on with the XSCF DUAL control cable disconnected or faulty, data between the master and standby XSCFs is not synchronized even if the XSCF DUAL control cable is restored.
System operation can continue. However, after master/standby XSCF switching, normal system operation is not guaranteed. This is because information in the old master XSCF is not reflected in the new XSCF.
You can check, with the following error logs, whether the XSCF DUAL control cable is disconnected or faulty:
- The XSCF DUAL control cable is disconnected:
Msg: BB control cable detected unexpected

- The XSCF DUAL control cable is faulty
Msg: Cannot communicate with the other XSCF
Workaround Before turning on the input power, confirm that the XSCF DUAL control cable is correctly inserted.
Also, use the showlogs error command to confirm that the error logs shown in [Description] are not registered.
[How to restore]
If the XSCF DUAL control cable is disconnected, make sure that it is properly connected. Then, execute the rebootxscf -a command to reboot all XSCFs.
If the XSCF DUAL control cable is faulty, replace the cable.
RTI No. RTIF2-170224-029
Model SPARC M12-2S
Description If the input power to the standby or slave chassis is turned off, a "Board control error (MBC link error)" error log may be registered.
Workaround There is no effective workaround.
Ignore this error log entry.
RTI No. RTIF2-170224-032
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description When the OS panics, a large volume of panic messages may be sent to the XSCF. In this case, the XSCF cannot handle the large volume of panic messages. As a result, the codd process fails and OS-panic error logs are registered in large quantities as shown below.
[Example] OS panic and process failure error logs
XSCF> showlogs error -v
Date: Dec 20 14:44:26 JST 2013
Code: 40000000-00ffff0000ff0000ff-01b900060000000000000000
Status: Warning Occurred: Dec 20 14:44:26.513 JST 2013
FRU: /UNSPECIFIED
Msg: XSCF command: System status change (OS panic) (PPARID#00, path: 00)
Diagnostic Code:
00000000 00000000 0000
00000000 00000000 0000
00000000 00000000 0000
00000000 00000000 00000000 00000000
00000000 00000000 0000
Date: Dec 20 15:00:01 JST 2013
Code: 20000000-00fcff00b0000000ff-010400010000000000000000
Status: Notice Occurred: Dec 20 14:59:56.838 JST 2013
FRU: /FIRMWARE,/XBBOX#81/XSCFU
Msg: SCF process down detected
Diagnostic Code:
00000000 00000000 0000
51000000 00000000 0000
00000000 00000000 0000
636f6464 2e323537 382e627a 32000000
00000000 00000000 0000
You can check codd by confirming that the first four bytes on the fourth line of the [Diagnostic Code:] have the value 636f6464.
Workaround There is no effective workaround.
[How to restore]
The system is restored when the XSCF is rebooted by codd process failure.
RTI No. RTIF2-170224-033
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description Description Suppose that a cluster system is built to meet the following conditions: it consists of multiple SPARC M12/M10 system chassis, each of which contains 10 or more guest domains (10 or more cluster nodes) running in one physical partition (PPAR). Moreover, PRIMECLUSTER software is installed on each of these guest domains. Alternatively, the cluster system consists of multiple PPARs inside the SPARC M12/M10 system chassis. Then, if the poweroff -f command is executed on one PPAR to forcibly power off that PPAR, the XSCF may slow down, panic, and then reboot.
Workaround Confirm that the number of cluster nodes configured per PPAR existing in the SPARC M12/M10 system is fewer than 10 nodes.
[How to restore]
After an XSCF panic reboot, the poweroff command continues being processed, so the system can be used as is.
RTI No. RTIF2-170224-034
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description With operations performed in the following order, the error message "An internal error has occurred. Please contact your system administrator." appears when the prtfru(8) command is executed. The command abnormally ends.
1. Turn on the input power and execute the rebootxscf(8) or switchscf(8) command to start or reboot the XSCF.

2. Execute the snapshot(8) command.

3. Execute the prtfru(8) command.
Workaround After the XSCF is started or rebooted, execute the prtfru(8) command before executing the snapshot(8) command.
[How to restore]
Reboot all the XSCFs by executing the rebootxscf(8) command.
RTI No. RTIF2-170224-036
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If the guest domain has been kept active for a long time, powering off and then powering on a physical partition (PPAR) may cause the guest domain time to shift.
This phenomenon occurs under the following conditions.
- A guest domain is configured (*1), and

- a long period of time passes after the ldm add-spconfig command is executed from Oracle VM Server for SPARC (*2), and

- a physical partition power is turned on or reset.

*1 Time deviation does not occur on the control domain.

*2 Time deviation comes to about 20 seconds per month.
Workaround Immediately before powering off a physical partition or resetting it, execute the ldm addspconfig from Oracle VM Server for SPARC, and store the latest guest domain configuration information in XSCF.
[How to restore]
If the guest domain time shifts, boot Oracle Solaris in single user mode, and then synchronize the time.
[Example] Setting of 18:30:00 on June 27, 2014 # date 0627183014.00
# date 0627183014.00
RTI No. RTIF2-170224-037
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description Suppose that you execute any of the following on a system with a PCI expansion unit connected when the power to the physical partition is on. In such a case, an SNMP Trap related to the addition of the PCI expansion unit or the link card is sent by mistake.
- Reboot the XSCF

- Switch the master/standby XSCF

- Change the SNMP agent from the disabled state to the enabled state

- Set the SNMP agent management information when the SNMP agent is enabled

In such a case, the following SNMP Traps are sent.
- PCI expansion unit addition

scfPciBoxEvent
scfTrapEventType=add(10)
- Link card addition

scfComponentEvent
scfTrapEventType=add(10)
Similarly, the following SNMP Trap on the PCIe card addition is sent by mistake in a system with a PCIe card connected.
scfComponentEvent
scfTrapEventType=add(10)
Workaround There is no effective workaround.
This incorrect SNMP Trap sending does not affect the behavior of the PCI expansion unit or the PCIe card.
RTI No. RTIF2-170224-038
Model SPARC M12-2S
Description For a successful firmware update, "Event: SCF:XCP update has been completed" is registered in the log at the XCP firmware update time. However, the firmware may not actually have been updated on some SPARC M12 units or crossbar boxes.
Workaround There is no effective workaround. If any of the following conditions is true, update the XCP firmware again.
- Condition 1:

"Updating XCP:XSCF updated (BBID=x, bank=y)" is not logged twice for each chassis, between the logging of "SCF:XCP update is started (XCP version=xxxx:last version=yyyy)" and "SCF:XCP update has been completed (XCP version=xxxx:last version=yyyy)".
- Condition 2:

A log indicating an error in a connected chassis is registered between the logging of "SCF:XCP update is started (XCP version=xxxx:last version=yyyy)" and "SCF:XCP update has been completed (XCP version=xxxx:last version=yyyy)".
[Example 1]
XSCF> showlogs monitor -r
Alarm: /XBBOX#81/XSCFU:SCF:XSCF hang-up is detected
[Example 2]
XSCF> showlogs monitor -r
Notice: /FIRMWARE,/BB#0/CMUL:SCF:SCF panic detected
RTI No. RTIF2-170224-039
Model SPARC M12-1, SPARC M12-2, SPARC M12-2S
Description If available CPU resources in a physical partition (PPAR) have not been assigned when a CPU Activation Interim Permit expires, powering on the PPAR causes the PPAR to be reset repeatedly without power-on processing being suppressed.
At this time, the following event log is registered repeatedly.
SCF:PPAR-ID x: Reset
SCF:SP-Config falling back to factory-default (PPARID 0 factor:0x1010000)
SCF:PPAR-ID x: Reset released
Workaround After a CPU Activation Interim Permit expires, execute the setinterimpermit disable command to disable the CPU Activation Interim Permit.
To power on a PPAR, assign available CPU core resources in the PPAR.
[How to restore]
Perform the following procedure.
1. Execute the poweroff -f command to forcibly power off the PPAR that is being reset repeatedly.

2. Execute the poweroff command (without -f) to power off all PPARs other than that described above.

3. Turn off/on (AC OFF/ON) the input power to every SPARC M12.

4. Execute the setinterimpermit disable command to disable the CPU Activation Interim Permit.
RTI No. RTIF2-170224-040
Model SPARC M12-2S
Description If the master XSCF hangs during replacement of the XSCF unit (XSCFU) in the SPARC M12, the following error may be wrongly detected when the input power to the system is turned off/on (AC OFF/ON) to restore the master XSCF.

Date: Jan 16 01:39:17 JST 2017
Code: 40002000-0075210000ff0000ff-019112200000000000000000
Status: Warning Occurred: Jan 16 01:39:13.403 JST 2017
FRU: /BB#0/CMUU
Msg: Insufficient PUMP rotation speed
Workaround There is no effective workaround.
Turn off/on (AC OFF/ON) the input power to the system again.
RTI No. RTIF2-170224-041
Model SPARC M12-2S
Description If you execute a command listed in [Command list] during XSCF master/standby switching, the following problems occur.
- The XSCF that executes the command stops due to "BOARD ERROR."

- You cannot power on the physical partition containing the PSB (BB) of the stopped XSCF described above.

You can confirm that XSCF master/standby switching is completed by executing the showhardconf command to check for "Normal" under [Status] of XBBOX or BB.
[Command list]
restoreconfig(8)
rebootxscf(8)
flashupdate(8)
setdate(8)
sethsmode(8)
Workaround Do not execute any of the commands listed in [Command list] in [Description] during XSCF master/standby switching.
[How to restore]
Recover the system by performing the following procedure.
1. Shut down Oracle Solaris on all logical domains.

2. Execute the poweroff -f command to forcibly power off all PPARs.

3. Turn off (AC OFF) the input power to every SPARC M12.

4. Turn on (AC ON) the input power to every SPARC M12.
RTI No. RTIF2-170224-042
Model SPARC M12-2S
Description Within 30 minutes after using the replacefru command for XSCF unit (XSCFU) replacement, if you execute the replacefru command specified with an XSCFU inside the same physical partition (PPAR), this replacement attempt fails.
At this time, the "Warning:055" error appears.
[Example] Error message of the replacefru command
[Warning:055]
BB#7/XSCFU cannot be Replacement.
Because the PPAR is a possibility that the control domain is stopped
for CoD resource violation.
Workaround When replacing multiple XSCFUs inside the same PPAR, wait 30 minutes before replacing the next one.
[How to restore]
After the "Warning:055" error appears, wait 30 minutes, and then replace an XSCFU again.
RTI No. RTIF2-170224-044
Model SPARC M12-2S
Description If panic, process down, etc. occurs on the master XSCF during XSCF unit (XSCFU) replacement using the replacefru command, reboot or switching occurs in the master XSCF. At this time, XSCFU replacement work has not been completed. In this state, the execution of a command listed in [Command list] causes any of the following:
- The command fails, leading to an error, etc.

- After master/standby XSCF switching, the master XSCF does not reflect the command setting information.

[Command list]
addboard(8)
addfru(8)
addpowerschedule(8)
clearremotepwrmgmt(8)
deleteboard(8)
deletepowerschedule(8)
diagxbu(8)
flashupdate(8)
initbb(8)
ioxadm(8)
poweroff(8)
poweron(8)
rebootxscf(8)
reset(8)
restoreconfig(8)
setcod(8)
setdate(8)
setpowerschedule(8)
setpparmode(8)
setremotepwrmgmt(8)
setupfru(8)
testsb(8)
setinterimpermit(8)
sethsmode(8)
Workaround After the XSCF is rebooted, execute the replacefru command to complete the XSCFU replacement work.
[How to restore]
Recover the system by performing the following procedure.
1. Shut down Oracle Solaris on all logical domains.

2. Execute the poweroff -f command to forcibly power off all PPARs.

3. Turn off (AC OFF) the input power to every SPARC M12.

4. Replace the XSCFU with a FRU.

5. Turn on (AC ON) the input power to every SPARC M12.
RTI No. RTIF2-170224-045
Model SPARC M12-2S
Description Suppose that active replacement using the replacefru command is in progress for the XSCF unit (XSCFU). Meanwhile, if a failure occurs in the PPAR containing the SPARC M12 (PSB) where this XSCFU is mounted, the PPAR is restarted. Then, the PSB with the XSCFU being replaced is left powered on, and only this PSB is disconnected from the PPAR.
After the PPAR is restarted, you can confirm the occurrence of this phenomenon by executing the showboards(8) command. Check for the display of "y" under Pwr, "n" under Conn, and "n" under Conf.
[Example] PSB#03-0 is in the powered-on state, and the disconnection from the PPAR configuration causes an abnormal state

XSCF> showboards -av
PSB R PPAR-ID(LSB) Assignment Pwr Conn Conf Test Fault
---- - ------------ ----------- ---- ---- ---- ------- --------
00-0 00(00) Assigned y y y Passed Normal
01-0 00(01) Assigned y y y Passed Normal
02-0 00(02) Assigned y y y Passed Normal
03-0 00(03) Assigned y n n Passed Normal
Workaround There is no effective workaround.
[How to restore]
Recover the system by performing the following procedure.
1. Shut down Oracle Solaris on all logical domains.

2. Execute the poweroff -f command to forcibly power off all PPARs.

3. Turn off (AC OFF) the input power to every SPARC M12.

4. Replace the XSCFU being active replaced.

5. Turn on (AC ON) the input power to every SPARC M12.
RTI No. RTIF2-170224-046
Model SPARC M12-2S
Description Suppose that you execute the deleteboard command to disconnect the system board (PSB) from the operating physical partition (PPAR). At this timing, if a failure occurs in the PSB being disconnected, the deleteboard command ends normally without the Fatal reboot of the operating PPAR.
Actually, the PSB failed to be disconnected, so if system operation continues as is, an error may occur in the operating PPAR.
After "BB-ID n: Reset" appears in the event log, you can confirm the occurrence of this phenomenon by looking for an error occurrence and the display of "Reset retry."
[Example] The "Abnormal reaction of CPU" error has occurred in the PSB being disconnected.
XSCF> showlogs monitor
Dec 9 16:36:38 M12-2 Event: SCF:PPAR-ID 0: PSB#03-0 deleteboard started
Dec 9 16:36:38 M12-2 Event: SCF:PPAR-ID 0: PSB#03-0 is disconnected (deleteboard)
Dec 9 16:36:41 M12-2 Event: SCF:PPARID 0 GID 00000001 state change (Solaris suspended)
Dec 9 16:36:41 M12-2 Event: SCF:PPARID 0 GID 00000002 state change (Solaris suspended)
Dec 9 16:36:42 M12-2 Event: SCF:PPARID 0 GID 00000003 state change (Solaris suspended)
Dec 9 16:36:43 M12-2 Event: SCF:PPARID 0 GID 00000001 state change (Solaris running)
Dec 9 16:36:44 M12-2 Event: SCF:PPARID 0 GID 00000002 state change (Solaris running)
Dec 9 16:36:44 M12-2 Event: SCF:PPARID 0 GID 00000003 state change (Solaris running)
Dec 9 16:37:16 M12-2 Event: SCF:BB-ID 3: Reset
Dec 9 16:38:40 M12-2 Warning: /BB#3/CMUU:SCF:Abnormal reaction of CPU (compare)
Dec 9 16:38:47 M12-2 Warning: /BB#3/CMUL:SCF:Abnormal reaction of CPU (compare)
Dec 9 16:38:48 M12-2 Event: SCF:Reset retry
Dec 9 16:39:57 M12-2 Event: SCF:PPAR-ID 0: PSB#03-0 deleteboard completed
Workaround There is no effective workaround.
After resetting the PPAR with the reset -p x por command or after powering off the PPAR with the poweroff command, power on the PPAR with the poweron command to restore the system.
RTI No. RTIF2-170224-047
Model SPARC M12-2S
Description If BB#01 is the master XSCF, the slave chassis may not be recognized when the input power is turned on (AC ON) under any of the following conditions:
- The XSCF BB control cable between BB#00 and the slave XSCF is disconnected.

- The XSCF BB control cable between BB#00 and the slave XSCF is faulty.
Workaround Turn off (AC OFF) the input power to the system, and either confirm that the XSCF BB control cable is connected or replace this cable. Then, turn on (AC ON) the input power to the system.
RTI No. RTIF2-170224-049
Model SPARC M12-2S
Description Suppose that a physical partition (PPAR) consisting of multiple SPARC M12-2S (BB) units is operating. If one of the BBs loses power during this time, the PPAR has to be reset in order for the PPAR to operate continuously with the other BBs that still have power.
If the BB that lost power performs power recovery while the PPAR is being reset, it may interrupt the PPAR reset and power off the PPAR.
At this time, the following error log is registered.
Date: Oct 03 13:19:55 JST 2016
Code: 40000000-00fcff0000ff0000ff-0192ffff0000000000000000
Status: Warning Occurred: Oct 03 13:19:50.293 JST 2016
FRU: /FIRMWARE
Msg: LSI control error (SP internal)
Workaround There is no effective workaround.
[How to restore]
Execute the poweron command to power on the PPAR.
RTI No. RTIF2-170224-050
Model SPARC M12-2S
Description Suppose that XSCF unit (XSCFU) replacement using the replacefru(8) command has failed. If you leave that situation as it is and try to replace another XSCFU, a retry to replace the previous failed XSCFU fails again.
Workaround If XSCFU replacement using the replacefru(8) command fails, try again to replace the same XSCFU until you are successful.
Meanwhile, do not replace the other XSCFUs.