IBM Power Systems Scale-out LC Server Firmware

Applies to:  LC921(9006-12P) and LC922(9006-22P)

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.

 

Contents

1.0 Systems Affected

1.1 Minimum ipmitool Code Level

1.2 Minimum Browser levels for BMC Web GUI

1.3 Fix level Information on IBM OpenPOWER Components and Operating systems

1.4 Enabling Trusted Boots in RHEL7.5-ALT

2.0 Important Information

3.0 Firmware Information

3.1 Firmware Information and Description 

4.0 Operating System Information

4.1 Linux Operating System

4.2 How to Determine the Level of a Linux Operating System

4.3 How to Determine if the opal-prd (Processor Recovery Diagnostics) package is installed

5.0 How to Determine The Currently Installed Firmware Level

6.0 Downloading the Firmware Package

7.0 Installing the Firmware

7.1 IBM Power Systems Firmware maintenance

7.2 Updating the System Firmware with the pUpdate utility

7.3 Supporting Diagnostics

7.4 Installing ipmitool on Ubuntu

7.5 Updating the System Firmware using the BMC Web GUI

7.6 System I/O Firmware

7.6.1 Embedded 12Gb SAS Controller, Microsemi Adaptec SmartIOC 2000 16i

8.0 System Management and Virtualization

8.1 BMC Service Processor IPMI and Web GUI Access

8.2 OpenPOWER Abstraction Layer (OPAL)

8.3 Intelligent Platform Management Interface (IPMI)

8.4 Petitboot bootloader

9.0 Quick Start Guide for Installing Linux on the LC servers

10.0 Change History

 

1.0 Systems Affected

This package provides firmware for the Power Systems Scale-out LC921 (9006-12P)  and LC922 (9006-22P).

 

The firmware level in this package is:

 

These are the following images:

BMC Firmware:                  SMT_P9_207.bin

PNOR Firmware:      P9DSU20190807_IBM_sign.pnor

pUpdate  version 2.20 Utility for powerpc:  pUpdate_ppc

 

 

 

Details on the package binaries are included in section 3.1.

 

Note 1: If  the system has BMC 1.xx and PNOR 1.xx levels, the following update steps must be used to update to the new 2.xx  or later version levels.   The update must be done with the system powered off.

 

1. Power off the system

2. Update BMC to BMC 2.04 or later using pUpdate 2.20.   Wait for the automatic BMC reset to complete.

3. You must reset the nginx configuration for Redfish (REST API) support:

a) ipmitool -I lanplus -H bmc_ip_address -U ipmi_userid  -P ipmi_password raw 0x30 0x70 0xb7

b) ipmitool -I lanplus -H bmc_ip_address  -U ipmi_userid   -P ipmi_password mc reset cold

4. Update PNOR  to PNOR V2.10 or later using pUpdate 2.20

5. Power on the system

 

 

Note 2:  If BMC and PNOR need to be downgraded from 2.xx or later level down to any 1.xx version of the firmware, the system must be powered off before the update and an AC cycle is needed after the update.  If you downgrade a BMC or a PNOR to any 1.xx version, you must downgrade both the BMC and PNOR  firmware so that each is at  the same primary version level such as 1.xx.

 

1.1 Minimum ipmitool Code Level

This section specifies the "Minimum ipmitool Code Level" required by the System Firmware to perform firmware installations and managing the system.  OpenPOWER requires ipmitool level v1.8.15 to execute correctly on the LC server firmware.

 

Verify your ipmitool level on your Linux workstation using the following commands:

 

bash-4.1$ ipmitool -V

ipmitool version 1.8.15

 

If you need to update or add impitool to your Linux workstation , you can compile ipmitool (current level 1.8.15) for Linux as follows from Sourceforge:

 

1.1.1  Download ipmitool tar from http://sourceforge.net/projects/ipmitool/  to  your Linux system

1.1.2  Extract tarball on Linux system

1.1.3  cd to top-level directory

1.1.4 ./configure

1.1.5  make

1.1.6  ipmitool will be under src/ipmitool        

 

 

 

1.2 Minimum Browser levels for BMC Web GUI

The BMC Web GUI is a  web-based application that works within a browser.   Supported browser levels are shown below with Chrome being the preferred browser:

  1.  

1.3 Fix level Information on IBM OpenPOWER Components and Operating systems

For specific fix level information on key components of IBM Power Systems servers and Linux operating systems, please refer to the documentation in the IBM Knowledge Center.

Here are the links for the LC921 and LC922 servers:

 

9006-12P:        http://www.ibm.com/support/knowledgecenter/POWER9/p9hdx/9006_12p_landing.htm


9006-22P       
http://www.ibm.com/support/knowledgecenter/POWER9/p9hdx/9006_22p_landing.htm

1.4  Enabling Trusted Boots in RHEL7.5-ALT

On Red Hat Enterprise Linux (RHEL) for PPC, RHEL-Alt 7.5, The Trusted Platform Module (TPM) device driver is not loaded automatically at boot time.  Without this driver, the TPM device will not be accessible.

 

This affects any user-space application needing to access the TPM, as well as kernel security functions, such as the Integrity Measurement Architecture subsystem (IMA) in the Linux kernel.  Without the TPM driver loaded, IMA will be unable to record trusted measurements to the TPM.

 

To load the driver manually, as root:

 

# modprobe tpm_i2c_nuvoton

 

To load the driver automatically at boot time:

 

# echo "tpm_i2c_nuvoton" > /etc/modules-load.d/tpm.conf"

 

The TPM device driver will be integrated as a built-in kernel module in a future release 7 of RHEL-Alt.  Once this is done, it will be loaded automatically and this procedure will no longer be necessary

2.0 Important Information


Downgrading firmware from any given release level to an earlier release level is not recommended. 

If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

Concurrent Firmware Updates not available for LC servers.

Concurrent system firmware update is not supported on these LC servers.

3.0 Firmware Information

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

For the LC server systems, the installation of system firmware is always disruptive.

 

3.1 Firmware Information and Description

The xxx.pnor  file updates the primary side of the PNOR.  The yyy.bin updates the primary side of the BMC only.  The golden sides are unchanged.

 

Filename

Size

Checksum

P9DSU20190807_IBM_prod_sign.pnor

67108992

ed569d65b880ee98832ec4104469037f

SMT_P9_207.bin

33554432

efc7e870bf486e173eb615f969dc5de0

pUpdate_ppc

133824

00afbdb0690fa576019331bfab93e743

 

Note: The Checksum can be found by running the Linux/Unix/AIX md5sum command against the file (all 32 characters of the checksum are listed), ie: md5sum pUpdate.  

 

After a successful update to the new firmware level, the PNOR components and BMC should be at the following levels.  The ipmitool "fru" command can be used to display FRU ID 47 and the ipmitool "mc info" command can be used to display the BMC level.

 

Note:  FRU information for the PNOR level does not show the updated levels via the fru command until the system has been booted once at the updated level.

 

PNOR firmware levels from FRU ID 47 inventory list for driver:  

 

Product Name          : OpenPOWER Firmware

Product Version         : open-power-SUPERMICRO-P9DSU-V2.14-20190807-prod

Product Extra            : op-build-a1f8650

Product Extra            : buildroot-2018.11.3-12-g222837a

Product Extra            : skiboot-v6.0.20

Product Extra            : hostboot-8591ded-p4f715ce

Product Extra            : occ-8fa3854

Product Extra            : linux-4.19.57-openpower1-p48ee860

Product Extra            : petitboot-v1.7.5-p11ed908

Product Extra            : machine-xml-734a35e

Product Extra            : hostboot-binaries-hw072719a.op920

Product Extra            : capp-ucode-p9-dd2-v4

Product Extra            : sbe-b6ee17b

Product Extra            : hcode-hw072719a.op920

 

BMC Level:                        

   

Display BMC firmware level using the "ipmitool mc info | grep Firmware" command:

 

Firmware Revision         : 2.07

 

 

OP920.00
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url: 
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

OP920.21 / V2.14-20190807 / BMC V2.07

 

10/07/19

 

Impact: Data     Severity: HIPER

 

New features and functions

 

The BMC web gui was enhanced so that the LDAP port can be set regardless of the SSL settings.

 

Support was added for the BMC web gui to be able to enable and disable IPMI over LAN.

 

Support was added for  Redfish 1.6.1 (2018.3) and new Redfish APIs:

1) /redfish/v1/CertificateService

2) /redfish/v1/Managers/BMC/Oem/Supermicro/IPAccessControl

3) /redfish/v1/Managers/BMC/Oem/Supermicro/BackupRestoreService

4) Removed post and delete method on Firmware Inventory and use HttpPushUri instead for file upload.

5) Revised patch array for ManagerNetworkProtocol.

6) Added Active Directory and LDAP.

 

Security was enhanced for stunnel by allowing SSL Medium Strength and Anonymous Cipher Suites to be disabled.  A reset of the stunnel configuration is required to do the disable using the following ipmitool commands one time:

1)  ipmitool ... raw 0x30 0x70 0xB9

2)  ipmitool ... mc reset cold

 

Added support for Active Directory to allow the BMC to make connections to LDAP\AD servers.

 

A change was made to enable OS software checkstops by default.  This prevents hangs of multiple hours in failed reboots if the CPUs become stuck at the start of a kdump.

 

System firmware changes that affect all systems

 

A change was made to fix an intermittent processor anomaly that may result in issues such as operating system or hypervisor termination, application segmentation fault, hang, or undetected data corruption.  The only issues observed to date have been operating system or hypervisor terminations.

 

A problem was fixed for Logical Volume Manager (LVM) virtual raid drives not being found that could prevent a system boot..  This problem is a regression error introduced at OP920.20 for PNOR version V2.12 when the "mdadm" command was moved from /usr/sbin to /sbin on the system.

 

A problem was fixed for failed FRUs associated with checkstops in hostboot not being guarded.  This is an intermittent timing problem related to error log entries that have been created but not written or flushed yet at the time of the guard processing.

 

A problem was fixed for an IPL failure caused by an IPMI timeout.  This is a rare problem and the reIPL after the failure recovers from the problem.  The eSEL for the IPMI timeout may be ignored.

 

A problem was fixed for an OS boot that fails because the BMC is going through a reboot itself.  The OS boot can fail when it needs to use BMC services when accessing the flash memory.  This can happen if the BMC is not ready to receive commands.  With the fix, the boot waits for the BMC to become ready instead of failing immediately on the errant flash access.

 

A problem was fixed for a fast reboot of the OS failing if VFs (Virtual Funcitions) were enabled and disabled prior to the reboot.

 

A problem was fixed for bad flashes caused by data size of memory to flash not being block aligned.  This error can intermittently cause partial data to be written to the flash.  

 

A problem was fixed for a hang in the OS reboot caused by a TOD failure.

 

A problem was fixed for a session timeout when clicking the "System Event Log" web page.

 

A problem was fixed for Redfish errors in  /redfish/v1/UpdateService.

 

A problem was fixed for CGI aborts when uploading configurations using HTTP.

 

A problem was fixed for SSL certificate checks that were incorrectly failing on the check of the private key.

 

A security problem was fixed for the BMC ethernet Network Interface Card (NIC) device driver.   The ethernet packet frames were not being padded with null bytes, which can allow remote attackers to obtain information from previous packets or kernel memory by using malformed packets,  This fix protects against  the Common Vulnerabilities and Exposures issue number CVE-2003-0001.

 

A problem was fixed for not being able to access the BMC web gui  using HTTPS and IPv6.  The fix requires an user step to enable it.   After putting on the fix, reset the Lighttpd configuration on the BMC.  This reset can be done using ipmitool with the following two commands:

1)  ipmitool ... raw 0x30 0x70 0xB7

2)  ipmitool ... mc reset cold

 

A problem was fixed in the BMC web gui for an incorrect VLAN ID ranges in the network configuration.

 

A problem was fixed for SSH security vulnerabilities that were found running the Qualys tool.

 

A problem was fixed for a failure that can occur when setting time with hwclock.  This failure is triggered by a small  time drift that can occur if NTP is active.

 

A problem was fixed in the BMC web gui for incorrect wording on the AD and LDAP pages.

 

A security problem was fixed for a password being stored in clear text on the BMC.

 

OP920.20 / V2.12-20190404 / BMC 2.06

04/18/19

 

Impact: Data      Severity:  HIPER

 

New features and functions

 

Added support in Redfish for configuring RADIUS (Remote Authentication Dial In User Service), a network protocol for remote user authentication and accounting.  It is implemented under redfish/v1/Managers/1/RADIUS.  Method supported: Get/Patch.  [PATCH]: "RadiusEnabled", "RadiusServerIP", "RadiusPortNumber", "RadiusSecret".

 

Added support in Redfish for configuring Syslog.

 

Added ipmi raw command 0x3a 0x30 to be able to set the Meltdown/Spectre risk level to 0, 1, or 2.  The default is risk level 0 to provide full mitigation but slowest performance.  Here are the risk levels:

Risk Level 0 = "Speculative execution controls to mitigate user-to-kernel and user-to-user side-channel attacks"

Risk Level 1 = " Speculative execution controls to mitigate user-to-kernel side-channel attacks"

Risk Level 2 = "Speculative execution fully enabled"

More information on these levels can be found at https://www.ibm.com/support/knowledgecenter/en/POWER9/p9hby/p9hby_speculative_execution_control.htm?pos=2.

After the risk level setting is changed, the host needs to be powered off and back on again to be running at the new risk level.

The Spectre/Meltdown fixes were included in the OP920.00 GA release and have the Common Vulnerabilities and Exposures issue numbers CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754.

 

 

Added support in the SNMP client to allow connections to V2 and V3 servers to be running at the same.

 

Added support for Active Directory to allow the BMC to make connections to LDAP\AD servers.

 

System firmware changes that affect all systems

 

HIPER/Pervasive:  A problem was fixed where, under certain conditions, a Power Management Reset (PM Reset) event may result in undetected data corruption.  PM Resets occur under various scenarios such as power management mode changes between Dynamic Performance and Maximum Performance, power management controller recovery procedures, or system boot.

 

A problem was fixed for a rare Nest Memory Management Unit (NMMU) hang calling out processor hardware incorrectly, masking the real cause of the problem which was an NPU failure.   The incorrect error messages take this form on the system:

3      | FQPSPPU0093G  | 2018-10-01 01:25:40  | Yes          | Warning   | CPU 1 has exceeded a correctable error threshold

4      | FQPSPPU0093G  | 2018-10-01 03:20:55  | Yes          | Warning   | CPU 0 has exceeded a correctable error threshold

5      | FQPSPAA0008M  | 2018-10-01 04:35:40  | Yes          | Critical  | Hostboot procedure callout

 

A problem was fixed for an incorrect Redfish access privilege to make it consistent with the BMC gui webpage.

 

A problem was fixed for Redfish namespace PhysicalContext.v1_3_0 not being found in schema PhysicalContext_v1.xml.

 

A problem was fixed for Redfish crashing if an attempt is made to convert a string to an integer when the string does not represent a number.

 

A problem was fixed for the host console losing data.

 

A problem was fixed for slow SOL console response for long-running commands.

OP920.10 / V2.10-20190208 / BMC 2.04

02/14/19

 

Impact: Data        Severity:  HIPER

 

New features and functions

 

Redfish support was extended to version 1.6.0 and the FanMode API was added.

 

Support for the UART3 and UART4 was disabled in the Linux kernel on the BMC.

 

The BMC GUI was enhanced to show both the PNOR version and the build date.

 

Support was added for a new BMC gui page to control the power capping of the system.

 

Support was added to the BMC for new PNOR version partition that has a 4k signed header.

 

 

System firmware changes that affect all systems

 

HIPER/Non-Pervasive:  Fixes included to address potential scenarios that could result in undetected data corruption.

 

A security problem was fixed to prevent host programs from being able to corrupt the BMC using the internal bridges between the host and BMC.  The Common Vulnerabilities and Exposures issue number is CVE-2019-6260.

 

A security problem was fixed to prevent a buffer overflow when loading the boot image that could cause firmware corruption.  The firmware mitigation adds additional checking of the initial boot firmware image's load size and terminates the boot if the size is too big.  The Common Vulnerabilities and Exposures issue number is CVE-2018-1992.

 

A problem was fixed for an intermittent IPL failure with BC131705 and  BC8A1703 logged with a processor core called out.  This is a rare error and does not have a real hardware fault, so the processor core can be unguarded and used again on the next IPL.

 

A problem was fixed on the BMC for an incorrect "LanDrvinit fails to initial" message.  This is not a true error and the message can be ignored.

 

A problem was fixed for an intermittent opal-prd crash that can happen on the host OS.  The fault signature is:  " opal-prd[2864]: unhandled signal 11 at 0000000000029320 nip 00000 00102012830 lr 0000000102016890 code 1"

 

A problem was fixed for a PCI Host Bridge (PHB) configuration write error that caused the incorrect PCIe device to be frozen.  The fault will be attributed to the last device to have a memory-mapped I/O operation (MMIO).  With this fix,  the freeze action for PHB configuration write errors is disabled in order to not impact functional hardware

 

A problem was fixed for diagnostic code trying to read sensor values for PCI Host Bridge (PHB) entries that are unused, which causes debug output to have incorrect values for the unused entries.  With the fix, only the used entries are processed by the diagnostic code.

 

A problem was fixed for a IPL loop/hang with a fatal MCE exception log caused by a probe of a failed PCI Host Bridge (PHB) that had been guarded.   This is an infrequent error because it requires a PHB to have previously failed. The exception log has the following format:

Fatal MCE at 000000003006ecd4   .probe_phb4+0x570

 CFAR : 00000000300b98a0

 <snip>

Aborting!

CPU 0018 Backtrace:

S: 0000000031cc37e0 R: 000000003001a51c   ._abort+0x4c

S: 0000000031cc3860 R: 0000000030028170   .exception_entry+0x180

S: 0000000031cc3a40 R: 0000000000001f10 *

S: 0000000031cc3c20 R: 000000003006ecb0   .probe_phb4+0x54c

S: 0000000031cc3e30 R: 0000000030014ca4   .main_cpu_entry+0x5b0

S: 0000000031cc3f00 R: 0000000030002700   boot_entry+0x1b8

 

A problem was fixed for memory Over-Temperature (OT) throttling not occurring when a DIMM reaches the throttle temperature.  Although the frequency to the memory DIMMs is not reduced, the fan speeds do increase to provide more cooling for the DIMMs.

 

A problem was fixed for error logs occurring on the IPL following a DIMM error recovery.  These logs, related to failed memory scrubbing, have the following "Signature Description":  "mba(n0p15c1) () ERROR: command complete analysis failed".  These error logs do not indicate a hardware problem and may be ignored.

 

A problem was fixed for system termination for a re-IPL with power on.  The system can be recovered by powering off and then IPLing.  This problem occurs infrequently and can be avoided by powering off the system between IPLs.

 

A problem was fixed for certain system boot failures not propagating to the BMC before the boot firmware shuts down.  Some details of the error log may still appear in the console output trace, but the details will not be available with the BMC queries.  This problem is timing dependent and intermittently possible depending on the timing of the shutdown path.  However, immediate shutdowns exacerbate the problem and increase the chance it can occur.

 

A problem was fixed for a BMC web gui freeze condition when an error event occurs on the backplane.

 

A problem was fixed for random "????" characters displayed on the SOL Console during the skiboot boot.

 

A problem was fixed for Redfish-Service-Validator detected errors in the following two Redfish APIs:   "/redfish/v1/Managers/BMC/LogServices/Log/Entries/[ID]"  and "/redfish/v1/Chassis/Planar/Assembly".

 

A problem was fixed for an IPv4 address change not persisting after a BMC reboot.  This error can occur if the last octet of the IP address is reduced in characters by the IP address change.  For the case where this was observed, the IP address was changed fro 50.6.36.100 to 50.6.36.1.  But after the BMC reboot, the IP address again had two trailing zeros on IP as the IP address had reverted to 50.6.36.100.

 

A problem was fixed for a skiboot hang that could occur rarely for an i2C request if the i2c bus is in error or locked by the On-Chip Controller (OCC).

 

A problem was fixed for "Unexpected TCE size" error messages when Linux tried the default P9 PHB4 pages size and used the unsupported  2M and 1G page sizes.  The TCE page size property is now set correctly with 4K/64K/16M and 256M supported.

 

Support was added to recognize a port parameter in the URL path for the Preboot eXecution Environment (PXE) in the ethernet adapters.  Without the fix, there could be PXE discovery failures if a port was specified in the URL for the PXE.  

 

A problem was fixed for an intermittent rare processor core lock failure that is not a real hardware problem.  The erroneous failure looks like this in the logs:

LOCK ERROR: Releasing lock we don't hold depth @0x30493d20 (state: 0x0000000000000001)

      [13836.000173140,0] Aborting!

      CPU 0000 Backtrace:

       S: 0000000031c03930 R: 000000003001d840   ._abort+0x60

       S: 0000000031c039c0 R: 000000003001a0c4   .lock_error+0x64

       S: 0000000031c03a50 R: 0000000030019c70   .unlock+0x54

       S: 0000000031c03af0 R: 000000003001a040   .drop_my_locks+0xf4

 

 

OP920.02 / V1.16-20180531/ BMC 1.27

10/15/2018

 

Impact: Availability     Severity: SPE

 

System firmware changes that affect all systems

 

A problem was fixed for a false over-temperature reading on the AOC SAS/SATA adapter due to an incorrect low 65C maximum temperature threshold.  With the fix, the maximum operating temperature for the device is set to 100C as the upper threshold.

 

OP920.01 / V1.16-20180531/ BMC 1.23

06/13/2018

Impact: Availability     Severity: SPE

 

 

System firmware changes that affect all systems

 

A problem was fixed for a failure in DDR4 RCD (Register Clock Driver) memory initialization that causes half of the DIMM memory to be unusable after an IPL.  This is an intermittent problem where the memory can sometimes be recovered by doing another IPL.  The error is not a hardware problem with the DIMM but it is an error in the initialization sequence needed get the DIMM ready for normal operations.

 

A problem was fixed for a failure to IPL if there are DIMM failures that require DIMM chips to be guarded.  During the memory reconfiguration, the system would reboot itself before the reconfiguration could complete.  The memory failure would persist into the next IPL attempt, with the result that the system might not be able to IPL with bad memory DIMMs.  This problem is more prevalent when there are errors in the larger DIMM modules, such as the 128 GB chips, as it takes longer to reconfigure the ranks of the larger memory chips.  There is a possibility that an IPL watchdog timeout and system reboot can occur if a rank of memory takes over two minutes to reconfigure to guard out the bad memory.

 

A problem was fixed for a failure to isolate to an errant FRU for a system checkstop.  This is an intermittent error related to the On-Chip Controller (OCC) not waiting long enough to collect the failure information for a checkstop that occurs on a busy system.  When this error happens, it prevents checkstop diagnosis procedures from identifying the cause of the checkstop fault.  For this error,  no active error bits are found  and the checkstop analysis failure error log is mapped to a SEL which directs the customer to contact support as shown below:

 

1 | 05/17/2018 | 20:49:27 | System Firmware Progress Boot Progress | Motherboard initialization () | Asserted

2 | 05/17/2018 | 20:50:13 | System Firmware Progress Boot Progress | System boot initiated () | Asserted

3 | 05/18/2018 | 09:27:07 | OEM record c0 | 040020 | ceff6fffffff ==> Checkstop Signal , check other serviceable SELs and resolve them

4 | 05/18/2018 | 09:27:44 | System Firmware Progress Boot Progress | Motherboard initialization () | Asserted

5 | 05/18/2018 | 09:27:56 | OEM record df | 040020 | 12046faa0000

6 | 05/18/2018 | 09:28:05 | OEM record de | 000000 | 100000000005 ==> Procedure callout, decodes to contact next level of suppoort of assistance

7 | 05/18/2018 | 09:28:31 | System Firmware Progress Boot Progress | System boot initiated () | Asserted

The following steps can be used to identify the signature for this problem.  Look for the SEL indicating checkstop signal as shown below:

3 | 05/18/2018 | 09:27:07 | OEM record c0 | 040020 | ceff6fffffff ==> Checkstop Signal , check other serviceable SELs and resolve them

If found, look for a "OEM de" SEL that is logged a few minutes after the checsktop signal SEL, decoding into "contact next level of support for assistance":

5 | 05/18/2018 | 09:27:56 | OEM record df | 040020 | 12046faa0000

6 | 05/18/2018 | 09:28:05 | OEM record de | 000000 | 100000000005 ==> Procedure callout, decodes to contact next level of suppoort of assistance

If found, collect plc.pl output and inspect the decoded eSEL (PEL) associated with the checkstop analysis. If the signature description in the PRD log reads "No active error bits found", the checkstop analysis failure is confirmed.

| Reference Code : BC70E550 |

| Hex Words 2 - 5 : 000000E0 00000B00 00000000 00200000 |

| Hex Words 6 - 9 : 000B0004 00000103 BC4ADD02 00000000 |

| |

| ModuleId : 0x0B |

| Reason Code : 0xE550 |

| Code Location : 0x0103 |

| |

| PRD SRC Type : PRD Detected Hardware Indication |

| PRD SRC Class : Software likely caused a hardware error condition,|

| : smaller possibility of a hardware cause. |

| |

| PRD Signature : 0x000B0004 0xBC4ADD02 |

| Signature Description : mcs(n0p1c0) No active error bits found |

 

 

 

OP920.00 / V1.16-20180518

05/25/2018

Impact:   New     Severity: New

 

New features and functions for MTMs  9006-12P and 9006-22P:

 

GA Level

 

 

Support for trusted boot and secure boot with TPM2.0 Nuvoton NPCT650ABAWX through an I2C controller.

 

Dedicated 1 GB IPMI port

 

Integrated MicroSemi PM8069 SAS/SATA 16-port Internal Storage Controller PCIe3.0 x8 with RAID 0, 1, 5, and 10 support (no write cache)

 

Integrated Intel XL710 Quad Port 10GBase-T PCIe3.0 x8 UIO built-in LAN (one shared management port)

 

Supermicro BMC support for Redfish:

Get System/Chassis inventory info

Manage user accounts and privileges

BMC configuration (AD, LDAP, SNMP, SMTP, RADIUS, Fanmode, Mousemode, NTP, Snooping etc.)

BIOS configuration

Boot order change

RAID configuration (For 3108)

Storage Management

Get NIC MAC info (NIC asset info)

BMC/BIOS Firmware updates

Get thermal/power/sensor info

Get system memory info

Get hostname

Launch iKVM/HTML5 using redfish

Update SSL certificate and key

Perform computer system reset

BMC reset

BMC configuration reset

Get health event log/Advanced system event log

Acknowledge warning/critical severity events

Virtual Media ISO image mounting

System firmware changes that affect all systems

HIPER/Pervasive:  A firmware change was made to address a rare case where a memory correctable error on POWER9 servers may result in an undetected corruption of data.

 

 

4.0 Operating System Information

OS levels supported by the LC921 and LC922 servers:

 - Red Hat Enterprise Linux (RHEL) 7.5 little endian (LE) (POWER9), or later

 - Ubuntu Server 18.04 LTS

 - Virtualization Engine

Feature (#EC16) -OpenPOWER non-virtualized configuration (also known as bare-metal install)

 

IBM Power LC921 and LC922 servers support Linux which provides a UNIX like implementation across many computer architectures.  Linux supports almost all of the Power System I/O and the configurator verifies support on order.  For more information about the software that is available on IBM Power Systems, see the Linux on IBM Power Systems website:

        http://www.ibm.com/systems/power/software/linux/index.html

4.1 Linux Operating System

 

The Linux operating system is an open source, cross-platform OS.  It is supported on every Power Systems server IBM sells.  Linux on Power Systems is the only Linux infrastructure that offers both scale-out and scale-up choices.  One supported version of Linux on the IBM Power LC921 and LC922 servers is Ubuntu Server 18.04 LTS for IBM POWER9.  For more information about Ubuntu Server for Ubuntu for POWER9 see the following website:

https://wiki.ubuntu.com/ppc64el

 

Another supported version of Linux on the LC921 and LC922  servers is Red Hat Enterprise Linux 7.5 LE.  For additional questions about the availability of this release and supported Power servers, consult the Red Hat Hardware Catalog at https://access.redhat.com/products/red-hat-enterprise-linux/#addl-arch

 

For more information about Linux on Power, see the Linux on Power developer center at https://developer.ibm.com/linuxonpower/

 

For information about the features and external devices that are supported by Linux, see this website:

http://www.ibm.com/systems/power/software/linux/index.html

 

 

4.2 How to Determine the Level of a Linux Operating System

 

Use one of the following commands at the Linux command prompt to determine the current Linux level:

 

 

The output string from the command will provide the Linux version level.

 

4.3 How to Determine if the opal-prd (Processor Recovery Diagnostics) package is installed

The opal-prd package on the Linux system collects the OPAL Processor Recovery Diagnostics messages to log file /var/log/syslog.  It is recommended that this package be installed if it is not already present as it will help with maintaining the system processors by alerting the users to processor maintenance when needed.

 

On Red Hat Linux, perform command "rpm -qa | grep -i opal-prd ".  The command output indicates the package is installed on your system if the rpm for opal-prd is found and displayed.  This package provides a daemon to load and run the OpenPower firmware's Processor Recovery Diagnostics binary. This is responsible for run-time maintenance of Power hardware.   If the package is not installed on your system, the following command can be run on Red Hat to install it:

        sudo yum update opal-prd

 

5.0 How to Determine The Currently Installed Firmware Level

The system firmware is a combination of BMC and PNOR firmware levels.

Use the ipmtool "fru" command or the BMC Web GUI FRU option to look at product details of FRU 47.

 

ipmitool -I lanplus -H <bmc host IP address> -P ADMIN -U ADMIN fru print 47

 

Use the ipmitool "mc info" command or the BMC Web GUI System tab to look at the BMC level.

 

      ipmitool -I lanplus -H <bmc host IP address> -P ADMIN -U ADMIN mc info|grep "Firmware"

 

6.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

 

7.0 Installing the Firmware

Note 1: If  the system has BMC 1.xx and PNOR 1.xx levels, the following update steps must be used to update to the new 2.xx  or later version levels.   The update must be done with the system powered off.

 

1. Power off the system

2. Update BMC to BMC 2.04 or later using pUpdate 2.20.   Wait for the automatic BMC reset to complete.

3. You must reset the nginx configuration for Redfish (REST API) support:

a) ipmitool -I lanplus -H bmc_ip_address -U ipmi_userid  -P ipmi_password raw 0x30 0x70 0xb7

b) ipmitool -I lanplus -H bmc_ip_address  -U ipmi_userid   -P ipmi_password mc reset cold

4. Update PNOR  to PNOR V2.10 or later using pUpdate 2.20

5. Power on the system

 

 

Note 2:  If BMC and PNOR need to be downgraded from 2.xx or later level down to any 1.xx version of the firmware, the system must be powered off before the update and an AC cycle is needed after the update.  If you downgrade a BMC or a PNOR to any 1.xx version, you must downgrade both the BMC and PNOR  firmware so that each is at  the same primary version level such as 1.xx.

7.1  IBM Power Systems Firmware maintenance

The updating and upgrading of system firmware depends on several factors, such as the current firmware that is installed, and what operating systems is running on the system.

These scenarios and the associated installation instructions are comprehensively outlined in the firmware section of Fix Central, found at the following website:

http://www.ibm.com/support/fixcentral/

 

Any hardware failures should be resolved before proceeding with the firmware updates to help insure the system will not be running degraded after the updates.

7.2 Updating the System Firmware with the pUpdate utility

The pUpdate utility is provided with the firmware update files from IBM Fix Central.  It can be used to perform in-band (from the host OS), in-band update recovery, and out-of-band updates by selecting either the "-i usb" , "-i bt" or  "-i lan" parameters, respectively on the command invocation. The code update needs to be done in two steps:  1) Update the BMC firmware and 2) Update the CEC PNOR for the hostboot and the OPAL components.  It is recommended that the BMC be updated first unless otherwise specified in the firmware install instructions.

 

Before using the pUpdate command on the host,  make sure that the ipmi driver is loaded in the kernel and the ipmi service is started.

 

Note: For updates that use the "usb" or "bt" pUpdate option, you must use the root user ID and password to log in to the host operating system. After you log in to the host operating system, ensure that the IPMI service is activated.

# chkconfig ipmi on

# service ipmi start

 

For more information about activating the IPMI service, see the OpenIPMI Driver: https://www.ibm.com/support/knowledgecenter/POWER8/p8eih/p8eih_ipmi_open_driver.htm

 

For in-band update, use the following "-i usb" invocation of pUpdate:

 

BMC update:  "pUpdate -f bmc.bin -i usb", where bmc.bin is the name and location of the BMC image file.

 

PNOR update:  "pUpdate -pnor pnor.bin -i usb", where pnor.bin is the name and location of the PNOR image file.

 

If the in-band update fails on the BMC, use the recovery option with the Block Transfer (bt) invocation of pUpdate:

 

BMC update: "pUpdate -f bmc.bin -i bt " where bmc.bin is the name  and location of the BMC image file.

 

PNOR update:" pUpdate -pnor pnor.bin -i bt " where pnor.bin is the name and location of the PNOR image file.

 

For more information on BMC recovery steps, refer to the following link in the IBM Knowledge Center:

https://www.ibm.com/support/knowledgecenter/POWER8/p8eis/p8eis_console_problem.htm

 

If the host is not booted, a network connection can be made to the BMC and an out-of-band update done with the following LAN invocation from a Linux companion system:

 

BMC update: " pUpdate -f bmc.bin -i lan -h xx.xx.xx.xx  -u ADMIN -p ADMIN -r y" where bmc.bin is the name and location of the BMC image file, xx.xx.xx.xx is the IP address of the BMC.

 

PNOR update:  "pUpdate -pnor pnor.bin -i lan -h xx.xx.xx.xx -u ADMIN -p ADMIN " where pnor.bin is the name  and location of the PNOR image file and xx.xx.xx.xx is the IP address of the BMC.

 

For more details on how to use the pUpdate utility, refer to the following link:

https://www.ibm.com/support/knowledgecenter/POWER9/p9eit/p9eit_update_firmware_pupdate.htm

7.3 Supporting Diagnostics

You can use diagnostic utilities to diagnose adapter problems.

 

For more details on how to use the diagnostic utilities,  refer to the following link:

https://www.ibm.com/support/knowledgecenter/POWER9/p9eit/p9eit_diags_kickoff.htm

7.4 Installing ipmitool on Ubuntu

OpenPOWER requires Source Forge ipmitool level v1.8.15 to execute correctly on the P9 firmware.  

 

7.5  Updating the System Firmware using the BMC Web GUI

Another method to update the system firmware is by using the baseboard management controller (BMC).

The system firmware is a combination of the BMC firmware and the PNOR firmware. To update the system firmware, update both the BMC firmware and the PNOR firmware by using the BMC.

Note: System firmware update from the BMC Web GUI is only supported on Google Chrome and Mozilla Firefox browsers.

 

Complete the following steps to update the BMC firmware:

1.        Log in to the BMC by entering the user name and password. Then, press Enter.

2.        From the Maintenance list on the BMC dashboard, select BMC Update.

3.        In the BMC Update window, select Enter Update Mode. Click OK.

4.        In the BMC Upload window, choose the .bin file from your local system folder and click Upload Firmware. Wait for the file to be uploaded. Then, click OK.

5.        The existing and new versions of the BMC firmware are displayed.  Ensure that the Preserve Configuration check box is selected and the Preserve SDR check box is not selected. Click Start Upgrade.

Note: You cannot perform other activities by using the BMC interface until the firmware update is complete.

6.        The upgrade progress of the firmware update is displayed. After the BMC update is complete, the system is restarted.

7.        After the restart of the system is complete, verify the firmware revision level in the System menu of the BMC dashboard.

 

Complete the following steps to update the PNOR firmware:

1.        Log in to the BMC by entering the user name and password. Then, press Enter.

2.        From the Maintenance list on the dashboard, select PNOR Update.

3.        In the PNOR Upload window, choose the .pnor file from your local system folder and click Upload PNOR. Wait for the file to be uploaded. Then, click OK.

4.        The existing and new dates of the PNOR firmware are displayed. Click Start Upgrade.

Note: You cannot perform other activities by using the BMC interface until the PNOR update is complete.

5.        The progress of the PNOR update is displayed. After the PNOR update is completed,  the system must be restarted to finish installation of the new PNOR firmware.

 

For more information on updating the firmware using the BMC, refer to the following link:

https://www.ibm.com/support/knowledgecenter/POWER9/p9eit/p9eit_update_firmware_bmc.htm

7.6  System I/O Firmware

System I/O devices have firmware that can be updated.  

Please see the IBM Knowledge Center for the 9006-12P and 9006-22P for applicable I/O firmware update information.

9006-12P:

http://www.ibm.com/support/knowledgecenter/POWER9/p9hdx/9006_12p_landing.htm

9006-22P:

http://www.ibm.com/support/knowledgecenter/POWER9/p9hdx/9006_22p_landing.htm

 

 

7.6.1 Embedded 12Gb SAS Controller, Microsemi Adaptec SmartIOC 2000 16i

To check the firmware version on this device, use the Microsemi Adaptec cli utility arcconf .  The command “arcconf getversion” will show information for all Microsemi Adaptec devices present on the machine.  If no other Microsemi Adaptec slot cards are installed, the embedded controller will have the controller ID of “1”.

 

Part # = PM8069

Description: Embedded 12Gb SAS Controller, Microsemi Adaptec SmartIOC 2000 16i

Minimum FW level: 4.02[0] (0)

8.0 System Management and Virtualization

The service processor, or baseboard management controller (BMC), provides a hypervisor and operating system-independent layer that uses the robust error detection and self-healing functions that are built into the POWER processor and memory buffer modules. OpenPOWER application layer (OPAL) is the system firmware in the stack of POWER processor-based Linux-only servers.

 

8.1  BMC Service Processor IPMI and Web GUI Access

The service processor, or baseboard management controller (BMC), is the primary control for autonomous sensor monitoring and event logging features on the LC server.

The BMC supports the Intelligent Platform Management Interface (IPMI) for system monitoring and management.  The BMC monitors the operation of the firmware during the boot process and also monitors the OPAL hypervisor for termination.  The firmware code update is supported through the BMC and Intelligent Platform Monitoring Interface (IPMI) and the BMC Web GUI  The GUI console is accessed using a web browser with a "http:" connection to port.  See section 1.2 for the supported browsers that can be used with BMC Web GUI.  

 

8.2 OpenPOWER Abstraction Layer (OPAL)

The OpenPOWER Abstraction Layer (OPAL) provides hardware abstraction and run time services to the running host Operating System.   For these LC servers,  only the OPAL bare-metal installs can be used.

 

Find out more about OPAL skiboot here:

https://github.com/open-power/skiboot

 

8.3 Intelligent Platform Management Interface (IPMI)

The Intelligent Platform Management Interface (IPMI) is an open standard for monitoring, logging, recovery, inventory, and control of hardware that is implemented independent of the main CPU, BIOS, and OS.  These LC servers provide one 10M/100M baseT IPMI port.

The ipmitool is a utility for managing and configuring devices that support IPMI. It provides a simple command-line interface to the service processor.  You can install the ipmitool from the Linux distribution packages in your workstation, sourceforge.net, or another server (preferably on the same network as the installed server).

 

For installing ipmitool from sourceforge, please see section 1.1 "Minimum ipmitool Code Level".

 

For more information about ipmitool, there are several good references for ipmitool commands:

 

  1. 1.The man page  

  2. 2.The built-in command line help provides a list of IPMItool commands:
    # ipmitool help 

  3. 3.You can also get help for many specific IPMItool commands by adding the word help after the command:
    # ipmitool channel help 

  4. 4.For a list of common ipmitool commands and help on each, you may use the following link:  
    www.ibm.com/support/knowledgecenter/linuxonibm/liabp/liabpcommonipmi.htm 

     

 

To connect to your host system with IPMI, you need to know the IP address of the server and have

a valid password. To power on the server with the ipmitool, follow these steps:

1. Open a terminal program.

2. Power on your server with the ipmitool:

ipmitool -I lanplus -H bmc_ip_address -U ipmi_userid -P ipmi_password power on

3. Activate your IPMI console:

ipmitool -I lanplus -H bmc_ip_address  -U ipmi_userid -P ipmi_password sol activate

 

8.4 Petitboot bootloader

Petitboot is a kexec based bootloader used by IBM POWER9 systems for doing the bare-metal installs on these LC servers.

After the POWER9 system powers on, the petitboot bootloader scans local boot devices and network interfaces to find boot options that are available to the system. Petitboot returns a list of boot options that are available to the system. If you are using a static IP or if you did not provide boot arguments in your network boot server, you must provide the details to petitboot.  You can configure petitboot to find your boot with the following instructions:

https://www.ibm.com/support/knowledgecenter/linuxonibm/liabp/liabppetitbootadvanced.htm

 

You can edit petitboot configuration options, change the amount of time before Petitboot automatically boots, etc. with these instructions:

https://www.ibm.com/support/knowledgecenter/linuxonibm/liabp/liabppetitbootconfig.htm

 

After you select to boot the ISO media for the Linux distribution of your choice, the installer wizard for that Linux distribution walks you through the steps to set up disk options, your root password, time zones, and so on.

You can read more about the petitboot bootloader program here:

https://www.kernel.org/pub/linux/kernel/people/geoff/petitboot/petitboot.html

 

 

9.0 Quick Start Guide for Installing Linux on the LC servers

This guide helps you install Linux on a Power Systems server.

Overview

Use the information found in http://www.ibm.com/support/knowledgecenter/linuxonibm/liabw/liabwkickoff.htm  to install Linux on a non-virtualized (bare metal) IBM Power LC server.  Note that the choice of PowerKVM is offered in the link but that is not a supported OS for these LC servers.

 

 

 

10.0 Change History

 

Date

Description

10/07/2019

OP920.21 release

04/18/2019

OP920.20 release

02/14/2019

OP920.10 release (Added firmware install notes)

10/15/2018

OP920.02 release (reverted pUpdate tool to prior version, 218)

6/13/2018

OP920.01 release

05/25/2018

OP920.00 release for Power 9 LC921 (9006-12P) and LC922(9006-22P)