Power8 System Firmware

Applies to:   8247-21L; 8247-22L; 8247-42L; 8284-22A; 8286-41A and 8286-42A.

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L (8247-42L), Power System S822 (8284-22A), Power System S814 (8286-41A) and Power System S824 (8286-42A) servers only.

The firmware level in this package is:

1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update.  If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

The Minimum HMC Code levels for this firmware are:

        HMC V8 R8.1.0 Service Pack 1  (PTF MH01420) with Security fix (PTF MH01474) and OPENSSL POODLE Security fix (PTF MH01481), or higher.
                                                                                                                   -OR-

        HMC V8 R8.2.0 (PTF MH01453) with Mandatory fix (PTF MH01454) and OPENSSL POODLE Security fix (PTF MH01486), or higher.

NOTE: For the firmware installation to proceed, the HMC must be updated to one of the above minimum levels,  prior to installing this server firmware level.

For information concerning HMC releases and the latest PTFs,  go to the following URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTES:
                -You must be logged in as hscroot in order for the firmware installation to complete correctly.
                - Systems Director Management Console (SDMC) does not support this System Firmware level.

2.0 Important Information

Note: The installation of this Service Pack (SV810_108 / FW810.21) is concurrent if your system is at firmware level SV810_081 / FW810.10, SV810_087 / FW810.11 or SV810_101 / FW810.20.  Otherwise the installation will be disruptive.

Recently, several enhancements were released to improve the reliability and function of new and existing adapters used on Power8 systems. To ensure the highest level of availability and performance, it is important that the following System Firmware, IO, AIX & VIOS maintenance is performed.  For efficiency, IBM recommends that all applicable System Firmware, IO, AIX & VIOS maintenance is consolidated and performed during the same session to reduce the number of scheduled maintenance windows.

System F/W: SV810_081 / FW810.10 (or higher)
- For systems in PowerVM mode, a problem was fixed for unresponsive PCIe adapters after a partition power off or a partition reboot.

I/O:
- Device: PCIe2 4-Port (10GbE SFP+ & 1GbE RJ45) Adapter
   Feature Codes: EN0S EN0T EN0U EN0V
   Version: 30090140 (or higher)
   An enhancement added to support Network Installation on 1GB speed switch ports.

- Device: PCIe2 2-Port 10GbE Base-T Adapter
   Feature Codes: EN0W EN0X
   Version: 20110140 (or higher)
   Fixes a Network Installation issue seen with 1GB speed switch port setting.

AIX/VIOS:
- VIOS 2233/61 TL09 SP3: IV63449
- AIX 71 TL03 SP03        :  IV63680

For Power8 systems using NIC adapter Feature Codes (FC) EN0U, EN0V, EN0S, EN0T, EL3Z, EN0W, EN0X which translate to:
PCIe2 4-Port Adapter (10GbE SFP+)
PCIe2 4-Port Adapter (1GbE RJ45)
PCIe2 2-Port 10GbE Base-T Adapter

These APARs correct a problem that occurs when promiscuous mode is not set when the adapter gets reset (e.g. when adapter becomes backup in SEA fail over mode or Encounters a transmit error). This would cause the adapter to transmit packet but not receive packets.

Downgrading firmware from any given release level to an earlier release level is not recommended.

If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

Concurrent Firmware Updates

Concurrent system firmware update is only supported on HMC Managed Systems only.

Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:
Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8hat/p8hat_lparmemory.htm


3.0 Firmware Information

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

For systems that are not managed by an HMC, the installation of system firmware is always disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed. Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01SVxxx_yyy_zzz

NOTE: Values of service pack and last disruptive service pack level (yyy and zzz) are only unique within a release level (xxx). For example, 01SV810_040_040 and 01SV820_040_045 are different service packs.

An installation is disruptive if:

            Example: Currently installed release is 01SV810_040_040, new release is 01SV820_050_050.

            Example: SV810_040_040 is disruptive, no matter what level of SV810 is currently installed on the system.

            Example: Currently installed service pack is SV810_040_040 and new service pack is SV810_050_045.

An installation is concurrent if:

The release level (xxx) is the same, and
The service pack level (yyy) currently installed on the system is the same or higher than the last disruptive service pack level (zzz) of the service pack to be installed.

Example: Currently installed service pack is SV810_040_040, new service pack is SV810_071_040.

3.1 Firmware Information and Description

 
Filename Size Checksum
01SV810_108_081.rpm
90813536
15139

Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01SV810_108_081.rpm

SV810
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/SV-Firmware-Hist.html
SV810_108_081 / FW810.21

01/09/15
Impact: Security         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed to prevent the Advanced System Management Interface (ASMI) "System Service Aids/Factory Configuration" panel option from restoring to factory configuration for FSP or ALL if one boot side of the service processor is marked invalid.  The following informational message is issued:  "The request cannot be performed because a firmware boot side is marked invalid.  This state may have been caused by a previous firmware update failure."
  • A problem was fixed for firmware updates from USB to allow the code update progress to be seen with the addition of progress code C100B100.  This progress code means that the firmware update is busy unpacking the firmware image file and that the USB key should not be removed until the operation is completed.
  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Downgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is implemented by disabling SSL protocol SSLv3 and requiring TLSv1.2 protocol on all secured connections.  The Hardware Management Console (HMC) also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V8 R8.1.0 SP1 with PTF MH01481).  This HMC minimum requirement is enforced by the firmware update process for this defect.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
  • A problem was fixed for two light-emitting diodes (LEDs) turning on incorrectly on the operator panel after a system power off.  These LEDs are the blue LED (Identify) and the amber LED (enclosure fault indicator LED with the exclamation point symbol ("!").

System firmware changes that affect certain systems

  • On systems with partitions using shared processors, a problem was fixed that could result in latency or timeout issues with IO devices.
SV810_101_081 / FW810.20

10/24/14
Impact: Availability    Severity: HIPER

New features and functions

  • Support for the IBM Power System S824L (8247-42L).
  • Support for NEBS-3 48VDC 750 W power supply with CCIN 51D8 and F/C #EB3H on the S822 (8284-22A) and the S822L (8247-22L).
  • Support for 128Gb CDIMM DDR3 DRAM with F/C #EM8E on the IBM Power System S824 (8286-42A).  These need to be ordered in pairs and each DIMM within a DIMM pair must be of the same capacity.
  • Support for the Nvidia Compute Intensive Accelerator (PCIe attached GPU) with F/C #EC47.  This feature is only supported on the IBM Power System S824L(8247-42L).  It is a PCIe 3 X16/Long/Full High/Double wide adapter with the PCIe connection in the left slot.
  • Support was added to enable fast sleep on OPAL systems, allowing for significant power savings.
  • Support for an Intelligent Platform Management Interface (IPMI) enhancement to provide a host Linux boot device path on OPAL systems.
  • Enhancement to the service processor dump for easier problem debugging by collecting full kcore dumps as a gzipped file instead of truncating the large kcore files.
  • Enhancement made to the Advanced System Management Interface (ASMI) "System Service Aids/Factory Configuration" menu to clear all firmware NVRAM for PowerVM and OPAL, regardless of the current firmware selection.  Previously, only the NVRAM for the current firmware type was cleared.
  • Support for additional PCIe adapters, which had previously been supported on Power7+ and earlier servers, to help with server migration:
        Ethernet 1 Gb LAN: 2-port UTP/TX (#5767, #5281), 2-port SX (#5768, #5274), and 4-port UTP/TX (#5717, #5271)
        Ethernet and FCoE: 2-port 10 Gb (#5708, #5270)
        SAS:  3-port 6 Gb/1.8 GB cache (#5913, #ESA3)

System firmware changes that affect all systems

  • A problem was fixed in the error handling of memory channel failures with SRC B181E540 to prevent false processor errors with SRC B113E504 during the next IPL after the memory fault.
  • A problem was fixed for L4 cache errors being assigned an incorrect subsystem of "Memory Controller" in the SRC B121E504 error log instead of "Memory Fru".    L4 cache resides on the DIMM and is not a memory controller.
  • A problem was fixed in the Advanced System Management Interface (ASMI)  "Performance Setup/Logical Memory Block Size" menu that prevented the user from selecting valid Logical Memory Block (LMB) sizes because they were greyed out.
  • A problem was fixed to capture missing trace data for the hardware compression accelerator (NX) checkstop failures to allow for easier debug of the failures.
  • A problem was fixed to add call outs for the operations panel FRU for SRCs B1504804 and B1504805 for operation panel failures.  The FRU call out had been missing in the error log.
  • A problem was fixed that caused the system to hang in the IPL state during a system dump with SRC B182901E shown in the error log.  The hang occurred when system dump detected a prior system dump already in place.  The second system dump would normally be bypassed to allow the IPL to complete.
  • A problem was fixed for the service processor error log handling that caused SRC B150BAC5 errors when converting a error log entry from an object into a flattened array of bytes.
  • A problem was fixed for truncated fan part numbers in the FRU call outs of SRC 110076111 so that 4U systems (8286-41A, 8286-42A, 8247-42L) have FRU 00FV629 for the 80 mm fan and the 2U systems (8284-22A, 8247-21L, 8247-22L)  have FRU 00FV726 for the 60 mm fan.  FRU 00FV62 and FRU 00FV72 were being incorrectly reported, showing the right-most character of the part number truncated.
  • A problem was fixed in the fault isolation of FRUs for errors in the Time Of Day (TOD) oscillator topologies and the processors to reduce the number of incorrect call outs.  When a problem is detected in a connection between the processor and TOD oscillator,  the oscillator is now called out with high priority and processor with low priority but neither is guarded to prevent unnecessary loss of system resources.
  • A problem was fixed with the DIMM pairing rules to ensure that only the one DIMM that is the paired mate of a failing or missing DIMM is guarded.  An error in the pairing rules was causing additional DIMMs to be called out and guarded in the case of a single DIMM failure.
  • A problem was fixed so that when a L2/L3 cache repair cannot be performed because there is no repair available, the error log written is a Predictive Error instead of a hidden Recoverable Error.  This improves the customer awareness that the processor cache is becoming degraded.

System firmware changes that affect certain systems

  • HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem was fixed that may affect shared processor partitions where there is a mixture of dedicated and shared processor partitions with virtual IO connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting, between them.  In high availability cluster environments this problem may result in a split brain scenario.
  • On systems using OPAL firmware, a performance problem was fixed where the On-Chip Controller (OCC) failed to establish a session to OPAL, resulting in all the system processors being set to minimum (safe mode) frequencies.
  • On systems using PowerVM firmware, a problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • On systems using OPAL firmware,  a problem was fixed for the 10/1Gb Ethernet adapter (F/C #EL3Z) where it failed by rebooting into the wrong endian mode.
  • On systems using PowerVM firmware, a problem was fixed for a false error message displayed on the management console during firmware code updates that include Concurrent Core Initialization (CCI) for the processors.  All processors core are correctly initialized but the management console displays this message:   "An open serviceable event related to system firmware was found.  The firmware update process will not be interrupted.  Please address any open serviceable events on the system(s) ...  HSCF0223".
  • On systems using PowerVM firmware,  a problem was fixed so that a system dump with Advanced System Management Interface (ASMI)  server firmware content of "maximum " or "HCA IO" will not cause the system to fail with a SRC B700F103.  There is no Infiniband (IB) Host Channel Adapter (HCA) on a IBM Power8 system so this caused an unexpected problem in the hypervisor dump data collection for IB adapters.
  • On systems using PowerVM firmware,  a problem was fixed for network boot/install using a null pointer when network adapter buffers are depleted and failing the boot with a SRC BA210003 - "Partition firmware detected a data storage error".
  • On the IBM Power System S824 (8286-42A)  with IBM i partitions, a problem was fixed to block a non-applicable  IBM i console warning message "CPF9E17 - Usage limit exceeded - operator action required".  IBM i software license key 5722-SS1 feature 5052, the user entitlement key for the number of users who are authorized to use the operating system, is not required for the 8286-42A system.  This system has the Software Tier P20 licensing, which does not have user based licensing and includes the 5250 features.
  • On systems using OPAL firmware,  a problem was fixed when switching into the PowerVM mode to prevent the management console from going into recovery mode.
  • On systems using PowerVM firmware, a problem was fixed for a hypervisor time-keeping services topology failover that caused errors to be wrongly attributed to the new time-of-day topology, resulting in processor FRUs being guarded falsely.
  • On systems with a PCIe dual-x4 SAS adapter (F/C #5901, #5278, or #EL10), a problem was fixed for the system fans running too fast and loud.  This PCIe adapter was incorrectly assigned a hot PCIe rating and this caused the system fans to go to high speed for the required extra cooling.
    This fix is not applicable to the IBM Power System S824L (8247-42L).
  • On systems using OPAL firmware,  a problem was fixed for CAPP (Coherent Attached Processor Proxy) system checkstops that should have been recoverable errors.
  • On systems using OPAL firmware,  a problem was fixed for the CEC memory controllers to increase the operation time-out value to be able to handle long-running Coherent Accelerator Processor Interface (CAPI) and Peripheral Component Interconnect Express (PCIe) operations.
  • On systems using OPAL firmware, a problem was fixed in the Advanced System Management Interface (ASMI) "Real Time progress indicator" to not delete the first character of the second line of the display.
  • On systems using PowerVM firmware, a problem was fixed to allow booting off an iSCSI device.  For the failure, the partition firmware error logs had SRC BA012010 "Opening the TCP node failed." and SRC BA010013 "The information in the error log entry for this SRC provides network trace data."  The open firmware standard output trace showed SRC BA012014  "The TCP re-transmission count of 8 was exceeded. This indicates a large number of lost packets between this client and the boot or installation server" followed by SRC BA012010.
  • On systems using PowerVM firmware, a problem was fixed for partition firmware stack corruption that would cause spurious output to the console for failed ping or network boot operations.  When a stack imbalance is encountered, text is displayed on the console indicating a stack depth error along with a number of values and the text string "CUTILS" similar, in format, to the following:
                6 1 2 2 0 da15b007 22901dc
                CUTILS: bad exit depth? SCHEDULER call-c-wrapper exit: depth=7 , _indepth=4 , _#inparms=0
  • On systems using PowerVM firmware, a problem was fixed so that the thermal and power management tunable parameters for the On-Chip Controller (OCC) in the Advanced System Management Interface (ASMI) "System Configuration/Power Management/Tuning Parameters" are not set back to the defaults when the CEC is powered off.
  • On systems using PowerVM firmware, a problem was fixed in checkstop error recovery to force a re-IPL instead of a system termination for checkstops that occur during memory-preserving IPLs.  This allows the system to recover from the IPL error without any operator intervention needed.
SV810_087_081 / FW810.11

09/26/14
Impact: Data            Severity:  HIPER

System firmware changes that affect certain systems

  • HIPER/Pervasive:  A problem was fixed in PowerVM where the effect of the problem is non-deterministic but may include undetected corruption of data.  This problem can occur if VIOS (Virtual I/O Server) version 2.2.3.x or later is installed and either one of following statements is true:

    (A) A storage adapter (including Fibre Channel) is assigned to a VIOS and shared between multiple partitions (one of which must be an IBM i partition, others can be AIX, Linux or IBM i partitions), and at least one of the other partitions is performing LPM (Live Partition Mobility) or an immediate or abnormal shutdown operation.

    -or-

    (B) A Shared Ethernet Adapter (SEA) with fail over enabled is configured on the VIOS.
SV810_081_081 / FW810.10

09/08/14
Impact: Availability    Severity: SPE

New features and functions

  • Extended the availability of the IBM Power System S812L (8247-21L) that was enabled in the 810.00 release.
  • Expansion of maximum number of SAS drives on Power System S814 (8286-41A) from 8 (SSD, disk, or combination thereof) to 10 drives.
  • Support for SAS EXP24S expansion drawer (#5887, #EL1S) attached using a PCIe slot.
  • Support for large M64 based BARs for systems in the OPAL environment.
  • Fan speed settings were enhanced for the case of systems with fan failure to set the speed based on system thermal conditions instead of forcing all remaining fans to a overdrive speed setting.
  • Support for a PCIe Gen3 FPGA x 16 slot adapter that acts as a co-processor for the POWER8 processor chip for gzip compressions and decompressions.  Feature codes #EJ12 and #EJ13 are electronically identical with the same CCIN of 59AB.  #EJ12 has full high tail stock and is supported by 8286-41A and 8286-42A.  #EJ13 has a low profile tail stock and is supported by 8284-22A.  OS levels supported are AIX 6.1 and AIX 7.1 or later.  IBM i and Linux are not supported.
  • Support for use of system and partition templates on the management console.
  • Support for Coherent Accelerator Processor Interface (CAPI) for the PCIe Gen 3 FPGA on OPAL.  Operating system supported is Linux.
  • Support was added to allow concurrent initialization of the processor cores.  This expands the range of concurrent firmware updates to accommodate core initialization changes and also allows for dynamic repairs of processor and cache memory.
  • Support was added for cache memory L2/L3 column repair to allow concurrent repair of memory and propagation of memory errors for better fault isolation of memory components.
  • The system operator panel was enhanced to show the firmware mode of the system during the IPL of either PowerVM or OPAL for panel function 1.
  • The service processor Processor Runtime Diagnostics (PRD) was enhanced to collect debug data for failures in host boot initialization for the Self-Boot Engine (SBE).
  • Support was added to the Advanced System Management Interface (ASMI) USB menu to allow a system dump to be collected to USB with the power on to the system.  This allows the dump to be collected with the system memory state intact.
  • Support for enhanced 10 Gb ethernet adapters that were previously announced for Power8 for AIX NIM (Network Install Management) or Linux Network Install capability.  The enhanced adapters are the following:
        PCIe2 4-port(10Gb+1GbE) SR+RJ45 Adapter (#EN0S, #EN0T)
        PCIe2 4-port(10Gb+1GbE) SFP+Copper+RJ45 Adapter (#EN0U, #EN0V)
        The level of adapter microcode required is level 20100130 or later.

        PCIe2 LP 2-port 10/1GbE BaseT RJ45 Adapter (#EN0W, #EN0X, #EL3Z)
        The level of adapter microcode required is level 30080130 or later.
  • Support for a new 4-port Ethernet Adapter with two 10 Gb and two 1Gb ports (#EN0M, #EN0N with CCIN 2CC0). The adapter offers NIC and FCoE over its 10 Gb ports and NIC over the 1 Gb ports and is SR-IOV capable.  The 10 Gb ports are LR (long range) fiber optic, supporting distances up to 10 km.  Except for the transceivers and cabling of the 10 Gb ports,  this adapter is functionally identical to the 4-port adapter (#EN0H, #EN0J, #EL38) SR optical and (#EN0K, #EN0L, #EL3C) activer copper twinax.
  • Support for a new PCIe 2-port Async adapter (#EN27, #EN28) that serves the same function as the  predecessor PCIe 2-port Async adapter (#5289, #5290) on the Power7+ and earlier servers.    This adapter provides connection for 2 asynchronous EIA-232 devices. Ports are programmable to support EIA-232 protocols, at a line speed of 128K bps. Two RJ45 connections are located on the rear of the adapter. To attach to devices using a 9-pin (DB9) connection, use an RJ45-to-DB9 converter. For convenience, one converter is included with this feature. One converter for each connector needing a DB9 connector is needed.
  • Support for additional PCIe adapters, which had previously been supported on Power7+ and earlier servers, to help with server migration:
        Ethernet 10 Gb LAN: 1-port optical SR (#5769, #5275)
        Ethernet and FCoE: 4-port 10 Gb/1 Gb Copper (#EN0K, #EN0L, #EL3C)
        Ethernet RoCE: 2-port 10 Gb copper (#EC27, #EC28, #EL27)
        Fibre Channel: 2-port 4 Gb (#5774, #5276, #EL09)
        SAS: 2-port 3 Gb 380 MB cache (#5805)
  • Support was added for a new Advanced System Management Interface (ASMI) menu to allow the user to choose between an IPMI or a serial console when in OPAL mode.

System firmware changes that affect all systems

  • A problem was fixed in the service processor that caused the SRC B1504804 to be logged as many as 30 times over five minutes for a operations panel voltage regulator error.  The error logging has been reduced to one SRC for this error.
  • A problem was fixed to allow the system to  prevent an intermittent system hang until IPL time-out after a processor core checkstop.  This secondary failure after a core checkstop had a low probability of occurring.
  • A problem was fixed to maintain time-of-day (TOD) clock redundancy for the hypervisor time-keeping services in the case of a TOD error and fail-over to the backup clock topology.  There was a failure in the TOD fail-over process to correctly assign the new backup TOD topology, causing loss of redundancy for the next TOD error.
  • A problem was fixed for the service processor reset/reload process to eliminate an extra dump and SRC B1818601 caused by an internal core dump during the reset/reload.
  • A problem was fixed for a processor error with an incorrect call out of a memory card with SRC B124E504 to eliminate the memory card FRU call out.  The processor error call out of SRC B170E540 was correct.
  • A problem was fixed in the Advanced System Menu Interface (ASMI) menus to restore factory settings so that the default for the Hypervisor mode (PowerVM or OPAL) was restored to the factory setting using "System Service Aids/Factory Configuration/Service Processor Reset/All Reset".
  • A problem was fixed in how the processor clock speed was reported to the hypervisor, causing the partitions to show a clock speed that was about 200 MHZ faster than the actual processor clock speed.
  • A problem was fixed for DRAM repair for the case where two DRAM modules are having failures at the same rank such that spares are used to repair each DRAM error.  Without the fix, the second DRAM is not repaired and could eventually be called out and guarded with a UE SRC.
  • A problem was fixed for system hardware dump collection to collect all the hardware registers by stopping all functional clocks before starting the collection.
  • A problem was fixed for repairing spare memory DRAM so that repair solutions for failed spares persists across IPLs of the system by getting the repair solutions written to the Vital Product Data (VPD) of the DRAM.
  • A problem was fixed in the Advanced System Menu Interface (ASMI) menus to change the name of the "Hypervisor Configuration" menu to "Firmware Configuration" to more accurately describe the menu function of being able to change firmware between the PowerVM and OPAL modes.
  • A problem was fixed in the Advanced System Menu Interface (ASMI) menus to move the IPMI password reset operation from the "Firmware Configuration" menu to the "Login Profile/Change password" menu.  This change was made to put all the password change operation together under one menu.
  • A problem was fixed in the Advanced System Menu Interface (ASMI) menu for "Resource Dump" to give the message "This feature is not supported for OPAL environments" when the system is in OPAL mode.  Previously,  ASMI incorrectly stated that the "Resource Dump" function was not supported on the machine type.
  • A problem was fixed in the service processor to add missing call outs for the memory buffer and memory controller FRUs when there is a time-out error on the power bus with PE SRC logged of B170E540.
  • A problem was fixed in memory diagnostics and fault isolation that deconfigured more memory than necessary for memory errors.
  • A problem was fixed that caused the Utility COD display of historical usage data to be truncated on the management console.
  • A problem was fixed to eliminate service processor dumps after AC power cycles of the CEC.
  • A problem was fixed to add a missing hardware call out for service processor FSI bus errors logged with SRC BC8A0A11.  This causes the failing hardware to be deconfigured and guarded for the next IPL of the system.
  • A problem was fixed so that if an IPL failure occurs that causes the system to power off,  error SRCs will be logged instead of the system hanging for ten minutes and not logging any SRCs.
  • A problem was fixed in the system dump data collection for missing memory data to collect memory data after hardware de-configuration checkstop errors.
  • A problem was fixed for in-band code update to prevent loss of a processor support interface (PSI) link that is in a backup role.
  • A problem was fixed in system dump collection for a system hang after a checkstop.  The system failed to go to terminate state and reboot.
  • A problem was fixed in system dump collection to return full dump data when a secondary error occurs during dump data collection for the checkstop primary error.
  • A problem was fixed in the Advanced System Menu Interface (ASMI) menu "System Configuration/Hardware Deconfiguration/Memory Deconfiguration" to be able to manually configure and deconfigure DIMMs.
  • A problem was fixed for system terminations that could occur as a result of PCIe adapters using a Level Signaled Interrupt (LSI) before the hypervisor interrupt handler was ready.  This could occur when in PCIe adapter recovery for an error with src logs of  B7006970 and B700B971.   The PCIe adapters are now held in reset until initialization sequences are completed to ensure all interrupt handlers are ready for PCIe adapter interrupts.
  • A problem was fixed for a management console firmware update "Remove and Activate" operation that fails to activate the OCC (On-Chip Controller for thermal and power management) new code level with SRCs logged of B18B2616 and B1812601.  An IPL is needed to activate the OCC code level to complete the firmware update.
  • A problem was fixed for IPL failures caused by Host Boot PNOR memory corruption.  If a IPL Terminate Immediate (TI) from Host Boot has a SRC without a specific reason code, a corruption check on the Host Boot memory partitions is run and the Host Boot partitions corrected to recover them.
  • A problem was fixed for the power usage regulation of memory to keep memory power usage below its specified limits.  Lack of enough memory throttling was allowing the memory to consume power pass its set limits, leaving the system exposed to power faults or unexpected power throttling in other areas of the system.
  • A problem was fixed to guard cores on hang errors.  A processor core was not being guarded on hang errors where a core timed-out waiting for an instruction to complete.
  • A problem was fixed to allow memory diagnostics during a re-IPL of the CEC, insuring that problem memory will be guarded or recovered and preventing possible error log flooding with memory errors.
  • A problem was fixed for system dump process memory corruption that could cause the wrong dump type to be created for a system failure, resulting in a system dump with the wrong content.
  • A problem was fixed for a service processor reset/reload causing a FSP dump with a Firmware Database (fwdb) core dump captured within it.
  • A problem was fixed for a processor core forward progress parity error so that the core could be guarded without causing a system checkstop.
  • A problem was fixed in the run time diagnostics of DIMMs to read the raw card type correctly, preventing failures in the memory repair.
  • A problem was fixed to prevent an intermittent hostboot IPL deadlock/hang in the deferred work queue with progress code CC009543 and termination with SRC B1813450.
  • A problem was fixed in memory diagnostics to be able to handle multiple DIMM failures without a time-out failure, reducing the the amount of memory needed to guarded for the errors.
  • A problem was fixed in DIMM initialization to prevent intermittent B181BA08 DIMM failures in host boot during IPL.
  • A problem was fixed to call home guarded FRUs on each IPL.  Only the initial failure of the hardware was being reported to the error log.
  • A problem was fixed for the incorrect fan FRU call outs of SRC 110076111 so that 4U systems (8286-41A, 8286-42A) have FRU 00FV629 for the 80 mm fan and the 2U systems (8284-22A, 8247-21L, 8247-22L)  have FRU 00FV726 for the 60 mm fan.
  • A problem was fixed for a memory write error becoming a system checkstop instead of being handled by the memory error handling and recovery processes.
  • A problem was fixed for the error processing of processor core checkstops at runtime to not ignore the guard on the failed core on the next IPL of the system, thus preventing additional failures with the next IPL during host boot.
  • A problem was fixed for error recovery for a failed processor that has all cores guarded such that host boot is able to re-IPL using the working processor.   In certain situations, the re-IPL on the good processor was failing with SRC B113E504 with PRD signature PB_CENT_CRESP_ADDR_ERROR.
  • A problem was fixed for run-time guarding of a processor core that had resulted in a system checkstop when the core guard attempt failed.  The processor with the non-guarded broken core caused the On-Chip Controller (OCC) to have a power measurement time-out to the processor with SRC B1102A00 that resulted in the system termination.
  • A problem was fixed to prevent incorrect logging of SRC 11007221 whenever the operator panel is missing (or broken).  This SRC indicates ambient temperature of the system is too high and a performance throttle may occur to lower the temperature, causing performance loss.  A missing operator panel should not cause lower performance of the system.
  • A problem was fixed for undefined hardware states in the system that caused a early IPL failure with SRCB1101314 when configuring the Self Boot Engine (SBE) for hostboot.
  • A problem was fixed for the Operator panel where the Enclosure Fault LED was swapped with the Attention/Check Log LED.
  • A problem was fixed for memory diagnostics to guard all unusable memory due to a channel failure.  This prevents the hypervisor from trying to start partitions with memory associated with the bad channel and having the partition crash.
  • A problem was fixed to insure all memory is scrubbed for correctable errors to prevent run-time memory failures and possible checkstops.   If memory scrubbing actions found the preceding memory rank had persistent ECC errors, the next rank of memory was sometimes skipped.
  • A problem was fixed in the Hostboot Self Boot Engine (SBE) to re-IPL without guarding the processor on a SBE step that has infrequent failures that are recoverable with a retry.

System firmware changes that affect certain systems

  • A problem was fixed for processor local bus errors during an IPL to call out the master and slave bus components with a BC14090F SRC to identify all the possible failing components.  For the problem, only the bus slave components were being called out on bus error leaving open the possibility that the faulty component might not be guarded or repaired.
  • On systems that have a boot disk located on a SAN,  a problem was fixed  where the SAN  boot disk would not be found on the default boot list  and then the boot disk would have to be selected from SMS menus.  This problem would normally  be seen for new partitions that had tape drives configured before the SAN boot disk.
  • On systems in IPv6 networks,  A problem was fixed for DHCP where a duplicate address detection (DAD) message to the DHCP-client on the service processor could fail, resulting in duplicate IP addresses being configured on the network.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove, leaving a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems in IPv6 networks, a  problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client.
  • On systems in Dynamic Power Saver (DPS) mode, a  problem was fixed so SRC B1812A61 is not logged when power throttling is needed for a workload over the power capacity.  In DPS mode,  a system power usage adjustment is not an error condition.
  • On systems in OPAL mode,  a problem was fixed for OPAL network boots to add retries to DHCP to prevent network boot time-out errors caused by network lags and slow downs.
  • On systems in OPAL mode, a problem was fixed in the fault isolation procedures to not call out hardware FRUS for software failures to reduce loss of hardware on errors.
  • On systems in PowerVM mode,  a problem was fixed in Live Partition Mobility (LPM) for systems at or near the new 32K maximum for virtual devices that insufficient space existed to store device attributes of the migrated system,  causing RMC failures and incorrect MTMS values for the migrated partition.
  • On systems in PowerVM mode,  a problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These size adjustments were marked as UE previously for a condition that is normal.
  • On Power8 2U systems, a problem was fixed for the C5 PCIe slot failing.  This PCIe configuration was not supported on the 8284-22A, 8247-21L, and 8247-22L systems.
  • On Power8 2U systems, a problem was fixed  in the fan speed management to lower the maximum RPMs of the fans and reduce the noise level of the system.  This problem affects the 8284-22A, 8247-21L, and 8247-22L systems.
  • On systems in PowerVM mode using dedicated processors, a problem with concurrent firmware update was fixed to prevent a quiesce of the hypervisor process that can result in a system hang.
  • On systems in PowerVM mode, a problem was fixed for unresponsive PCIe adapters after a partition power off or a partition reboot.
  • On systems with 64Gb DIMM memory (F/C #EM8D), a problem was fixed to allow 64Gb DIMM memory error-correcting code (ECC) repairs instead of logging a predictive error with no repair to the memory.
SV810_061_054 / FW810.02

07/29/14
Impact: Data            Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive: A problem was fixed in PowerVM where the usage of P8 transactional memory and vector facilities could result in undetected corruption of data if the system is running in Power8 native mode. OS levels that support Power8 native mode are RHEL 7 and AIX 7.1 TL3 SP3 and later.

System firmware changes that affect certain systems

  • HIPER/Pervasive: A problem was fixed with Live Partition Mobility (LPM) on PowerVM when migrating a partition between two Power8 systems that are running in Power8 native mode. This problem could result in unpredictable behavior when the partition resumes execution on the target system, including potential undetected corruption of data, a system crash, or a partition crash. OS levels that support Power8 native mode are RHEL 7 and AIX 7.1 TL3 SP3 and later.
  • A problem was fixed for an IBM i D-mode IPL failure with SRC B2003110 when the alternative load source could not be found.  If a system encounters this issue prior to installing the fix, the Service Pack can be applied via the Management console or using a USB flash drive with the system powered off.
SV810_058_054 / FW810.01

06/23/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
  • A problem was fixed for hardware dumps on the service processor so that valid dump data could be collected from multiple processor checkstops.  Previously, the hardware data from multiple processor checkstops would only be correct for the first processor.
  • A problem was fixed for platform dumps so that certain operations would work after the platform dump completed.  Operations such as firmware updates or reset/reloads of the service processor after a platform dump would cause the service processor to become inaccessible.
SV810_054_054 / FW810.00

06/10/14
Impact:  New      Severity:  New

New Features and Functions

  • GA Level

    NOTE:
  • POWER8 firmware addresses the security problem in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability. 
  • POWER8 (and later) servers include an “update access key” that is checked when system firmware updates are applied to the system.  The initial update access keys include an expiration date which is tied to the product warranty. System firmware updates will not be processed if the calendar date has passed the update access key’s expiration date, until the key is replaced.  As these update access keys expire, they need to be replaced using either the Hardware Management Console (HMC) or the Advanced Management Interface (ASMI) on the service processor.  Update access keys can be obtained via the key management website: http://www.ibm.com/servers/eserver/ess/index.wss .

4.0 How to Determine The Currently Installed Firmware Level

For HMC managed systems:  From the HMC, select Updates in the navigation (left-hand) pane, then view the current levels of the desired server(s).

For standalone system running IBM i without an HMC: From a command line, issue DSPFMWSTS.

For standalone system running IBM AIX without an HMC: From a command line, issue lsmcode.

Alternately, use the Advanced System Management Interface (ASMI) Welcome pane. The current server firmware appears in the top right corner. Example: SV810_yyy.


5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC is not internet-connected you will need to download the new firmware level to a USB flash memory device or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: SVxxx_yyy_zzz

Where xxx = release level

HMC Managed Systems:

Instructions for installing firmware updates and upgrades on systems managed by an HMC can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8ha1/updupdates.htm

Systems not Managed by an HMC:

Power Systems:

Instructions for installing firmware on systems that are not managed by an HMC can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8ha5/fix_serv_firm_kick.htm

IBM i Systems:

Refer to "IBM i Support: Recommended Fixes":
http://www-912.ibm.com/s_dir/slkbase.nsf/recommendedfixes


7.0 Firmware History

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/SV-Firmware-Hist.html