Power8 System Firmware

Applies to: 8247-21L; 8247-22L; 8247-42L; 8284-22A; 8286-41A; 8286-42A and 8408-E8E.

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.

1.0 Systems Affected
1.1 Minimum HMC Code Level
1.2 AIX iFix Required
1.3 IBM i Minimum Levels
2.0 Important Information
2.1 IPv6 Support and Limitations
2.2 Concurrent Firmware Updates
2.3 DPSS Updates
2.4 Memory Considerations for Firmware Upgrades
3.0 Firmware Information
3.1 Firmware Information and Description Table
4.0 How to Determine Currently Installed Firmware Level
5.0 Downloading the Firmware Package
6.0 Installing the Firmware
7.0 Firmware History
8.0 Change History Revised (11/27/17)

1.0 Systems Affected

This package provides firmware for Power System S812L (8247-21L), Power System S822L (8247-22L), Power System S824L (8247-42L), Power System S822 (8284-22A), Power System S814 (8286-41A), Power System S824 (8286-42A) and Power System E850 (8408-E8E) servers only.

The firmware level in this package is:

SV840_168 / FW840.50

1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update. If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

The Minimum HMC Code level for this firmware is: HMC V8 R8.4.0 (PTF MH01559) with Mandatory ifix (PTF MH01560).

Although the Minimum HMC Code level for this firmware is listed above, HMC V8 R8.5.0 Service Pack 2 (PTF MH01657) with ifix (PTF MH01702) or higher is recommended.

For information concerning HMC releases and the latest PTFs, go to the following URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/
For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTES:
-You must be logged in as hscroot in order for the firmware installation to complete correctly.
- Systems Director Management Console (SDMC) does not support this System Firmware level.

1.2 AIX iFix Required

NOTE: Does not pertain to S812L(8247-21L), S822L (8247-22L), or S824L (8247-42L) models:

For IBM Power System servers with the PCIe 2-port Async EIA-232 Adapter installed on AIX partitions, an AIX fix resolving the async port interrupt handling (APAR IV77596) must be installed before updating to the SV840_056 (FW840.00) or later level of firmware. The ports on the adapter (feature code EN27/EN28, CCIN 57D4) may become un-usable with the installation of that firmware level due to an issue with how interrupts are handled. Many JAS_RTS error log entries are written to the error log due to this issue.

Prior to this APAR shipping in a future Service Pack, AIX intends to publish ifixes for the latest Service Packs on all active Technology Levels on our ftp server, in ftp://aix.software.ibm.com/aix/ifixes/iv77596/ on or before Oct 13, 2015. If you need an ifix other than the ones on this server, contact IBM support to request one for your specific situation.

The procedure is intended to be performed by the customer. In the event that the customer has questions or concerns with the procedure, you should contact IBM Support. Please contact IBM Support:
US Support: 1.800.IBM.SERV
WW Support (select your country): http://www.ibm.com/planetwide/

1.3 IBM i Minimum Levels

For IBM i customers who have systems with machine type model 8286-41A or 8286-42A, firmware update has a prerequisite on partitions running IBM i operating system that own physical I/O.

For IBM i 7.1, the following minimum code levels are prerequisites:
IBM i 7.1 TR PTF Group SF99707 Level 9 + Cumulative PTF Package C4283710 + HIPER PTF Group

For IBM i 7.2, the following minimum code levels are prerequisites:
IBM i 7.2 TR PTF Group SF99717 Level 1 + Cumulative PTF Pacakge C4276720 + HIPER PTF Group

For IBM i 7.3,
- All IBM i 7.3 code levels are compatible with this firmware update.

Note 1: These code levels are not a requirement for IBM i partitions that are a client of VIOS.
Note 2: These IBM i code levels are listed as prerequisites for the feature code EMX0 expansion drawer. If this firmware release has already been applied, the above IBM i code level should be applied on IBM i partitions in order to maintain system stability.

2.0 Important Information

Downgrading firmware from any given release level to an earlier release level is not recommended.

If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

2.1 IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

2.2 Concurrent Firmware Updates

Concurrent system firmware update is only supported on HMC Managed Systems only.

The concurrent firmware update will cause the system fan speeds to accelerate to maximum RPMs with loud noise emissions. This increased fan level and loud sound level will persist for several minutes while the service processor is reset and the new firmware level is activated. Thereafter, the fan speeds will gradually adjust back to normal operating speed and sound levels.

2.3 DPSS Updates

Power 8 servers use a programmable power controller called a DPSS (Digital Power Subsystem Sweep) which is located in each system node. The DPSS is used to control P8 fan speeds, check voltage levels of the power supplies for proper level, and operation in the system node. The DPSS image is persistent and is only reloaded if there is a system firmware update that contains a DPSS change. If there is a DPSS change and the system firmware update is concurrent, the DPSS update is delayed to the next IPL of the CEC which will cause an additional 18 to 20 minutes to be added on to the IPL. If there is a change and the firmware update is disruptive, then DPSS update occurs when the service processor is resetting to service processor stand-by state, and will add 18 to 20 minutes to this transition. During the DPSS update the HMC or op-panel, will display DPSS update progress codes which may be overwritten on the HMC, but will be displayed as C100C300 thru C100C3FF. If there is a DPSS change in a system firmware service pack, the change will be designated as deferred in the service pack README. DPSS changes will be described along with a reminder of the 18 to 20 minute additional time in the Firmware Information and Description section in the README.

The DPSS download progress codes are documented in the IBM Knowledge Center:
https://www.ibm.com/support/knowledgecenter/POWER8/p8eai/C1xx_info.htm

2.4 Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:

Number of logical partitions
Partition environments of the logical partitions
Number of physical and virtual I/O devices used by the logical partitions
Maximum memory values given to the logical partitions

Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8hat/p8hat_lparmemory.htm

3.0 Firmware Information

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

For systems that are not managed by an HMC, the installation of system firmware is always disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed. Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01SVxxx_yyy_zzz

xxx is the release level
yyy is the service pack level
zzz is the last disruptive service pack level

NOTE: Values of service pack and last disruptive service pack level (yyy and zzz) are only unique within a release level (xxx). For example, 01SV830_040_040 and 01SV840_040_045 are different service packs.

An installation is disruptive if:

The release levels (xxx) are different.

Example: Currently installed release is 01SV840_040_040, new release is 01SV850_050_050.

The service pack level (yyy) and the last disruptive service pack level (zzz) are the same.

Example: SV830_040_040 is disruptive, no matter what level of SV830 is currently installed on the system.

The service pack level (yyy) currently installed on the system is lower than the last disruptive service pack level (zzz) of the service pack to be installed.

Example: Currently installed service pack is SV830_040_040 and new service pack is SV830_050_045.

An installation is concurrent if:

The release level (xxx) is the same, and
The service pack level (yyy) currently installed on the system is the same or higher than the last disruptive service pack level (zzz) of the service pack to be installed.

Example: Currently installed service pack is SV830_040_040, new service pack is SV830_071_040.

3.1 Firmware Information and Description


*Filename*	*Size*	*Checksum*
01SV840_168_056.rpm	93704687	46153

Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01SV840_168_056.rpm

SV840 For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url: http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs The complete Firmware Fix History for this Release Level can be reviewed at the following url: http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/SV-Firmware-Hist.html
SV840_168_056 / FW840.50 04/21/2017	Impact: Availability Severity: SPE New features and functions Support for the Advanced System Management Interface (ASMI) was changed to allow the special characters of "I", "O", and "Q" to be entered for the serial number of the I/O Enclosure under the Configure I/O Enclosure option. These characters have only been found in an IBM serial number rarely, so typing in these characters will normally be an incorrect action. However, the special character entry is not blocked by ASMI anymore so it is able to support the exception case. Without the enhancement, the typing of one of the special characters causes message "Invalid serial number" to be displayed. On systems using PowerVM firmware, support was added to allow the IBM i OS on the Power System S822 (8284-22A) without the need for a VET code. On systems using PowerVM firmware, support was added for the Universally Unique IDentifier (UUID) property for each partition. The UUID provides each partition with an identifier that is persisted by the platform across partition reboots, reconfigurations, OS reinstalls, partition migration, and hibernation. System firmware changes that affect all systems A problem was fixed for the setting the disable of a periodic notification for a call home error log SRC B150F138 for Memory Buffer resources (membuf) from the Advanced System Management Interface (ASMI). A problem was fixed for incorrect callouts of the Power Management Controller (PMC) hardware with SRC B1112AC4 and SRC B1112AB2 logged. These extra callouts occur when the On-Chip Controller (OCC) has placed the system in the safe state for a prior failure that is the real problem that needs to be resolved. A problem was fixed for device time outs during a IPL logged with a SRC B18138B4. This error is intermittent and no action is needed for the error log. The service processor hardware server has allotted more time of the device transactions to allow the transactions to complete without a time-out error. A problem was fixed for the OS not being able to detect the USB connected Uninterruptible Power Supply (UPS) that has feature code #ECCF. An informational SRC B1814616 is logged from the service processor and the IBM i OS logs a CPI0961 (Uninterruptible power supply no longer attached). The error occurs infrequently because it depends on system timing and system configuration. If a system is having the error, it might have it on every IPL. The circumvention is to reseat the USB cable connector for the USB connected UPS. A problem was fixed for the Advanced System Management Interface (ASMI) "System Service Aids => Error/Event Logs" panel not showing the "Clear" and "Show" log options and also having a truncated error log when there are a large number of error logs on the system. A problem was fixed for the failover to the backup PNOR on a Hostboot Self Boot Engine (SBE) failure. Without the fix, the failed SBE causes loss of processors and memory with B15050AD logged. With the fix, the SBE is able to access the backup PNOR and IPL successfully by deconfiguring the failing PNOR and calling it out as a failed FRU. A problem was fixed for System Vital Product Data (SVPD) FRUs being guarded but not having a corresponding error log entry. This is a failure to commit the error log entry that has occurred only rarely. A problem was fixed for the system VPD showing 4 extra PCIe slots that are not actually available to the system. When running an IBM i partition, the IBM i Hardware Service Manager shows twelve PCIe adapter slots instead of the actual eight that can be used (P1-C2, P1-C3, P1-C4, and P1-C5 are the extra slots displayed). This problem only pertains to the IBM Power System S814 (8286-41A). A problem was fixed to allow changing the IPMI channel authentication capabilities from the OS. The following command was causing an IPMI core dump "ipmitool channel authcap 1 4" every time it was run. A problem was fixed for a system going into safe mode with SRC B1502616 logged as informational without a call home notification. Notification is needed because the system is running with reduced performance. If there are unrecoverable error logs and any are marked with reduced performance and the system has not been rebooted, then the system is probably running in safe mode with reduced performance. With the fix, the SRC B1502616 is a Unrecoverable Error (UE). A problem was fixed for the service processor boot watch-dog timer expiring too soon during DRAM initialization in the reset/reload, causing the service processor to go unresponsive. On systems with a single service processor, the SRC B1817212 was displayed on the control panel. For systems with redundant service processors, the failing service processor was deconfigured. To recover the failed service processor, the system will need to be powered off with AC powered removed during a regularly scheduled system service action. This problem is intermittent and very infrequent as most of the reset/reloads of the service processor will work correctly to restore the service processor to a normal operating state. A problem was fixed for host-initiated resets of the service processor causing the system to terminate. A prior fix for this problem did not work correctly because some of the host-initiated resets were being translated to unknown reset types that caused the system to terminate. With this new correction for failed host-initiated resets, the service processor will still be unresponsive but the system and partitions will continue to run. On systems with a single service processor, the SRC B1817212 will be displayed on the control panel. For systems with redundant service processors, the failing service processor will be deconfigured. To recover the failed service processor, the system will need to be powered off with AC powered removed during a regularly scheduled system service action. This problem is intermittent and very infrequent as most of the host-initiated resets of the service processor will work correctly to restore the service processor to a normal operating state. A problem was fixed for incorrect error messages from the Advanced System Management Interface (ASMI) functions when the system is powered on but in the "Incomplete State". For this condition, ASMI was assuming the system was powered off because it could not communicate to the PowerVM hypervisor. With the fix, the ASMI error messages will indicate that ASMI functions have failed because of the bad hypervisor connection instead of falsely stating that the system is powered off. A problem was fixed for system termination and outage caused by a corrupted system reset type. For cases where the system reset type cannot be identified, the service processor will now do a reset/reload to keep the system running. This is a rare problem that is occurring during an error/recovery situation that involves a reset of the service processor. A problem has been fixed for systems losing performance and going into Safe mode (a power mode with reduced processor frequencies intended to protect the system from over-heating and excessive power consumption) with B1xx2AC3/B1xx2AC4 SRCs logged. This happened because of an On-Chip Controller (OCC) internal queue overflow. The problem has only been observed for systems running heavy workloads with maximum memory configurations (where every DIMM slot is populated - size of DIMM does not matter), but this may not be required to encounter the problem. Recovery from Safe mode back to normal performance can be done with a re-IPL of the system, or concurrently using the following link steps for a soft reset of the service processor: https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm. To check or validate that Safe mode is not active on the system will require a dynamic celogin password from IBM Support to use the service processor command line: 1) Log into ASMI as celogin with dynamic celogin password generated by IBM Support 2) Select System Service Aids 3) Select Service Processor Command Line 4) Enter "tmgtclient --query_mode_and_function" from the command line The first line of the output, "currSysPwrMode" should say "NOMINAL" and this means the system is in normal mode and that Safe mode is not active. System firmware changes that affect certain systems On systems using PowerVM firmware, a problem was fixed for cable card (PCIe3 Optical Cable Adapter for the PCIe3 Expansion Drawer) capable PCI slots that fail during the IPL. Hypervisor I/O Bus Interface UE B7006A84 is reported for each cable card capable PCI slot that doesn't contain a cable card. PCI slots containing a cable card will not report an error but will not be functional. The problem can be resolved by doing a "power off/power on" re-IPL of the system. The trigger for the failure is the I2C devices used to detect the cable cards are not coming out of the power on reset process in the correct state due to a race condition. The affected optical cable adapters have feature codes #EJ05, #EJ07, and #EJ08 with CCINs 2B1C, 6B52, and 2CE2, respectively. On systems using PowerVM firmware, a problem was fixed for a blank SRC in the LPA dump for user-initiated non-disruptive adjunct dumps. The SRC is needed for problem determination and dump analysis. On systems using PowerVM firmware, a problem was fixed with SR-IOV adapter error recovery where the adapter is left in a failed state in nested error cases for some adapter errors. The probability of this occurring is very low since the problem trigger is multiple low-level adapter failures. With the fix, the adapter is recovered and returned to an operational state. On systems using PowerVM firmware with PCIe adapters in Single Root I/O Virtualization (SR-IOV) shared mode, a problem was fixed for the hypervisor SR-IOV adjunct partition failing during the IPL with SRCs B200F011 and B2009014 logged. The SR-IOV adjunct partition successfully recovers after it reboots and the system is operational. For the IBM Power System E850 (8408-E8E) system, a problem was fixed for the incorrect values for the Idle Power Saver (IPS) mode call home data. The call home "max" is reported much lower numbers than what the On-chip Controllers (OCC) read for the IPS. This problem only affects 4-socket systems as it is caused by an integer overflow of the summation of the IPS value from all OCCs in the system. On systems using PowerVM firmware, a problem was fixed for PCIe Host Bridge (PHB) outages and PCIe adapter failures in the PCIe I/O expansion drawer caused by error thresholds being exceeded for the LEM bit [21] errors in the FIR accumulator. These are typically minor and expected errors in the PHB that occur during adapter updates and do not warrant a reset of the PHB and the PCIe adapter failures. Therefore, the threshold LEM[21] error limit has been increased and the LEM fatal error has been changed to a Predictive Error to avoid the outages for this condition. On systems using PowerVM firmware with a large memory configuration (greater than 8 TB), a problem was fixed for a SR-IOV adjunct failure during the IPL, causing loss of SR-IOV function. The large system memory space causes an overflow in the space calculations for SR-IOV adapters in PCIe slots with Enlarged IO Capacity enabled. The problem can be avoided by reducing the number of PCIe slots with Enlarged IO Capacity enabled so it does not include adapters in SR-IOV shared-mode. Another circumvention option is to move the SR-IOV adapters to SR-IOV capable PCIe slots where Enlarged IO Capacity is not enabled. Reducing system physical memory to below 8 TB will also work as a circumvention. On systems using PowerVM firmware, a problem was fixed for Live Partition Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of firmware. Subsequent DLPAR of Virtual Adapters will fail with HMC error message HSCL294C, which contains text similar to the following: "0931-007 You have specified an invalid drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and later. Not affected by this issue are partitions installed with VIOS, IBM i, or earlier levels of AIX. On a system using PowerVM firmware running a Linux OS, a problem was fixed for support for Coherent Accelerator Processor Interface (CAPI) adapters. The CAPI related RTAS h-calls for the CAPI devices could not be made by the Linux OS, impacting the CAPI adapter functionality and usability. This problem involves the following adapters: the PCIe3 LP CAPI Accelerator Adapter with F/C #EJ16 that is used on the S812L(8247-21L) and S822L (8247-22L) models; the PCIe3 CAPI FlashSystem Acclerator Adapter with F/C #EJ17 that is used on the S814(8286-41A) and S824(8286-42A) models; and the PCIe3 CAPI FlashSystem Accelerator Adapter with F/C #EJ18 that is used on the S822(8284-22A), E870(9119-MME), and E880(9119-MHE) models. This problem does not pertain to PowerVM AIX partitions using CAPI adapters. On a system using OPAL firmware, a problem was fixed for excessive "Poller recursion detected" error messages during the skiboot that could require a power off to recover from the error. On a system using OPAL firmware, a problem was fixed for an unnecessary error message when a reset occurs on an empty PCIe Host Bridge (PHB) - no PCIe adapters attached.. The extra error message occurs anytime the PHBs in the system go through error recovery. On a system using OPAL firmware, a problem was fixed to fence off an errant PCIe Host Bridge (PHB) during a complete reset to allow the kernel to retry the operation. This helps the system recovery process by guarding out the bad hardware to prevent a fatal error loop. On a system using PowerVM firmware, a problem was fixed for corruption of the partition data in the service processor NVRAM during a power off that causes the managed system to go into the HMC "Recovery" error state. A circumvention for the error is to restore partition data from the HMC. If using Novalink to manage the partition, a recovery can be done from the Novalink backup. The error is very infrequent but more likely to occur on an immediate power off of the system. Instead, if a delayed powered off is used, that would allow the hypervisor to complete all pending operations before shutting down cleanly. On systems using PowerVM firmware, a problem was fixed for a group of shared processor partitions being able to exceed the designated capacity placed on a shared processor pool. This error can be triggered by using the DLPAR move function for the shared processor partitions, if the pool has already reached its maximum specified capacity. To prevent this problem from occurring when making DLPAR changes when the pool is at the maximum capacity, do not use the DLPAR move operation but instead break it into two steps: DLPAR remove followed by DLPAR add. This gives enough time for the DLPAR remove to be fully completed prior to starting the DLPAR add request. On systems using PowerVM firmware, a problem was fixed for NVRAM corruption and a HMC recovery state when using Simplified Remote Restart partitions. The failing systems will have at least one Remote Restart partition and on the failed IPL there will be a B70005301 SRC with word 7 being 0X00000002. On systems using PowerVM firmware with an IBM i partition, a problem was fixed for incorrect maximum performance reports based on the wrong number of "maximum" processors for the system. Certain performance reports that can be generated on IBMi systems contain not only the existing machine information, but also "what-if" information, such as "how would this system perform if it had all the processors possible installed in this system". This "what-if" report was in error because the maximum number of processors possible was too high for the system. On systems using PowerVM firmware, a problem was fixed for NVRAM corruption that can occur when deleting a partition that owns a CAPI adapter, if that CAPI adapter is not assigned to another partition before the system is powered off. On a subsequent IPL, the system will come up in recovery mode if there is NVRAM corruption. To recover, the partitions must be restored from the HMC. The frequency of this error is expected to be rare. The CAPI adapters have the following feature codes: #EC3E, #EC3F, #EC3L, #EC3M, #EC3T, #EC3U, #EJ16, #EJ17, #EJ18, #EJ1A, and #EJ1B. On systems using PowerVM firmware, a problem was fixed for PCIe3 I/O expansion drawer (#EMX0) link improved stability. The settings for the continuous time linear equalizers (CTLE) was updated for all the PCIe adapters for the PCIe links to the expansion drawer. The CEC must be re-IPLed for the fix to activate. On systems using PowerVM firmware, the following problems were fixed for SR-IOV adapters: 1) Insufficient resources reported for SR-IOV logical port configured with promiscuous mode enable and a Port VLAN ID (PVID) when creating new interface on the SR-IOV adapters. 2) Spontaneous dumps and reboot of the adjunct partition for SR-IOV adapters. 3) Adapter enters firmware loop when single bit ECC error is detected. System firmware detects this condition as a adapter command time out. System firmware will reset and restart the adapter to recover the adapter functionality. This condition will be reported as a temporary adapter hardware failure. 4) vNIC interfaces not being deleted correctly causing SRC B400FF01 to be logged and Data Storage Interrupt (DSI) errors with failiure on boot of the LPAR. This set of fixes updates adapter firmware to 10.2.252.1926, for the following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EN0M, EN0N, EN0K, EN0L, EL38 , EL3C, EL56, and EL57. The SR-IOV adapter firmware level update for the shared-mode adapters happens under user control to prevent unexpected temporary outages on the adapters. A system reboot will update all SR-IOV shared-mode adapters with the new firmware level. In addition, when an adapter is first set to SR-IOV shared mode, the adapter firmware is updated to the latest level available with the system firmware (and it is also updated automatically during maintenance operations, such as when the adapter is stopped or replaced). And lastly, selective manual updates of the SR-IOV adapters can be performed using the Hardware Management Console (HMC). To selectively update the adapter firmware, follow the steps given at the IBM Knowledge Center for using HMC to make the updates: https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm. Note: Adapters that are capable of running in SR-IOV mode, but are currently running in dedicated mode and assigned to a partition, can be updated concurrently either by the OS that owns the adapter or the managing HMC (if OS is AIX or VIOS and RMC is running). On systems using PowerVM firmware, a problem was fixed for partition boot failures and run time DLPAR failures when adding I/O that log BA210000, BA210003, and/or BA210005 errors. The fix also applies to run time failures configuring an I/O adapter following an EEH recovery that log BA188001 events. The problem can impact IBMi partitions running in any processor mode or AIX/Linux partitions running in P7 (or older) processor compatibility modes. The problem is most likely to occur when the system is configured in the Manufacturing Default Configuration (MDC) mode. The trigger for the problem is a race-condition between the hypervisor and the physical operations panel with a very rare frequency of occurrence. On systems with maximum memory configurations (where every DIMM slot is populated - size of DIMM does not matter), a problem has been fixed for systems losing performance and going into Safe mode (a power mode with reduced processor frequencies intended to protect the system from over-heating and excessive power consumption) with B1xx2AC3/B1xx2AC4 SRCs logged. This happened because of On-Chip Controller (OCC) time out errors when collecting Analog Power Subsystem Sweep (APSS) data, used by the OCC to tune the processor frequency. This problem occurs more frequently on systems that are running heavy workloads. Recovery from Safe mode back to normal performance can be done with a re-IPL of the system, or concurrently using the following link steps for a soft reset of the service processor: https://www.ibm.com/support/knowledgecenter/POWER8/p8hby/p8hby_softreset.htm. To check or validate that Safe mode is not active on the system will require a dynamic celogin password from IBM Support to use the service processor command line: 1) Log into ASMI as celogin with dynamic celogin password generated by IBM Support 2) Select System Service Aids 3) Select Service Processor Command Line 4) Enter "tmgtclient --query_mode_and_function" from the command line The first line of the output, "currSysPwrMode" should say "NOMINAL" and this means the system is in normal mode and that Safe mode is not active.
SV840_147_056 / FW840.40 10/26/16	Impact: Availability Severity: SPE New features and functions Support was added to protect the service processor from booting on a level of firmware that is below the minimum MIF level. If this is detected, a SRC B18130A0 is logged. A disruptive firmware update would then need to be done to the minimum firmware level or higher. This new support has no effect on the system being updated with the service pack but has been put in place to provide an enhanced firmware level for the IBM field stock service processors. Support for the Advanced System Management Interface (ASMI) was changed to not create VPD deconfiguration records and call home alerts for hardware FRUs that have one VPD chip of a redundant pair broken or inaccessible. The backup VPD chip for the FRU allows continued use of the hardware resource. The notification of the need for service for the FRU VPD is not provided until both of the redundant VPD chips have failed for a FRU. System firmware changes that affect all systems A problem was fixed for excessive, repeating error logs with SRC B150B901 for a failed FSI link to a DIMM that had insufficent hardware callouts for easy diagnosis of the failure. With the fix, the B150B901 is limited to one occurrence but a new error log is provided with the hardware callouts. Without the fix, if you see repeating B150B901 predictive logs, there will also be repeated informational error logs with SRC B1504800. These B1504800 logs would have the hardware involved and could be used to point to the failing DIMM. A problem was fixed for unneeded throttling of processors if a power supply fails. The error log SRCs of B1812A05 and B1812A33 are reported when the processors are throttled. The affected systems have four power supplies and the loss of one power supply would not normally cause power use to go over the power capacity limit, but it happened because the number of power supplies was internally set as two instead of the four actually in the system. This problem only affects the IBM Power System S824 (8286-42A) and the S824L(8247-42L) models. Without the fix, the problem with processor throttling can be circumvented by replacing the power supply that has failed. A problem was fixed for PCIe slot errors caused by improper PCIe device training. PCIe links do not train properly and PCIe cards may show up as unknown in I/O list system properties. Error log SRC BA180020 may be seen, or informational events B7006976 (for PHB slot) or B7006977 (for a switch slot). The applied fix does not recover failed PCIe devices but does prevent those failures on the next power on IPL. If any PCIe devices are in the failed state, they can be recovered using the HMC to power cycle the affected PCIe slot. This problem only affects the IBM Power System E850 (8408-E8E) model. A problem was fixed for a backplane short causing smoke in the case. The power on sequence was changed to apply power from one power supply at a time and then check for excessive current use that could be caused by a backplane short. If excessive current is defected, the system is powered off with a SRC logged to call out the failing hardware. If a short has occurred, the backplane must still be replaced but damage to other components will be prevented. The problem is triggered by a physical move of the system. This problem only affects the IBM Power System E850 (8408-E8E) model. A problem was fixed for the Advanced System Management Interface "Network Services/Network Configuration" "Reset Network Configuration" button that was not resetting the static routes to the default factory setting. The manufacturing default is to have no static routes defined so the fix clears any static routes that had been added. A circumvention to the problem is to use the ASMI "Network Services/Network Configuration/Static Route Configuration" "Delete" button before resetting the network configuration. A problem was fixed for the HMC Exchange FRU procedure for DVD drive with MTM 7226-1U3 and feature codes 5757/5762/5763 where it did not verify the DVD drive was plugged in at the end of the exchange procedure. Without the fix, the user must manually verify that the DVD drive is plugged in. A problem was fixed for the Advanced System Management Interface (ASMI) incorrectly showing the Anchor card as guarded whenever any redundant VPD chip is guarded. A problem was fixed for the health monitoring of the NVRAM and DRAM in the service processor that had been disabled. The monitoring has been re-established and early warnings of service processor memory failure is logged with one of the following Predictive Error SRCs: B151F107, B151F109, B151F10A, or B151F10D. A problem was fixed for infrequent VPD cache read failures during an IPL causing an unnecessary guarding of DIMMs with SRC B123A80F logged. With the fix, the VPD cache read fails cause a temporary deconfiguration of the associated DIMM but the DIMM is recovered on the next IPL. A problem was fixed for a processor hang where the error recovery was not guarding the failing processor. The failure causes a SRC B111E540 to be logged with Signature Description of " ex(n0p3c1) (COREFIR[55]) NEST_HANG_DETECT: External Hang detected". With the fix, the failure processor FRU is called out and guarded so that the error does not re-occur when the system is re-IPLed. A problem was fixed for a DDR4 memory training step during hostboot that incorrectly failed DIMMs on the timing margins for the HOLD limit. The DIMMs may be recovered by manually unguarding the failed DIMM hardware. This affects the 128GB DDR4 memory DIMM with feature code #EM8S for the E850 (8404-E8E) system. A problem was fixed for a failed IPL with SRC UE BC8A090F that does not have a hardware callout or a guard of the failing hardware. The system may be recovered by guarding out the processor associated with the error and re-IPLing the system. With the fix, the bad processor core is guarded and the system is able to IPL. A problem was fixed for the Operations Panel showing swapped physical port assignments for logical eth0 and eth1 for the service processor when panel function 30 is used. For eth0, port "T5" is displayed instead of port "T4". For eth1, port "T4" is displayed instead of "T5". This problem does not affect the IP addresses assigned in the Advanced System Management Interface (ASMI) for the eth0 and eth1 ports which are correctly assigned. This problem only pertains to the IBM Power System E850 (8408-E8E) model. A problem was fixed for On-Chip Controller (OCC) errors that had excessive callouts for processor FRUs. Many of the OCC errors are recoverable and do not required that the processor be called out and guarded. With the fix, the processors will only be called out for OCC errors if there are three or more OCC failures during a time period of a week. A problem was fixed for an Operations Panel Function 04 (Lamp test) during an IPL causing the IPL to fail. With the fix, the lamp test request is rejected during the IPL until the hypervisor is available. The lamp test can be requested without problems anytime after the system is powered on to hypervisor ready or an OS is running in a partition. A problem was fixed for a false thermal alarm in the active optical cables (AOC) for the PCIe3 expansion drawer with SRCs B7006AA6 and B7006AA7 being logged every 24 hours. The AOC cables have feature codes of #ECC6 through #ECC9, depending on the length of the cable. The SRCs should be ignored as they call for the replacement of the cable, cable card, or the expansion drawer module. With the fix, the false AOC thermal alarms are no longer reported. A problem was fixed for the On-Chip Controller (OCC) incorrectly calling out processors with SRC B1112A16 for L4 Cache DIMM failures with SRC B124E504. This false error logging can occur if the DIMM slot that is failing is adjacent to two unoccupied DIMM slots. System firmware changes that affect certain systems On systems using PowerVM firmware, a problem was fixed for network issues, causing critical situations for customers, when an SR-IOV logical port or vNIC is configured with a non-zero Port VLAN ID (PVID). This fix updates adapter firmware to 10.2.252.1922, for the following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C. The SR-IOV adapter firmware level update for the shared-mode adapters happens under user control to prevent unexpected temporary outages on the adapters. A system reboot will update all SR-IOV shared-mode adapters with the new firmware level. In addition, when an adapter is first set to SR-IOV shared mode, the adapter firmware is updated to the latest level available with the system firmware (and it is also updated automatically during maintenance operations, such as when the adapter is stopped or replaced). And lastly, selective manual updates of the SR-IOV adapters can be performed using the Hardware Management Console (HMC). To selectively update the adapter firmware, follow the steps given at the IBM Knowledge Center for using HMC to make the updates: https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm. Note: Adapters that are capable of running in SR-IOV mode, but are currently running in dedicated mode and assigned to a partition, can be updated concurrently either by the OS that owns the adapter or the managing HMC (if OS is AIX or VIOS and RMC is running). A problem was fixed for systems using the OPAL firmware for repeated B181460B error logs in the Linux OS message log. These are informational error logs related to a restart of a process in the service processor and can be ignored. The restart of the process has been cleaned up to prevent the error message from being logged. On systems using the PowerVM hypervisor firmware and Novalink, a problem was fixed for a NovaLink installation error where the hypervisor was unable to get the maximum logical memory buffer (LMB) size from the service processor. The maximum supported LMB size should be 0xFFFFFFFF but in some cases it was initialized to a value that was less than the amount of configured memory, causing the service processor read failure with error code 0X00000134. On systems using PowerVM firmware, a problem was fixed for an AIX or Linux partition failing with a SRC B2008105 LP 00005 on a re-IPL after a dump (firmware assisted or error generated dump) following a Live Partition Mobility (LPM) migration operation. The problem does not occur if the migrated partition completes a normal IPL after the migration. On systems using PowerVM firmware, a problem was fixed to prevent NovaLink managed or co-managed systems from blocking SR-IOV configurations. When configuring or deconfiguring SR-IOV, it is highly likely that the Novalink VMC virtual device will interfere with SR-IOV virtual devices. Without the fix, SR-IOV is ignoring the NovaLink VMC device and trying to use the same virtual slot. On systems using PowerVM firmware, a problem was fixed for intermittent long delays in the NX co-processor for asynchronous requests such as NX 842 compressions. This problem was observed for AIX DB2 when it was doing hardware-accelerated compressions of data but could occur on any asynchronous request to the NX co-processor. On systems using the PowerVM firmware, a fix was made to provide an option to change the ordering of PCIe Host Bridge (PHB) devices on Power 8 systems to match the discovery order on Power 7 systems. On systems using PowerVM firmware that have an attached HMC, a problem was fixed for a Live Partition Mobility migration that resulted in the source managed system going to the Hardware Management Console (HMC) Incomplete state after the migration to the target system was completed. This problem is very rare and has only been detected once.. The problem trigger is that the source partition does not halt execution after the migration to the target system. The HMC went to the Incomplete state for the source managed system when it failed to delete the source partition because the partition would not stop running. When this problem occurred, the customer network was running very slowly and this may have contributed to the failure. The recovery action is to re-IPL the source system but that will need to be done without the assistance of the HMC. For each partition that has a OS running on the source system, shut down each partition from the OS. Then from the Advanced System Management Interface (ASMI), power off the managed system. Alternatively, the system power button may also be used to do the power off. If the HMC Incomplete state persists after the power off, the managed system should be rebuilt from the HMC. For more information on HMC recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm On systems using PowerVM firmware, a problem was fixed for a latency time of about 2 seconds being added to a target Live Partition Mobility (LPM) migration system when there is a latency time check failure. With the fix, in the case of a latency time check failure, a much smaller default latency is used instead of two seconds. This error would not be noticed if the customer system is using a NTP time server to maintain the time. On systems using PowerVM firmware that have an attached HMC, a problem was fixed for a Live Partition Mobility migration that resulted in a system hang when an EEH error occurred simultaneously with a request for a page migration operation. On the HMC, it shows an incomplete state for the managed system with reference code A181D000. The recovery action is to re-IPL the source system but that will need to be done without the assistance of the HMC. From the Advanced System Management Interface (ASMI), power off the managed system. Alternatively, the system power button may also be used to do the power off. If the HMC Incomplete state persists after the power off, the managed system should be rebuilt from the HMC. For more information on HMC recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm On systems using PowerVM firmware, a problem was fixed for a system dump post-dump IPL that resulted in adjunct partition errors of SRC BA54504D, B7005191, and BA220020 when they could not be created due to false space constraints. These adjunct partition failures will prevent normal operations of the hypervisor such as creating new partitions, so a power off and power on of the system is needed to recover it. If the customer system is experiencing this error (only some systems will be impacted), it is expected to occur for each system dump post-dump IPL until the fix is applied. On systems using PowerVM firmware, a problem was fixed for a shared processor pool partition showing an incorrect zero "Available Pool Processor" (APP) value after a concurrent firmware update. The zero APP value means that no idle cycles are present in the shared processor pool but in this case it stays zero even when idle cycles are available. This value can be displayed using the AIX "lparstat" command. If this problem is encountered, the partitions in the affected shared processor pool can be dynamically moved to a different shared processor pool. Before the dynamic move, the "uncapped" partitions should be changed to "capped" to avoid a system hang. The old affected pool would continue to have the APP error until the system is re-IPLed. On systems using PowerVM firmware, a rare problem was fixed for a system hang that can occur when dynamically moving "uncapped" partitions to a different shared processor pool. To prevent a system hang, the "uncapped" partitions should be changed to "capped" before doing the move. On systems using PowerVM firmware, a problem was fixed for a DLPAR add of the USB 3.0 adapter (#EC45 and #EC46) to an AIX partition where the adapter could not be configured with the AIX "cfgmgr" command that fails with EEH errors and an outstanding illegal DMA transaction. The trigger for the problem is the DLPAR add operation of the USB 3.0 adapter that has a USB External Dock (#EU04) and RDX Removable Disk Drives attached, or a USB 3.0 adapter that has a flash driver attached. The PCI slot can be powered off and on to recover the USB 3.0 adapter. On systems using PowerVM firmware, a problem was fixed for a missing OF trace buffer in the resource dump. This happens any time a resource dump is requested. The missing FFDC data may require that problems be recreated before they can be debugged. On systems using PowerVM firmware, a problem was fixed for a Live Partition Mobility (LPM) error where the target partition migration is failed with HSCLB98C error. Frequency of this error can be moderate with source partitions that have a vNIC resource but extremely low if the source partition does not have a vNIC resource. The failure originates at the VIOS VF level, so recovery from this error may need a re-IPL of the system to regain full use of the vNIC resources.
SV840_139_056 / FW840.30 09/28/16	Impact: Availability Severity: SPE New features and functions Support for the CAPI NVMe (Non-Volatible Memory express) Flash Accelerator Adapter with feature code #EJ1K. This feature provides a PCIe Gen3 adapter with an FPGA and 1.92 TB of low write latency, nonvolatile flash memory. The adapter physically is a half length x8 adapter, but requires a x16 PCIe CAPI-capable Gen3 slot in the system unit. The system connects to the FPGA using the CAPI interface. The FPGA connects to the flash memory using NVMe, which is a high performance software interface to read/write this flash memory. Use of the #EJ1K adapter requires one #EC2A CAPI activation feature per system. This CAPI Flash Accelerator Adapter does not run under PowerKVM but is a bare-metal install only for the following minimum Little Endian (LE) Linux distribution level: Ubuntu 16.04.1. This feature only pertains to the IBM Power System S812L (8247-21L), S822L (8247-22L) and S824L (8247-42L) models. Support for 6 core processor with FC #8A2225 and CCIN 54E1 extended for use in the Power System S822L (8247-22L). Support was already in place for this processor since FW810.20 for the S822 (8284-22A). The certificate store on the service processor has been upgraded to include the changes contained in version 2.6 of the CA certificate list published by the Mozilla Foundation at the mozilla.org website as part of the Network Security Services (NSS) version 3.21. System firmware changes that affect all systems A problem was fixed for PCI Host Bridge (PHB) "link down" Endpoint Recoverable errors that became fatal exceptions when not handled by the CAPI adapters. With the fix, the recoverable errors are now detected by the CAPI adapters to allow for run-time link recovery. A problem was fixed for CAPI adapter errors that caused the system processors to be called out and guarded instead of the CAPI adapter unit. The errors that cause this problem are the rare fatal adapter errors, so the problem should be infrequently seen. With the fix, the failing CAPI adapter is guarded after the checkstop instead of the system processor. A problem was fixed for host-initiated resets of the service processor that can cause the service processor to terminate. In this state, the service processor will be unresponsive but the system and partitions will continue to run. On systems with a single service processor, the SRC B1817212 will be displayed on the control panel. For systems with redundant service processors, the failing service processor will be deconfigured. To recover the failed service processor, the system will need to be powered off with AC powered removed during a regularly scheduled system service action. The problem is intermittent and very infrequent as most of the host-initiated resets of the service processor will work correctly to restore the service processor to a normal operating state. System firmware changes that affect certain systems On systems using the PowerVM firmware, a fix was made to provide an option to change the ordering of PCIe Host Bridge (PHB) devices on Power 8 systems to match the discovery order on Power 7 systems.
SV840_132_056 / FW840.24 08/31/16	Impact: Availability Severity: HIPER System firmware changes that affect certain systems HIPER/Non-Pervasive: For a system using PowerVM firmware at a FW840 level and having an AIX partition or VIOS partition at specific back levels, a problem was fixed for PCI adapters not getting configured in the OS. DVD boots hang with status code 518 when attempts are made to boot off the AIX or VIOS DVD image. NIM installs hang with status code 608. If the firmware is updated to 840_104 through 840_118 for a SAS booted system, the subsequent reboot will hang with status code 554. The failing AIX and VIOS levels for the IBM Power System S822 (8284-22A), S814 (8286-41A), and S824 (8286-42A) are as follows: AIX: AIX 7100-01-10 AIX 7100-02-05 - AIX 7100-02-07 AIX 6100-07-10 AIX 6100-08-05 - AIX 6100-08-07 VIOS : VIOS 2.2.1.9 VIOS 2.2.2.5 - VIOS 2.2.2.70 The failing AIX and VIOS levels for the IBM Power System E850 (8408-E8E) are as follows: AIX : AIX 7100-02-07 AIX 6100-08-07 VIOS : VIOS 2.2.2.70 Without the fix, the problem may be circumvented by upgrading the AIX to 7100-03-03 or 6100-09-03 and the VIOS to 2.2.3.4. Depending on the adapter not getting configured, the error may result in Defined devices, EEH errors, and/or failure to boot the partition (if the failing adapter is the boot device). These errors may also be seen for a rebooted partition after a LPM migration to FW840. With the fix applied, the error state for some of the adapters in the running OS may persist and it will be necessary to reboot the OS to recover from those errors. The problem corrected with this Service Pack does not pertain to the IBM Power System S812L (8247-21L), S822L (8247-22L), or S824L (8247-42L) models.
SV840_118_056 / FW840.23 07/28/16	Impact: Data Severity: HIPER System firmware changes that affect certain systems HIPER/Non-Pervasive: DEFERRED: On systems with DDR4 memory installed, a problem was fixed for the handling of data errors in the L4 cache. If a data error occurs in the L4 cache of the memory buffer on an affected system and it is pushed out to mainline memory, the data error will not be correctly handled. A data error originating in the L4 cache may result in incorrect data being stored into memory. The DDR4 DRAM has feature code (FC) EM8S for a 128GB 1600 MHz CDIMM. IBM strongly recommends that the customer should plan an outage to install the firmware fix immediately. Fix activation requires a subsequent platform IPL following the installation of the firmware fix to eliminate any exposure to this issue. This problem only exists on the 8408-E8E systems with the DDR4 DRAM memory feature.
SV840_113_056 / FW840.22 07/06/16	Impact: Availability Severity: ATT New features and functions Support was added to Live Partition Mobility to allow migrations between partitions at firmware level FW760 and FW840.22 or later. Previously, migration operations were not allowed between FW760 and FW840 partitions. Support for the CoD on the 226W 4.323 GHz eight core processor (CCIN 54E5, F/C EPXF) for the EasyScale offering of the S822 (8284-22A). This includes Processor Capacity on Demand (CoD) with Elastic (On/Off) Processor CoD and Trial Processor CoD. Previously, the CoD support for the EasyScale S822 was only available when using the ten core 3.42 GHz processor (CCIN 54E8, F/C EPXD). System firmware changes that affect certain systems On systems using PowerVM firmware, a problem was fixed for a sequence of two or more Live Partition Mobility migrations that caused a partition to crash with a SRC BA330000 logged (Memory allocation error in partition firmware). The sequence of LPM migrations that can trigger the partition crash are as follows: The original source partition level can be any FW760.xx, FW763.xx, FW770.xx, FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx, or FW840.xx P8 level. It is migrated first to a system running one of the following levels: 1) FW730.70 or later 730 firmware or 2) FW740.60 or later 740 firmware And then a second migration is needed to a system running one of the following levels: 1) FW760.00 - FW760.20 or 2) FW770.00 - FW770.10 The twice-migrated system partition is now susceptible to the BA330000 partition crash during normal operations until the partition is rebooted. If an additional LPM migration is done to any firmware level, the thrice-migrated partition is also susceptible to the partition crash until it is rebooted. With the fix applied, the susceptible partitions may still log multiple BA330000 errors but there will be no partition crash. A reboot of the partition will stop the logging of the BA330000 SRC.
SV840_104_056 / FW840.20 05/31/16	Impact: Availability Severity: SPE New features and functions Support for a 128GB DDR4 memory DIMM for the E850 (8408-E8E) model . Memory feature code #EM8S provides the 128GB CDIMM (1600 MHz, 8GBIT DDR4). Note that DDR4 and DDR3 DIMMs cannot be mixed in the system. Also, the minimum firmware level needed for DDR4 usage is FW840.23 due to a fix needed for a data integrity problem. Support was added for the Stevens6+ option of the internal tray loading DVD-ROM drive with F/C #EU13. This is an 8X/24X(max) Slimline SATA DVD-ROM Drive. The Stevens6+ option is a FRU hardware replacement for the Stevens3+. MTM 7226-1U3 (Oliver) FC 5757/5762/5763 attaches to IBM Power Systems and lists Stevens6+ as optional for Stevens3+. If the Stevens6+ DVD drive is installed on the system without the required firmware support, the boot of an AIX partition will fail when the DVD is used as the load source. Also, an IBM i partition cannot consistently boot from the DVD drive using D-mode IPL. A SRC C2004130 may be logged for the load source not found error. Support for the IBM PCIe3 12GB cache RAID plus SAS dual 4-port 6Gb x8 adapter with feature code #EJ14 and CCIN 57B1. This adapter is very similar to the #EJ0L SAS adapter, but it uses a second chip in the card to provide more IOPS capacity (significant performance improvement) and can attach more SSD. This adapter uses integrated flash memory to provide protection of the write cache, without need for batteries, in case of power failure. Support for PowerVM vNIC extended to Linux OS Ubuntu 16.04 LE with up to ten vNIC client adapters for each partition. PowerVM vNIC combines many of the best features of SR-IOV and PowerVM SEA to provide a network solution with options for advanced functions such as Live Partition Mobility along with better performance and I/O efficiency when compared to PowerVM SEA. In addition PowerVM vNIC provides users with bandwidth control (QoS) capability by leveraging SR-IOV logical ports as the physical interface to the network. PowerVM CoD was enhanced to eliminate the yearly Utility CoD renewal on systems using Utility COD. The Utility CoD usage is already monitoring to make sure systems are running within the prescribed threshold limit of unreported usage, so a yearly customer renewal is not needed to manage the Utility CoD processor usage. Support was added to the DHCP client on the service processor for non-random backoff mode needed for Data Center Manageability Interface (DCMI) V1.5 compliance. By default, the DHCP client does random backoff delays for retries during DHCP discovery. For DCMI V1.5, non-random backoff delays were introduced as an option. Disabling the random back-off mode is not required for normal operations, but if wanted, the system administrator can override the default and disable the random back-off mode by sending the “SET DCMI Configuration Parameters” for the random back-off property of the Discovery Configuration parameter. A value of "0" for the bit means "Disabled". Or, the DHCP configuration file can be modified to add "random-backoff off", causing the non-random mode for the retry delays to be used during DHCP discovery. Support was added for enhanced diagnostics for PowerVM Simplified Remote Restart (SRR) partitions. This service pack level is recommended when using SRR partitions. You can learn more about SSR partitions at the IBM Knowledge Center: " http://www.ibm.com/support/knowledgecenter/HW4P4/p8hat/p8hat_createremotereslpar.htm". Support was added for auto-correction in the Advanced System Manager Interface (ASMI) for the "Feature Code/Sequence Number" field of the "System Configuration/Program Vital Product Data/System Enclosures" menu selection. Lower case letters are invalid in the "Feature Code/Sequence Number" field so these are now changed to upper case letters to help form a valid entry. For example, if "78c9-001" was entered, it would be changed to "78C9-001". Support was added for HTTP Strict Transport Security (HSTS) compliance for The Advanced System Management Interface (ASMI) web connection. Even without this feature, any attempt to access ASMI with the HTTP protocol was rejected because the service processor firewall blocks port 80 (HTTP). But enabling HSTS for ASMI prevents HSTS security warnings for the service processor during network scans by security scanner programs such as IBM AppScan. System firmware changes that affect all systems DEFERRED: A problem was fixed in the dynamic ram (DRAM) initialization to update the VREF on the dimms to the optimal settings and to add an additional margin check test to improve the reliability of the DRAM by screening out more marginal dimms before they can result in a run-time memory fault. A problem was fixed for a degraded PCI link causing a processor core to be guarded if a non-cacheable unit (NCU) store time-out occurred with SRC B113E540 and PRD signature "(NCUFIR[9]) STORE_TIMEOUT: Store timed out on PB". With the fix, the processor core is not guarded because of the NCU error. If this problem occurs and a core is deconfigured. clear the guard record and re-IPL to regain the processor core. The solution for degraded PCI links is different from the fix for this problem, but a re-IPL of the CEC or a reset of the PCI adapters could help to recover the PCI links from their degraded mode. A problem was fixed for an incorrect reduction in FRU callouts for Processor Run-time Diagnostic (PRD) errors after a reference oscillator clock (OSCC) error has been logged. Hardware resources are not called out and guarded as expected. Some of the missing PRD data can be found in the secondary SRC of B181BAF5 logged by hardware services. The callouts that PRD would have made are in the user data of that error log. A problem was fixed for a Qualys network scan for security vulnerabilities causing a core dump in the Intelligent Platform Management Interface (IPMI) process on the service processor with SRC B181EF88. The error occurs anytime the Qualys scan is run because it sends an invalid IPMI session id that should have been handled and discarded without a core dump. A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during RSA PPS signature verification. The Common Vulnerabilities and Exposures issue number is CVE-2015-3194. A security problem was fixed in the lighttpd server on the service processor, where a remote attacker, while attempting authentication, could insert strings into the lighttpd server log file. Under normal operations on the service processor, this does not impact anything because the log is disabled by default. The Common Vulnerabilities and Exposures issue number is CVE-2015-3200. A problem was fixed for the service processor going to the reset state instead of the termination state when the anchor card is missing or broken. At the termination state, the Advanced System Manager Interface (ASMI) can be used to collect failure data and debug the problem with the anchor card. A problem was fixed for error log entries created by Hostboot not getting written to the error log in some situations. This can cause hardware detected as failed by Hostboot to not get reported or have a call-home generated. This problem will occur whenever Hostboot commits a recovered or informational error as its last error log in the current IPL. In the next IPL, one or more error logs from Hostboot will be lost. A problem was fixed for a service processor failure during a system power off that causes a reset of the service processor. The service processor is in the correct state for a normal system power on after the error. The frequency for this error should be low as it is caused by a very rare race condition in the power off process. A problem was fixed so that service processor NVRAM bit flips are now detected and reported as predictive errors after a certain threshold of failures have occurred. The SRCs reported are B151F109 (threshold of NVRAM errors was reached) or B151F10A (a NVRAM address has failed multiple times). Previously, these normal wear errors in the NVRAM were ignored. The bit flip is self-corrected and does not cause a problem but a high occurrence of these could mean that a service processor card FRU or system backplane FRU, as called out in the SRC, is in need of service. A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during SSL certificate management. The Common Vulnerabilities and Exposures issue number is CVE-2016-0797. System firmware changes that affect certain systems DEFERRED: On systems using PowerVM firmware, a performance improvement was made by disabling the Hot/Cold Affinity (HCA) hardware feature, which gathers memory usage statistics for consumption by partition operating system memory management algorithms. The statistics gathering can, in rare cases, cause performance to degrade. The workloads that may experience issues are memory-intensive workloads that have little locality of reference and thus cannot take advantage of hardware memory cache. As a consequence, the problem occurs very infrequently or not at all except for very specific workloads in a HPC environment. This performance fix requires an IPL of the system to activate it after it is applied. On systems using PowerVM firmware and NovaLink co-management of the partitions, a problem was fixed with the Hardware Management Console (HMC) not showing the co-management master name with the HMC lscomgmt command. The command displayed blank text for the master owner when NovaLink established the master mode. This problem occurred whenever Novalink powered on and took the master mode that had been released by the HMC. On systems using OPAL firmware, a problem was fixed for Enhanced Error Handling (EEH) recoverable errors on network adapters behind a PLX switch having the backplane called out by OPAL instead of the adapter slot. On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was fixed for a Restart Dump operation failing into KDB mode. If "q" is entered to exit from KDB mode, the partition fails to start. The AIX partition must be powered off and back on to recover. The problem can be circumvented by disabling Firmware Assisted Dump (default is enabled in AIX 7.2). On a PowerVM system, a problem was fixed for an incorrect date in partitions created with a Simplified Remote Restart-Capable (SRR) attribute where the date is created as Epoch 01/01/1970 (MM/DD/YYYY). Without the fix, the user must change the partition time of day when starting the partition for the first time to make it correct. This problem only occurs with SRR partitions. On a PowerVM system with licensed Power Integrated Facility for Linux (IFL) processors, a problem was fixed for a system hang that could occur if the system contains both 1) dedicated processor partitions configured to share processors while active and 2) shared processor partitions. This problem is more likely to occur on a system with a low number of non-IFL processors. On systems using PowerVM firmware with dedicated processor partitions, a problem was fixed for the dedicated processor partition becoming intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors. This is a follow-on to the fix provided in 840.11 for a different issue for delays in dedicated processor partitions that were caused by low I/O utilization. A problem was fixed for transmit time-outs on a Virtual Function (VF) during stressful network traffic, on systems using PCIe adapters in Single Root I/O Virtualization (SR-IOV) shared-mode. This fix updates adapter firmware to 10.2.252.1918, for the following Feature Codes: EN15, EN16, EN17, EN18, EN0H, EN0J, EL38, EN0M, EN0N, EN0K, EN0L, and EL3C. The SR-IOV adapter firmware level update for the shared-mode adapters happens under user control to prevent unexpected temporary outages on the adapters. A system reboot will update all SR-IOV shared-mode adapters with the new firmware level. In addition, when an adapter is first set to SR-IOV shared mode, the adapter firmware is updated to the latest level available with the system firmware (and it is also updated automatically during maintenance operations, such as when the adapter is stopped or replaced). And lastly, selective manual updates of the SR-IOV adapters can be performed using the Hardware Management Console (HMC). To selectively update the adapter firmware, follow the steps given at the IBM Knowledge Center for using HMC to make the updates: https://www.ibm.com/support/knowledgecenter/HW4M4/p8efd/p8efd_updating_sriov_firmware.htm. Note: Adapters that are capable of running in SR-IOV mode, but are currently running in dedicated mode and assigned to a partition, can be updated concurrently either by the OS that owns the adapter or the managing HMC (if OS is AIX or VIOS and RMC is running). On systems using OPAL firmware, a problem was fixed in the PCI Host Bridge (PHB) to prevent adapter interrupts from being lost when two interrupts come in at the same time. The lost interrupts could result in a slow down for the workload using the affected adapter. This fixes a problem seen with some CAPI workloads that have lots of interrupt masking at the same time as a high interrupt load. However, the fix is not specific to the CAPI adapters. On systems using OPAL firmware, a problem was fixed for an extraneous SRC BB822411 being logged during service processor termination occurrences. This SRC is unrelated to the root cause of the termination and should be ignored. On systems using OPAL firmware, a problem was fixed for a incomplete reporting of a Hypervisor Maintenance Interrupt (HMI) to the host Linux OS. The fix ensures the CPU Processor Identification Register (PIR) is reported correctly instead of having an all zero value. HMIs are caused by hardware failures occurring in the SLW (sleep winkle image) for processors or in CAPI (Coherent Accelerator Processor Interface) adapters. These cause the hypervisor to investigate the cause of the error by reading SCOM registers to isolate the fault and send a HMI. On systems using OPAL firmware, support was added to allow the Linux OS to send alphanumeric strings to the operations panel. The OS program must use a device driver for /dev/oppanel. The driver implements a 32 character buffer which a user can read/write by accessing the device (/dev/oppanel). This buffer is then displayed on the operator panel display. A problem was fixed for the Advanced System Management Interface (ASMI) "System Service Aids/Call-home Setup" menu not being able to clear the old Service center phone numbers. The blank or null characters are now accepted and can be used to overlay the existing values. Without the fix, the characters input to clear the phone number field are rejected and replaced with the old values. The ASMI Call-home option is not available for systems that are managed by the Hardware Management Console (HMC). On PowerVM systems using Elastic Capacity on Demand (CoD) (also known as On/Off CoD), a problem was fixed for losing entitlement amounts when upgrading from FW820 or FW830. If you upgrade to a service pack level that does not have this fix and lose the entitlement, you can get another On/Off (Elastic) CoD Enablement code from IBM Support. This problem only pertains to the E850 (8408-E8E), E870 (9119-MME), and E880 (9119-MHE) models. On IBM Power System S822 (8284-22A) using PowerVM for IBM i partitions, a problem was fixed for the User-based pricing indicator being off. This was changed to be on. The IBM i Licensing fees involves a distinction between User-based and non-User-based pricing. The model S822, for PurePower (IBM i) now shows User-based pricing as required.
SV840_087_056 / FW840.11 03/18/16	Impact: Availability Severity: ATT New features and functions Support for PowerVM co-management mode on the Hardware Management Console (HMC). This feature allows the HMC and PowerVM NovaLink to both have a live management connection to the system. This is different than the traditional dual-HMC model however, and results in some behavior changes in the HMC. For hardware and service management functions, the HMC works as it does when not in co-management mode. However, when in co-management mode, only the PowerVM Co-Management Master can make changes to the PowerVM configuration and change the state of the system. Power System firmware updates must be done using the HMC, with the HMC as the Co-Management Master. All management entities (HMC(s) and NovaLink) have read-access to the partition configuration regardless of whether they are the designated master. Typically NovaLink will be the co-management master, however if a virtualization task or a firmware update is needed, one can explicitly request master authority for the HMC, perform the action, and then relinquish the authority back to NovaLink. The minimum firmware and HMC levels for this feature are FW840.11 and HMC V8R8.4.0.1. If using PowerVC with NovaLink co-management, the minimum level is PowerVC 1.3.0.2. Please refer to IBM KnowledgeCenter link "http://ibm.biz/novalink-kc" for more information on the PowerVM NovaLink feature and changing the master authority when doing co-management. Note: If a firmware update is attempted from a co-managing HMC that is not in the master role, the update operation will fail with the following message: "Could not start the update because this management console is not the master console. Check to see if there is another management console program is attached to the target server {0} (HSCF0261E)" along with HMC SRC E302FB11. The default setting for the "Enlarged I/O Memory Capacity" feature was disabled on newly manufactured E850, E870 & E880 models to reduce hypervisor memory usage. Customers of the new systems using PCI adapters that leverage "Enlarged I/O Memory Capacity" will need to explicitly enable this feature for the supported PCI slots, using ASMI Menus while the system is powered off. Existing systems will not see a change in their current setting. For existing systems with only AIX and IBM i partitions that do not benefit from this feature, it can be disabled by using the Advanced System Management Interface (ASMI) for the "System Configuration-> I/O Adapter Enlarged Capacity" panel to uncheck the option for the "I/O Adapter Enlarged Adapter Capacity" feature. System firmware changes that affect certain systems On systems using PowerVM partitions, a problem was fixed for error recovery from failed Live Partition Mobility (LPM) migrations. The recovery error is caused by a partition reset that leaves the partition in an unclean state with the following consequences: 1) A retry on the migration for the failed source partition may not not be allowed; and 2) With enough failed migration recovery errors, it is possible that any new migration attempts for any partition will be denied. This error condition can be cleared by a re-IPL of the system. The partition recovery error after a failed migration is much more likely to occur for partitions managed by NovaLink but it is still possible to occur for Hardware Management Console (HMC) managed partitions.
SV840_079_056 / FW840.10 03/04/16	Impact: Availability Severity: SPE New features and functions Support was added to block a full Hardware Management Console (HMC) connection to the service processor when the HMC is at a lower firmware major and minor release level than the service processor. In the past, this check was done only for the major version of the firmware release but it now has been extended to the minor release version level as well. The HMC at the lower firmware level can still make a limited connection to the higher firmware level service processor. This will put the CEC in a "Version Mismatch" state. Firmware updates are allowed with the CEC in the "Version Mismatch" state so that the condition can be corrected with either a HMC update or a firmware update of the CEC. Support for Processor Capacity on Demand (CoD) for the IBM Power System S822 (8284-22A) that includes Elastic (On/Off) Processor CoD and Trial Processor CoD. Support was removed in the Advanced Systems Management Interface (ASMI) and IPMI for allowing the IBM Power System S822 (8284-22A) to change between OPAL and PowerVM hypervisor modes. The default for new 8284-22A systems is PowerVM mode and it cannot be changed to OPAL. For existing customers with 8284-22A systems, both hypervisor modes (PowerVM & OPAL) are still available after the firmware is upgraded to 840.10, so they are not affected by the change. Support was added for a 4-Core 3.02 GHz POWER8 Processor Card with CCIN 54E9 and feature code #EPXK for the S822 (8284-22A), S812L(8247-21L), and S822L (8247-22L) models. Support for PowerVM vNIC with more vNIC client adapters for each partition, up to 10 from a limit of 6 at the FW840.00 level. PowerVM vNIC combines many of the best features of SR-IOV and PowerVM SEA to provide a network solution with options for advanced functions such as Live Partition Mobility along with better performance and I/O efficiency when compared to PowerVM SEA. In addition PowerVM vNIC provides users with bandwidth control (QoS) capability by leveraging SR-IOV logical ports as the physical interface to the network. Support for the IBM Power System E850 (8408-E8E) with AIX and Linux partitions. The default setting for the "Enlarged I/O Memory Capacity" feature was disabled on newly manufactured E850, E870 & E880 models to reduce hypervisor memory usage. Customers using PCI adapters that leverage "Enlarged I/O Memory Capacity" will need to explicitly enable this feature for the supported PCI slots, using ASMI Menus while the system is powered off. System firmware changes that affect all systems A problem was fixed for false errors logs for SRC B181A40F where upper domain fans are incorrectly reported as missing on a reboot of the service processor. This problem only pertains to the IBM Power System E850 (8408-E8E). A problem was fixed for not being able to control all I/O slots for Huge Dynamic DMA Window (HDDW) capability on the IBM Power System E850 (8408-E8E). There are 13 I/O slots enabled for HDDW on this system but only 8 could be controlled by the Advanced System Management Interface (ASMI) panel for "I/O Enlarged Capacity". This prevented enabling all slots to be HDDW enabled, limiting DMA bandwidth on some of the I/O slots. A problem was fixed for a system IPL hang at C100C1B0 with SRC 1100D001 when the power supplies have failed to supply the necessary 12-volt output for the system. The 1100D001 SRC was calling out the planar when it should have called out the power supplies. With the fix, the system will terminate as needed and call out the power supply for replacement. One mode of power supply failure that could trigger the hang is sync-FET failures that disrupt the 12-volt output. A problem was fixed for a PCIe3 I/O expansion drawer (#EMX0) not getting all error logs reported when its error log queue is full. In the case where the error log queue is full with 16 entries, only one entry is returned to the hypervisor for reporting. This error log truncation only occurs during periods of high error activity in the expansion drawer. A problem was fixed for the callout of a VPD collection fault and system termination with SRC 11008402 to include the 1.2vcs VRM FRU. The power good fault fault for the 1.2 volts would be a primary cause of this error. Without the fix, the VRM is missing in the callout list and only has the VPDPART isolation procedure. A problem was fixed for excessive logging of the SRC 11002610 on a power good (pgood) fault when detected by the Digital Power Subsystem Sweep (DPSS). Multiple pgood interrupts are signaled by the DPSS in the interval between the first pgood failure and the node power down. A threshold was added to limit the number of error logs for the condition. A problem was fixed to speed recovery for VPD collection time-out errors for PCIe resources in an I/O drawer logged with SRC 10009133 during concurrent firmware updates. With the fix, the hypervisor is notified as soon as the VPD collection has finished so the PCIe resources can report as available . Without the fix, there is a delay as long as two hours for the recovery to complete. A problem was fixed to allow IPMI entity IDs to be used in ipmitool raw commands on the service processor to get the temperature reading. Without the fix, the DCMI entity IDs have to be used in the raw command for the "Get temperature" function. A problem was fixed for a false unrecoverable error (UE) logged for B1822713 when an invalid cooling zone is found during the adjustment of the system fan speeds. This error can be ignored as it does not represent a problem with the fans. A problem was fixed for loss of back-level protection during firmware updates if an anchor card has been replaced. The Power system manufacturing process sets the minimum code level a system is allowed to have for proper operation. If a anchor card is replaced, it is possible that the replacement anchor card is one that has the Minimum MIF Level (MinMifLevel) given as "blank", and this removes the system back-level protection. With the fix, blanks or nulls on the anchor card for this field are handled correctly to preserve the back-level protection. Systems that have already lost the back-level protection due to anchor card replacement remain vulnerable to a accidental downgrade of code level by operator error, so code updates to a lower level for these systems should only be performed under guidance from IBM Support. The following command can be run the Advanced Management Management Interface (ASMI) to determine if the system has lost the back-level protection with the presence of "blanks" or ASCII 20 values for MinMifLevel: "registry -l cupd/MinMifLevel" with output: "cupd/MinMifLevel: 2020202020202020 2020202020202020 [ ] 2020202020202020 2020202020202020 [ ]" A problem was fixed for a code update error from FW830 to a FW840 level causes temperature sensors to be lost so that the ipmitool command to list the temperature sensors fails with a IPMI program core dump. If the temperature sensors are already corrupted due to a preceding code update, this fix adds back in the temperature sensors to allow the ipmitool to work for listing the temperature sensors. A problem was fixed for a system checkstop caused by a L2 cache least-recently used (LRU) error that should have been a recoverable error for the processor and the cache. The cache error should not have caused a L2 HW CTL error checkstop. A problem was fixed for a re-IPL with power on failure with B181A40F SRC logged for VPD not found for a DIMM FRU. The DIMM had been moved to another slot or just removed. In this situation, a IPL of the system from power off will work without errors, but a re-IPL with power on, such as that done after processing a hardware dump, will fail with the B181A40F. Power off the system and IPL to recover. Until the fix is applied, the problem can be circumvented after a DIMM memory move by putting the PNOR flash memory in genesis mode by running the following commands in ASMI with the CEC powered off: 1) hwsvPnorCmd -c 2) hwsvPnorCmd -g A problem was fixed for the service processor becoming inaccessible when having a dynamic IP address and being in DCMI "non-random" mode for DHCP discovery by customer configuration. The problem can occur intermittently during a AC power on of the system. If the service processor does not respond on the network, AC power cycle to recover. Without the fix, the problem can be circumvented by using the DHCP client in the DCMI "random" mode for DHCP discovery, which is the default on the service processor. A problem was fixed for a memory initialization error reported with SRC BC8A0506 that terminates the IPL. This problem is unlikely to occur because it depends on a specific memory location being used by the code load. The system can be recovered from the error by doing another IPL. System firmware changes that affect certain systems On PowerVM systems a problem was fixed to address a performance degradation. The problem surfaces under the following conditions: 1) There is at least one VIOS or Linux partition that is running with dedicated processors AND 2) There is at least one VIOS or Linux partition running with shared processors AND 3) There is at least one AIX or IBMi partitions configured with shared processors. If ALL the above conditions are met AND one of the following actions occur, 1) VIOS/Linux dedicated processor partition is configured to share processors while active OR 2) A dynamic platform optimization operation (HMC 'optmem' command) is performed OR 3) Processors are unlicensed via a capacity on demand operation there is an exposure for a loss in performance. On systems using PowerVM firmware, a problem was fixed for PCIe switch recovery to prevent a partition switch failure during the IPL with error logs for SRC B7006A22 and B7006971 reported. This problem can occur when doing recovery for an informational error on the switch. If this problem occurs, the partition must be restarted to recover the affected I/O adapters. On systems using PowerVM firmware, a problem was fixed for a concurrent FRU exchange of a CAPI (Coherent Accelerator Processor Interface) adapter for a standard I/O adapter that results in a vary off failure. If this failure occurs, the system needs to be re-IPLed to fix the adapter. The trigger for this failure is a dual exchange where the CAPI adapter is exchanged first for a standard (non-like-typed) adapter. Then an attempt is made to exchange the standard adapter for a CAPI adapter which fails. On systems using PowerVM firmware, a problem was fixed for a CAPI (Coherent Accelerator Processor Interface) device going to a "Defined" state instead of "Available" after a partition boot. If the CAPI device is doing recovery and logging error data at the time of the partition boot, the error may occur. To recover from the error, reboot the partition. With the fix, the hypervisor will wait for the logging of error data from the CAPI device to finish before proceeding with the partition boot. On systems using PowerVM firmware, a problem was fixed for a hypervisor adjunct partition failed with "SRC B2009008 LP=32770" for an unexpected SR-IOV adapter configuration. Without the fix, the system must be re-IPLed to correct the adjunct error. This error is infrequent and can only occur if an adapter port configuration is being changed at the same time that error recovery is occurring for the adapter. On systems using PowerVM firmware and PCIe adapters in SR-IOV mode, the following problem was addressed with a Broadcom Limited (formerly known as Avago Technologies and Emulex) adapter firmware update to 10.2.252.1913: Transmit time-outs on a Virtual Function (VF) during stressful network traffic. On systems using PowerVM firmware with an invalid P-side or T-side in the firmware, a problem was fixed in the partition firmware Real-Time Abstraction System (RTAS) so that system Vital Product Data (VPD) is returned at least from the valid side instead of returning no VPD data. This allows AIX host commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent even if there is one bad code side. Without the fix, all the VPD data is blocked from the OS until the invalid code side is recovered by either rejecting the firmware update or attempting to update the system firmware again. On systems using PowerVM firmware without a HMC (and in Manufacturing Default Configuration (MDC) mode with a single host partition), a problem was fixed for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were not off-loaded to the host OS. This is an infrequent error caused by a timing error that causes the dump notification signal to the host OS to be lost. The missing/pending dumps can be retrieved by rebooting the host OS partition. The rebooted host OS will receive new notifications of the dumps that have to be off-loaded. On systems using PowerVM firmware, a problem was fixed for truncation on the memory fields displayed in the Advanced System Management Interface on the COD panels. ASMI shows three fields of memory called "Installed memory", Permanent memory", and "Inactive memory". The largest value that can be displayed in the fields was "9999" GB. This has been expanded to a maximum of "999999" GB for each of the ASMI fields. The truncation was only in the displayed memory value, not in the actual memory size being used by the system which was correct. On systems using PowerVM firmware and a partition using Active memory Sharing (AMS), a problem was fixed for a Live Partition Mobility (LPM) migration of the AMS partition that can hang the hypervisor on the target CEC. When an AMS partition migrates to the target CEC, a hang condition can occur after processors are resumed on the target CEC, but before the migration operation completes. The hang will prevent the migration from completing, and will likely require a CEC reboot to recover the hung processors. For this problem to occur, there needs to be memory page-based activity (e.g. AMS dedup or Pool paging) that occurs exactly at the same time that the Dirty Page Manager's PSR data for that page is being sent to the target CEC. On systems using PowerVM firmware, a problem was fixed for PCIe adapter hangs and network traffic error recovery during Live Partition Mobility (LPM) and SR-IOV vNIC (virtual ethernet adapter) operations. An error in the PCI Host Bridge (PHB) hardware can persist in the L3 cache and fail all subsequent network traffic through the PHB. The PHB error recovery was enhanced to flush the PHB L3 cache to allow network traffic to resume. On systems using PowerVM firmware with AIX or Linux partitions with greater than 8TB of memory, a problem was fixed for Dynamic DMA Window (DDW) enabled adapters IPLing into a "Defined" state, instead of "Available", and unusable with a "0" size DMA window. If a DDW enabled adapter is plugged into an HDDW (Huge Dynamic DMA Window) slot in a partition with the large memory size, the OS changes the default DMA window to "0" in size. To prevent this problem, the Advanced System Management Interface (ASMI) in the service processor can be used to set "I/O Enlarged Capacity" to "0" (which is off), and all the DDW enabled adapters will work on the next IPL. On systems using OPAL firmware, a problem was fixed for a held PSI link in delayed power off during a reset/reload of the service processor. This error makes the service processor do a forced recovery of the PSI link on the next IPL. For this problem, the PSI SRCs and error logs can be ignored as there is no problem in the PSI link. On systems using OPAL firmware, a problem was fixed for intermittent errors in the module autoload function in the ibmpowernv driver. A compatible property "ibm.opal-sensor" was added to implement the fix for a smooth autoload in Linux. On systems using OPAL firmware, a problem was fixed for lost console output for serial consoles during power downs and reboots. If a power down or reboot is detected, the console output buffer is now flushed before proceeding with the operation. On systems using OPAL firmware , an informational message was added that OPAL does not support opal-prd since the processor runtime diagnostics (PRD) are handled by the service processor. On systems using OPAL firmware, a performance problem was fixed in the OPAL hypervisor PCI Host Bridge (PHB) to prevent the PHB L3 cache from retrying defunct entries in the L3 after an MSI end of information (EOI) has been received. The cache line is now flushed after updating the P/Q bits in the priority queue. The situation is improved (and thus performance) by sending a DCBF (Data Cache Block Flush) to force a flush of PHB cache. This improves interrupt performance, reducing latency per interrupt. The improvement will vary by workload. On systems using OPAL firmware, a problem was fixed for the OPAL hypervisor not releasing the PSI link after a power off of the CEC. With the PSI link unavailable, the service processor has to forcibly reclaim it on the next IPL, causing erroneous SRCs and error logs for the PSI link when no problem exists. On systems using OPAL firmware, a problem was fixed for a infinite loop in the boot of a host OS linux kernel. Under rare error conditions in the real time clock, a bad error code returned to the host could cause it to get stuck in an infinite loop. On systems using PowerVM firmware and NovaLink management of the partitions, a problem was fixed for error recovery for the NovaLink partition in cases where it has gone unresponsive with a heartbeat failure. Without the fix, the system would have to be re-IPLed. With the fix, the hypervisor reboots the NovaLink partition to resume normal operations. On PowerVM systems with partitions running Linux, a problem was fixed for intermittent hangs following a Live Partition Mobility (LPM) migration of a Linux partition. A partition migrating from a source system running FW840.00 to a system running any other supported firmware level may become unresponsive and unusable once it arrives on the target system. The problem only affects Linux partitions and is intermittent. Only partitions that have previously been migrated to a FW840.00 system are susceptible to a hang on subsequent migration to another system. If a partition is hung following a LPM migration, it must be rebooted on the target system to resume operations. On systems using OPAL firmware, a problem was fixed that prevented multiple NVIDIA Tesla K80 GPUs from being attached to one PCIe adapter. This prevented using a PCIe attached GPU drawer. This fix increases the PCIe MMIO (memory-mapped I/O) space to 1 TB from a previous maximum of 64 GB per PHB/PCIe slot. On PowerVM systems with dedicated processor partitions with low I/O utilization, the dedicated processor partition may become intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors. On systems using OPAL firmware, a problem was fixed in OPAL to identify the PCI Host Bridge (PHB) on CAPI adapter errors and not always assume PHB0. On systems using OPAL firmware, a problem was fixed in the OPAL gard utility to remove gard records after guarded components have been replaced, Without the fix, Hostboot and the gard utility could be in disagreement on the replaced components, causing some components to still display as guarded after a repair. On systems using PowerVM firmware with partitions with very large number of PCIe adapters, a problem was fixed for partitions that would hang because the partition firmware ran out of memory for the OpenFirmware FCode device drivers for PCIe adapters. With the fix, the hypervisor is able to dynamically increase the memory to accommodate the larger partition configurations of I/O slots and adapters. On PowerVM systems with vNIC adapters, a problem was fixed for doing a network boot or install from the adapter using a VLAN tag. Without the fix, the support is missing for doing a network boot from the VLAN tag from the SMS RIPL menu. On systems using PowerVM firmware, a problem was fixed for a Live Partition Mobility (LPM) migration of a partition with large memory that had a migration abort when the partition took longer than five minutes to suspend. This is a rare problem and is triggered by an abnormally slow response time from the migrating partition. With the fix, the five minute time limit on the suspend operation has been removed. On systems using PowerVM firmware at FW840.00 with an AIX VIO client partition at level 7.1 TL04 SP03 or 7.2 TL01 SP00 or later, a problem was fixed for virtual ethernet adapters adapters with a IPv6 largesend packet (-i.e., data packets of size greater than the maximum transmission unit (MTU)) that hung and/or ran slow because largesend packets were discarded by the hypervisor. For example, telnet and ping commands for the system will be working but as soon as a send of a large packet of data is attempted, the network connection hangs. This firmware fix requires AIX levels 7.1 TL04 SP03 or 7.2 TL01 SP00 or later for the largesend feature to work. The problem can be circumvented by disabling "mtu_bypass" (largesend) on the AIX VIO client. The "mtu_bypass" is disabled by default but many network administrators enable it for a performance gain. To disable " mtu_bypass" on the AIX VIO client, use the following steps: (0) This change may impact existing connections so shut down the affected NIC cards (where X is the interface number) prior to the change (1) Login to AIX VIO client from console as root (2) ifconfig enX down;ifconfig enX detach (3) chdev -l enX -a mtu_bypass=off (4) chdev -l enX -a state=up (5) mkdev -l inet0
SV840_056_056 / FW840.00 12/04/15	Impact: New Severity: New New features and functions NOTE: POWER8 (and later) servers include an “update access key” that is checked when system firmware updates are applied to the system. The initial update access keys include an expiration date which is tied to the product warranty. System firmware updates will not be processed if the GA date of the desired firmware level occurred after the update access key’s expiration date. As these update access keys expire, they need to be replaced using either the Hardware Management Console (HMC) or the Advanced Management Interface (ASMI) on the service processor. Update access keys can be obtained via the key management website: http://www.ibm.com/servers/eserver/ess/index.wss. Support for allowing the PowerVM hypervisor to continue to run when communication between the service processor and platform firmware has been lost and cannot be re-established. A SRC B1817212 may be logged and any active partitions will continue to run but they will not be able to be managed by the management console. The partitions can be allowed to run until the next scheduled service window at which time the service processor can be recovered with an AC power cycle or a pin-hole reset from the operator panel. This error condition would only be seen on a system that had been running with a single service processor (no redundancy for the service processor). Support for a HVDC (180-400 VDC) 1400W power supply in a one plus one or two plus two configuration to support redundancy. Supported in rack models only with F/C EB2N for the S822 (8284-22A), S814(8286-41A), S824(8286-42A), and E850(8404-E8E) models. And F/C EL1D for the S812L(8247-21L), S822L(8247-22L), and S824L(8247-42L) models. Support in the Advanced Systems Management Interface (ASMI) for managing certificates on the service processor with option "System Configuration/Security/Certificate Management". Certificate management includes 1) Generation of Certificate Signing Request (CSR) 2) Download of CSR and 3) Upload of signed certificates. For more information on managing certificates, go to the IBM KnowledgeCenter link for "Certificate Management" (https://www-01.ibm.com/support/knowledgecenter/P8ESS/p8hby/p8hby_securitycertificate.htm). Support for water cooling of the processor module in place of air cooling fins with feature code #ER2C. The PCIe C5 slot carries the water lines so a PCIe adapter cannot be used there when the water cooling is installed. This feature is available for the S822 (8284-22A) and S822L (8247-22L) models only. Support for a High Frequency Trading policy to speed the processors. When this policy is enabled, the processor cores are allowed to run at a higher frequency and voltage for better performance. A new panel was created in the Advanced Systems Management Interface (ASMI) "System Configuration/High Frequency Trading" to enable and disable this policy. In PowerVM mode, this feature applies only to the S822 (8284-22A), S812L (8247-21L), and S822L (8247-22L) models. In OPAL mode, this feature applies to S812L (8247-21L) and S822L (8247-22L) with Ubuntu 14.04.3 bare-metal, Ubuntu 15.10 bare-metal, or RHEL 7.2 LE bare-metal. Support for enhanced power management on PowerKVM systems with memory throttling and in-band power measurement capability. This feature applies to S812L (8247-21L) and S822L (8247-22L) models only. Support for service processor call home of error logs over ethernet (no dial-up modem required). The call home setup is done through an option on the Advanced System Management Interface called "System Service Aids/Call-Home Setup". This feature is only available for systems that are not attached to a management console. For guidance on how to set up the call-home on the service processor, go to the IBM KnowledgeCenter link for "Configuring the call-home policy" (https://www-01.ibm.com/support/knowledgecenter/P8DEA/p8hby/callhomesetup.htm). PowerVM support for Support for Coherent Accelerator Processor Interface (CAPI) adapters. The PCIe3 LP CAPI Accelerator Adapter with F/C #EJ16 is used on the S812L(8247-21L) and S822L (8247-22L) models The PCIe3 CAPI FlashSystem Acclerator Adapter with F/C #EJ17 is used on the S814(8286-41A) and S824(8286-42A) models. The PCIe3 CAPI FlashSystem Accelerator Adapter with F/C #EJ18 is used on the S822(8284-22A), E870(9119-MME), and E880(9119-MHE) models. This feature does not apply to the S824L (8247-42L) model. Support for PCIe3 Expansion Drawer (#EMX0) lower cable failover, using lane reversal mode to bring up the expansion drawer from the top cable. This eliminates a single point of failure by supporting lane reversal in case of problems with the lower cable. Expanded support of Virtual Ethernet Large send from IPv4 to the IPv6 protocol in PowerVM. Support for IBM i network install on a IEEE 802.1Q VLAN. The OS supported levels are IBM i.7.2.TR3 or later. This feature applies only to S814 (8286-41A), S824(8286-42A), E870 (9119-MME), and E880 (9119-MHE) models. Support for PowerVM vNIC with up to six vNIC client adapters for each partition. PowerVM vNIC combines many of the best features of SR-IOV and PowerVM SEA to provide a network solution with options for advanced functions such as Live Partition Mobility along with better performance and I/O efficiency when compared to PowerVM SEA. In addition PowerVM vNIC provides users with bandwidth control (QoS) capability by leveraging SR-IOV logical ports as the physical interface to the network. Note: If more than six vNIC client adapters are used in a partition, the partition will run, as there is no check to prevent the extra adapters, but certain operations such as Live Partition Mobility may fail. Enhanced handling of errors to allow partial data in a Shared Storage Pool (SSP) cluster. Under partial data error conditions, the management console "Manage PowerVM" gui will correctly show the working VIOS clusters along with information about the broken VIOS clusters, instead of showing no data. PowerVM enhanced to support Little Endian (LE) Linux guest OSes with Nvidia Compute Intensive Accelerator (PCIe attached GPU) with F/C EC47 and EC4B. These adapters are only supported on the IBM Power System S824L (8247-42L) model. Little Endian must be used because the Nvidia software stack is only enabled for LE mode. Live Partition Mobility (LPM) was enhanced to allow the user to specify VIOS concurrency level overrides. Support was added for PowerVM hard compliance enforcement of the Power Integrated Facility for Linux (IFL). IFL is an optional lower cost per processor core activation for Linux-only workloads on IBM Power Systems. Power IFL processor cores can be activated that are restricted to running Linux workloads. In contrast, processor cores that are activated for general-purpose workloads can run any supported operating system. PowerVM will block partition activation, LPM and DLPAR requests on a system with IFL processors configured if the total entitlement of AIX and IBMi partitions exceeds the amount of licensed general-purpose processors. For AIX and IBMi partitions configured with uncapped processors, the PowerVM hypervisor will limit the entitlement and uncapped resources consumed to the amount of expensive processors that are currently licensed. Support was added to allow Power Enterprise Pools to convert permanently-licensed (static) processors to Pool Processors using a CPOD COD activation code provided by the management console. Previously, only unlicensed processors were able to become Pool Processors. The management console was enhanced to allow a Live Partition Mobility (LPM) if there is a failed VIOS in a redundant pair. During LPM, if the VIOS is inactive, the management console will use stored configuration information to perform the LPM. The firmware update process from the management console and from in-band OS (except for IBM i PTFs) has been enhanced to download new "Update access keys" as needed to prevent the access key from expiring. This provides an automatic renewal process for the entitled customer. Live Partition Mobility support was added to allow the user to specify a different virtual Ethernet switch on the target server. PowerVM was enhanced to support an AIX Live Update where the AIX kernel is updated without rebooting the kernel. The AIX OS level must be 7.2 or later. Starting with AIX Version 7.2, the AIX operating system provides the AIX Live Update function which eliminates downtime associated with patching the AIX operating system. Previous releases of AIX required systems to be rebooted after an interim fix was applied to a running system. This new feature allows workloads to remain active during a Live Update operation and the operating system can use the interim fix immediately without needing to restart the entire system. In the first release of this feature, AIX Live Update will allow customers to install interim fixes (ifixes) only. For more information on AIX Live Update, go to the IBM KnowledgeCenter link for "Live Update" (https://www-01.ibm.com/support/knowledgecenter//ssw_aix_72/com.ibm.aix.install/live_update_install.htm). The management console has been enhanced to use standard FTP in its firmware update process instead of a custom implementation. This will provide a more consistent interface for the users. Support for setting Power Management Tuning Parameters from the management console (Fixed Maximum Frequency (FMF), Idle Power Save, and DPS Tunables) without needing to use the Advanced System Management Interface (ASMI) on the service processor. This allows FMF mode to be set by default without having to modify any tunable parameters using ASMI. Support for a Corsa PCIe adapter with accelerator FPGA for low latency connection using CAPI (Coherent Accelerator Processor Interface) attached to a FlashSystem 900 using two 8Gb optical SR Fibre Channel (FC) connections. Supported IBM Power Systems for this feature are the following: 1) E880 (9119-MHE) with CAPI Activation feature #EC19 and Corsa adapter #EJ18 Low profile on AIX. 2) E870 (9119-MME) with CAPI Activation feature #EC18 and Corsa adapter #EJ18.Low profile on AIX. 3) S822 (8284-22A) with CAPI Activation feature #EC2A and Corsa adapter #EJ18.Low profile on AIX. 4) S814 (8286-41A) with CAPI Activation feature #EC2A and Corsa adapter #EJ17 Full height on AIX. 5) S824 (8286-42A) with CAPI Activation feature #EC2A and Corsa adapter #EJ17 Full height on AIX. 6) S812L (8247-21L) with CAPI Activation feature #EC2A and Corsa adapter #EJ16 Low profile on Linux. 7) S822L (8247-22L) with CAPI Activation feature #EC2A and Corsa adapter #EJ16 Low profile on Linux. OS levels that support this feature are PowerVM AIX 7.2 or later and OPAL bare-metal Linux Ubuntu 15.10. The IBM FlashSystem 900 storage system is model 9840-AE2 (one year warranty) or 9843-AE2 (three year warranty) at the 1.4.0.0 or later firmware level with features codes #AF23, #AF24, and #AF25 supported for 1.2 TB, 2.9 TB, 5.7 TB modules, respectively. The Digital Power Subsystem Sweep (DPSS) FPGA, used to control P8 fan speeds and memory voltages, was enhanced to support the 840 GA level. This DPSS update is delayed to the next IPL of the CEC and adds 18 to 20 minutes to the IPL. See the "Concurrent Firmware Updates" section above for details. Support for Data Center Manageability Interface (DCMI) V1.5 and Energy Star compliance. DCMI features were added to the Intelligent Platform Management Interface (IPMI) 2.0 implementation on the service processor. DCMI adds platform management capability for monitoring elements such as system temperatures, power supplies, and bus errors. It also includes automatic and manually driven recovery capabilities such as local or remote system resets, power on/off operations, logging of abnormal or "out-of-range‟ conditions for later examination. And It allows querying for inventory information that can help identify a failed hardware unit along with power management options for getting and setting power limits. Note: A deviation from the DCMI V1.5 specification exists for 840.00 for the DCMI Configuration Parameters for DHCP Discovery. Random back-off mode is enabled by default instead of being disabled. The random back-off puts a random variation delay in the DHCP retry interval so that the DHCP clients are not responding at the same time. Disabling the back-off time is not required for normal operations, but if wanted, the system administrator can override the default and disable the random back-off mode by sending the “SET DCMI Configuration Parameters” for the random back-off property of the Discovery Configuration parameter. A value of "0" for the bit means "Disabled". Support for PowerVM NovaLink partition management. The NovaLink architecture enables OpenStack to work seamlessly with PowerVM by providing a direct connection to the PowerVM server rather than proxying through an HMC. This allows for vastly improved scalability (from 30 to 200+ servers), better performance, and better alignment with the OpenStack architecture. NovaLink is enabled via a small software package that runs within a Linux partition (Ubuntu) on a POWER8 host. The following are the NovaLink hardware and software requirements: o POWER8 hardware coupled with System Firmware 840 (or later) o Virtual IO Server 2.2.4 (or later) o Ubuntu Linux 15.10 (ppc64le) (or later) o PowerVC 1.3 (or later) Support for IBM i operating system over Virtual I/O Server (VIOS) on the IBM Power System S822 (8284-22A) server. The IBM i support requires VIOS (no native I/O) and FW840.00. At this level, the S822 supports IBM i 7.2 or IBM i7.1 with special terms and conditions. Technology Refresh 3 or later for IBM i 7.2 or Technology Refresh 11 or later for IBM i 7.1 is required. Multiple IBM i partitions, each up to a maximum of two cores, are supported. The Power S822 software tier is P10. IBM i partitions that access directly attached disk or SSD through VIOS must use 4 k byte sector drives, not 5 xx byte sector drives. The 4 k drives are required for performance reasons. Note: Async or bisync adapters or crypto-cards are not supported under VIOS. Thus IBM i applications that require use of these adapters are not a good fit for the Power S822. IBM i 7.2 clients can connect to a LAN-attached OEM device that has downstream async connections.

4.0 How to Determine The Currently Installed Firmware Level

For HMC managed systems: From the HMC, select Updates in the navigation (left-hand) pane, then view the current levels of the desired server(s).

For standalone system running IBM i without an HMC: From a command line, issue DSPFMWSTS.

For standalone system running IBM AIX without an HMC: From a command line, issue lsmcode.

Alternately, use the Advanced System Management Interface (ASMI) Welcome pane. The current server firmware appears in the top right corner. Example: SV830_yyy.

5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC is not internet-connected you will need to download the new firmware level to a USB flash memory device or ftp server.

6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: SVxxx_yyy_zzz

Where xxx = release level

If the release level will stay the same (Example: Level SV830_040_040 is currently installed and you are attempting to install level SV830_071_040) this is considered an update.
If the release level will change (Example: Level SV830_040_040 is currently installed and you are attempting to install level SV840_050_050) this is considered an upgrade.

HMC Managed Systems:

Instructions for installing firmware updates and upgrades on systems managed by an HMC can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8ha1/updupdates.htm

NovaLink Managed Systems:

A NovaLink managed system does not have a HMC attached and is managed either by PowerVM Novalink or PowerVC using PowerVM Novalink.
Instructions for installing firmware updates and upgrades on systems managed by PowerVM NovaLink can be found at:
http://www.ibm.com/support/knowledgecenter/POWER8/p8eig/p8eig_updating_firmware.htm

HMC and NovaLink Co-Managed Systems:

A co-managed system is managed by HMC and NovaLink, with one of the interfaces in the co-management master mode.
Instructions for installing firmware updates and upgrades on systems co-managed by an HMC and Novalink is the same as above for a HMC managed systems since the firmware update must be done by the HMC in the co-management master mode. Before the firmware update is attempted, one must be sure that HMC is set in the master mode using the steps at the following IBM KnowledgeCenter link for NovaLink co-managed systems:
http://ibm.biz/novalink-kc

Then the firmware updates can proceed with the same steps as for the HMC managed systems:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8ha1/updupdates.htm

Systems not Managed by an HMC or NovaLink:

Power Systems:

Instructions for installing firmware on systems that are not managed by an HMC can be found at:
http://www-01.ibm.com/support/knowledgecenter/8286-42A/p8ha5/fix_serv_firm_kick.htm

Systems running Ubuntu operating system:

If Ubuntu will be used to update the system firmware, please follow these instructions to extract the installable binary and update/upgrade the firmware:

1) Download the .gz (tarball) from Fix Central to your Ubuntu system (ie, to /tmp/fwupdate).

2) Extract the .gz file to /tmp/ on the Ubuntu system:

Example:
tar -xzf /tmp/fwupdate/01SV840_075_048.tar.gz -C /tmp/fwupdate

3) Use update_flash -v -f <extracted file name> to verify the package.

4) Update your firmware using update_flash:

/usr/sbin/update_flash -f <extracted file name>

System will reboot during the firmware update. When the system reaches Ubuntu run-time state, you can then commit or reject the firmware update:
Commit: /usr/sbin/update_flash -c
Reject: /usr/sbin/update_flash -r

IBM i Systems:

Refer to "IBM i Support: Recommended Fixes":
http://www-912.ibm.com/s_dir/slkbase.nsf/recommendedfixes

When ordering firmware for IBM i Operating System managed systems from Fix Central, choose "Select product", under Product Group specify "System i", under Product specify "IBM i", then Continue and specify the desired firmware PTF accordingly.

7.0 Firmware History

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/SV-Firmware-Hist.html

8.0 Change History

Date	Description
November 27, 2017	- Fix Description update for SV840_168 / FW840.50.
August 08, 2017	- Fix Description update for firmware level: SV840_168_056 / FW840.50. One of the fixes requires a re-IPL of the system to activate but has not been marked as deferred. This is a fix for improved link stability for the PCIe expansion drawer (F/C #EMX0).
April 26, 2017	- Added fix description for SV840_079 / FW840.10 concerning systems running PowerVM firmware at FW840.00 with an AIX VIO client partition.