Power7 System Firmware

Applies to: 9119-FHB

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for Power 795 (9119-FHB) Servers only.

The firmware level in this package is:


1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update.  If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

Note: Due to issues seen when using ASM at certain HMC levels (see section 2.0 Important Information), the Minimum and Recommended HMC Code level for this firmware is listed below:

HMC V7 R7.9.0 Service Pack 2  (PTF MH01451) with ifix (PTF MH01571) or higher.


Important: To prevent vulnerability to security issues, the HMC should be updated to the above recommended level,  prior to installing this server firmware level.

For information concerning HMC releases and the latest PTFs,  go to the following URL to access Fix Central.
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTE: You must be logged in as hscroot in order for the firmware installation to complete correctly.

2.0 Important Information

HMC Note:

When attempting to open ASM on a local HMC, an empty/blank window is opened and the logon screen does not appear. The problem affects only ASM to POWER7 servers at the firmware levels listed below.  The problem does not affect accessing ASM from a remote HMC session or  logging in directly from a PC browser.
Versions impacted:
HMC V7R7.1.0 and later
Power 7 Servers: 780 (all levels) or later, 770_062 or later, 760_078 or later
Circumvention:
Use a remote HMC login session or connect a PC directly to the FSP.
Fix:
A fix is planned for HMC 7.7.7 and 7.7.8.

Special Instructions for Upgrading to Server Firmware AH760:
Note: If the dual HMC is not disconnected prior to the upgrade, the upgrade will fail shortly after the "retrieving updates" stage with the following error:

HSCF0999 - Disconnect or power-off the sibling management console(s) from the following list and retry the update. After the update is complete, reconnect or power-on the sibling.
The requested update level can not be applied on the following server from this management console (<HMC performing the upgrade>) while the server is managed by multiple management consoles. management console(s).
<server MTMS>: Sibling console(s)
- On the dual HMC, select HMC Management, then the Shut Down and Restart task.  
- On the Shutdown or Restart panel select Shutdown HMC and click OK. 
- If the HMC is in a remote or "lights out" data center then the HMC can be disconnected from the server and frame. 

  See the following document for detailed information: http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/650380499



ECA Info:
Before upgrading your system from AH720 to AH730/AH760 firmware release, contact your authorized provider and ask about ECA 256 and ECA 303, as hardware may have to be upgraded.

SPPL NOTE:
In some previous firmware releases, the system firmware was not properly enforcing the system partition processor limit (SPPL) attribute for shared processor partitions.  This service pack fixes that enforcement to ensure that shared processor partitions comply with the limit for virtual processors when the SPPL setting is 24 or 32.

You will be affected by this change if you have the following configuration:
   - 795 class server (model 9119-FHB)
   - The server has 3 or fewer books, or the server has 4 or more books and the SPPL attribute is set to 24 or 32.
   - The server has 24 processor cores per book and you have configured more than 24 virtual processors for a shared processor partition.
   - The server has 32 processor cores per book and you have configured more than 32 virtual processors for a shared processor partition.

After this service pack is installed, the behavior of the shared processor partitions that exceed the SPPL attribute will change as follows:

- Partition activation:
   - Partitions will continue to boot and reboot successfully unless the minimum number of virtual processors is greater than the SPPL.
   - Partitions that are activated will limit the number of active virtual processors to no more than the SPPL limit.

- Partition configuration:
   - Errors that are logged when the SPPL is exceeded can result in HMC errors HSCLA4D6 and HSC0A4D6.
   - Attempts to change the number of virtual processors or entitled processing units via a profile or dynamic LPAR change will be subject to the SPPL setting of 24 or 32.
      For example, if the SPPL is set to 32 and your shared processor partition is configured with 40 virtual processors,   you must reduce the number of virtual processors to 32 or fewer for the change to be successful.
   - If you create a new shared processor partition, the number of virtual processors must not exceed the SPPL value.

- Partition mobility:
   -  A partition must comply with the SPPL of the target server.

- Partition hibernation (suspend/resume):
   - If you have suspended partitions that have exceeded the SPPL limit and install this service pack, you will not be able to successfully resume those suspended
     partitions.  You should ensure all suspended partitions comply with (have virtual processors fewer than or equal to) the new SPPL limit before installing this service pack.

Downgrading firmware from any given release level to an earlier release level is not recommended.
If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:
Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
  http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/p7hat/iphatlparmemory.htm


3.0 Firmware Information and Description

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed.  Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01AHXXX_YYY_ZZZ

NOTE: Values of service pack and last disruptive service pack level (YYY and ZZZ) are only unique within a release level (XXX). For example, 01AH330_067_045 and 01AH340_067_053 are different service packs.

An installation is disruptive if:

Example: Currently installed release is AH330, new release is AH340 Example: AH330_120_120 is disruptive, no matter what level of AH330 is currently installed on the system Example: Currently installed service pack is AH330_120_120 and new service pack is AH330_152_130

An installation is concurrent if:

Example: Currently installed service pack is AH330_126_120, new service pack is AH330_143_120.

 
Filename Size Checksum
01AH760_089_043.rpm 51896633
52409
   
Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01AH760_089_043.rpm

AH760
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html
AH760_089_043 / FW760.51

04/16/15
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.

    A problem was corrected for a defect in an earlier service pack (AH760_087) that potentially caused an undetected corruption of firmware when the fix was concurrently activated. If the earlier service pack(AH760_087) was concurrently installed, a platform IPL will mitigate potential future exposure to the problem.
System firmware changes that affect certain systems
  • On systems with redundant service processors and unlicensed cores, a problem was fixed with firmware update to prevent SRC B170B838 errors on unlicensed cores after an administrative failover (AFO) to the backup service processor.
AH760_087_043 / FW760.50

01/12/15
Impact: Security         Severity:  HIPER

New Features and Functions

  • Support was added for using the Mellanox ConnectX-3 Pro 10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.

System firmware changes that affect all systems

  • A problem was fixed that caused an intermittent loss of TTY serial port access to the Advanced System Management Interface (ASMI) after a power off of the system.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
  • A problem was fixed to prevent a recoverable processor clock error from falsely calling out processor chip FRUs with a SRC B181E550 error log.  Only the predictive error SRC B158CC62 for the oscillator chip should have been reported.
  • A problem was fixed that caused a "code accept" during a concurrent firmware installation from the management console to fail with SRC E302F85C.
  • A problem was fixed for memory relocation failing during a partition reboot with SRC B700F103 logged.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was fixed that caused the date and time to be incorrect in AIX if a partition is remotely restarted on a different system from the one on which it was hibernated.
  • A problem was fixed that caused the Utility COD display of historical usage data to be truncated on the management console.
  • A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These DMA size adjustments were marked as UE previously for a condition that is normal.
  • A security problem was fixed in the service processor Lighttpd web server that allowed denial of service vulnerabilities for the Advanced System Manager Interface (ASMI).  The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2011-4362 and CVE-2012-5533.
  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor of the CEC.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor of the CEC.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
  • A security problem was fixed for the Network Time Protocol (NTP) client that allowed remote attackers to execute arbitrary code via a crafted packet containing an extension field.  The Common Vulnerabilities and Exposures issue number is CVE-2009-1252.
  • A security problem was fixed for the Network Time Protocol (NTP) client for a buffer overflow that allowed remote NTP servers to execute arbitrary code via a crafted response.  The Common Vulnerabilities and Exposures issue number is CVE-2009-0159.
  • A problem was fixed for a Live Partition Mobility (LPM) suspend and transfer of a partition that caused the time of day to skip ahead to an incorrect value on the target system.  The problem only occurred when a suspended partition was migrated to a target CEC that had a hypervisor time that was later than the source CEC.
  • A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
  • A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings.  This could be used to disclose portions of the program memory on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508.
  • Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests.  A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509.
  • A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510.
  • A security problem in GNU Bash was fixed to prevent arbitrary commands hidden in environment variables from being run during the start of a Bash shell.  Although GNU Bash is not actively used on the service processor, it does exist in a library so it has been fixed.  This is IBM Product Security Incident Response Team (PSIRT) issue #2211.  The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2014-6271, CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.
  • A security problem was fixed in the Advanced System Management Interface (ASMI) to block click-jacking attempts. This prevents framing of the original ASMI page with a top layer on it with dummy buttons that could trick the user into clicking on a link.
  • A problem was fixed for the Advanced System Manager Interface (ASMI) that allowed possible cross-site request forgery (CSRF) exploitation of the ASMI user session to do unwanted tasks on the service processor.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
  • A problem was fixed to prevent a hypervisor to service processor surveillance heartbeat time-out error and host-initiated reset/reload of the service processor.  This problem was caused by an errant long delay in writing an error log entry, resulting in a block of the heartbeat message and subsequent time-out.
  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Dowgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options.  The Security Configuration options of "Disabled", "Default", and "Enabled" for SSLv3 determines the level of protection from POODLE.  The management console also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.7.0 SP4 with PTF MH01482) to eliminate all vulnerability to POODLE and allow use of option 1 "Disabled" as shown below:
    -1) Disabled:  This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack.  All clients must be capable of using TLS to make the secured connections to the service processor to use this option.  This requires the management console be at a minimum level of HMC V7 R7.7.0 SP4 with POODLE PTF MH01482.
    -2) Default:  This medium level of security protection disables SSLv3 for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections.  But the legacy management consoles are allowed to use SSLv3 to connect to the service processor.  This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels.  Running a non-POODLE compliant HMC to a service processor in  "Default" mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI.
    -3) Enabled:  This basic level of security protection enables SSLv3 for all service processor client connection.  It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections.  This option is intended for customer sites on protected internal networks that have a large investment in legacy hardware that need SSLv3 to make browser and HMC connection to the service processor.  The level of POODLE protection actually achieved in "Enabled" mode is determined by the percentage of clients that are at the POODLE fix compliant levels.
System firmware changes that affect certain systems
  • HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem was fixed that may affect shared processor partitions where there is a mixture of dedicated and shared processor partitions with virtual IO connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting, between them.  In high availability cluster environments this problem may result in a split brain scenario.
  • On a system with partitions with redundant Virtual Asynchronous Services Interface (VASI) streams,  a problem was fixed that caused the system to terminate with SRC B170E540.  The affected partitions include Active Memory Sharing (AMS), encapsulated state partitions, and hibernation-capable partitions.  The problem is triggered when the management console attempts to change the active VASI stream in a redundant configuration.  This may occur due to a stream reconfiguration caused by Live Partition Mobility (LPM); reconfiguring from a redundant Paging Service Partition (PSP) to a single-PSP configuration; or conversion of a partition from AMS to dedicated memory.
  • On systems that have Active Memory Sharing (AMS) partitions and deduplication enabled, a problem was fixed for not being able to resume a hibernated AMS partition.  Previously,  resuming a hibernated AMS partition could give checksum errors with SRC B7000202 logged and the partition would remain in the hibernated state.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems with a partition that has a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to  Power7 or Power6 using Live Partition Mobility (LPM), a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000.  The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems.
  • On systems using IPv6 addresses, the firmware was enhanced to reduce the time it take to install an operating system using the Network Installation Manager (NIM).
  • On systems in IPv6 networks, a  problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.
  • For systems with a IBM i load source disk attached to an Emulex-based fibre channel adapter such as F/C #5735, a problem was fixed that caused an IBM i load source boot to fail with SRC B2006110 logged and a message to the boot console of  "SPLIT-MEM Out of Room".  This problem occurred for load source disks that needed extra disk scans to be found, such as those attached to a port other than the first port of a fibre channel adapter (first port requires fewest disk scans).
  • A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.
  • A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations:
    - Using concurrent maintenance to perform a hot repair of a node.
    - Reducing the size of an Active Memory Sharing (AMS) pool.
    - On systems using mirrored memory, using the memory mirroring optimization tool.
    - Performing a Dynamic Platform Optimizer (DPO) operation.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed for concurrent maintenance operations to limit hardware retries on failed hardware so that it can be concurrently repaired.
  • A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer.
AH760_079_043 / FW760.41

06/24/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
AH760_078_043 / FW760.40

01/22/14
Impact: Availability    Severity: SPE

New Features and Functions

  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
  • Support was dropped for Secured Socket Layer (SSL) protocol version 2 and SSL weak and medium cipher suites in the service processor web server (Lighttpd) .  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added in Advanced System Management Interface (ASMI) "System Configuration/Firmware Update Policy" menu to detect and display the appropriate Firmware Update Policy (depending on whether system is HMC managed) instead of requiring the user to select the Firmware Update Policy.  The menu also displays the "Minimum Code Level Supported" value.

System firmware changes that affect all systems

  • A problem was fixed that caused a memory leak of 50 bytes of service processor memory for every call home operation.  This could potentially cause an out of memory condition for the service processor when running over an extended period of time without a reset.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that prevented a GX adapter from being added to an empty slot for location code P1-C3 using a MES add when the system was powered off.  The P1-C3 location code was not provided as candidate location for the GX add in the Service Focal Point on the management console.
  • A problem was fixed that prevented the clear of the file on the service processor that contains partition data when the Advanced Management System Interface (ASMI) was used to "Reset Server Firmware Settings" from the Factory Configuration menu.  This problem caused the HMC managed system to go into the recovery state.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • The firmware was enhanced to display on the management console the correct number of concurrent live partition mobility (LPM) operations that is supported.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • A problem was fixed that caused a SRC B7006A72 calling out the adapter and the I/O Planar.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • A problem was fixed during resource dump processing that caused a read of an invalid system memory address and a SRC B181C141.  The invalid memory reference resulted from the service processor incorrectly referencing memory that had been relocated by the hypervisor.
  • DEFERRED:  A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • A problem was fixed that prevented a HMC-managed system from being converted to manufacturing default configuration (MDC) mode when the management console command "lpcfgop -m <server> -o clear" failed to create the default partition.  The management console went to the incomplete state for this error.
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem during a dynamic logical partitioning (DLPAR) memory operation was fixed that caused BA250020 SRCs to be logged unnecessarily for the AIX partition.  There were no memory errors for the partition.
  • DEFERRED: A problem was fixed that caused a system checkstop during hypervisor time keeping services.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • A problem was fixed that caused frequent SRC B1A38B24 error logs with a call home every 15 seconds when service processor network interfaces were incorrectly configured on the same subnet.  The frequency of the notification of the network subnet error has been reduced to once every 24 hours.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • A problem was fixed that caused Capacity on Demand (COD) to truncate On/Off "Resource Days Enabled" for users with extended amounts.
System firmware changes that affect certain systems
  • On systems running Dynamic Platform Optimizer (DPO) ,  a problem was fixed that caused an incorrect placement of dedicated processors for partitions larger than a single chip.  When this occurs, the performance is impacted over what would have been gained with proper placement.
  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • On systems running Dynamic Platform Optimizer (DPO) with no free memory,  a problem was fixed that caused the management console lsmemopt command to report the wrong status of completed with no partitions affected.  It should have indicated that DPO failed due to insufficient free memory.  DPO can only run when there is free memory in the system.
  • On systems with partitions using physical shared processor pools, a problem was fix that caused partition hangs if the shared processor pool was reduced to a single processor.
  • On systems with turbo-core enabled that are a target of logical partition migration (LPM),  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.
  • On systems involved in a series of consecutive logical partition migration (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20.  Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure).
  • A problem was fixed in the run-time abstraction services (RTAS) extended error handling (EEH) for fundamental reset that caused partitions to crash during adapter updates.  The fundamental reset of adapters now returns a valid return code.  The adapter drivers using fundamental reset affected by this fix are the following:
    o QLogic PCIe Fibre Channel adapters (combo card)
    o IBM PCIe Obsidian
    o Emulex BE3-based ethernet adapters
    o Broadcom-based PCIe2 4-port 1Gb ethernet
    o Broadcom-based FlexSystem EN2024 4-port 1Gb ethernet for compute node

  • On systems that have configurations that support all the types of Capacity On Demand (Perm/OnOff/Trial/Utility-Processor,Perm.OnOff/Trial-Memory), a problem was fixed to eliminate repeated B7005300 error logs caused by hypervisor asset protection processes using slightly more memory than promised.
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
  • On systems running Dynamic Platform Optimizer (DPO) with one or more unlicensed processors, a problem was fixed where the system performance was significantly degraded during the DPO operation.  The amount of performance degradation was more for systems with larger numbers of unlicensed processors.
  • On systems with one memory clock deconfigured, a problem was fixed where the system failed to IPL using the second memory clock with SRCs B158CC62 and B181C041 logged.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • A problem was fixed that caused a concurrent processor exchange to terminate during node deactivation with SRC B1814616.
  • A problem was fixed that caused SRC B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
  • A problem was fixed that caused Capacity on Demand (COD) "Out of Compliance" messages during concurrent maintenance operations when the system was actually in compliance for the licensed amount of resources in use.
AH760_069_043 / FW760.31

07/25/13
Impact: Performance    Severity: ATT

System firmware changes that affect certain systems

  • On systems running Dynamic Platform Optimizer (DPO) ,  a problem was fixed that caused an incorrect placement of dedicated processors for partitions larger than a single chip.  When this occurs, the performance is impacted over what would have been gained with proper placement.
AH760_068_043 / FW760.30

06/24/13
Impact: Availability    Severity: SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor dump to be generated with SRC B18187DA "NETC_RECV_ER" logged.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant data chip on the anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary data chip on the anchor card has been corrupted.
  • A problem was fixed that caused a migrated partition to have to rebooted on the target system.
  • A problem was fixed that caused a performance loss after a configuration change, such as un-licensing a processor, because the Hypervisor is unable to dispatch a partition to a shared processor.
  • A problem was fixed that may cause inaccurate processor utilization reporting.
  • A problem was fixed that caused erroneous A70047xx SRCs to be logged that called out the Anchor (VPD) card.   This led to unnecessary replacements of the Anchor card.
System firmware changes that affect certain systems
  • On systems running Active Memory Sharing (AMS) partitions, a problem was fixed that may arise due to the incorrect handling of a return code in an error path during the logical partition migration (LPM) of an AMS partition.
  • A problem was fixed that caused the On/Off Capacity on Demand (CoD) entitlement to erroneously go to zero when the system firmware was upgraded from the AH730 release to the AH760 release.
  • On systems running Dynamic Platform Optimization (DPO), a problem was fixed that caused the current DPO score for a partition to be incorrect.  When this occurs, it looks like DPO would not improve performance when in fact it would improve the performance.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to time out and fail when the system controller (service processor) gets bogged down.
  • On systems in which there are no processors in the shared processor pool, a problem was fixed that caused the Hypervisor to become unresponsive (the service processor starts logging time-out errors against the Hypervisor, and the HMC can no longer talk to the Hypervisor) during a concurrent hot add/repair maintenance operation.  SRC B182953C will also be called home.
AH760_062_043 / FW760.20

02/27/13
Impact: Availability    Severity: SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a card (and its children) that was removed after the system was booted to continue to be listed in the guard menus in the Advanced System Management Interface (ASMI).
  • A problem was fixed that caused a firmware update to fail with SRC B1818A0F.
  • A problem was fixed that caused a partition to become unresponsive when the AIX command "update_flash -s" is run.
  • A problem was fixed that caused the service processor (or system controller) to crash when it boots from the new level during a concurrent firmware installation.
  • A problem was fixed that caused SRC B1812A40 to be erroneously logged; a memory DIMM  and the symbolic FRU AMBTEMP were listed in the FRU list.
System firmware changes that affect certain systems
  • On systems running iSCSI, a problem was fixed that caused pinging from the iSCSI menu in the System Management Services (SMS) to fail.
  • On a partition with a large number of potentially bootable devices, a problem was fixed that caused the partition to fail to boot with a default catch, and SRC BA210000 may also be logged.
  • On a system running a Live Partition Mobility (LPM) operation, a problem was fixed that caused the partition to successfully appear on the target system, but hang with a 2005 SRC.
  • On a partition with the virtual Trusted Platform Module (vTPM) enabled, a problem was fixed that caused errors to occur when the memory assigned to the partition was changed.
  • On a partition with the virtual Trusted Platform Module (vTPM) enabled, a problem was fixed that caused the partition to stop functioning after certain operations.  When this problem occurs, the client partition may not power off.
  • On a system using the modem/serial port on the service processor, a problem was fixed that caused a service processor dump (with SRC B181EF88 logged) to be erroneously generated when the connection was dropped.
  • On systems that support all types of both memory and processor Capacity on Demand (CoD) operations, and on which CoD operations are frequently performed, the firmware was enhanced to reduce the number of informational SRC B7005300 logged.
  • A problem was fixed that caused the sibling system controller state to show up as "unknown" in the service processor error log if a code synchronization problem was detected after a system controller was replaced.
  • On a partition with the virtual Trusted Platform Module (vTPM) enabled, a problem was fixed that caused SRC B200F00F to be logged when the partition was resumed after hibernation.
  • On a partition with the virtual Trusted Platform Module (vTPM) enabled, the Hypervisor was enhanced to display (on the management console) the minimum maximum memory required to support the partition.
  • On systems running AIX or Linux, a problem was fixed that caused a partition to fail to boot with SRC CA260203.  This problem also can cause concurrent firmware updates to fail.
  • On systems with TurboCore processors and unlicensed processors, a problem was fixed that caused the output of the AIX lparstat command for "Active Physical CPUs in system" to be incorrect.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail due to an FSP reset.
  • On large system configurations, a problem was fixed that caused concurrent hot add/repair maintenance operations to fail.
  • On large system configurations running hundreds of partitions, a problem was fixed that caused the managed system to go to the incomplete state on the HMC during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail if a memory channel failure on the CEC was followed by a service processor reset/reload.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail when run after a service processor reset/reload.
AH760_043_043

11/21/12
Impact:  New      Severity:  New

New Features and Functions

  • Support for the GX++ dual-port 10GB Ethernet/Fibre Channel over Ethernet (FCoE) adapter, feature code (F/C) EN22.
  • Support for the GX++ dual-port Fibre Channel adapter, feature code (F/C) EN23.
  • Support for 0.05 processor granularity.
  • Support for 64GB DIMMs.
  • Support for Dynamic Platform Optimizer (DPO).
  • The Hypervisor was enhanced to enforce broadcast storm prevention between the primary and backup SEAs (Shared Ethernet Adapters).  This fix requires VIOS 2.2.2.0 or later on all VIOS partitions with SEA devices.

    Additional Requirements:
  • FC EB33, available at no charge, needs to be ordered for DPO
  • Partitions included in DPO optimization need to running an affinity aware version of the operating system OR need to be restarted after DPO completes. If not, partitions can be excluded from participation in optimization through a command line option on the optmem  command.

    Notes:
    - Affinity aware operating system (OS) levels that support DPO:
                          ◦ AIX 6.1 TL8 or later
                          ◦ AIX 7.1 TL2 or later
                          ◦ VIOS 2.2.2.0
                          ◦ IBM i 7.1 PTF MF56058
    - No integrated support for DPO in current RHEL or SUSE Enterprise versions. Linux partitions can either be excluded from participation in optimization or restarted after DPO operation completes.

4.0 How to Determine Currently Installed Firmware Level

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane. It appears in the top right corner. Example: AH760_123.

5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC or SDMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: AHXXX_YYY_ZZZ

Where XXX = release level

Instructions for installing firmware updates and upgrades can be found at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7ha1/updupdates.htm

IBM i Systems:
See "IBM Server Firmware and HMC Code Wizard":
http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/408316083

NOTE:
For all systems running with the IBM i Operating System, the following IBM i PTFs must be applied to all IBM i partitions prior to installing AH760_089:
These PTFs can be ordered through Fix Central.

When ordering firmware for IBM i Operating System managed systems from Fix Central, choose "Select product", under Product Group specify "System i", under Product specify "IBM i", then Continue and specify the desired firmware PTF accordingly.

7.0 Firmware History

The complete Firmware Fix History for this Release level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html

8.0 Change History

Date
Description
October 19, 2015 - CHARM fix description update for AH760_087_043 / FW760.50