Power7 System Firmware

Applies to: 9119-FHB

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for Power 795 (9119-FHB) Servers only.

The firmware level in this package is:


1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update.  If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

The Minimum HMC Code level for this firmware is:  HMC V7 R7.8.0 (PTF MH01377) with mandatory efix (PTF MH01388).

Although the Minimum HMC Code level for this firmware is listed above,  HMC level V7 R7.8.0 Service Pack 1 (MH01397) with Mandatory efixs (PTF MH01416 and MH01423), or higher are suggested for this firmware level.

For information concerning HMC releases and the latest PTFs,  go to the following URL to access Fix Central.
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTE: You must be logged in as hscroot in order for the firmware installation to complete correctly.

2.0 Important Information


Special Instructions for Upgrading to Server Firmware AH780:
Note: If the dual HMC is not disconnected prior to the upgrade, the upgrade will fail shortly after the "retrieving updates" stage with the following error:

HSCF0999 - Disconnect or power-off the sibling management console(s) from the following list and retry the update. After the update is complete, reconnect or power-on the sibling.
The requested update level can not be applied on the following server from this management console (<HMC performing the upgrade>) while the server is managed by multiple management consoles. management console(s).
<server MTMS>: Sibling console(s)
- On the dual HMC, select HMC Management, then the Shut Down and Restart task.  
- On the Shutdown or Restart panel select Shutdown HMC and click OK. 
- If the HMC is in a remote or "lights out" data center then the HMC can be disconnected from the server and frame. 

 
See the following document for detailed information: http://www-01.ibm.com/support/docview.wss?uid=nas8N1010700


ECA Info:
Before upgrading your system from AH720 to AH730/AH760/AH780 firmware release, contact your authorized provider and ask about ECA 256 and ECA 303, as hardware may have to be upgraded.

SPPL NOTE:
In some previous firmware releases, the system firmware was not properly enforcing the system partition processor limit (SPPL) attribute for shared processor partitions.  This service pack fixes that enforcement to ensure that shared processor partitions comply with the limit for virtual processors when the SPPL setting is 24 or 32.

You will be affected by this change if you have the following configuration:
   - 795 class server (model 9119-FHB)
   - The server has 3 or fewer books, or the server has 4 or more books and the SPPL attribute is set to 24 or 32.
   - The server has 24 processor cores per book and you have configured more than 24 virtual processors for a shared processor partition.
   - The server has 32 processor cores per book and you have configured more than 32 virtual processors for a shared processor partition.

After this service pack is installed, the behavior of the shared processor partitions that exceed the SPPL attribute will change as follows:

- Partition activation:
   - Partitions will continue to boot and reboot successfully unless the minimum number of virtual processors is greater than the SPPL.
   - Partitions that are activated will limit the number of active virtual processors to no more than the SPPL limit.

- Partition configuration:
   - Errors that are logged when the SPPL is exceeded can result in HMC errors HSCLA4D6 and HSC0A4D6.
   - Attempts to change the number of virtual processors or entitled processing units via a profile or dynamic LPAR change will be subject to the SPPL setting of 24 or 32.
      For example, if the SPPL is set to 32 and your shared processor partition is configured with 40 virtual processors,   you must reduce the number of virtual processors to 32 or fewer for the change to be successful.
   - If you create a new shared processor partition, the number of virtual processors must not exceed the SPPL value.

- Partition mobility:
   -  A partition must comply with the SPPL of the target server.

- Partition hibernation (suspend/resume):
   - If you have suspended partitions that have exceeded the SPPL limit and install this service pack, you will not be able to successfully resume those suspended
     partitions.  You should ensure all suspended partitions comply with (have virtual processors fewer than or equal to) the new SPPL limit before installing this service pack.

Downgrading firmware from any given release level to an earlier release level is not recommended.
If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:
Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
  http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/p7hat/iphatlparmemory.htm


3.0 Firmware Information and Description

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed.  Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01AHXXX_YYY_ZZZ

NOTE: Values of service pack and last disruptive service pack level (YYY and ZZZ) are only unique within a release level (XXX). For example, 01AH330_067_045 and 01AH340_067_053 are different service packs.

An installation is disruptive if:

Example: Currently installed release is AH330, new release is AH340 Example: AH330_120_120 is disruptive, no matter what level of AH330 is currently
installed on the system Example: Currently installed service pack is AH330_120_120 and
new service pack is AH330_152_130

An installation is concurrent if:

Example: Currently installed service pack is AH330_126_120,
new service pack is AH330_143_120.

 
Filename Size Checksum
01AH780_054_040.rpm 55314813
21542
   
Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01AH780_054_040.rpm

AH780
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html
AH780_054_040 / FW780.02

04/18/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Montgomery ladder implementation for the ECDSA (Elliptic Curve Digital Signature Algorithm) to protect sensitive information from being obtained with a flush and reload cache side-channel attack to recover ECDSA nonces from the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0076.  The stolen ECDSA nonces could be used to decrypt the SSL sessions and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability.  The stolen private keys could be used to decrypt the SSL sessions and and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
AH780_050_040 / FW780.01

03/10/14
Impact:  Data      Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Non-Pervasive:  A problem was fixed for a potential silent data corruption issue that may occur when a Live Partition Mobility (LPM) operation is performed from a system (source system) running a firmware level earlier than AH780_040 or AM780_040 to a system (target system) running AH780_040 or AM780_040.
AH780_040_040 / FW780.00

12/06/13
Impact:  New      Severity:  New

New Features and Functions

  • Support was added to the Virtual I/O Server (VIOS) for shared storage pool mirroring (RAID-1) using the virtual SCSI (VSCSI) storage adapter to provide redundancy for data storage.
  • Support was added to upgrade the service processor to openssl version 1.0.1 and for compliance to National Institute of Standards and Technologies (NIST) Special Publications 800-131a.  SP800-131a compliance required the use of stronger cryptographic keys and more robust cryptographic algorithms.
  • Support was added to the Management Console command line to allow configuring a shared control channel for multiple pairs of Shared Ethernet Adapters (SEAs).  This simplifies the control channel configuration to reduce network errors when the SEAs are in fail-over mode.
  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
  • Support was added to the Management Console for group-based LDAP authentication.
  • Partition Firmware was enhanced to to be able to recognize and boot from disks formatted with the GUID Partition Table (GPT) format that are capable of being greater than 2TB in size.  GPT is a standard for the layout of the partition table on a physical hard disk, using globally unique identifiers (GUID), that does not have the 2TB limit that is imposed by the DOS partition format.
  • The call home data for every serviceable event of the system was enhanced to include information on every guarded element (processor, memory,I/O chip, etc) and contains the part number and location codes of the FRUs and the service processor de-configuration policy settings.
  • Support for IBM PCIe 3.0 x8 dual 4-port SAS RAID adapter with 12 GB cache with feature code EJ0L and CCIN 57CE.
  • Support for Dynamic Platform Optimizer (DPO) enhancements to show the logical partition current and potential affinity scores.  The Management Console has also been enhanced to show the partition scoring.  The operating system (OS) levels that support DPO:

                ◦ AIX 6.1 TL8 or later
                ◦ AIX 7.1 TL2 or later
                ◦ VIOS 2.2.2.0
                ◦ IBM i 7.1 PTF MF56058
                ◦ Linux RHEL7
                ◦ Linux SLES12

         Note: If DPO is used with an older version of the OS that predates the above levels, either:
                   - The partition needs to be rebooted after DPO completes to optimize placement, or
                   - The partition is excluded from participating in the DPO operation (through a command line option on the "optmem" command that is used to initiate a
                      DPO operation).

  • Support was added to the Management Console and the Virtual I/O Server (VIOS) to provide the capability to to enable and disable individual virtual ethernet adapters from the management console.
  • Support for Management Console logical partition Universally Unique IDs (UUIDs) so that the HMC preserves the UUID for logical partitions on backup/restore and migration.
  • Support for Management Console command line to configure the ECC call home path for SSL proxy support.
  • Support for Management Console to minimize recovery state problems by using the hypervisor and VIOS configuration data to recreate partition data when needed.
  • Support for Management Console to provide scheduled operations to check if the partition affinity falls below a threshold and alert the user that Dynamic Platform Optimizer (DPO) is needed.
  • Support for enhanced platform serviceability to extend call home to include hardware in need of repair and to issue periodic service events to remind of failed hardware.
  • Support for IBM PCIe 3.0 x8 non-caching 2-port SAS RAID adapter with feature code EJ0J. and CCIN 57B4.
  • Support for Virtual I/O Server (VIOS) to support 4K block size DASD as a virtual device.
  • Support for performance improvements for concurrent Live Partition Mobility (LPM) migrations.
  • Support for Management Console to handle all Virtual I/O Server (VIOS) configuration tasks and provide assistance in configuring partitions to use redundant VIOS.
  • Support for Management Console to maintain a profile that is synchronized with the current configuration of the system, including Dynamic Logical Partitioning (DLPAR) changes.
  • Support for Power System Pools allows for the aggregation of Capacity on Demand (CoD) resources, including processors and memory, to be moved from one pool server to any other pool server as needed.
  • Support for a Management Console Performance and Capacity Monitor (PCM) function to monitor and manage both physical and virtual resources.
  • Support for virtual server network (VSN) Phase 2 that delivers IEEE standard 802.1Qbg based on Virtual Ethernet Port Aggregator (VEPA) switching.  This supports the Management Console assignment of the VEPA switching mode to virtual Ethernet switches used by the virtual Ethernet adapters of the logical partitions.  The server properties in the Management Console will show the capability "Virtual Server Network Phase 2 Capable" as "True" for the system.
  • Support for Virtual I/O Server (VIOS) for an IBMi client data connection to a SIS64 device driver backed by VSCSI physical volumes.
  • Support for the Power 795 GX++ 1-port 4X Infiniband QDR adapter with CCIN 2B76 and feature code EN25.
  • Support was dropped for Secured Socket Layer (SSL) protocol version 2 and SSL weak and medium cipher suites in the service processor web server (Ligthttpd) .  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added in Advanced System Management Interface (ASMI) "System Configuration/Firmware Update Policy" menu to detect and display the appropriate Firmware Update Policy (depending on whether system is HMC managed) instead of requiring the user to select the Firmware Update Policy.  The menu also displays the "Minimum Code Level Supported" value.

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor kernel panic on an out-of-memory condition with SRC B181720D when an incorrect MTMS was specified for a frame in the Advanced System Management Interface (ASMI).
  • A problem was fixed that caused a service processor OmniOrb core dump with SRC B181EF88 logged.
  • A problem was fixed that caused the system attention LED to stay lit when a bad FRU was replaced.
  • A problem was fixed that caused a memory leak of 50 bytes of service processor memory for every call home operation.  This could potentially cause an out of memory condition for the service processor when running over an extended period of time without a reset.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a system checkstop during hypervisor time keeping services.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant Anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary Anchor card has been corrupted.
  • A problem was fixed with the Dynamic Platform Optimizer (DPO) that caused memory affinity to be incorrectly reported to the partitions before the memory was optimized.   When this occurs, the performance is impacted over what would have been gained with the optimized memory values.
  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary.
  • A problem was fixed that can cause Anchor (VPD) card corruption and  A70047xx SRCs to be logged.  Note: If a serviceable event  with SRC A7004715 is present or was logged previously, damage to the VPD card may have occurred. After the fix is applied, replacement of the Anchor VPD  card is recommended in order to restored full redundancy.
  • The firmware was enhanced to display on the management console the correct number of concurrent Live Partition Mobility (LPM) operations that is supported.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • A problem was fixed that caused the state of the Host Ethernet Adapter (HEA) port to be reported as down when the physical port is actually up.
  • When powering on a system partition, a problem was fixed that caused the partition universal unique identifier (UUID) to not get assigned, causing a B2006010 SRC in the error log.
  • For the sequence of a reboot of a system partition followed immediately by a power off of the partition, a problem was fixed where the hypervisor virtual service processor (VSP) incorrectly retained locks for the powered off partition, causing the CEC to go into recovery state during the next power on attempt.
  • A problem was fixed that caused an error log generated by the partition firmware to show conflicting firmware levels.  This problem occurs after a firmware update or a Live Partition Mobility (LPM) operation on the system.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.
  • A problem was fixed during resource dump processing that caused a read of an invalid system memory address and a SRC B181C141.  The invalid memory reference resulted from the service processor incorrectly referencing memory that had been relocated by the hypervisor.

System firmware changes that affect certain systems

  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • When switching between turbocore and maxcore mode, a problem was fixed that caused the number of supported partitions to be reduced by 50%.
  • On systems in turbocore mode with unlicensed processors, a problem was fixed that caused an incorrect processor count.  The AIX command lparstat gave too high a value for "Active Physical CPUs in system" when it included unlicensed turbocore processors in the count instead of just counting the licensed processors.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions.
  • On systems running AIX or linux, a hang in a Live Partition Mobility (LPM) migration for remote restart-capable partitions was fixed by adding a time-out for the required paging space to become available.  If after five minutes the required paging space is not available, the start migration command returns a error code of 0x40000042 (PagingSpaceNotReady) to the management console.
  • On systems running Dynamic Platform Optimizer (DPO) with no free memory,  a problem was fixed that caused the Hardware Management System (HMC) lsmemopt command to report the wrong status of completed with no partitions affected.  It should have indicated that DPO failed due to insufficient free memory.  DPO can only run when there is free memory in the system.
  • On systems with partitions using physical shared processor pools, a problem was fix that caused partition hangs if the shared processor pool was reduced to a single processor.
  • On a system running a Live Partition Mobility (LPM) operation, a problem was fixed that caused the partition to successfully appear on the target system, but hang with a 2005 SRC.
  • On systems using IPv6 addresses, the firmware was enhanced to reduce the time it take to install an operating system using the Network Installation Manager (NIM).
  • On systems managed by a management console, a problem was fixed that caused a partition to become unresponsive when the AIX command "update_flash -s" is run.
  • On systems with turbo-core enabled that are a target of Live Partition Mobility (LPM),  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.

Concurrent hot add/repair maintenance firmware fixes

  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • The firmware was enhanced to reduce the number of concurrent hot add/repair maintenance failures due to the operation timing out on fully-configured systems.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail if a memory channel failure on the CEC was followed by a service processor reset/reload.
  • A problem was fixed that caused SRC B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail with SRC B181C350.
  • A problem was fixed that prevented the I/O slot information from being presented on the management console after a concurrent node repair.
  • A problem was fixed that caused Capacity on Demand (COD) "Out of Compliance" messages during concurrent maintenance operations when the system was actually in compliance for the licensed amount of resources in use.

4.0 How to Determine Currently Installed Firmware Level

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane. It appears in the top right corner. Example: AH780_123.

5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC or SDMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: AHXXX_YYY_ZZZ

Where XXX = release level

Instructions for installing firmware updates and upgrades can be found at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7ha1/updupdates.htm

IBM i Systems:
See "IBM Server Firmware and HMC Code Wizard":
http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/408316083

NOTE:
For all systems running with the IBM i Operating System, the following IBM i PTFs must be applied to all IBM i partitions prior to installing AH780_054:
These PTFs can be ordered through Fix Central.

7.0 Firmware History

The complete Firmware Fix History for this Release level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html