Power7 High-End System Firmware
Applies to: 9119-FHB
This document provides information about the installation of
Licensed
Machine or Licensed Internal Code, which is sometimes referred to
generically
as microcode or firmware.
Contents
1.0 Systems Affected
This package provides firmware for Power 795 (9119-FHB) Servers
only.
The firmware level in this package is:
1.1 Minimum HMC Code Level
This section is intended to describe the "Minimum HMC Code Level"
required by the System Firmware to complete the firmware installation
process. When installing the System Firmware, the HMC level must be
equal to or higher than the "Minimum HMC Code Level" before starting
the system firmware update. If the HMC managing the server
targeted for the System Firmware update is lower than the "Minimum HMC
Code Level" the firmware update will not proceed.
The
Minimum HMC Code level for
this firmware is: HMC V7 R7.2.0
(PTF MH01233 or MH01234)
and PTF MH01246 (Service Pack 1).
Although the Minimum HMC Code level for this firmware is listed
above, HMC level V7 R7.2.0 with PTF MH01276 (Service Pack 3), or
higher is
suggested for
this
firmware level.
For information concerning HMC releases and the latest
PTFs,
go
to the following URL to access Fix Central.
http://www-933.ibm.com/support/fixcentral/
For specific fix level
information on key components of IBM
Power Systems running the AIX, IBM i and Linux operating systems, we
suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home
NOTES:
-You must be logged in as hscroot in order for the
firmware
installation to complete correctly.
- Systems Director Management Console (SDMC) does not support this
System Firmware level.
2.0 Important Information
Prior to this service
pack, the system firmware was not properly enforcing the system
partition processor limit (SPPL) attribute for shared processor
partitions. This service pack fixes that enforcement to ensure
that shared processor partitions comply with the limit for virtual
processors when the SPPL setting is 24 or 32.
You will be affected by this
change if you have the following configuration:
- 795 class server
(model 9119-FHB)
- The server has 3 or
fewer books, or the server has 4 or more books and the SPPL attribute
is set to 24 or 32.
- The server has 24
processor cores per book and you have configured more than 24 virtual
processors for a shared processor partition.
- The server has 32
processor cores per book and you have configured more than 32 virtual
processors for a shared processor partition.
After this service pack is
installed, the behavior of the shared processor partitions that exceed
the SPPL attribute will change as follows:
- Partition activation:
- Partitions will
continue to boot and reboot successfully unless the minimum number of
virtual processors is greater than the SPPL.
- Partitions that are
activated will limit the number of active virtual processors to no more
than the SPPL limit.
- Partition configuration:
- Errors that are
logged when the SPPL is exceeded can result in HMC errors HSCLA4D6 and
HSC0A4D6.
- Attempts to change
the number of virtual processors or entitled processing units via a
profile or dynamic LPAR change will be subject to
the SPPL
setting of 24 or 32. For example, if the SPPL is set to 32 and
your shared processor partition is configured with 40 virtual
processors, you must reduce the
number of
virtual processors to 32 or fewer for the change to be successful.
- If you create a new
shared processor partition, the number of virtual processors must not
exceed the SPPL value.
- Partition mobility:
- A partition
must comply with the SPPL of the target server.
- Partition hibernation
(suspend/resume):
- If you have suspended
partitions that have exceeded the SPPL limit and install this service
pack, you will not be able to successfully resume those suspended
partitions. You should ensure all suspended partitions comply
with (have virtual processors fewer than or equal to) the new SPPL
limit before installing this service pack.
Downgrading firmware from any
given release level to an earlier release level is not recommended.
If you feel that it is
necessary to downgrade the firmware on
your system to an earlier release level, please contact your next level
of support.
IPv6 Support and Limitations
IPv6 (Internet Protocol version 6) is supported in the System
Management
Services (SMS) in this level of system firmware. There are several
limitations
that should be considered.
When configuring a network interface card (NIC) for remote IPL, only
the most recently configured protocol (IPv4 or IPv6) is retained. For
example,
if the network interface card was previously configured with IPv4
information
and is now being configured with IPv6 information, the IPv4
configuration
information is discarded.
A single network interface card may only be chosen once for the boot
device list. In other words, the interface cannot be configured for the
IPv6 protocol and for the IPv4 protocol at the same time.
Memory Considerations for Firmware Upgrades
Firmware Release Level upgrades and Service Pack updates may consume
additional system memory.
Server firmware requires memory to support the logical partitions on
the server. The amount of memory required by the server firmware varies
according to several factors.
Factors influencing server firmware memory requirements include the
following:
- Number of logical partitions
- Partition environments of the logical
partitions
- Number of physical and virtual I/O devices
used by the logical partitions
- Maximum memory values given to the logical
partitions
Generally, you can estimate the amount of memory required by server
firmware to be approximately 8% of the system installed memory. The
actual amount required will generally be less than 8%. However, there
are some server models that require an absolute minimum amount of
memory for server firmware, regardless of the previously mentioned
considerations.
Additional information can be found at:
http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/p7hat/iphatlparmemory.htm
3.0 Firmware
Information
and Description
Use the following examples as a reference to determine whether your
installation
will be concurrent or disruptive.
Note: The concurrent levels of system firmware may, on occasion,
contain
fixes that are known as deferred. These deferred fixes can be installed
concurrently, but will not be activated until the next IPL. Deferred
fixes,
if any, will be identified in the "Firmware Update Descriptions" table
of this document. For deferred fixes within a service pack, only the
fixes
in the service pack which cannot be concurrently activated are
deferred.
Note: The file names and service pack levels used in the
following
examples are for clarification only, and are not necessarily levels
that
have been, or will be released.
System firmware file naming convention:
01AHXXX_YYY_ZZZ
- XXX is the release level
- YYY is the service pack level
- ZZZ is the last disruptive service pack level
NOTE: Values of service pack and last disruptive service pack
level
(YYY and ZZZ) are only unique within a release level (XXX). For
example,
01AH330_067_045 and 01AH340_067_053 are different service
packs.
An installation is disruptive if:
- The release levels (XXX) are different.
Example: Currently installed release is EH330, new release is EH340
- The service pack level (YYY) and the last disruptive
service
pack level (ZZZ) are the same.
Example: AH330_120_120 is disruptive, no matter what level of AH330 is
currently
installed on the system
- The service pack level (YYY) currently installed on the system is
lower
than the last disruptive service pack level (ZZZ) of the service pack
to
be installed.
Example: Currently installed service pack is AH330_120_120 and
new service pack is AH330_152_130
An installation is concurrent if:
- The release level (XXX) is the same, and
- The service pack level (YYY) currently installed on the
system
is
the same or higher than the last disruptive service pack level
(ZZZ)
of the service pack to be installed.
Example: Currently installed service pack is AH330_126_120,
new service pack is AH330_143_120.
Filename |
Size |
Checksum |
01AH720_113_064.rpm |
49351217 |
25578
|
Note: The Checksum can be found by running the AIX sum command against the rpm file
(only the first 5 digits are listed).
ie: sum 01AH720_113_064.rpm
AH720
For Impact, Severity and other Firmware definitions, Please
refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The
complete Firmware Fix History for this
Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html
|
AH720_113_064
06/28/12 |
Impact: Availability
Severity: SPE
New Features and Functions
- PARTITION-DEFERRED:
Support for Live Partition Mobility (LPM) between systems running Ax720
system firmware, and 8246-L2S systems.
System firmware changes that affect all systems
- The firmware was enhanced to increase the threshold of soft
NVRAM errors on the service processor to 32 before SRC B15xF109 is
logged. (Replacement of the service processor is recommended if
more than one B15xF109 is logged per week.)
- The firmware was enhanced to call out the correct field
replaceable units (FRUs) when SRC B124E504 with description "Chnl init
TO due to SN stuck in recovery" was logged.
- A problem was fixed that caused informational SRC A70047FF,
which may indicate that the Anchor (VPD) card should be replaced, to be
erroneously logged again after the Anchor card was replaced.
- A problem was fixed that caused booting from a virtual
fibre channel tape device to fail with SRC B2008105.
- A problem was fixed that caused a dynamic LPAR (DLPAR) add
operation to fail on an empty PCI slot that is not hot-pluggable.
- The firmware was enhanced to more gracefully handle the
system shutdown that is required when a hypervisor hang condition was
encountered. SRCs B7000602, B182951C, B1813918 and A7001151 were
logged, and a service processor failover occurred, when the hypervisor
hang condition and subsequent system crash occurred.
- A problem was fixed that caused a system to crash when the
system was in low power (or safe) mode, and the system attempted to
switch over to nominal mode.
- A problem was fixed that caused the system to crash after a
recoverable error was logged on an I/O hub.
- A problem was fixed that caused SRC B7006990 to be
incorrectly logged, instead of SRC B7006991, when the hypervisor is
unable to communicate with the secondary system controller at system
boot.
System firmware changes that affect certain systems
- The
firmware was enhanced to fix a potential performance degradation on
systems utilizing the stride-N stream prefetch instructions dcbt (with
TH=1011) or dcbtst (with TH=1011). Typical applications executing
these algorithms include High Performance Computing, data intensive
applications exploiting streaming instruction prefetchs, and
applications utilizing the Engineering and Scientific Subroutine
Library (ESSL) 5.1.
- A problem was fixed that caused the hypervisor to hang
during a concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O
drawer. Recovering from the hypervisor hang required a platform
reboot.
- A problem was fixed that impacted performance if profiling
was enabled in one or more partitions. Performance profiling is
enabled:
- In an AIX or VIOS partition using the tprof (-a,
-b, -B, -E option) command or pmctl (-a, -E option) command.
- In an IBM i partition when the PEX *TRACE profile
(TPROF) collections or PEX *PROFILE collections are active.
- In a Linux partition using the perf command, which
is available in RHEL6 and SLES11; profiling with oprofile does not
cause the problem.
Concurrent hot add/repair maintenance firmware fixes
- A problem was fixed
that caused multiple types of failures (CHARM node operations and
Advanced Energy Manager (AEM) state changes, among others), after a
CHARM hot node operation on the first (top) drawer (or the first
logical node) was followed by a concurrent firmware installation.
- A problem was fixed that caused unrecoverable SRCs B1813918
and B182953C during a CHARM operation.
|
AH720_108_064
01/23/12 |
Impact: Availability
Severity: HIPER
- High Impact/PERvasive, Should be installed as soon as
possible.
System firmware changes that affect all systems
- HIPER/Not pervasive:
A problem
was fixed that caused the system to crash with SRC B18187DA.
- The firmware was enhanced to
log SRC B1768B76 as informational instead of unrecoverable.
- The firmware was enhanced to
increase the threshold for recoverable SRC B113E504 so that the
processor core reporting the SRC is not guarded out. This
prevents performance loss and the unnecessary replacement of processor
modules.
- A problem was fixed prevented
a platform system dump from being deleted when the file system space on
the service processor was full.
- The firmware was enhanced to
log SRC B1812A11 as informational, instead of "service action
required", when the thermal/power management device (TPMD) is
successfully reset.
- The field replaceable unit
(FRU) callouts were enhanced for SRC B181E550.
- A problem was fixed that
caused the message "500 - Internal Server Error." to be displayed when
a setting was changed on the Advanced System Management Interface's
(ASMI's) power on/off menu, when the change was attempted when the
system was powering down.
- A problem was fixed that
erroneously caused SRC B1818601 to be logged and an FSP dump to be
generated.
- The firmware was
enhanced to log an error, instead of causing a kernel panic, if a guard
record was corrupted or truncated.
- A problem was fixed that
caused the wrong error code to be logged when the memory test took
longer than normal during system boot.
- A problem was fixed that caused a system's partition
dates to revert back to 1969 after the service processor or its battery
was replaced. This occurred regardless of whether or not the
service processor's time-of-day (TOD) clock was correctly set during
the service action.
- A problem was fixed that
caused the system to appear to hang, and a service processor
reset/reload to occur, when multiple hardware errors occurred.
- A problem was
fixed that caused SRC B7005442 to be erroneously logged, and functional
processor cores to be guarded out, when an error occurred in the
operation system or an application.
- A problem was fixed that
erroneously caused SRC B1818601 to be logged and an FSP dump to be
generated.
- A problem was fixed that
caused multiple service processor dumps to be unnecessarily taken
during a concurrent firmware update. SRC B181EF9A, which
indicates that the dump space on the service processor is full, was
logged as a result.
- The firmware was enhanced by
the addition of a new option in the system management services (SMS)
"Multi-boot" menu that facilitates zoning of physical and virtual fibre
channel adapters.
- A problem was fixed that
caused a firmware installation from the HMC with the "do not auto
accept" option selected to fail.
- A problem was fixed
that caused SRC B18138B7 to be erroneously logged, and the service
processor to terminate, when errors were continuously logged due to
failing hardware. This problem can cause both node controllers to
terminate, which disables the node.
- Please see the
detailed description of this defect in the "Important Information"
section of this document.
- A problem was fixed that
caused SRC B1754201, with memory DIMMs in the FRU list, to be
erroneously logged after the reset/reload of a node controller.
- The firmware was enhanced to
correctly log an error when the bulk power controllers' firmware levels
don't match.
System firmware changes that affect certain systems
- HIPER/Pervasive on systems
with a Virtual
Input/Output (VIO) client running AIX, and with a F/C 5803 or 5873 I/O
drawer attached: A problem was fixed that caused the
system to crash with SRC B700F103.
- HIPER/Not pervasive:
On
systems running the Advanced Energy Manager, a problem was fixed that
caused the system to crash with SRC B114E504.
- On systems running more than
100 logical partitions, a problem was fixed that caused a concurrent
firmware installation to fail.
- On systems running the
Advanced Energy Manager (AEM), that terminates when in dynamic power
save mode, a problem was fixed that caused SRCs B150B943, B113C660, and
B113C661 to be erroneously logged when the system rebooted.
- On systems running
Active Memory Sharing (AMS), the firmware was enhanced to reduce the
time required to migrate an AMS partition.
- On systems running Active
Memory Sharing (AMS), a problem was fixed that caused the system to
crash during the creation of a logical partition (LPAR).
- On systems running Active
Memory Sharing (AMS), a problem was fixed that prevented an AMS
partition from being activated with SRC B2006009.
- On systems running VIOS, a
problem was fixed that caused the location code in the output of the
VIOS command "lsmap -npiv -all" to be incorrect.
- A problem was
fixed that caused a shared processor partition that is configured with
two virtual processors and an entitled capacity of 1.0 processors to
hang when only one processor is in the physical shared pool.
- On systems running iSCSI, a
problem was fixed that caused the system to hang when booting from an
iSCSI device in the system management services (SMS) menus.
- On the System Management
Services (SMS) remote IPL (RIPL) menus, a problem was fixed that caused
the SMS menu to continue to show that an Ethernet device is configured
for iSCSI, even though the user has changed it to BOOTP.
- On systems running the
Advanced Energy Manager (AEM), a problem was fixed that caused the work
rate calculation for a processor to be incorrect if the system dropped
into safe mode.
- On systems from which a node
has been removed, a problem was fixed that caused the node to continue
to be listed when the Processing Unit Deconfiguration option was
selected on the Advanced System Management Interface (ASMI) menus.
- On systems in which a service
processor had been guarded out manually, a problem was fixed that
caused the Deconfiguration Records option, which is under the System
Service Aids in the Advanced System Management Interface (ASMI), to
display null data for that service processor.
- The firmware was enhanced to
allow the Enhanced Cache Option (also known as Turbo Core) to be
enabled when three or more processor nodes are present.
|
4.0
How to Determine Currently Installed Firmware Level
You can view the server's current firmware level on the Advanced System
Management Interface (ASMI) Welcome pane. It appears in the top right
corner.
Example: AH350_038.
5.0 Downloading
the
Firmware Package
Follow the instructions on the web page. You must read and agree to the
license agreement to obtain the firmware packages.
Note: If your HMC is not internet-connected you will need to
download
the new firmware level to a CD-ROM or ftp server.
6.0 Installing the
Firmware
The method used to install new firmware will depend on the release
level
of firmware which is currently installed on your server. The release
level
can be determined by the prefix of the new firmware's filename.
Example: EHXXX_YYY_ZZZ
Where XXX = release level
- If the release level will stay the same (Example: Level
AH330_075_075
is
currently installed and you are attempting to install level
AH330_081_075)
this is considered an update.
- If the release level will change (Example: Level AH330_081_075 is
currently
installed and you are attempting to install level AH340_096_096) this
is
considered an upgrade.
Instructions for installing firmware updates and upgrades can be found
at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7ha1/updupdates.htm
7.0 Firmware History
The complete Firmware Fix History for this Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html