This document provides information about the installation of Licensed
Machine or Licensed Internal Code, which is sometimes referred to generically
as microcode or firmware.
When configuring a network interface card (NIC) for remote IPL, only
the most recently configured protocol (IPv4 or IPv6) is retained. For example,
if the network interface card was previously configured with IPv4 information
and is now being configured with IPv6 information, the IPv4 configuration
information is discarded.
A single network interface card may only be chosen once for the boot
device list. In other words, the interface cannot be configured for the
IPv6 protocol and for the IPv4 protocol at the same time.
For systems that are not managed by an HMC, the installation of system
firmware is always disruptive.
Note: The concurrent levels of system firmware may, on occasion, contain
fixes that are known as deferred. These deferred fixes can be installed
concurrently, but will not be activated until the next IPL. Deferred fixes,
if any, will be identified in the "Firmware Update Descriptions" table
of this document. For deferred fixes within a service pack, only the fixes
in the service pack which cannot be concurrently activated are deferred.
EM340 |
EM340_101_039
09/23/09 |
Impact: Serviceability
Severity: Attention
System firmware changes that affect all systems
-
DEFERRED: The firmware was enhanced to reduce the number of
correctable errors (CEs) being erroneously logged against the memory bus
with SRC B124E504.
-
The firmware was enhanced such that SRC B181F126 is correctly managed,
and no longer calls home unnecessarily for this problem.
|
EM340_095_039
08/20/09 |
Impact: Function Severity: HIPER
System firmware changes that affect all systems
-
DEFERRED: This fix corrects the handling of a specific processor
instruction sequence that was generated on a particular heavily-tuned High
Performance Computing (HPC) application. This specific instruction sequence
has the potential to produce an incorrect result. This instruction sequence
has only been observed in a single HPC application. However, it is
strongly recommended that you apply this fix.
System firmware changes that affect certain systems
-
HIPER: for systems with F/C 5802 or 5877 drawers attached:
A problem was fixed that prevented node concurrent maintenance operations
on systems with F/C 5802 or 5877 drawers attached to them.
-
On systems with F/C 5802 or 5877 drawers attached, a problem was fixed
that prevented an I/O slot's power LED from accurately reflecting the state
of the I/O slot in a 5802 or 5877 drawer, under certain circumstances.
-
On systems running system firmware EM340_075 and Active Memory Sharing,
a problem was fixed that might have caused a partition to fail to boot
with SRC B700F103 if the partition had more than 24 virtual processors
assigned to it.
-
On systems running system firmware EM340_075 and Active Memory Sharing,
a problem was fixed that might have caused a partition to lose I/O entitlement
after the partition was moved from one system to another using PowerVM
Mobility.
-
On systems running system firmware release EM340, a problem was fixed that
might have caused the I/O performance to be degraded if a node evacuation
operation was performed (as part of a concurrent maintenance operation
to fix a failing I/O adapter or drawer) after the repair was complete.
-
On systems with external I/O towers attached, the firmware was enhanced
so that the system will not crash when SRC B7006981 is logged for certain
types of I/O hardware failures.
Concurrent maintenance (CM) firmware fixes
-
On model MMA systems, the firmware was enhanced such that an invalid enclosure
serial number will not cause the node evacuation phase of a concurrent
maintenance operation to fail. A small number of model MMA enclosures
may have an invalid serial number (such as "DQ1234 " or "DQ1234#) due to
the I/O backplane having been replaced in a previous maintenance operation.
-
A problem was fixed that might have caused the performance of an I/O loop
(attached to a 12X I/O adapter) to be degraded if a B7006982, B7006984,
B7006985, B70069F2, B70069F3, or B70069F4 SRC is logged after a concurrent
maintenance operation on that loop.
-
A problem was fixed that caused concurrent maintenance operations on memory
DIMMs to fail if the replacement DIMMs were functionally equivalent to
the original DIMMs, but did not have the same CCIN (customer card identification
number).
-
A problem was fixed that caused SRC B1xxB889 SRCs to be erroneously logged
during a node evacuation operation. (Node evacuation is one step
in a concurrent maintenance operation on a node.)
-
A problem was fixed that caused the system to crash during a hot node or
GX adapter repair with certain hardware configurations.
-
A problem was fixed that caused the system to crash during a hot node repair
or upgrade.
|
EM340_075_039
05/26/09 |
Impact: Function Severity: Special Attention
New features and functions:
- DEFERRED: Support for F/C 5802 (19" I/O drawer) and 5803 (24"
I/O drawer).
Attention: After this level of firmware is installed, the platform
must be powered off, then powered on, before the 5802 or 5803 I/O drawer
is added to the system.
- DEFERRED: Support for POWER VM Active Memory Sharing.
Attention: After this level of firmware is installed, the platform
must be powered off, then powered on to activate the POWER VM Active Memory
Sharing function
Attention: If EM340_075 has been installed, and the new POWER
VM Active Memory Sharing function has been activated, and you want to back-level
the system firmware, the active memory sharing pool must be deactivated
and deleted prior to back-leveling the system firmware. IBM does not recommend
back-leveling the system firmware.
System firmware changes that affect all systems:
-
A problem was fixed that caused the detailed data at the end of an "early
power off warning type 5" AIX error log entry to be filled with invalid
data instead of zeros.
-
A problem was fixed that prevented all of the necessary files from being
synchronized between the primary and the secondary service processors.
One possible symptom of this problem was the time-of-day clocks being out
of synch after a service processor failover.
-
The firmware was enhanced to include processor card #1 in the list of field
replaceable units (FRUs) that are called out if an I2C bus error occurs
when accessing the processor backplane's vital product data (VPD).
-
A problem was fixed that caused SRC B1818601 to be logged, and a service
processor dump to be generated, at runtime.
-
A problem was fixed that caused the number of empty GX adapter slots displayed
by the advanced system management interface (ASMI) to be incorrect.
-
A problem was fixed that caused the amber identify LED, instead of the
green power-on LED, to be lit on the first drawer of a model MMA (Power
570) system.
-
The firmware was enhanced so that if the secondary service processor remains
hung after the primary service processor successfully boots, a predictive
error will be logged, and a call home will be made.
-
A problem was fixed that caused the service processor diagnostics to report
a "TOD (time-of-day) overflow" error, instead of an uncorrectable memory
error, when failures occurred on memory DIMMs.
-
The firmware was enhanced such that if an attempt is made to enable redundancy
when the system is booting, the error log entry that is made will be informational
instead of predictive.
-
The firmware was enhanced so that a call home will be made if the hypervisor
issues a "terminate immediate" interrupt.
-
The firmware was enhanced so that the service processor only logs SRC B1A38B24
when a valid network setup error is found. The callouts for this SRC were
also improved.
-
The firmware was enhanced so that SRCs B181720D, B1818A13, and B1818A0F,
and occasionally a service processor dump, will not be generated when the
service processor's two Ethernet interfaces are on the same subnet. (This
is an invalid configuration.)
System firmware changes that affect certain systems:
-
In systems using InfiniBand switches for processor clustering, a problem
was fixed that caused packets to be dropped under certain circumstances.
-
A problem was fixed that caused the migration of a partition with more
than 900 virtual slots defined, from a system running firmware EM320 to
a system running firmware EM340, to fail.
-
On systems running firmware release EM340, a problem was fixed that caused
data in the platform dump to be invalid.
-
On systems with external drawers or towers, a problem was fixed that caused
SRC xxxx6981, xxxx6982, or xxxx6985 to be logged. When this problem occurred,
some I/O slots might also be missing from the resource lists.
-
On systems using on/off (temporary) memory capacity on demand (COD), the
firmware was enhanced to improve memory COD's interaction with other tools
(such as Inventory Scout in AIX), and to make the billing process easier.
-
On systems with two hardware management consoles (HMCs), the firmware was
enhanced so that the system will not restart and generate a service processor
dump when the two HMCs are in the same subnet. (This is an invalid configuration.)
Concurrent maintenance (CM) firmware fixes:
-
DEFERRED: A problem was fixed that caused SRC B150A422 to be erroneously
logged, and the advanced system management interface (ASMI) to erroneously
show deconfigured processor cores, if system firmware was installed while
a node was deactivated due a concurrent maintenance operation.
-
DEFERRED: A problem was fixed that caused SRC B181B171 to be logged,
and the system to crash, during a concurrent node repair or concurrent
GX adapter repair.
-
A problem was fixed that might cause a concurrent node repair, a concurrent
I/O expansion unit repair, a concurrent PCI slot repair, or a DLPAR removal
or moving of I/O slots to fail if the I/O hardware involved is in a failed
state.
-
A problem was fixed that caused a hot node repair operation to fail if
16GB huge pages were configured on the system.
-
A problem was fixed that caused a concurrent node add or repair operation
to fail if the operation immediately followed an upgrade of system firmware
from EM330_xxx to EM340_039, then a concurrent installation of EM340_061.
-
A problem was fixed that caused a partition reboot to hang at AIX progress
code 0581, after the concurrent replacement of the I/O backplane in a model
MMA drawer, when the partition owned resources in the drawer that was repaired.
|
EM340_061_039
04/20/09 |
Impact: Function Severity: Special
Attention
System firmware changes that affect all systems:
-
DEFERRED: A problem was fixed that caused the advanced system management
interface (ASMI) menus to become unresponsive, and the system to appear
to hang, when a GX adapter slot reservation was attempted when the system
was at service processor standby.
-
A problem was fixed that caused the service processor diagnostics to report
a "TOD (time-of-day) overflow" error, instead of an uncorrectable memory
error, when failures occurred on memory DIMMs.
-
A problem was fixed that prevented the service processor from automatically
booting from the permanent (or P) side if the temporary (or T) side of
the firmware flash was corrupted. When the problem occurred, the service
processor stopped instead of booting from the P side.
-
A problem was fixed that might have caused the system to crash when a processor
was dynamically removed when the system was running. If the system is running
the EM340 release of system firmware, this problem can also occur during
a concurrent maintenance operation.
-
The firmware was enhanced such that data corruption in the Anchor (VPD)
will be corrected by the firmware, rather than having to have the Anchor
card replaced.
-
A problem was fixed that prevented the system from powering on after the
"reset to factory settings" option was selected in the advanced system
management interface (ASMI) menus.
-
The firmware was enhanced to improve the service processor's capability
to recover from bad bits in the flash memory. A predictive error, or an
unrecoverable error, will be logged against the card that contains the
system firmware if the number of correctable or uncorrectable errors exceeds
the threshold.
-
A problem was fixed that caused non-terminating SRCs (such as B1818A1E)
that indicate registry read errors to be logged during a disruptive installation
of system firmware.
-
A problem was fixed that caused a partition being migrated to crash on
the target system.
-
On systems running the EM340 release of system firmware, a problem was
fixed that caused an abort code to be logged in the virtual input/output
system (VIOS) error log on the source system after a successful partition
migration.
-
A problem was fixed that caused a partition being migrated to become unresponsive
on the target system when firmware-assisted dump was enabled.
-
The firmware was enhanced so that SRC BA210012 will not generate a call
home when logged.
-
The callouts for SRC B181E6ED, which is logged when a system is booted
with service processor redundancy disabled, were improved to indicate that
redundancy was disabled rather than calling out a firmware failure.
-
A problem was fixed that caused hardware to be deconfigured when the system
encountered network errors, even though the SRCs were being logged as informational.
-
A problem was fixed that prevented all of the necessary files from being
synchronized between the primary and secondary service processors. One
possible symptom of this problem was the time-of-day clocks being out of
synch after a service processor failover.
System firmware changes that affect certain systems:
-
On systems with external I/O drawers, a problem was fixed that could cause
the system to hang on checkpoint C700406E during a "warm" reboot (a reboot
in which the processor drawer is power-cycled but the I/O drawers are not).
-
On systems running system firmware release EM340 and IBM i partitions,
a problem was fixed that caused message CPF9E7F, CPF9E2D or CPF9E5E (which
indicates a licensing key problem) to be received by the IBM i partitions
when the number of physical processors was greater than the number of IBM
i licenses.
-
On systems with virtual fiber channel disks, a problem was fixed that prevented
the system management services (SMS) from displaying the virtual fiber
channel disks if the virtual fiber channel server reported that any of
them were reserved.
Concurrent maintenance (CM) firmware fixes
-
DEFERRED: On systems running system firmware release EM340, a problem
was fixed that caused the system to checkstop during the "hot add" of a
GX I/O adapter card.
-
A problem was fixed that caused the fans in a drawer that was added in
a "hot drawer add" operation to run at high speed.
-
A problem was fixed that caused a concurrent maintenance operation to be
halted with SRC B181A433 being logged.
-
A problem was fixed that caused concurrent maintenance operations, if attempted
immediately after a disruptive firmware installation, to be disabled.
-
A problem was fixed that caused SRC B150D15E to be erroneously logged during
a concurrent drawer addition or concurrent memory upgrade.
-
A problem was fixed that caused concurrent maintenance operations, if attempted
immediately after a concurrent firmware installation, to be disabled.
-
A problem was fixed that caused a concurrent node add to fail after a disruptive
firmware installation with SRC B181A422 being logged.
-
A problem was fixed that prevented a concurrent add or repair of a GX adapter
from being re-attempted if a reset/reload of the primary service processor
occurred during the GX add part of the initial procedure.
|
EM340_041_039
12/09/08 |
Impact: Availability Severity:
HIPER
System firmware changes that affect certain systems:
-
On model 9117-MMA systems with F/C 7540 (POWER6, 64-bit, 4.2 GHz, four
core processor) installed, and all model 8234-EMA systems, a problem was
fixed that caused a processor to checkstop after a reset/reload of the
service processor. SRC B181D15F, B181E911 and/or B150B145 may be logged,
and service processor dumps may be present, when this problem occurs.
|
EM340_039_039
11/21/08 |
Impact: Function Severity:
Attention
New Features and Functions:
-
Support for the model 8234-EMA.
-
Support for the 8GB fiber channel adapter, F/C 5735.
-
Support for a virtual tape device.
-
Support for USB flash memory storage devices.
-
Support in the service processor firmware for IPv6.
-
Support in the hypervisor for three types of hardware performance monitors.
-
Support for installing AIX and Linux using the integrated virtualization
manager (IVM).
-
On systems running AIX, support was added for an enhanced power and thermal
management capability. When static power save mode is selected, AIX will
"fold" processors to free processors which can then be put in the "nap"
state.
-
Support for CIM (common information model) power instrumentation in the
service processor firmware.
-
Support for enhanced power management, including dynamic voltage and frequency
slewing.
-
Support for processor cards with two dual-core module (DCM) processors;
the maximum configuration of the model MMA with these processor cards is
32 processors.
-
On systems that have temperature and power management device (TPMD) hardware,
support was added for a "soft" power cap.
-
Support for concurrent processor node addition, as well as hot and cold
node repair.
System firmware changes that affect all systems:
-
A problem was fixed that prevented the default partition environment in
the advanced system management interface (ASMI) power on/off menu from
being set to "i5/OS" when it was blank.
-
The firmware was enhanced so that SRC B1xx3409, which indicates an invalid
state change (such as pushing the power on button twice quickly) will be
logged as informational instead of predictive, and will not call home.
-
A problem was fixed that caused a service processor dump to be taken and
SRC B181EF88 to be logged, even though the operation of the system was
not affected.
-
A problem was fixed that, under certain rare circumstances, caused SRC
B181E411 to be logged, a call home to be made, and a service processor
dump to be taken.
-
The firmware was enhanced so that SRC B1812224, which indicates that the
user attempted to enable redundancy when the managed system was booting,
will be logged as informational instead of predictive.
-
A problem was fixed that prevented error log entries on the secondary service
processor from generating a serviceable event on the hardware management
console (HMC).
-
A problem was fixed that prevented some of the service processor error
log entries from being see when the advanced system management interface
(ASMI) menus were accessed on a TTY terminal.
System firmware changes that affect certain systems:
-
On systems with the integrated x-series adapter (IXA), a problem was fixed
that prevented the creation of a system plan on the HMC.
-
On model MMA systems shipped before mid-May 2008, a problem was fixed the
prevented RB keyword0 from being set in the advanced system management
interface (ASMI) system keywords menu.
-
On systems with multiple host channel adapter (HCA) cards, a problem was
fixed that logical ports on the HCA cards to be intermittently inactive.
-
In networks using a time server, a problem was fixed that caused the date
on a client system to be reset to 1969 if the client system lost power.
|
EM320 |
EM320_101_045
10/22/09 |
Impact: Function
Severity: HIPER
System firmware changes that affect all systems
-
HIPER: A problem was fixed that caused the migration of a
partition using shared processors to fail with a reason code of 4180043,
or caused the source system to hang or crash.
-
DEFERRED: This fix corrects the handling of a specific processor
instruction sequence that was generated on a particular heavily-tuned High
Performance Computing (HPC) application. This specific instruction sequence
has the potential to produce an incorrect result. This instruction sequence
has only been observed in a single HPC application. However, it is
strongly recommended that you apply this fix.
-
The firmware was enhanced such that SRCs B181F126, B181F127, and B181F129
are correctly managed, and no longer cause unnecessary calls home.
-
A problem was fixed that caused SRC B7005603 to be erroneously logged when
a F/C 5802 or 5877 19" drawer was concurrently added to the system.
-
A problem was fixed that caused SRC B1817201 to be erroneously logged during
the installation of system firmware.
System firmware changes that affect certain systems
-
On systems using on/off (temporary) memory capacity on demand (COD), the
firmware was enhanced to improve the billing process for this feature.
|
EM320_093_045
05/04/09 |
Impact: Function Severity: Special Attention
System firmware changes that affect all systems:
-
DEFERRED: The firmware was enhanced so that the system recovers
gracefully from an I/O load time-out, rather than issuing a machine check,
which crashes the system.
-
A problem was fixed that caused the service processor diagnostics to report
a "TOD (time-of-day) overflow" error, instead of an uncorrectable memory
error, when failures occurred on memory DIMMs.
-
A problem was fixed that, in certain configurations, caused the removal
of a host Ethernet adapter (HEA) port to fail when using a dynamic LPAR
(DLPAR) operation.
-
A problem was fixed that, under certain circumstances, prevented the operating
system from recovering a PCI-E adapter on which a temporary enhanced error
handling (EEH) error occurred.
-
A problem was fixed that caused the hardware management console (HMC) to
show the managed system's status as incomplete after adding a drawer using
the concurrent maintenance operation.
-
The firmware was enhanced to improve the service processor's capability
to recover from bad bits in the flash memory. A predictive error, or an
unrecoverable error, will be logged against the card that contains the
system firmware if the number of correctable or uncorrectable errors exceeds
the threshold.
-
The firmware was enhanced so that a call home will be made if the hypervisor
issues a "terminate immediate" interrupt.
-
A problem was fixed that prevented service processor and hypervisor error
log entries from being reported to the operating system after a successful
partition migration. This problem only affected the partition that was
migrated.
-
The firmware was enhanced so that if a system with redundant service processors
is booted with redundancy disabled, a call home error will be logged.
-
A problem was fixed that prevented the system from powering on after the
"reset to factory settings" option was selected in the advanced system
management interface (ASMI) menus.
-
A problem was fixed that caused the migration of an AIX or Linux partition
to fail when firmware-assisted dump was enabled. When this problem occurs,
the partition becomes unresponsive on the target system, and the target
system may have to be rebooted to recover.
-
A problem was fixed that prevented the service processor from automatically
booting from the permanent (or P side) if the temporary (or T side) of
the firmware flash was corrupted. When the problem occurred, the service
processor stopped instead of booting from the P side.
-
A problem was fixed that caused SRC B1818601 to be logged, and a service
processor dump to be generated, at runtime.
-
The firmware was enhanced to include processor card #1 in the list of field
replaceable units (FRUs) that are called out if an I2C bus error occurs
when accessing the processor backplane's vital product data (VPD).
-
A problem was fixed that prevented all of the necessary files from being
synchronized between the primary and the secondary service processors.
One possible symptom of this problem was the time-of-day clocks being out
of synch after a service processor failover.
System firmware changes that affect certain systems:
-
On systems with a host Ethernet adapter (HEA) or host channel adapter (HCA)
assigned to a Linux partition, a problem was fixed that prevented the partition
from booting if 512 GB, 1 TB, or 1.5 TB of memory was assigned to the partition.
When this problem occurred, SRC B700F105 was logged.
-
On systems with multiple host channel adapter (HCA) cards, a problem was
fixed that logical ports on the HCA cards to be intermittently inactive.
-
On systems with the integrated xSeries adapter (IXA), a problem was fixed
that prevented the creation of a system plan on the HMC.
-
On systems with redundant service processors, a problem was fixed that
caused registry read errors or registry value errors to be generated during
the installation of system firmware.
-
On systems running AIX partitions, a problem was fixed that caused AIX
to erroneously log a hardware error in which the LABEL field is "INTRPPC_ERR",
and the INTERRUPT LEVEL is "0009 0001", after a concurrent firmware update
or partition migration. This error did not affect the operation of the
system or partition.
|
EM320_083_045
09/24/08 |
Impact: Serviceability Severity: HIPER
System firmware changes that affect all systems:
-
DEFERRED and HIPER: A problem was fixed that, under certain rarely
occurring circumstances, an application could cause a processor to go into
an error state, and the system to crash.
-
DEFERRED and HIPER: The system initialization settings were changed
to reduce the likelihood of a system crash under certain circumstances.
-
HIPER: A problem was fixed that caused the system to terminate abnormally
with SRC B131E504.
-
HIPER: A problem was fixed that caused a system to fail to reboot
after a B1xxE504 SRC was logged due to a processor interconnection bus
failure. The same SRC, B1xx E504, was logged when the reboot failed.
-
HIPER: A problem was fixed that might cause a partition to crash
during a partition migration before the migration was complete.
-
DEFERRED: A problem was fixed such that under certain rare circumstances,
if a service processor failover occurred, the new secondary service processor
was not able to communicate with the system.
-
A problem was fixed that caused SRC B1818A10 to be erroneously generated
after the successful installation of system firmware.
-
Enhancements were made to the firmware to improve the FRU callouts for
certain types of failures of the time-of-day clock circuitry.
-
A problem was fixed that, under certain rarely occurring circumstances,
caused the system to crash if an L2 or L3 cache failure occurred.
-
The firmware was enhanced so that the contents of /tmp are included when
a service processor dump is taken.
-
A problem was fixed that caused a predictive SRC, B181EF88, to be erroneously
logged after a successful installation of system firmware, and a subsequent
slow-mode IPL, of the system.
-
A problem was fixed that, under certain rarely occurring circumstances,
caused the system to crash with SRC B7005191 being logged.
-
A problem was fixed that prevented the system from rebooting if an error
occurred during a memory-preserving IPL.
-
A problem was fixed that prevented the diagnostic commands in AIX (diag
and lsmcode, for example) from working after a partition migration.
-
A problem was fixed that, under certain rarely occurring circumstances,
caused a partition shutdown or partition reboot to hang with SRC D200B077.
-
A problem was fixed that, under certain rarely occurring circumstances,
caused the hypervisor to loose its communication link to the service processor
and log SRC A181D000.
-
A problem was fixed that, under certain rarely occurring circumstances,
might have caused dynamic LPAR (DLPAR) operations on memory to fail.
-
A problem was fixed that prevented I/O hardware operations from completing
before dynamic LPAR (DLPAR) operations were performed on memory. This caused
PCI bus errors, and multiple instances of SRC B7006971 to be logged.
-
A problem was fixed in the hypervisor that, under certain rarely occurring
circumstances, caused a system-level activation to fail.
-
A problem was fixed that caused SRC B7006971 to be generated because the
firmware was incorrectly performing operations on PCI-Express I/O adapters
during dynamic LPAR (DLPAR) operations on memory.
-
A problem was fixed that might have caused a processor checkstop after
a node repair or node add operation.
-
A problem was fixed that caused the message "BA330000malloc error!" to
be displayed on the operating system console after a partition migration,
even though SRC BA330000 had not been logged. When this problem occurred,
the partition migration appeared to be successful. However, a process within
the partition was either hung or had failed, and in most cased the partition
had to be rebooted to fully recover.
-
The firmware was enhanced to improve the description and service actions
that are logged with SRC BA210012.
-
A problem was fixed that, under certain rare circumstances, prevented a
partition migration from completing successfully if processors were removed
from the partition being migrated prior to the migration using dynamic
LPAR (DLPAR) operations.
-
A problem was fixed that, under certain rare circumstances, caused a system
crash during partition migration operations.
-
A problem was fixed that, under certain rare circumstances, caused the
hypervisor to crash when it was booting.
System firmware changes that affect certain systems:
-
On systems that are managed by a hardware management console (HMC), a problem
was fixed that caused the HMC to show an "Incomplete" state after it attempted
to read a file with an incorrect size from the service processor (or system
controller). This problem also occurred if the "factory configuration"
option was used on the advanced system management interface (ASMI) menus.
-
On systems with I/O drawers attached, a problem was fixed that might have
caused some I/O slots in the drawers not to be configured when the system
was booted.
-
On i5 partitions using IOP-based I/O adapters which are configured to use
i5 clustering (SAN), a problem was fixed that caused the failover of an
I/O drawer or tower, to a system which previously owned the drawer or tower,
to fail.
-
On systems with a large number of fibre channel disks, a problem was fixed
that caused SRC BA210003 to logged (which called out the fibre channel
adapter) when the system management services (SMS) boot firmware was searching
for a boot disk.
-
In systems with clustered processors, various problems were fixed in the
InfiniBand interconnection networks.
|
EM320_076_045
06/09/08 |
Impact: Serviceability Severity: HIPER
System firmware changes that affect all systems:
-
DEFERRED and HIPER: The processor initialization settings were changed
to reduce the likelihood of a processor going into an error state and causing
a checkstop or system crash.
-
HIPER: A problem was fixed in the hypervisor that might cause a
partition migration to fail.
-
HIPER: A problem was fixed that caused large numbers of enhanced
error handling (EEH) errors to be logged against the 4-port gigabit Ethernet
adapter, F/C 5740, under certain circumstances.
-
HIPER: A problem was fixed that caused the firmware to erroneously
log VPD errors against the processors. This prevented drawers from powering
on.
-
HIPER: On system with a redundant service processor installed and
enabled, a problem was fixed that caused a communications hang between
the two service processors. When this occurred, it triggered a reset/reload
of the primary service processor, and the resulting fail-over to the secondary
service processor failed in such a way that the system crashed and logged
SRC B1813410. Service processor dumps were also taken.
-
HIPER: On systems with redundant service processors installed and
enabled, the firmware was enhanced so that if a failure occurs during a
service processor failover, the firmware will attempt to reset/reload one
of the service processors. This may allow the system to recover and stay
up instead of crashing.
-
HIPER: On systems with redundant service processors installed and
enabled, a problem was fixed that caused the system to crash if a service
processor failover occurred when the VPD files were being synchronized.
-
The firmware was enhanced to improve the system memory error recovery.
-
A problem was fixed that caused the /tmp directory on the service processor
to fill up, which results in an out-of-memory condition. When this problem
occurred, the service processor usually performed a reset/reload. This
is one possible cause of SRC B1817201 being logged.
-
A problem was fixed that caused panel function 02 to fail when trying to
set the "next IPL speed" or "next IPL side".
-
The firmware was enhanced so that serial port S1 is not automatically designated
the local console, even if the console is not selected within 60 seconds
of the system is first booted. This enhancement allows the console to be
selected again, if no selection was made on the previous boot, instead
of defaulting to the S1 port.
|
EM320_061_031
Mfg Only
05/09/08 |
Impact: Serviceability Severity:
HIPER
-
HIPER: A problem was fixed that caused a concurrent firmware installation
to hang with SRC BA00E840 being logged. This problem may also cause a partition
migration to hang, under certain circumstances, with the same SRC, BA00E840,
being logged. This SRC will be logged when this level of firmware is installed
and will generate a call home; it should be ignored. It will not be logged
during subsequent installations.
|
EM320_059_031
Mfg Only
05/06/08 |
Impact: Function Severity:
Special Attention
New features and functions:
-
Support for the concurrent addition of a node (drawer) was added.
-
Support for the "cold" repair of a node (repair with power off while other
nodes are running) was added.
-
Support for IPv6 was added. For more information, see Section 2.1 Cautions,
paragraph Concurrent Maintenance.
-
Support for logical volumes bigger than 2 TB was added.
-
Virtual switch support for virtual Ethernet devices was added. This requires
HMC V7 R3.3.0.0 with efix MH01102 to be running on the HMC.
Fixes that affect all systems:
-
HIPER: A problem was fixed that caused capacity-on-demand (COD)
data to be retrieved in an unreadable format from the Anchor (VPD) card.
-
HIPER: A problem was fixed that caused enhanced error handling (EEH)
to fail on certain I/O adapters.
-
HIPER: A problem was fixed that might cause the system to terminate
while IPLing partitions soon after a system boot. This problem might also
have been seen if the partitions were set to "autostart". This failure
is typically seen on systems with a large amount of memory; SRC B181D138
is usually logged when this error occurs.
-
DEFERRED: A problem was fixed that caused the system to appear to
hang with C10090B8 in the control (operator) panel during a slow mode boot.
-
A problem was fixed that prevented the processor clock from being deconfigured
with the fabric bus after a hardware error.
-
A problem was fixed that caused the L2 deconfiguration option to be displayed
advanced system management interface (ASMI) menus on systems on which it
is not supported.
-
A problem was fixed that caused the GX adapter slot reservation option
to be displayed on the advanced system management interface (ASMI) menus
on systems on which it is not supported.
-
Fixes problem where wrong slot location was provided in message when no
slot reservations were available for adding next Feature Code 1800 or 1802
adapter.
-
A problem was fixed that caused the location code reported with enhanced
error handling (EEH) errors on certain imbedded slots have a -Cx suffix
instead of the correct -T# suffix for the underlying adapter. This also
impacted the HMC's System Planning tool.
-
A problem was fixed that caused the Linux boot loader to lose its command
line parameters (and fail to boot a Linux partition) during a reconfiguration
reboot.
-
A problem was fixed that caused the "iSCSI" and "network1" aliases to be
created incorrectly in the SMS menus; this might have prevented the system
or partition from booting from that device.
-
A problem was fixed that caused this informational message to be erroneously
sent to the operating system console:
subq[5][0] destination address is 0!!!
Check whether the subq is needed. If it is, allocate MEM.
-
A problem was fixed that caused the AIX command lsvpd to hang if it was
executed during a partition migration.
-
A problem was fixed that caused the system or partition to hang at the
"Welcome to AIX" banner, following an iSCSI boot, during the installation
of AIX.
-
A problem was fixed that caused an iSCSI login to fail under certain circumstances.
When this failure occurred, the message sent to the console looked something
like this:
iscsiFailed to LOGIN to target, rc = 1
failed to login.
could not open target 0x9034751 :system04 for r/w, aborting...
tcpOPEN: iscsi open failed
!BA012010 !
-
A problem was fixed that caused the location codes of devices attached
to the integrated USB ports to have a duplicate port suffix. For example,
when this problem occurred, the location code of the device was shown as:
/usb-scsi@1 U789D.001.DQDGARW-P1-T2-T2-L1
instead of the correct location code, which is
/usb-scsi@1 U787D.001.DQDGARW-P1-T2-L1
-
Two translation issues were fixed. The first one caused the string "No
alias" to always be displayed on the iSCSI menus in SMS in English even
though it should have been translated into the other languages that the
SMS menus support. The second one caused the NIC (network interface card)
parameters such as the client IP address in the SMS ping menu to be displayed
with message strings in English; these should have been translated as well.
-
A problem was fixed that caused the SMS menus to drop into the open firmware
prompt with the message "DEFAULT CATCH!" when the ping test failed.
-
A problem was fixed that prevented the operating system from setting the
boot device list in NVRAM.
-
A problem was fixed that caused approximately 20-25 occurrences of informational
SRC B7005300 to be logged during every IPL, which was filling up the error
logs.
-
A problem was fixed that prevented the "100 Mbps/full duplex" setting for
the HEA 1 Gbps ports from being implemented from the HMC. When this occurred,
there was no error message on the HMC, but the setting never took effect.
-
A problem was fixed that caused the MAC addresses displayed on the HMC,
in the HEA logical port information for the second port group, to show
invalid addresses.
-
A problem was fixed that caused a service processor dump to be generated
with SRC B181EF88 when the advanced system management interface (ASMI)
client was closed abruptly, or a network failure disconnected the client
and the ASMI.
-
Enhancements were made to improve the field replaceable unit (FRU) isolation
for phase-locked loop (PLL) clock failures on multi-CEC drawer system.
SRCs B114F6D2, B114F6C1, B113F6C1, B157F12E, B18187EF, and B158E500 were
typically seen with this type of failure.
-
Enhancements were made to the error analysis firmware to provide better
FRU callouts for certain types of processor fabric bus failures. SRCs B114E504,
B114B2DF, and B181B10B were typically seen with this type of failure.
-
Enhancements were made to the firmware to improve the reliability of memory
DIMMs.
-
A change was made to the firmware such that predictive SRCs B18138B0, B1813862,
or B1813882 are now logged as informational.
System firmware changes that affect certain model MMA systems:
-
On system using the EnergyScale(TM) technology,
enhancements were made to include status, log, and error information about
the Power Save mode in the service processor error logs.
-
On systems with redundant service processors enabled, a problem was fixed
that caused the "restore factory configuration" function on the Advanced
System Management Interface (ASMI) to fail.
-
On systems with 7314-G30 drawers attached, a problem was fixed that caused
the InfiniBand I/O device to drop packets, which resulted in an unrecoverable
error.
-
On systems with 7314-G30 drawers attached, a problem was fixed that caused
the drawer to fail when performing concurrent maintenance on the associated
InfiniBand loop.
-
On systems with 7314-G30 drawers attached, a problem was fixed that caused
the partition to become unresponsive when an InfiniBand cable in a redundantly-cabled
loop was disconnected.
Note: The last two defects in this section corrected the issues detailed
in the section titled Signal Cable in an InfiniBand loop, and InfiniBand
I/0 drawer power on/off in earlier levels of the firmware description file. |
EM320_046_031
06/09/08 |
Impact: Serviceability Severity:
HIPER
Fixes that affect all model MMA systems:
-
HIPER: A problem was fixed that caused a concurrent firmware installation
to hang with SRC BA00E840 being logged. This problem may also cause a partition
migration to hang, under certain circumstances, with the same SRC, BA00E840,
being logged. This SRC will be logged when this level of firmware is installed
and will generate a call home; it should be ignored. It will not be logged
during subsequent installations.
-
HIPER: On systems with redundant service processors installed and
enabled, a problem was fixed that caused the system to crash if a service
processor failover occurred when the VPD files were being synchronized.
-
HIPER: On systems with redundant service processors installed and
enabled, the firmware was enhanced so that if a failure occurs during a
service processor failover, the firmware will attempt to reset/reload one
of the service processors. This may allow the system to recover and stay
up instead of crashing.
-
HIPER: A problem was fixed that caused the firmware to erroneously
log VPD errors against the processors. This prevented drawers from powering
on.
|
EM320_040_031
03/03/08 |
Impact: Serviceability Severity: Special
Attention
Fixes that affect all model MMA systems:
-
DEFERRED: A problem was fixed that caused a system crash (with SRC
B131E504) by changing the initialization settings of the I/O control hardware.
-
A problem was fixed that could cause the hypervisor to hang after a reset/reload
of the service processor.
-
A problem was fixed that, under certain circumstances, caused the InfiniBand
adapter to stop responding to InfiniBand requests.
-
A problem was fixed that caused SRC B1813014 to be logged after a successful
system firmware installation. This SRC will be logged when this level of
firmware is installed and will generate a call home; it should be ignored.
It will not be logged during subsequent installations.
-
The FRU list was changed so that clock card failures in a multi-drawer
system will be easier to debug and require fewer parts to fix.
-
A problem was fixed that caused the service processor to get stuck in a
reset/reload loop, which prevented the system from booting to standby.
System firmware changes that affect certain model MMA systems:
-
On systems with redundant service processors enabled, a problem was fixed
that could cause a significant increase in system boot time.
-
On systems with two service processors installed and with redundancy disabled,
a problem was fixed that caused the secondary service processor to go into
the dump state, and remain in the dump state, after a platform dump.
-
On systems with redundant service processors, SRCs B1813833 and B1813834,
which were being logged intermittently after a side-switch IPL, were changed
to informational.
-
On systems with a 1519-100 tower attached, a problem was fixed that caused
the location code of a connector on the integrated virtual IOP to be displayed
as Un-SE1-SE1-T1 instead of Un-SE1-T1.
-
On systems with 7134-G30 I/O drawers attached in certain cabling configurations,
a problem was fixed that prevented the I/O port labels from being displayed
for the port location codes on the hardware topology screens.
|
EM320_031_031
12/03/07 |
Impact: Function Severity: Attention
New Features and Functions:
-
Support for redundant service processors with failover on model MMA systems.
-
Support for the concurrent addition of a RIO/HSL adapter on model MMA systems.
-
Support for the concurrent replacement of a RIO/HSL adapter on model MMA
systems.
-
Support for the "hyperboot" boot speed option in the power on/off menu
on the Advanced System Management interface (ASMI).
-
Support for the creation of multiple virtual shared processor pools (VSPPs)
within the one physical pool. (In order for AIX performance tools to report
the correct information on systems configured with multiple shared processor
pools, a minimum of AIX 5.3 TL07 or AIX 6.1 must be running.)
-
Support for the capability to move a running AIX or Linux partition from
one system to another compatible system with a minimum of disruption.
-
Support for the collection of extended I/O device information (independent
of the presence of an operating system) when a system is first connected
to an HMC and is still in the manufacturing default state.
-
Improved VPD collection time on model MMA systems.
-
Support for the migration of DDR2 memory DIMMs during the MES upgrade from
a 9117-570 server to a 9117-MMA server when processor card F/C 5621 is
ordered when the initial system upgrade MES order is placed.
Support for EnergyScaletm and Active Energy Managertm.
For more information on the energy management features now available, please
see the EnergyScaletm
white
paper.
|
EM310 |
EM310_074_048
11/10/2008 |
Impact: Serviceability Severity: HIPER
System firmware changes that affect all systems:
-
DEFERRED and HIPER: The system initialization settings were changed
to reduce the likelihood of a system crash under certain circumstances.
-
HIPER: A problem was fixed that caused a system to fail to reboot
after a B1xxE504 SRC was logged, due to a processor interconnection bus
failure. The same SRC, B1xxE504, was logged when the reboot failed.
-
A problem was fixed that, under certain rarely occurring circumstances,
caused the system to crash if an L2 or L3 cache failure is not discovered
and repaired when it initially occurs.
-
The firmware was enhanced so that the contents of /tmp are included when
a service processor dump is taken.
-
A problem was fixed that, in certain configurations, caused the removal
of a host Ethernet adapter (HEA) port using a dynamic LPAR (DLPAR) operation
to fail.
-
A problem was fixed that, under certain rare circumstances, caused the
hypervisor to crash when it was booting with and SRC B6000103 being logged.
-
A problem was fixed that, under certain circumstances, prevented the operating
system from recovering a PIE adapter on which a temporary enhanced error
handling (EEH) error occurred.
-
A problem was fixed that prevented service processor and hypervisor error
log entries from being reported to the operating system after a successful
partition migration. This problem only affected the partition that was
migrated.
-
The firmware was enhanced so that a call home will be made if the hypervisor
issues a "terminate immediate" interrupt.
System firmware changes that affect certain systems:
-
In systems with clustered processors, various problems were fixed in the
InfiniBand interconnection networks.
-
On systems with a host Ethernet adapter (HEA) or host channel adapter (HCA)
assigned to a Linux partition, a problem was fixed that prevented the partition
from booting if 512 GB, 1 TB, or 1.5 TB of memory was assigned to the partition.
When this problem occurred, SRC B700F105 was logged.
|
EM310_071_048
07/30/2008 |
Impact: Serviceability Severity: HIPER
System firmware changes that affect all systems:
-
DEFERRED and HIPER: The processor initialization settings were changed
to reduce the likelihood of a processor going into an error state and causing
a checkstop or system crash.
-
HIPER: A problem was fixed that caused large numbers of enhanced
error handling (EEH) errors to be logged against the 4-port gigabit Ethernet
adapter, F/C 5740, under certain circumstances.
-
DEFERRED: A problem was fixed that caused informational SRCs B181B964
and B150D134 to be logged multiple times, and fill the service processor
error log, during normal operation of the system.
-
DEFERRED: The firmware was enhanced so that if an L3 cache controller
gets deconfigured at runtime, the associated processor cores will also
be deconfigured. This prevents the system from going into an error state
and causing a checkstop or system crash.
-
A problem was fixed that caused the /tmp directory on the service processor
to fill up, which results in an out-of-memory condition. When this problem
occurred, the service processor usually performed a reset/reload. This
is one possible cause of SRC B1817201 being logged.
-
Enhancements were made to improve the field replaceable unit (FRU) isolation
for phase-locked loop (PLL) clock failures on multi-CEC drawer system.
SRCs B114F6D2, B114F6C1, B113F6C1, B157F12E, B18187EF, and B158E500 were
typically seen with this type of failure.
-
A problem was fixed that caused SRC B1813014 to be erroneously generated
when a new level of system firmware was installed on the managed system.
-
A problem was fixed that caused SRC B7006971 to be erroneously generated
during dynamic LPAR (DLPAR) operations on memory.
-
A problem was fixed that caused an "HTML viewer error", followed by the
message "Cannot complete service action for reference code 'xxxxyyyy' "
to occur in Service Focal Point on the HMC when trying to perform the service
actions for certain SRCs.
-
A problem was fixed in partition firmware that could cause a partition
running AIX to crash under certain circumstances.
System firmware changes that affect certain systems:
-
On a partition running Linux, a problem was fixed that might cause the
hypervisor to erroneously deconfigure a processor core.
-
On partitions with a large number of hard disks attached to fibre channel
adapters, a problem was fixed that might cause SRC BA210003 to be erroneously
generated when the partition is booting. The partition might or might not
boot when this error occurs.
-
On systems with 7314-G30 drawers attached, a problem was fixed that caused
the port labels to be missing on the hardware topology screens with certain
cable configurations.
-
On systems with 7314-G30 drawers attached, a problem was fixed that caused
the partition to become unresponsive when an InfiniBand cable in a redundantly-cabled
loop was disconnected.
-
On systems with 7314-G30 drawers attached, a problem was fixed that might
have caused some I/O slots in the drawers not to be configured when the
system was booted.
Note: The last two defects in this section corrected the issues detailed
in the section titled Signal Cable in an InfiniBand loop, and InfiniBand
I/0 drawer power on/off in earlier levels of the firmware description file. |
EM310_069_048
02/11/2008 |
Impact: Availability Severity: HIPER
Fixes that affect all model MMA systems:
-
HIPER: A problem was fixed that caused some functions that perform
hardware operations during runtime to generate temporary extended error
handling (EEH) errors.
-
DEFERRED: A problem was fixed that caused a system crash (with SRC
B131E504) by changing the initialization settings of the I/O control hardware.
Note: This fix is not in the EM320_031_031 level listed above; it is included
in the EM320_040_031 level.
-
A problem was fixed that prevented a system from recovering after SRC B1xxB9xx
was logged.
-
A problem was fixed that caused a firmware installation to fail with SRC
B1813028.
-
A problem was fixed that caused SRC B1818A10 to be erroneously logged during
a disruptive firmware installation.
-
A problem was fixed that, under certain circumstances, caused the buttons
on the control (operator) panel to be inoperative.
-
A problem was fixed that prevented the system planning tool from deploying
a sysplan with certain HEA MCS values.
-
A problem was fixed that caused SRC B1813108 to be erroneously logged during
system boot.
-
A problem was fixed that, under certain circumstances, caused the InfiniBand
adapter to stop responding to InfiniBand requests.
-
A problem was fixed that caused the error "MSGVIOSE0300E002-0154 There
is insufficient memory available for firmware" to be logged on the HMC.
System firmware changes that affect certain model MMA systems
-
On model MMA systems with multiple drawers, a problem was fixed that prevented
the pin-hole reset switch on the control (operator) panel from resetting
the system.
-
On model MMA system with an uninterruptible power supply (UPS) attached,
a problem was fixed the prevented the UPS from notifying the operating
system that a utility failure or low battery condition had occurred.
-
On systems with at least 3 or more licensed processors and 2 or more unlicensed
processors, a problem was fixed that caused the system boot to be slower
than normal, or to hang with SRC C700406E.
-
On model MMA system with 7314-G30 I/O expansion drawers attached, problems
were fixed that caused the wrong FRUs to be called out with SRC B70069ED,
and caused the hypervisor to loop if certain invalid cabling configurations
are encountered.
-
On model MMA systems with a large number of I/O towers attached, a problem
was fixed that caused the HMC to go to the incomplete state when an additional
tower was added to a loop.
|
EM310_063_048
11/19/07 |
Impact: Availability Severity:
HIPER
-
HIPER: A problem was fixed that caused a time-out in a hardware
device driver. This time-out must include both SRCs B181B920 and
B181D147. Other SRCs may be present including, but not limited to, B1xxB9xx,
B1xxE504, and B150D141. Occasionally the system crashes. If B181B920 and
B181D147 SRCs are logged, check for any resources that were deconfigured
at the time of these errors and reconfigure them using the ASMI menus.
No hardware should be replaced. To recover from this error condition, the
service processor must be reset by removing, then reapplying, the managed
system's power.
-
DEFERRED: On multi-drawer model MMA systems, a problem found in
testing was fixed which when the L3 cache was disabled, under very unique
(and rare) circumstances may result in data being overwritten in the cache
and the system to crash. Although the exposure to this issue is very low,
and there have been no reported problems from the field, the system impact
if this occurred would be high. Product Engineering recommends that you
schedule time to install this deferred fix at you earliest convenience.
|
EM310_057_048
9/14/07 |
Impact: Availability Severity:
HIPER
Additional features and functions:
-
Added support for 9406-MMA.
System firmware changes that affect all 9117-MMA systems:
-
HIPER: A problem was fixed that caused the system to crash with
SRC B170E450.
-
HIPER: A problem was fixed that, in rare circumstances, could cause
the system to hang due to the improper handling of certain exceptions.
-
HIPER: A problem was fixed that prevented the operating system from
being notified of certain EPOW conditions that could lead to the system
or partition being shut down, with the possible loss of data. These EPOW
conditions included the ambient temperature being too high, the loss of
utility power (with or without UPS backup), and a user-initiated power
off using the white power button or the HMC.
-
A problem was fixed that could cause a firmware installation from the HMC
to fail with SRC E302F85C on the HMC, and SRC B1813088, B1818A0F, or B1813011
logged in the service processor error log.
-
A change was made so that if a failure occurs during a memory-preserving
reboot, the system continues to reboot rather than remaining in the termination
(powered off) state.
-
A problem was fixed that caused EEH (enhanced error handling) errors to
be erroneously logged against certain I/O adapters.
-
A problem was fixed that prevented "linked" resources that had been guarded
out from being reconfigured during the next reboot after a service action
on one of the guarded parts.
-
A problem was fixed that, after the backplane was replaced in a 7314-G30
I/O drawer, prevented the partition that owned the drawer from seeing those
resources.
-
A problem was fixed that caused the serial connection to a partition to
be lost. When this occurred, SRCs B181D307, B200E0AA, and/or B200813A were
generated by the service processor and the hypervisor.
-
A problem was fixed in partition firmware that, in some circumstances,
prevented a CD-ROM or tape device from being in the default service mode
boot list, even if one was present in the system.
-
A problem was fixed that caused the HMC to go to the incomplete state,
and SRC B182953C to be logged in the service processor error log every
five minutes or so, when the managed system was booted.
-
A problem was fixed that caused the system to intermittently fail to configure
devices attached to the integrated USB port when booting.
-
A problem was fixed that might have caused erroneous callouts if a problem
was found with certain levels of memory controller chips.
-
A problem was fixed that caused the system to call home and reboot instead
of allowing the failing part (a memory controller or DIMM) to be deconfigured
by PRD (processor runtime diagnostics).
Additional information concerning this service pack:
In addition to the fixes described above, this service pack also contains
a fix for a low probability problem and content intended for newly-manufactured
systems, or enhancements to system internal interfaces, which is not required
for systems already in production use. This content will not be activated
on systems that install this service pack concurrently. Even though this
content is not required for systems which are already installed and in
use, a disruptive installation of this service pack or a re-IPL after installing
it will cause this content to become active. It is not necessary to plan
a window for re-IPL the system the activate this content. |
EM310_048_048
6/22/07 |
Impact: New Severity:
New
|