AM760
For Impact, Severity and other Firmware definitions, Please
refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The complete Firmware Fix History for this
Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AM-IOCp-Firmware-Hist.html
|
AM760_089_034 / FW760.51
04/16/15 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- On systems using Virtual Shared Processor Pools (VSPP), a
problem was fixed for an inaccurate pool idle count over a small
sampling period.
A problem was corrected for a defect in an earlier service pack
(AM760_087) that potentially caused an undetected corruption of
firmware when the fix was concurrently activated. If the earlier
service pack(AM760_087) was concurrently installed, a platform IPL will
mitigate potential future exposure to the problem.
System firmware changes that affect certain systems
- On systems with
redundant service processors and unlicensed cores, a problem was fixed
with firmware update to prevent SRC B170B838 errors on unlicensed cores
after an administrative failover (AFO) to the backup service processor.
|
AM760_087_034 / FW760.50
01/12/15 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact: Security
Severity: HIPER
New features and functions
- System recovery for interrupted AC power and Voltage
Regulator Module (VRM) failures has been enhanced for systems with
multiple CEC enclosures such that a power AC or VRM fault on one CEC
drawer will no longer block the other CEC drawers from powering
on. Previously, all CEC enclosures in a system needed valid AC
power before the power on of the system could proceed.
This system recovery feature does not pertain to the IBM Power ESE
(8412-EAD) systems because it is a single CEC enclosure system.
- Support was added for using the Mellanox ConnectX-3 Pro
10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.
System firmware changes that affect all systems
- A problem was fixed that caused an intermittent loss of TTY
serial port access to the Advanced System Management Interface (ASMI)
after a power off of the system.
- A problem was fixed that prevented guard error logs from
being reported for FRUs that were guarded during the system power
on. This could happen if the same FRU had been previously
reported as guarded on a different power on of the system. The
requirement is now met that guarded FRUs are logged on every power on
of the system.
- Help text for the Advanced System Management Interface
(ASMI) "System Configuration/Power Management/Power Supply Idle
Control" menu option was enhanced to clarify that an idle power supply
is in a low power state and not powered off. The new help text
states "Power supply idle mode helps to reduce overall power
usage when the system load is very light by having one power supply
deliver all the power while the second is in a low power state".
- A problem was fixed where a 12V DC power-good (pGood) input
fault was reported as a SRC 11002620 with the wrong FRU callout of
Un-P1 for system backplane. The FRU callout for SRC 11002620 has
been corrected to Un-P2 for I/O card.
- A problem was fixed to prevent a recoverable processor
clock error from falsely calling out processor chip FRUs with a SRC
B181E550 error log. Only the predictive error SRC B158CC62 for
the oscillator chip should have been reported.
- A problem was fixed that caused a "code accept" during a
concurrent firmware installation from the management console to fail
with SRC E302F85C.
- A problem was fixed for memory relocation failing during a
partition reboot with SRC B700F103 logged. The memory relocation
could be part of the processing for the Dynamic Platform Optimizer
(DPO), Active Memory Sharing (AMS) between partitions, mirrored memory
defragmentation, or a concurrent FRU repair.
- A problem was fixed that caused the date and time to be
incorrect in AIX if a partition is remotely restarted on a different
system from the one on which it was hibernated.
- A problem was fixed that caused the Utility COD display of
historical usage data to be truncated on the management console.
- A problem was fixed for I/O adapters so that BA400002
errors were changed to informational for memory boundary adjustments
made to the size of DMA map-in requests. These DMA size
adjustments were marked as UE previously for a condition that is normal.
- A security problem was fixed in the service processor
Lighttpd web server that allowed denial of service vulnerabilities for
the Advanced System Manager Interface (ASMI). The Common
Vulnerabilities and Exposures issue numbers for this problem are
CVE-2011-4362 and CVE-2012-5533.
- A security problem was fixed for the Lighttpd web
server that allowed arbitrary SQL commands to be run on the service
processor of the CEC. The Common Vulnerabilities and Exposures
issue number is CVE-2014-2323.
- A security problem was fixed for the Lighttpd web server
where improperly-structured URLs could be used to view arbitrary files
on the service processor of the CEC. The Common Vulnerabilities
and Exposures issue number is CVE-2014-2324.
- A security problem was fixed for the Network Time Protocol
(NTP) client that allowed remote attackers to execute arbitrary code
via a crafted packet containing an extension field. The Common
Vulnerabilities and Exposures issue number is CVE-2009-1252.
- A security problem was fixed for the Network Time Protocol
(NTP) client for a buffer overflow that allowed remote NTP servers to
execute arbitrary code via a crafted response. The Common
Vulnerabilities and Exposures issue number is CVE-2009-0159.
- A power supply fan speed problem was fixed that slowed the
power supply fans down to a very low level for a minute about once
every hour, with possible thermal shutdown of the power supply.
- A problem was fixed for a Live Partition Mobility (LPM)
suspend and transfer of a partition that caused the time of day to skip
ahead to an incorrect value on the target system. The problem
only occurred when a suspended partition was migrated to a target CEC
that had a hypervisor time that was later than the source CEC.
- A security problem was fixed in the OpenSSL (Secure Socket
Layer) protocol that allowed a man-in -the middle attacker, via a
specially crafted fragmented handshake packet, to force a TLS/SSL
server to use TLS 1.0, even if both the client and server supported
newer protocol versions. The Common Vulnerabilities and Exposures issue
number for this problem is CVE-2014-3511.
- A security problem was fixed in OpenSSL for formatting
fields of security certificates without null-terminating the output
strings. This could be used to disclose portions of the program
memory on the service processor. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3508.
- Multiple security problems were fixed in the way that
OpenSSL handled Datagram Transport Layer Security (DLTS) packets.
A specially crafted DTLS handshake packet could cause the service
processor to reset. The Common Vulnerabilities and Exposures
issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and
CVE-2014-3507.
- A security problem was fixed in OpenSSL to prevent a denial
of service when handling certain Datagram Transport Layer Security
(DTLS) ServerHello requests. A specially crafted DTLS handshake
packet with an included Supported EC Point Format extension could cause
the service processor to reset. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3509.
- A security problem was fixed in OpenSSL to prevent a denial
of service by using an exploit of a null pointer de-reference during
anonymous Diffie Hellman (DH) key exchange. A specially crafted
handshake packet could cause the service processor to reset. The
Common Vulnerabilities and Exposures issue number for this problem is
CVE-2014-3510.
- A security problem in GNU Bash was fixed to prevent
arbitrary commands hidden in environment variables from being run
during the start of a Bash shell. Although GNU Bash is not
actively used on the service processor, it does exist in a library so
it has been fixed. This is IBM Product Security Incident Response
Team (PSIRT) issue #2211. The Common Vulnerabilities and
Exposures issue numbers for this problem are CVE-2014-6271,
CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.
- A security problem was fixed in the Advanced System
Management Interface (ASMI) to block click-jacking attempts. This
prevents framing of the original ASMI page with a top layer on it with
dummy buttons that could trick the user into clicking on a link.
- A problem was fixed for the Advanced System Manager
Interface (ASMI) that allowed possible cross-site request forgery
(CSRF) exploitation of the ASMI user session to do unwanted tasks on
the service processor.
- A security problem was fixed in OpenSSL for memory leaks
that allowed remote attackers to cause a denial of service (out of
memory on the service processor). The Common Vulnerabilities and
Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
- A problem was fixed to prevent a hypervisor to service
processor surveillance heartbeat time-out error and host-initiated
reset/reload of the service processor. This problem was caused by
an errant long delay in writing an error log entry, resulting in a
block of the heartbeat message and subsequent time-out.
- A security problem was fixed in OpenSSL for padding-oracle
attacks known as Padding Oracle On Dowgraded Legacy Encryption
(POODLE). This attack allows a man-in-the-middle attacker to
obtain a plain text version of the encrypted session data. The Common
Vulnerabilities and Exposures issue number is CVE-2014-3566. The
service processor POODLE fix is based on a selective disablement of
SSLv3 using the Advanced System Management Interface (ASMI) "System
Configuration/Security Configuration" menu options. The Security
Configuration options of "Disabled", "Default", and "Enabled" for SSLv3
determines the level of protection from POODLE. The management
console also requires a POODLE fix for APAR MB03867(FIX FOR
CVE-2014-3566 FOR HMC V7 R7.7.0 SP4 with PTF MH01482) to eliminate all
vulnerability to POODLE and allow use of option 1 "Disabled" as shown
below:
-1) Disabled: This highest level of security protection does not
allow service processor clients to connect using SSLv3, thereby
eliminating any possibility of a POODLE attack. All clients must
be capable of using TLS to make the secured connections to the service
processor to use this option. This requires the management
console be at a minimum level of HMC V7 R7.7.0 SP4 with POODLE PTF
MH01482.
-2) Default: This medium level of security protection disables
SSLv3 for the web browser sessions to ASMI and for the CIM clients and
assures them of POODLE-free connections. But the legacy
management consoles are allowed to use SSLv3 to connect to the service
processor. This is intended to allow non-POODLE compliant HMC
levels to be able to connect to the CEC servers until they can be
planned and upgraded to the POODLE compliant HMC levels. Running
a non-POODLE compliant HMC to a service processor in "Default"
mode will prevent the ASMI-proxy sessions from the HMC from connecting
as these proxy sessions require SSLv3 support in ASMI.
-3) Enabled: This basic level of security protection enables
SSLv3 for all service processor client connection. It relies on
all clients being at POODLE fix compliant levels to provide full POODLE
protection using the TLS Fallback Signaling Cipher Suite Value
(TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3
connections. This option is intended for customer sites on
protected internal networks that have a large investment in legacy
hardware that need SSLv3 to make browser and HMC connection to the
service processor. The level of POODLE protection actually
achieved in "Enabled" mode is determined by the percentage of clients
that are at the POODLE fix compliant levels.
System firmware changes that affect certain systems
- HIPER/Pervasive:
On systems using PowerVM firmware, a performance problem was fixed that
may affect shared processor partitions where there is a mixture of
dedicated and shared processor partitions with virtual IO connections,
such as virtual ethernet or Virtual IO Server (VIOS) hosting, between
them. In high availability cluster environments this problem may
result in a split brain scenario.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that occurred during Offline Converter Assembly (OCA)
replacement operations. The fix prevents a false Voltage
Regulator Module (VRM) fault and the logging of SRCs 10001511 or
10001521 from occurring. This resulted in the OCA LED
getting stuck in an on or "fault" state and the OCA not powering on.
- On systems with a redundant service processor with AC power
missing to the node containing the anchor card, a problem was fixed
that caused an IPL failure with SRC B181C062 when the anchor card could
not be found in the vital product data (VPD) for the system. With
the fix, the system is able to find the anchor card and IPL since the
anchor card gets its power from the service processor cable, not from
the node where it resides.
- On a system with partitions with redundant Virtual
Asynchronous Services Interface (VASI) streams, a problem was
fixed that caused the system to terminate with SRC B170E540. The
affected partitions include Active Memory Sharing (AMS), encapsulated
state partitions, and hibernation-capable partitions. The problem
is triggered when the management console attempts to change the active
VASI stream in a redundant configuration. This may occur due to a
stream reconfiguration caused by Live Partition Mobility (LPM);
reconfiguring from a redundant Paging Service Partition (PSP) to a
single-PSP configuration; or conversion of a partition from AMS to
dedicated memory.
- A problem was fixed for Live Partition Mobility (LPM)
migrations from Power7+ systems that use the nest accelerator (NX) for
compression and encryption usage that caused the migrated partition to
revert to software compression instead of using the NX hardware.
Some operating system negotiated functions may not operate correctly
and could impact performance.
- On systems that have Active Memory Sharing (AMS) partitions
and deduplication enabled, a problem was fixed for not being able to
resume a hibernated AMS partition. Previously, resuming a
hibernated AMS partition could give checksum errors with SRC B7000202
logged and the partition would remain in the hibernated state.
- On systems that have Active Memory Sharing (AMS)
partitions, a problem was fixed for Dynamic Logical Partitioning
(DLPAR) for a memory remove that leaves a logical memory block (LMB) in
an unusable state until partition reboot.
- On systems with a partition that has a 256MB Real Memory
Offset (RMO) region size that has been migrated from a Power8 system
to Power7 or Power6 using Live Partition Mobility (LPM), a
problem was fixed that caused a failure on the next boot of the
partition with a BA210000 log with a CA000091 checkpoint just prior to
the BA210000. The fix dynamically adjusts the memory footprint of
the partition to fit on the earlier Power systems.
- On systems using IPv6 addresses, the firmware was enhanced
to reduce the time it take to install an operating system using the
Network Installation Manager (NIM).
- On systems in IPv6 networks, a problem was fixed for
a network boot/install failing with SRC B2004158 and IP address
resolution failing using neighbor solicitation to the partition
firmware client.
- On a system with a disk device with multiple boot
partitions, a problem was fixed that caused System Management Services
(SMS) to list only one boot partition. Even though only one boot
partition was listed in SMS, the AIX bootlist command could still be
used to boot from any boot partition.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed for a hypervisor hang at progress code C7004091
during the IPL or hangs during serviceability tasks to the I/O drawer.
- For systems with a IBM i load source disk attached to an
Emulex-based fibre channel adapter such as F/C #5735, a problem was
fixed that caused an IBM i load source boot to fail with SRC B2006110
logged and a message to the boot console of "SPLIT-MEM Out of
Room". This problem occurred for load source disks that needed
extra disk scans to be found, such as those attached to a port other
than the first port of a fibre channel adapter (first port requires
fewest disk scans).
- A problem was fixed for systems in networks using the
Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to
prevent network ping errors and boot from network (bootp)
failures. The Address Resolution Protocol (ARP) table information
on the Juniper aggregated switches is not being shared between the
switches and that causes problems for address resolution in certain
network configurations. Therefore, the CEC network stack code has
been enhanced to add three gratuitous ARPs (ARP replies sent without a
request received) before each ping and bootp request to ensure that all
the network switches have the latest network information for the system.
- On systems using Virtual Shared Processor Pools (VSPP), a
problem was fixed for an inaccurate pool idle count over a small
sampling period.
- A problem was fixed that could result in unpredictable
behavior if a memory UE is encountered while relocating the contents of
a logical memory block during one of these operations:
- Using concurrent maintenance to perform a hot repair of a node.
- Reducing the size of an Active Memory Sharing (AMS) pool.
- On systems using mirrored memory, using the memory mirroring
optimization tool.
- Performing a Dynamic Platform Optimizer (DPO) operation.
- On systems with redundant service processors, a
problem was fixed so that a backup memory clock failure with SRC
B120CC62 is handled without terminating the system running on the
primary memory clock.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- A problem was fixed for concurrent maintenance operations
to limit hardware retries on failed hardware so that it can be
concurrently repaired.
- A problem was fixed for concurrent maintenance to prevent a
hardware unavailable failure when doing consecutive concurrent remove
and add operations to an I/O Hub adapter for a drawer.
|
AM760_079_034 / FW760.41
06/24/14 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- HIPER/Pervasive:
A security problem was fixed in the OpenSSL (Secure Socket Layer)
protocol that allowed clients and servers, via a specially crafted
handshake packet, to use weak keying material for communication.
A man-in-the-middle attacker could use this flaw to decrypt and modify
traffic between the management console and the service processor.
The Common Vulnerabilities and Exposures issue number for this problem
is CVE-2014-0224.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL for a buffer overflow in the
Datagram Transport Layer Security (DTLS) when handling invalid DTLS
packet fragments. This could be used to execute arbitrary code on
the service processor. The Common Vulnerabilities and Exposures
issue number for this problem is CVE-2014-0195.
- HIPER/Pervasive:
Multiple security problems were fixed in the way that OpenSSL handled
read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was
enabled to prevent denial of service. These could cause the
service processor to reset or unexpectedly drop connections to the
management console when processing certain SSL commands. The
Common Vulnerabilities and Exposures issue numbers for these problems
are CVE-2010-5298 and CVE-2014-0198.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
when handling certain Datagram Transport Layer Security (DTLS)
ServerHello requests. A specially crafted DTLS handshake packet could
cause the service processor to reset. The Common Vulnerabilities
and Exposures issue number for this problem is CVE-2014-0221.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
by using an exploit of a null pointer de-reference during anonymous
Elliptic Curve Diffie Hellman (ECDH) key exchange. A specially
crafted handshake packet could cause the service processor to
reset. The Common Vulnerabilities and Exposures issue number for
this problem is CVE-2014-3470.
|
AM760_078_034 / FW760.40
01/22/14 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact:
Availability Severity: SPE
New features and functions
- Support was added in Advanced System Management Interface
(ASMI) to facilitate capture and reporting of debug data for system
performance problems. The "System Service Aids/Performance
Dump" menu was added to ASMI to perform this function.
- Support was added to the Advanced System Management
Interface (ASMI) to provide a menu for "Power Supply Idle Mode".
Using the "Power Supply Idle Mode" menu, the power supplies can
be either set enabled to save power by idling power supplies when
possible or set disabled to keep all power supplies fully on and allow
a balanced load to be maintained on the power distribution units (PDUs)
of the system. Power supply idle mode enabled helps to reduce
overall power usage when the system load is very light by having one
power supply deliver all the power while the second power supply is
maintained in a low power state.
- Support was dropped for Secured Socket Layer (SSL) protocol
version 2 and SSL weak and medium cipher suites in the service
processor web server (Lighttpd) . Unsupported web browser
connections to the Advanced System Management Interface (ASMI) secured
port 443 (using https://) will now be rejected if those browsers do not
support SSL version 3. Supported web browsers for Power7 ASMI are
Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0),
Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
- Support was added in Advanced System Management Interface
(ASMI) "System Configuration/Firmware Update Policy" menu to detect and
display the appropriate Firmware Update Policy (depending on whether
system is HMC managed) instead of requiring the user to select the
Firmware Update Policy. The menu also displays the "Minimum Code
Level Supported" value.
System firmware changes that affect all systems
- A problem was fixed
that caused a memory leak of 50 bytes of service processor memory for
every call home operation. This could potentially cause an out of
memory condition for the service processor when running over an
extended period of time without a reset.
- A problem was fixed that caused a L2 cache error to not
guard out the faulty processor, allowing the system to checkstop again
on an error to the same faulty processor.
- A problem was fixed that prevented a GX adapter from being
added to an empty slot for location code P1-C3 using a MES add when the
system was powered off. The P1-C3 location code was not provided
as candidate location for the GX add in the Service Focal Point on the
management console.
- A problem was fixed that caused a HMC code update failure
for the FSP on the accept operation with SRC B1811402 or FSP is unable
to boot on the updated side.
- A problem was fixed that caused a system checkstop during
hypervisor time keeping services.
- A problem was fixed that caused a built-in self test (BIST)
for GX slots to create corrupt error log values that core dumped the
service processor with a B18187DA. The corruption was caused by a
failure to initialize the BIST array to 0 before starting the tests.
- The firmware was enhanced to display on the management
console the correct number of concurrent live partition mobility (LPM)
operations that is supported.
- A problem was fixed that caused a 1000911E platform event
log (PEL) to be marked as not call home. The PEL is now a call
home to allow for correction. This PEL is logged when the
hypervisor has changed the Machine Type Model Serial Number (MTMS) of
an external enclosure to UTMP.xxx.xxxx because it cannot read the vital
product data (VPD), or the VPD has invalid characters, or if the MTMS
is a duplicate to another enclosure.
- A problem was fixed that caused a SRC B7006A72 calling out
the adapter and the I/O Planar.
- A problem was fixed that caused the system attention LED
to be lit without a corresponding SRC and error log for the
event. This problem typically occurs when an operating system on
a partition terminates abnormally.
- A problem was fixed during resource dump processing that
caused a read of an invalid system memory address and a SRC
B181C141. The invalid memory reference resulted from the service
processor incorrectly referencing memory that had been relocated by the
hypervisor.
- DEFERRED: A
problem was fixed that caused a system checkstop with SRC B113E504 for
a recoverable hardware fault. This deferred fix addresses a
problem that has a very low probability of occurrence. As such
customers may wait for the next planned service window to activate the
deferred fix via a system reboot.
- A problem was fixed that prevented a HMC-managed system
from being converted to manufacturing default configuration (MDC) mode
when the management console command "lpcfgop -m <server> -o
clear" failed to create the default partition. The management
console went to the incomplete state for this error.
- A problem was fixed that caused the slot index to be
missing for virtual slot number 0 for the dynamic reconfiguration
connector (DRC) name for virtual devices. This error was visible
from the management console when using commands such as "lshwres -r
virtualio --rsubtype slot -m machine" to show the hardware resources
for virtual devices.
- A problem during a dynamic logical partitioning (DLPAR)
memory operation was fixed that caused BA250020 SRCs to be logged
unnecessarily for the AIX partition. There were no memory errors
for the partition.
- A problem was fixed that caused frequent SRC B1A38B24 error
logs with a call home every 15 seconds when service processor network
interfaces were incorrectly configured on the same subnet. The
frequency of the notification of the network subnet error has been
reduced to once every 24 hours.
- Help text for the Advanced System Management Interface
(ASMI) "System Configuration/Hardware Deconfiguration/Clear All
Deconfiguration Errors" menu option was enhanced to clarify that when
selecting "Hardware Resources" value of "All hardware resources", the
service processor deconfiguration data is not cleared. The
"Service processor" must be explicitly selected for that to be cleared.
- A problem was fixed that caused a memory clock failure to
be called out as failure in the processor clock FRU.
- A problem was fixed that caused Capacity on Demand (COD) to
truncate On/Off "Resource Days Enabled" for users with extended amounts.
System firmware changes that affect certain systems
- On systems running
Dynamic Platform Optimizer (DPO) , a problem was fixed that
caused an incorrect placement of dedicated processors for partitions
larger than a single chip. When this occurs, the performance is
impacted over what would have been gained with proper placement.
- On systems with a redundant service processor, a problem
was fixed that caused fans to run at a high-speed after a failover to
the sibling service processor.
- On systems with a redundant service processor, a problem
was fixed that caused a guarded sibling service processor
deconfiguration details to not be able to be shown in the Advanced
System Management Interface (ASMI).
- On systems with a F/C 5802 or 5877 I/O drawer installed,
the firmware was enhanced to guarantee that an SRC will be generated
when there is a power supply voltage fault. If no SRC is
generated, a loss of power redundancy may not be detected, which can
lead to a drawer crash if the other power supply goes down. This
also fixes a problem that causes an 8 GB Fiber channel adapter in
the drawer to fail if the 12V level fails in one Offline Converter
Assembly (OCA).
- On systems managed by an HMC with a F/C 5802 or 5877 I/O
drawer installed, a problem was fixed that caused the hardware topology
on the management console for the managed system to show "null" instead
of "operational" for the affected I/O drawers.
- On systems with a redundant service processor, a problem
was fixed that caused a SRC B150D15E to be erroneously logged after a
failover to the sibling service processor.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that where an Offline Converter Assembly (OCA) fault
would appear to persist after an OCA micro-reset or OCA
replacement. The fault bit reported to the OS may not be cleared,
indicating a fault still exists in the I/O drawer after it has been
repaired.
- On systems running Dynamic Platform Optimizer (DPO) with no
free memory, a problem was fixed that caused the management
console lsmemopt command to report the wrong status of completed with
no partitions affected. It should have indicated that DPO failed
due to insufficient free memory. DPO can only run when there is
free memory in the system.
- On systems with partitions using physical shared processor
pools, a problem was fix that caused partition hangs if the shared
processor pool was reduced to a single processor.
- On systems involved in a series of consecutive logical
partition migration (LPM) operations, a memory leak problem was fixed
in the run time abstraction service (RTAS) that caused a partition run
time AIX crash with SRC 0c20. Other possible symptoms include
error logs with SRC BA330002 (RTAS memory allocation failure).
- A problem was fixed in the run-time abstraction services
(RTAS) extended error handling (EEH) for fundamental reset that caused
partitions to crash during adapter updates. The fundamental reset
of adapters now returns a valid return code. The adapter drivers
using fundamental reset affected by this fix are the following:
o QLogic PCIe Fibre Channel adapters (combo card)
o IBM PCIe Obsidian
o Emulex BE3-based ethernet adapters
o Broadcom-based PCIe2 4-port 1Gb ethernet
o Broadcom-based FlexSystem EN2024 4-port 1Gb ethernet for compute node
- On systems that have configurations that support all the
types of Capacity On Demand
(Perm/OnOff/Trial/Utility-Processor,Perm.OnOff/Trial-Memory), a problem
was fixed to eliminate repeated B7005300 error logs caused by
hypervisor asset protection processes using slightly more memory than
promised.
- On systems with a redundant service processor, a problem
was fixed where the service processor allowed a clock failover to occur
without a SRC B158CC62 error log and without a hardware deconfiguration
record for the failed clock source. This resulted in the system
running with only one clock source and without any alerts to warn that
clock redundancy had been lost.
- DEFERRED: On
systems with a redundant service processor, a problem was fixed that
caused a system termination with SRC B158CC62 during a clock failover
initiated by certain types of clock card failures. This deferred
fix addresses a problem that has a very low probability of
occurrence. As such customers may wait for the next planned
service window to activate the deferred fix via a system reboot.
- On systems in manufacturing default configuration (MDC), a
problem was fixed that caused the system to change from MDC to Hardware
Management Console (HMC)-managed mode even though the HMC was unable to
authenticate to the service processor. A system must be
successfully discovered by a HMC as a prerequisite to becoming
HMC-managed.
- On systems running Dynamic Platform Optimizer (DPO) with
one or more unlicensed processors, a problem was fixed where the system
performance was significantly degraded during the DPO operation.
The amount of performance degradation was more for systems with larger
numbers of unlicensed processors.
- On systems with one memory clock deconfigured, a problem
was fixed where the system failed to IPL using the second memory clock
with SRCs B158CC62 and B181C041 logged.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- A problem was fixed that caused a concurrent hot add/repair
maintenance operation to fail on an erroneously logged error for the
service processor battery with SRCs B15A3303, B15A3305, and
B181EA35 reported.
- A problem was fixed that caused a concurrent processor
exchange to terminate during node deactivation with SRC B1814616.
- A problem was fixed that caused SRC B15A3303 to be
erroneously logged as a predictive error on the service processor
sibling after a successful concurrent repair maintenance operation for
the real-time clock (RTC) battery.
- A problem was fixed that caused Capacity on Demand (COD)
"Out of Compliance" messages during concurrent maintenance operations
when the system was actually in compliance for the licensed amount of
resources in use.
|
AM760_069_034 / FW760.31
07/25/13 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact: Performance
Severity: ATT
System firmware changes that affect certain systems
- On systems running
Dynamic Platform Optimizer (DPO) , a problem was fixed that
caused an incorrect placement of dedicated processors for partitions
larger than a single chip. When this occurs, the performance is
impacted over what would have been gained with proper placement.
|
AM760_068_034 / FW760.30
06/24/13 |
Systems
8412-EAD, 9117-MMD and 9179-MHD
Impact:
Availability Severity: SPE
New features and functions
- Support for the 8412-EAD.
System firmware changes that affect all systems
- A problem was fixed
that caused a service processor dump to be generated with SRC B18187DA
"NETC_RECV_ER" logged.
- A problem was fixed that prevented 1100xxxx SRCs from being
sent to the partitions.
- A problem was fixed that was caused by an attempt to modify
a virtual adapter from the management console command line when the
command specifies it is an Ethernet adapter, but the virtual ID
specified is for an adapter type other than Ethernet. The managed
system has to be rebooted to restore communications with the management
console when this problem occurs; SRC B7000602 is also logged.
- The Hypervisor was enhanced to allow the system to continue
to boot using the redundant data chip on the anchor (VPD) card, instead
of stopping the Hypervisor boot and logging SRC B7004715, when
the primary data chip on the anchor card has been corrupted.
- A problem was fixed that caused a migrated partition to
have to rebooted on the target system.
- A problem was fixed that caused a performance loss after a
configuration change, such as un-licensing a processor, because the
Hypervisor is unable to dispatch a partition to a shared processor.
- A problem was fixed that may cause inaccurate processor
utilization reporting.
- A problem was fixed that caused erroneous A70047xx SRCs to
be logged that called out the Anchor (VPD) card. This led
to unnecessary replacements of the Anchor card.
System firmware changes that affect certain systems
- When switching
between turbocore and maxcore mode, a problem was fixed that caused the
number of supported partitions to be reduced by 50%.
- On systems running Active Memory Sharing (AMS) partitions,
a problem was fixed that may arise due to the incorrect handling of a
return code in an error path during the logical partition migration
(LPM) of an AMS partition.
- On systems running Dynamic Platform Optimization (DPO), a
problem was fixed that caused the current DPO score for a partition to
be incorrect. When this occurs, it looks like DPO would not
improve performance when in fact it would improve the performance.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- On systems in which there are no processors in the shared
processor pool, a problem was fixed that caused the Hypervisor to
become unresponsive (the service processor starts logging time-out
errors against the Hypervisor, and the HMC can no longer talk to the
Hypervisor) during a concurrent hot add/repair maintenance
operation. SRC B182953C will also be called home.
|
AM760_062_034 / FW760.20
02/27/13 |
Systems
9117-MMD and 9179-MHD
Impact: Availability
Severity: SPE
New Features and Functions
- Enablement of concurrent hot add/repair maintenance on
9117-MMD and 9179-MHD systems.
System firmware changes that affect all systems
- A problem was fixed
that caused a card (and its children) that was removed after the system
was booted to continue to be listed in the guard menus in the Advanced
System Management Interface (ASMI).
- A problem was fixed that caused a firmware update to fail
with SRC B1818A0F.
- A problem was fixed that caused a partition to become
unresponsive when the AIX command "update_flash -s" is run.
- A problem was fixed that caused the service processor (or
system controller) to crash when it boots from the new level during a
concurrent firmware installation.
- A problem was fixed that can cause fans in the server to
run at maximum speed and generate a serviceable event during system
boot (B130B8AF, a predictive error with hardware callout) as a result
of an incorrect calibration of a particular thermal sensor.
- A problem was fixed that caused system fans to be
erroneously called out as failing with one or more of the following
SRC's: 11007610,11007620,11007630,11007640, or 11007650.
- A problem was fixed that caused SRCs B70069F4 and B130E504
to be erroneously logged when a system was powered down. This
also results in I/O hardware being guarded out, and the hypervisor is
not able to "unguard" the I/O hardware at runtime.
- A problem was fixed that caused SRC B1812A40 to be
erroneously logged; a memory DIMM and the symbolic FRU AMBTEMP
were listed in the FRU list.
System firmware changes that affect certain systems
- On systems running
iSCSI, a problem was fixed that caused pinging from the iSCSI menu in
the System Management Services (SMS) to fail.
- On a partition with a large number of potentially bootable
devices, a problem was fixed that caused the partition to fail to boot
with a default catch, and SRC BA210000 may also be logged.
- On a system running a Live Partition Mobility (LPM)
operation, a problem was fixed that caused the partition to
successfully appear on the target system, but hang with a 2005 SRC.
- On a partition with the virtual Trusted Platform Module
(vTPM) enabled, a problem was fixed that caused errors to occur when
the memory assigned to the partition was changed.
- On a partition with the virtual Trusted Platform Module
(vTPM) enabled, a problem was fixed that caused the partition to stop
functioning after certain operations. When this problem occurs,
the client partition may not power off.
- On a system using the modem/serial port on the service
processor, a problem was fixed that caused a service processor dump
(with SRC B181EF88 logged) to be erroneously generated when the
connection was dropped.
- On systems that support all types of both memory and
processor Capacity on Demand (CoD) operations, and on which CoD
operations are frequently performed, the firmware was enhanced to
reduce the number of informational SRC B7005300 logged.
- On systems with redundant service processors, a problem was
fixed that caused the sibling service processor state to show up as
"unknown" in the service processor error log if a code synchronization
problem was detected after a service processor was replaced.
- On a partition with the virtual Trusted Platform Module
(vTPM) enabled, a problem was fixed that caused SRC B200F00F to be
logged when the partition was resumed after hibernation.
- On a partition with the virtual Trusted Platform Module
(vTPM) enabled, the Hypervisor was enhanced to display (on the
management console) the minimum maximum memory required to support the
partition.
- On systems running AIX or Linux, a problem was fixed that
caused a partition to fail to boot with SRC CA260203. This
problem also can cause concurrent firmware updates to fail.
- On systems with TurboCore processors and unlicensed
processors, a problem was fixed that caused the output of the AIX
lparstat command for "Active Physical CPUs in system" to be incorrect.
- On systems running Active Memory Sharing (AMS) partitions,
a problem was fixed that caused the system to hang after an AMS
partition was deleted or mobilized, combined with either an AMS pool
resize or relocation of AMS pool memory.
- On systems with an I/O tower attached, the a problem was
fixed that caused multiple service processor reset/reloads if the tower
was flooding the System Power Control Network (SPCN) with bad
data.
|
AM760_051_034
12/05/12 |
Systems
9117-MMD and 9179-MHD
Impact: Serviceability
Severity: ATT
System firmware changes that affect all systems
A problem was fixed
that can cause fans in the server to run at maximum speed and generate
a serviceable event during system boot (B130B8AF, a predictive error
with hardware callout) as a result of an incorrect calibration of a
particular thermal sensor. |
AM760_044_034
11/28/12 |
Systems
9117-MMD and 9179-MHD
Impact: Availability
Severity: SPE
System firmware changes that affect all systems
- A
problem was fixed that caused an uncorrectable error (SRC B123E504) to
be erroneously logged when 64GB DIMMs were installed in a system that
already had 16GB or 32GB DIMMs.
- A problem was
fixed that caused the Advanced System Management Interface (ASMI) to
produce a core dump when changing the admin user password.
- A problem was fixed that caused SRC B1813221, which
indicates a failure of the battery on the service processor, to be
erroneously logged after a service processor reset or power cycle.
- A problem was fixed the caused the Hardware Management
Console (HMC) to erroneously indicate that a partition was using
hardware encryption and memory compression co-processors when those
co-processors were not installed in the managed system.
- A problem was fixed that caused various parts to be
erroneously guarded out when an ac power cord was unplugged when the
system was powered on.
- A problem was fixed that caused invalid temperature sensor
failures to be reported on memory DIMMs. SRC B124B8A4 was logged
when this problem occurred.
System firmware changes that affect certain systems
- On system running
the Dynamic Platform Optimizer (DPO), a problem was fixed that caused
an incomplete status output when using the "lsmemopt" HMC CLI
command. Specifically, the "requested" and "protected" sets of
partitions will appear empty in the lsmemopt output, even though the
user may have explicitly specified partitions in these sets on the
optmem command.
- On systems running the virtual Trusted Platform Module
(vTPM), a problem was fixed that caused a memory leak when a
vTPM-enabled partition was disabled, migrated, or deleted.
- On systems running IBM i, a problem was fixed that caused
the P7+ random number generator to be unavailable.
- The Power Hypervisor was enhanced to insure better
synchronization of vSCSI and NPIV I/O interrupts to partitions.
|
AM760_034_034
10/24/12 |
Systems
9117-MMD and 9179-MHD
Impact:
New
Severity: New
New Features and Functions
- Support for the 9117-MMD and 9179-MHD systems.
- On 9117-MMD and 9179-MHD systems, support for attachment of
the F/C 5888 I/O drawer.
- Support for a new processor power-saving deep-sleep mode.
- Enablement of the encryption accelerator.
- Enablement of the compression accelerator.
- Support for Dynamic Platform Optimizer.
- Support for 0.05 processor granularity.
- The Hypervisor was enhanced to enforce broadcast storm
prevention between the primary and backup SEAs (Shared Ethernet
Adapters). This fix requires VIOS 2.2.2.0 or later on all VIOS
partitions with SEA devices.
|