AM730
For Impact, Severity and other Firmware definitions, Please
refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The complete Firmware Fix History for this
Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AM-Firmware-Hist.html
|
AM730_146_035 / FW730.A0
01/28/15 |
Impact: Security
Severity: ATT
New Features and Functions
- Support was added for using the Mellanox ConnectX-3 Pro
10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.
- An enhancement was made for the Global Interrupt Queue
(GIQ) so that interrupts are presented in a round-robin fashion in
partitions that have idle processors instead of GIQ directed interrupts
favoring lower numbered processors.
System firmware changes that affect all systems
- A security problem was fixed for the Lighttpd web
server that allowed arbitrary SQL commands to be run on the service
processor. The Common Vulnerabilities and Exposures issue number
is CVE-2014-2323.
- A security problem was fixed for the Lighttpd web server
where improperly-structured URLs could be used to view arbitrary files
on the service processor. The Common Vulnerabilities and
Exposures issue number is CVE-2014-2324.
- A security problem was fixed for the Network Time Protocol
(NTP) client that allowed remote attackers to execute arbitrary code
via a crafted packet containing an extension field. The Common
Vulnerabilities and Exposures issue number is CVE-2009-1252.
- A security problem was fixed for the Network Time Protocol
(NTP) client for a buffer overflow that allowed remote NTP servers to
execute arbitrary code via a crafted response. The Common
Vulnerabilities and Exposures issue number is CVE-2009-0159.
- A security problem was fixed in the service processor
TCP/IP stack to discard illegal TCP/IP packets that have the SYN and
FIN flags set at the same time. An explicit packet discard was
needed to prevent further processing of the packet that could result in
an bypass of the iptables firewall rules.
- A security problem was fixed in the OpenSSL (Secure Socket
Layer) protocol that allowed a man-in -the middle attacker, via a
specially crafted fragmented handshake packet, to force a TLS/SSL
server to use TLS 1.0, even if both the client and server supported
newer protocol versions. The Common Vulnerabilities and Exposures issue
number for this problem is CVE-2014-3511.
- A security problem was fixed in OpenSSL for formatting
fields of security certificates without null-terminating the output
strings. This could be used to disclose portions of the program
memory on the service processor. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3508.
- Multiple security problems were fixed in the way that
OpenSSL handled Datagram Transport Layer Security (DLTS) packets.
A specially crafted DTLS handshake packet could cause the service
processor to reset. The Common Vulnerabilities and Exposures
issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and
CVE-2014-3507.
- A security problem was fixed in OpenSSL to prevent a denial
of service when handling certain Datagram Transport Layer Security
(DTLS) ServerHello requests. A specially crafted DTLS handshake
packet with an included Supported EC Point Format extension could cause
the service processor to reset. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3509.
- A security problem was fixed in OpenSSL to prevent a denial
of service by using an exploit of a null pointer de-reference during
anonymous Diffie Hellman (DH) key exchange. A specially crafted
handshake packet could cause the service processor to reset. The
Common Vulnerabilities and Exposures issue number for this problem is
CVE-2014-3510.
- A security problem was fixed in OpenSSL for memory leaks
that allowed remote attackers to cause a denial of service (out of
memory on the service processor). The Common Vulnerabilities and
Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
- A security problem was fixed in the Advanced System
Management Interface (ASMI) to block click-jacking attempts. This
prevents framing of the original ASMI page with a top layer on it with
dummy buttons that could trick the user into clicking on a link.
- A problem was fixed that caused a "code accept" during a
concurrent firmware installation from the management console to fail
with SRC E302F85C.
- A problem was fixed for the callout on power good (pgood)
fault SRC 11002634 so that it includes the CEC enclosure and the
failing FRU. Previously, the callout was missing the failing FRU.
- A security problem was fixed in OpenSSL for padding-oracle
attacks known as Padding Oracle On Dowgraded Legacy Encryption
(POODLE). This attack allows a man-in-the-middle attacker to
obtain a plain text version of the encrypted session data. The Common
Vulnerabilities and Exposures issue number is CVE-2014-3566. The
service processor POODLE fix is based on a selective disablement of
SSLv3 using the Advanced System Management Interface (ASMI) "System
Configuration/Security Configuration" menu options. The Security
Configuration options of "Disabled", "Default", and "Enabled" for SSLv3
determines the level of protection from POODLE. The management
console also requires a POODLE fix for APAR MB03867(Fix for
CVE-2014-3566 for HMC V7 R7.9.0 SP1 with PTF MH01484) to eliminate all
vulnerability to POODLE and allow use of option 1 "Disabled" as shown
below:
-1) Disabled: This highest level of security protection does not
allow service processor clients to connect using SSLv3, thereby
eliminating any possibility of a POODLE attack. All clients must
be capable of using TLS to make the secured connections to the service
processor to use this option. This requires the management
console be at a recommended minimum level of HMC V7 R7.9.0 SP1 with
POODLE PTF MH01484.
-2) Default: This medium level of security protection disables
SSLv3 for the web browser sessions to ASMI and for the CIM clients and
assures them of POODLE-free connections. But the legacy
management consoles are allowed to use SSLv3 to connect to the service
processor. This is intended to allow non-POODLE compliant HMC
levels to be able to connect to the CEC servers until they can be
planned and upgraded to the POODLE compliant HMC levels. Running
a non-POODLE compliant HMC to a service processor in "Default"
mode will prevent the ASMI-proxy sessions from the HMC from connecting
as these proxy sessions require SSLv3 support in ASMI.
-3) Enabled: This basic level of security protection enables
SSLv3 for all service processor client connection. It relies on
all clients being at POODLE fix compliant levels to provide full POODLE
protection using the TLS Fallback Signaling Cipher Suite Value
(TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3
connections. This option is intended for customer sites on
protected internal networks that have a large investment in legacy
hardware that need SSLv3 to make browser and HMC connection to the
service processor. The level of POODLE protection actually
achieved in "Enabled" mode is determined by the percentage of clients
that are at the POODLE fix compliant levels.
- A problem was fixed for a Live Partition Mobility (LPM)
suspend and transfer of a partition that caused the time of day to skip
ahead to an incorrect value on the target system. The problem
only occurred when a suspended partition was migrated to a target CEC
that had a hypervisor time that was later than the source CEC.
- A problem was fixed that could result in latency or timeout
issues with I/O devices.
- A problem was fixed for I/O adapters so that BA400002
errors were changed to informational for memory boundary adjustments
made to the size of DMA map-in requests. These DMA size
adjustments were marked as UE previously for a condition that is normal.
- A problem was fixed for the Advanced System Manager
Interface (ASMI) that allowed possible cross-site request forgery
(CSRF) exploitation of the ASMI user session to do unwanted tasks on
the service processor.
- A problem was fixed for intermittent B181EF88 SRCs and
netsSlp core dumps during network configurations on the service
processor. This error caused call home activity for the SRC and
dumps but otherwise had no impact to the CEC functionality.
System firmware changes that affect certain systems
- On
systems with a F/C 5802 or 5877 I/O drawer installed, a problem was
fixed for a hypervisor hang at progress code C7004091 during the IPL or
hangs during serviceability tasks to the I/O drawer.
- On systems that have Active Memory Sharing (AMS)
partitions, a problem was fixed for Dynamic Logical Partitioning
(DLPAR) for a memory remove that leaves a logical memory block (LMB) in
an unusable state until partition reboot.
- On systems using the Virtual I/O Server (VIOS) to share
physical I/O resources among client logical partitions, a problem was
fixed for memory relocation errors during page migrations for the
virtual control blocks. These errors caused a CEC termination
with SRC B700F103. The memory relocation could be part of the
processing for the Dynamic Platform Optimizer (DPO), Active Memory
Sharing (AMS) between partitions, mirrored memory defragmentation, or a
concurrent FRU repair.
- A problem was fixed that could result in unpredictable
behavior if a memory UE is encountered while relocating the contents of
a logical memory block during one of these operations:
- Using concurrent maintenance to perform a hot repair of a node.
- Reducing the size of an Active Memory Sharing (AMS) pool.
- On systems using mirrored memory, using the memory mirroring
optimization tool.
- A problem was fixed for systems in networks using the
Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to
prevent network ping errors and boot from network (bootp)
failures. The Address Resolution Protocol (ARP) table information
on the Juniper aggregated switches is not being shared between the
switches and that causes problems for address resolution in certain
network configurations. Therefore, the CEC network stack code has
been enhanced to add three gratuitous ARPs (ARP replies sent without a
request received) before each ping and bootp request to ensure that all
the network switches have the latest network information for the system.
- On systems in IPv6 networks, a problem was fixed for
a network boot/install failing with SRC B2004158 and IP address
resolution failing using neighbor solicitation to the partition
firmware client.
- For systems with a IBM i load source disk attached to an
Emulex-based fibre channel adapter such as F/C #5735, a problem was
fixed that caused an IBM i load source boot to fail with SRC B2006110
logged and a message to the boot console of "SPLIT-MEM Out of
Room". This problem occurred for load source disks that needed
extra disk scans to be found, such as those attached to a port other
than the first port of a fibre channel adapter (first port requires
fewest disk scans).
- On systems with a partition that has a 256MB Real Memory
Offset (RMO) region size that has been migrated from a Power8 system
to Power7 or Power6 using Live Partition Mobility (LPM), a
problem was fixed that caused a failure on the next boot of the
partition with a BA210000 log with a CA000091 checkpoint just prior to
the BA210000. The fix dynamically adjusts the memory footprint of
the partition to fit on the earlier Power systems.
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- A problem was fixed for concurrent maintenance operations
to limit hardware retries on failed hardware so that it can be
concurrently repaired.
- A problem was fixed for a power off failure of an expansion
drawer (F/C 5802 or F/C 5877) during a concurrent repair. The
power off commands to the drawer are now tried again using the System
Power Control Network (SPCN) serial connection to the drawer to allow
the repair to continue.
- A problem was fixed for concurrent maintenance to prevent a
hardware unavailable failure when doing consecutive concurrent remove
and add operations to an I/O Hub adapter for a drawer.
|
AM730_142_035 / FW730.91
06/24/14 |
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- HIPER/Pervasive:
A security problem was fixed in the OpenSSL (Secure Socket Layer)
protocol that allowed clients and servers, via a specially crafted
handshake packet, to use weak keying material for communication.
A man-in-the-middle attacker could use this flaw to decrypt and modify
traffic between the management console and the service processor.
The Common Vulnerabilities and Exposures issue number for this problem
is CVE-2014-0224.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL for a buffer overflow in the
Datagram Transport Layer Security (DTLS) when handling invalid DTLS
packet fragments. This could be used to execute arbitrary code on
the service processor. The Common Vulnerabilities and Exposures
issue number for this problem is CVE-2014-0195.
- HIPER/Pervasive:
Multiple security problems were fixed in the way that OpenSSL handled
read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was
enabled to prevent denial of service. These could cause the
service processor to reset or unexpectedly drop connections to the
management console when processing certain SSL commands. The
Common Vulnerabilities and Exposures issue numbers for these problems
are CVE-2010-5298 and CVE-2014-0198.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
when handling certain Datagram Transport Layer Security (DTLS)
ServerHello requests. A specially crafted DTLS handshake packet could
cause the service processor to reset. The Common Vulnerabilities
and Exposures issue number for this problem is CVE-2014-0221.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
by using an exploit of a null pointer de-reference during anonymous
Elliptic Curve Diffie Hellman (ECDH) key exchange. A specially
crafted handshake packet could cause the service processor to
reset. The Common Vulnerabilities and Exposures issue number for
this problem is CVE-2014-3470.
|
AM730_127_035 / FW730.90
04/02/14 |
Impact: Availability
Severity: SPE
System firmware changes that affect all systems
- A problem was fixed that caused a built-in self test (BIST)
for GX slots to create corrupt error log values that core dumped the
service processor with a B18187DA. The corruption was caused by a
failure to initialize the BIST array to 0 before starting the tests.
- Help text for the Advanced System Management Interface
(ASMI) "System Configuration/Hardware Deconfiguration/Clear All
Deconfiguration Errors" menu option was enhanced to clarify that when
selecting "Hardware Resources" value of "All hardware resources", the
service processor deconfiguration data is not cleared. The
"Service processor" must be explicitly selected for that to be cleared.
- A problem was fixed that prevented guard error logs from
being reported for FRUs that were guarded during the system power
on. This could happen if the same FRU had been previously
reported as guarded on a different power on of the system. The
requirement is now met that guarded FRUs are logged on every power on
of the system.
- DEFERRED: A problem
was fixed that caused a system checkstop with SRC B113E504 for a
recoverable hardware fault. This deferred fix addresses a problem
that has a very low probability of occurrence. As such customers
may wait for the next planned service window to activate the deferred
fix via a system reboot.
- A problem was fixed that caused a memory clock failure to
be called out as failure in the processor clock FRU.
System firmware changes that affect certain systems
- On
systems with a redundant service processor, a problem was fixed where
the service processor allowed a clock failover to occur without a SRC
B158CC62 error log and without a hardware deconfiguration record for
the failed clock source. This resulted in the system running with
only one clock source and without any alerts to warn that clock
redundancy had been lost.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that where an Offline Converter Assembly (OCA) fault
would appear to persist after an OCA micro-reset or OCA
replacement. The fault bit reported to the OS may not be cleared,
indicating a fault still exists in the I/O drawer after it has been
repaired.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that occurred during Offline Converter Assembly (OCA)
replacement operations. The fix prevents a false Voltage
Regulator Module (VRM) fault and the logging of SRCs 10001511 or
10001521 from occurring. This resulted in the OCA LED
getting stuck in an on or "fault" state and the OCA not powering on.
- On systems involved in a series of consecutive Live
Partition Mobility (LPM) operations, a memory leak problem was fixed in
the run time abstraction service (RTAS) that caused a partition run
time AIX crash with SRC 0c20. Other possible symptoms include
error logs with SRC BA330002 (RTAS memory allocation failure).
- On a system with partitions with redundant Virtual
Asynchronous Services Interface (VASI) streams, a problem was
fixed that caused the system to terminate with SRC B170E540. The
affected partitions include Active Memory Sharing (AMS), encapsulated
state partitions, and hibernation-capable partitions. The problem
is triggered when the management console attempts to change the active
VASI stream in a redundant configuration. This may occur due to a
stream reconfiguration caused by Live Partition Mobility (LPM);
reconfiguring from a redundant Paging Service Partition (PSP) to a
single-PSP configuration; or conversion of a partition from AMS to
dedicated memory.
- On systems with one memory clock deconfigured, a problem
was fixed where the system failed to IPL using the second memory clock
with SRCs B158CC62 and B181C041 logged.
- On a system with a disk device with multiple boot
partitions, a problem was fixed that caused System Management Services
(SMS) to list only one boot partition. Even though only one boot
partition was listed in SMS, the AIX bootlist command could still be
used to boot from any boot partition.
- On a system with a partition with a AIX and Linux boot
source to support dual booting, a problem was fixed that caused the
Host Ethernet Adapter (HEA) to be disabled when rebooting from Linux to
AIX. Linux had disabled interrupts for the HEA on power down,
causing an error for AIX when it tried to use the HEA to access the
network.
|
AM730_122_035 / FW730.80
09/18/13 |
Impact: Availability
Severity: SPE
Note: This service
pack includes several critical concurrent fixes and a deferred fix
which has a very low probability of occurrence. IBM
recommends that customers concurrently install the service pack, to
protect their system against known issues, but can wait to activate the
deferred fix, via a system reboot, until the next scheduled service
window.
New Features and Functions
- Support was dropped for Secured Socket Layer (SSL) Version
2 and SSL weak and medium cipher suites in the service processor web
server (Lighttpd). Unsupported web browser connections to the
Advanced System Management Interface (ASMI) secured port 443 (using
https://) will now be rejected if those browsers do not support SSL
version 3. Supported web browsers for Power7 ASMI are Netscape
(version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla
Firefox (version 2.0.0.11), and Opera (version 9.24).
System firmware changes that affect all systems
- On systems with utility processors, an accounting
problem with utility processor minutes was fixed.
- A problem was fixed that caused a migrated partition to
reboot during transfer to a VIOS 2.2.2.0, and later, target system. A
manual reboot would be required if transferred to a target system
running an earlier VIOS release. Migration recovery may also be
necessary.
- A problem was fixed that caused a service processor dump to
be generated with SRC B18187DA "NETC_RECV_ER" logged.
- A problem was fixed that caused a L2 cache error to not
guard out the faulty processor, allowing the system to checkstop again
on an error to the same faulty processor.
- A problem was fixed that caused a HMC code update failure
for the FSP on the accept operation with SRC B1811402 or FSP is unable
to boot on the updated side.
- A problem was fixed that caused a 1000911E platform event
log (PEL) to be marked as not call home. The PEL is now a call
home to allow for correction. This PEL is logged when the
hypervisor has changed the Machine Type Model Serial Number (MTMS) of
an external enclosure to UTMP.xxx.xxxx because it cannot read the vital
product data (VPD), or the VPD has invalid characters, or if the MTMS
is a duplicate to another enclosure.
- A problem was fixed that caused the state of the Host
Ethernet Adapter (HEA) port to be reported as down when the physical
port is actually up.
- A problem was fixed that caused the system attention LED
to be lit without a corresponding SRC and error log for the
event. This problem typically occurs when an operating system on
a partition terminates abnormally.
- DEFERRED: A
problem was fixed that caused a system checkstop during
hypervisor time keeping services. This deferred fix addresses a problem
that has a very low probability of occurrence. As such customers
may wait for the next planned service window to activate the deferred
fix via a system reboot.
System firmware changes that affect certain systems
- On
systems with a redundant service processor, a problem was fixed that
caused fans to run at a high-speed after a failover to the sibling
service processor.
- On systems with a redundant service processor, a problem
was fixed that caused a guarded sibling service processor
deconfiguration details to not be able to be shown in the Advanced
System Management Interface (ASMI).
- On systems with a F/C 5802 or 5877 I/O drawer installed,
the firmware was enhanced to guarantee that an SRC will be generated
when there is a power supply voltage fault. If no SRC is
generated, a loss of power redundancy may not be detected, which can
lead to a drawer crash if the other power supply goes down. This
also fixes a problem that causes an 8 GB Fiber channel adapter in
the drawer to fail if the 12V level fails in one Offline Converter
Assembly (OCA).
- On systems managed by an HMC with a F/C 5802 or 5877 I/O
drawer installed, a problem was fixed that caused the hardware topology
on the management console for the managed system to show "null" instead
of "operational" for the affected I/O drawers.
- On systems with a redundant service processor, a problem
was fixed that caused a SRC B150D15E to be erroneously logged after a
failover to the sibling service processor.
- On systems with turbo-core enabled that are a target of a
Live Partition Mobility (LPM) operation, a problem was fixed
where cache properties were not recognized and SRCs BA280000 and
BA250010 reported.
- When switching between turbocore and maxcore mode, a
problem was fixed that caused the number of supported partitions to be
reduced by 50%.
- On systems running AIX or Linux, a problem was fixed that
caused the operating system to halt when an InfiniBand Host Channel
Adapter (HCA) adapter fails or malfunctions.
- A problem was fixed in the run-time abstraction services
(RTAS) extended error handling (EEH) for fundamental reset that caused
partitions to crash during adapter updates. The fundamental reset
of adapters now returns a valid return code. The adapter drivers
using fundamental reset affected by this fix are the following:
- QLogic PCIe Fibre Channel adapters (combo card)
- Emulex BE3-based ethernet adapters
- Broadcom-based PCIe2 4-port 1Gb ethernet
- Broadcom-based FlexSystem EN2024 4-port 1Gb ethernet
for compute nodes
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- A problem was fixed that caused a concurrent hot add/repair
maintenance operation to fail on an erroneously logged error for the
service processor battery with SRCs B15A3303, B15A3305, and
B181EA35 reported.
- A problem was fixed that caused SRC B15A3303 to
be erroneously logged as a predictive error on the service processor
sibling after a successful concurrent repair maintenance operation for
the real-time clock (RTC) battery.
|
AM730_114_035 / FW730.70
04/03/13 |
Impact: Availability
Severity: SPE
System firmware changes that affect all systems
- A problem was fixed that caused a card (and its children)
that was removed after the system was booted to continue to be listed
in the guard menus in the Advanced System Management Interface (ASMI).
- A problem was fixed that prevented predictive guard errors
from being deleted on the secondary service processor. This
caused hardware to be erroneously guarded out if a service processor
failover occurred, then the system was rebooted.
- A problem was fixed that caused SRC B1813221, which
indicates a failure of the battery on the service processor, to be
erroneously logged after a service processor reset or power cycle.
- A problem was fixed that caused various SRCs to be
erroneously logged at boot time including B181E6C7 and B1818A14.
- A problem was fixed that caused a code update operation to
fail with a time-out error, creating a call-home with SRC B1818A0F
. This problem is more likely to occur on HMC-managed systems
experiencing a high level of management activity during a code update.
- A problem was fixed that caused system fans to be
erroneously called out as failing with one or more of the following
SRC's: 11007610,11007620,11007630,11007640, or 11007650.
- A problem was fixed that caused the service processor (or
system controller) to crash when it boots from the new level during a
concurrent firmware installation.
- A problem was fixed that caused SRC B7006A72 to be
erroneously logged.
- A problem was fixed that caused the system power to be
throttled, resulting in decreased performance. This problem
typically occurs after a PCI adapter is plugged into a node (CEC
drawer), and can also happen when a dedicated I/O partition is powered
on or off.
- The Power Hypervisor was enhanced to insure better
synchronization of vSCSI and NPIV I/O interrupts to partitions.
- A problem was fixed that caused SRC B15A3303 ("CEC
Hardware: Time-Of-Day Hardware Predictive Error") to be erroneously
logged, and the time-of-day to be set to Jan 1, 1970.
- A problem was fixed that was caused by an attempt to modify
a virtual adapter from the management console command line when the
command specifies it is an Ethernet adapter, but the virtual ID
specified is for an adapter type other than Ethernet. The managed
system has to be rebooted to restore communications with the management
console when this problem occurs; SRC B7000602 is also logged.
System firmware changes that affect certain systems
- On
systems with an I/O tower attached, a problem was fixed that caused
SRCs 10009135 and 10009139 to be erroneously logged.
- A problem was fixed that caused various parts to be
erroneously guarded out in some cases, and the clock card being called
out as defective in other cases, when both ac cords providing power to
a drawer were unplugged when the system was powered on.
- On systems running Selective Memory Mirroring (SMM), a
problem was fixed that caused the hypervisor to hang or crash when an
uncorrectable hardware error occurred in a memory DIMM.
- On systems with redundant service processors, a problem was
fixed that caused the sibling service processor state to show up as
"unknown" in the service processor error log if a code synchronization
problem was detected after a service processor was replaced.
- On systems with an I/O tower attached, a problem was fixed
that caused multiple service processor reset/reloads if the tower was
continuously sending invalid System Power Control Network (SPCN) status
data.
- A problem was fixed that caused the HMC to display
incorrect data for a virtual Ethernet adapter's transactions statistics.
- A problem was fixed that caused a hibernation resume
operation to hang if the connection to the paging space is lost near
the end of the resume processing. This is more likely on a
partition that supports remote restart.
- A problem was fixed that caused the system to terminate
with a bad address checkstop during mirroring defragmentation.
- A problem was fixed that prevented the HMC command
"lshwres" from showing any I/O adapters if any adapter name contained
the ampersand character in the VPD.
- On a system running a Live Partition Mobility (LPM)
operation, a problem was fixed that caused the partition to
successfully appear on the target system, but hang with a 2005 SRC.
- On a partition with a large number of potentially bootable
devices, a problem was fixed that caused the partition to fail to boot
with a default catch, and SRC BA210000 may also be logged.
- On systems running Active Memory Sharing (AMS) partitions,
a problem was fixed that may arise due to the incorrect handling of a
return code in an error path during the Live Partition Mobility
(LPM) of an AMS partition.
- On systems running Active Memory Sharing (AMS) partitions,
a timing problem was fixed that may occur if the system is undergoing
AMS pool size changes.
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- A problem was fixed that caused SRC B15738B0 to be
erroneously logged after a successful concurrent hot add/repair
maintenance operation.
- A problem was fixed that caused a concurrent hot add/repair
maintenance (CHARM) operation to fail after this sequence of events
occurred:
1. A user-initiated platform system dump is requested (from the
ASMI or management console).
2. A service processor reset/reload takes place while dump
collection is in progress.
3. A concurrent hot add/repair maintenance operation is attempted.
- On systems in which there are no processors in the shared
processor pool, a problem was fixed that caused the Hypervisor to
become unresponsive (the service processor starts logging time-out
errors against the Hypervisor, and the HMC can no longer talk to the
Hypervisor) during a concurrent hot add/repair maintenance operation.
- A problem was fixed that caused a hypervisor memory leak
during a concurrent hot add/repair maintenance operation.
- A problem was fixed that caused a concurrent node repair or
upgrade to fail during the system deactivation step with a hypervisor
error code of 0x300.
- A problem was fixed that caused a the system to terminate
with a bad address checkstop during a concurrent hot add/repair
maintenance operation.
- A problem was fixed that caused the system to hang if
memory relocation is performed during a concurrent hot add/repair
maintenance operation.
- A problem was fixed that caused partition activations to
fail during or after a node repair operation.
- A problem was fixed that caused synchronization problems in
an application using the Barrier Synchronization Register (BSR)
facility during the memory relocation that occurs in a concurrent hot
add/repair maintenance operation.
- A problem was fixed that prevented the I/O slot information
from being presented on the management console after a concurrent node
repair.
- On systems running multiple IBM i partitions that are
configured to communicate with each other via virtual Opticonnect,
concurrent hot add/repair maintenance operations may time-out.
When this problem occurs, a platform reboot may be required to recover.
|
AM730_099_035
10/24/12 |
Impact: Availability
Severity: HIPER - High Impact/PERvasive, Should be installed as soon as
possible.
System firmware changes that affect all systems
- HIPER/Non-Pervasive: DEFERRED: A problem was fixed
that caused a system crash with SRC B170E540.
- HIPER/Non-Pervasive:
A related
problem was also fixed that could cause a live lock on the power bus
resulting in a system crash.
- To address poor placement of partitions following a reboot
of a server with unlicensed cores, the firmware was enhanced to run the
affinity manager when the initialize configuration operation is done
from the HMC. A problem was also fixed that caused the hypervisor
to be left in an inconsistent state after a partition create operation
failed.
|
AM730_095_035
08/23/12 |
Impact: Availability
Severity: SPE
New Features and Functions
- Support for booting the IBM i operating system from a USB
tape drive.
System firmware changes that affect all systems
- A problem was fixed that caused a partition with dedicated
processors to hang with SRC BA33xxxx when rebooted, after it was
migrated using a Live Partition Mobility (LPM) operation from a system
running Ax730 to a system running Ax740, or vice versa.
- The firmware was enhanced to call out the correct field
replaceable units (FRUs) when SRC B124E504 with description "Chnl init
TO due to SN stuck in recovery" was logged.
- A problem was fixed that caused SRC B1818A10 to be
erroneously logged after a system firmware installation.
- A problem was fixed that caused booting from a virtual
fibre channel tape device to fail with SRC B2008105.
- The firmware was enhanced to log SRCs BA180030 and BA180031
as informational instead of predictive.
- A problem was fixed that caused a "code accept" during a
concurrent firmware installation from the HMC to fail with SRC
E302F85C. This is most likely to occur on model FHB systems.
- On systems running the AIX operating system, a problem was
fixed that caused the hypervisor to crash with SRC B7000103, after an
HEA (Host Ethernet Adapter) error was logged, when there is a lot of
AIX activity on the HEAs.
- A problem was fixed that caused the suspension of a
partition to fail if a large amount of data has to be stored to resume
the partition.
- A problem was fixed that caused a system crash with
unrecoverable SRC B7000103 with "ErFlightRecorder" in the failing
stack..
- On systems booting from an NPIV (N-port ID virtualization)
device, a problem was fixed that caused the boot to intermittently
terminate with the message "PReP-BOOT: unable to load full PReP
image.". This problem occurs more frequently on the IBM V7000
Storage System running the SAN Volume Controller (SVC), but not on
every boot.
- A problem was fixed that caused SRC B181E6F1 with the
description "RMGR_PERSISTENT_EVENT_TIMEOUT" to be erroneously logged.
- A problem was fixed that prevented a change to the system
operating mode ("M" or "N") made in the Advanced System Management
Interface (ASMI) menu from being displayed in the physical control
(operator) panel.
- A problem was fixed that caused a memory leak in the
service processor firmware.
- A problem was fixed that caused SRC B155A491 to be
erroneously logged during multiple system IPLs. This SRC may
cause the system to terminate.
- A problem was fixed that caused the lsstat command on the
HMC to display an erroneously high number of packets transmitted and
received on a vlan interface.
System firmware changes that affect certain systems
- The
firmware was enhanced to fix a potential performance degradation on
systems utilizing the stride-N stream prefetch instructions dcbt (with
TH=1011) or dcbtst (with TH=1011). Typical applications executing
these algorithms include High Performance Computing, data intensive
applications exploiting streaming instruction prefetchs, and
applications utilizing the Engineering and Scientific Subroutine
Library (ESSL) 5.1.
- On systems on which Internet Explorer (IE) is used to
access the Advanced System Management Interface (ASMI) on the Hardware
Management Console (HMC), a problem was fixed that caused IE to hang
for about 10 minutes after saving changes to network parameters on the
ASMI.
- A problem was fixed that caused informational SRC A70047FF,
which may indicate that the Anchor (VPD) card should be replaced, to be
erroneously logged again after the Anchor card was replaced.
- A problem was fixed that caused a network installation of
IBM i to fail when the client was on the same subnet as the server.
- On systems with a 5796 or 5797 I/O drawer attached, a
problem was fixed that could cause a system hang.
- On systems with a feature code (F/C) 5802 or 5877 I/O
drawer attached, a problem was fixed that prevented the system from
booting with SRC B1818903, with a signature of
"SINK_REASON_CODE_FILE_LOCK_TIMEOUT".
- On systems with the F/C 1804 (Integrated 4 Port (2x1Gb and
2x10Gb SFP+ Optical-SR ports)) or F/C 1813 (Integrated, 4 Port (2x1Gb
and 2x10Gb SFP+ Copper twinax ports)), the firmware was enhanced to
prevent the attached network switch from prematurely shutting down the
Ethernet port due to link flaps detected during IPL.
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- A problem was fixed the prevented the DASD roll-up fault
LED from working properly after a node add or node remove operation.
- A problem was fixed that caused a hot node repair operation
to fail with PhypRc=0x0300, indicating the deactivate system resource
operation failed.
- During a CHARM replacement of a memory card on a system
running with mirrored memory, a problem was fixed that caused the
operation to fail with "PhypRc = 0x0326".
|
AM730_087_035
05/18/12 |
Impact: Availability
Severity: SPE
New Features and Functions
- Support for IBM i Live Partition Mobility (LPM)
System firmware changes that affect all systems
- A problem was fixed that prevented the user from changing
the boot mode or keylock setting after a remote restart-capable
partition is created, even after the partition's paging device is
on-line.
System firmware changes that affect certain systems
- The firmware resolves undetected N-mode stability problems
and improves error reporting on the feature code (F/C) 5802 and 5877
I/O drawer power subsystem.
|
AM730_078_035
03/14/12 |
Impact: Availability
Severity: SPE
System firmware changes that affect all systems
- The firmware was enhanced to properly display a memory
controller that has been guarded out manually on the "Deconfiguration
Records" menu option (under "System Service Aids") on the Advanced
System Management Interface (ASMI).
- A problem was fixed that caused multiple service processor
dumps to be unnecessarily taken during a concurrent firmware
update. SRC B181EF9A, which indicates that the dump space on the
service processor is full, was logged as a result.
- The firmware was enhanced to increase the threshold for
recoverable SRC B113E504 so that the processor core reporting the SRC
is not guarded out. This prevents unnecessary performance loss
and the unnecessary replacement of processor modules.
- A problem was fixed that caused SRC B7000602 to be
erroneously logged at power on.
- The firmware was enhanced to recognize new USB-attached
devices so that they will be listed as boot devices in the System
Management Services (SMS) menus.
- A problem was fixed that caused booting or installing a
partition or system from a USB device to fail with error code
BA210012. This usually occurs when an operating system (OS) other
than the OS that is already on the partition or system is booted or
installed.
- On the System Management Services (SMS) remote IPL (RIPL)
menus, a problem was fixed that caused the SMS menu to continue to show
that an Ethernet device is configured for iSCSI, even though the user
has changed it to BOOTP.
- The firmware was enhanced to log SRCs BA180030 and BA180031
as informational instead of predictive.
- The firmware was enhanced to increase the threshold of soft
NVRAM errors on the service processor to 32 before SRC B15xF109 is
logged. (Replacement of the service processor is recommended if
more than one B15xF109 is logged per week.)
- A problem was fixed that caused a system to crash when the
system was in low power (or safe) mode, and the system attempted to
switch over to nominal mode.
- On a multi-drawer system, a problem was fixed that
prevented the system attention LED from correctly reflecting the status
of the DASD fault LEDs in drawers 2, 3, and 4.
- A problem was fixed that caused the system to fail to boot
with SRC B1xxB507.
- A problem was fixed that prevented a node from being
deconfigured manually using the Advanced System Management
Interface (ASMI).
- A problem was fixed the caused system fans to be
erroneously called out as failing.
System firmware changes that affect certain systems
- A
problem was fixed that caused the hypervisor to hang during a
concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O
drawer. Recovering from the hypervisor hang required a platform
reboot.
- A problem was fixed that impacted performance if profiling
was enabled in one or more partitions. Performance profiling is
enabled:
- In an AIX or VIOS partition using the tprof (-a, -b, -B, -E option)
command or pmctl (-a, -E option) command.
- In an IBM i partition when the PEX *TRACE profile (TPROF) collections
or PEX *PROFILE collections are active.
- In a Linux partition using the perf command, which is available in
RHEL6 and SLES11; profiling with oprofile does not cause the problem.
- A problem was fixed that prevented the operating system
from being notified that a F/C 5802 or 5877 I/O drawer had recovered
from an input power fault (SRC 10001512 or 10001522).
- On a system that is being upgraded from Ax720 system
firmware to Ax730 system firmware, the firmware was enhanced to log
B1818A0F as informational instead of predictive if it occurs during the
firmware upgrade.
- On systems running Active Memory Sharing (AMS), the
allocation of the memory was enhanced to improve performance.
- A problem was fixed that caused the suspension of a logical
partition running Active Memory Sharing (AMS) to fail because the disk
headers had not been erased.
- On systems with an iSCSI network, when booting a logical
partition using that iSCSI network, a problem was fixed that caused the
iSCSI gateway parameter displayed on the screen to be incorrect.
It did not impact iSCSI boot functionality.
- On systems running Active Memory Sharing (AMS) and Active
Memory Mirrorring (AMM), a problem was fixed that caused memory
allocation to fail. This in turn caused a partition to fail to
boot with SRC A2009030.
- On systems using affinity groups, a problem was fixed that
prevented one of the partitions from being placed correctly.
- On 9117-MMB and 9179-MHB systems without an optional GX
adapter, a problem was fixed that caused the system fans to ramp up to
their maximum speed.
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- A problem was fixed that caused a checkstop to occur during
a node repair operation.
- A problem was fixed that caused the system to hang
during a CHARM operation.
- A problem was fixed that caused multiple types of failures
(CHARM node operations and Advanced Energy Manager (AEM) state
changes, among others), after a CHARM hot node operation on the first
(top) drawer was followed by a concurrent firmware installation.
- On systems with more than one node, a problem was fixed
that caused a CHARM operation on node B to fail with a Repair and
Verify (R&V) panel that indicated a "Deactivate power domain for
the FruType.CEC_ENCLOSURE at U78C0.001.xxxxxx" failure due to a "0x0007
COMMAND_TIMEOUT".
|
AM730_066_035
12/08/11 |
Impact: Availability
Severity: HIPER - High Impact/PERvasive, Should be installed as soon as
possible.
System firmware changes that affect certain systems
- HIPER/Pervasive on systems
with a Virtual Input/Output (VIO) client running AIX, and with a F/C
5802 or 5877 I/O drawer attached: A problem was fixed
that caused the system to crash with SRC B700F103.
|
AM730_065_035
11/22/11 |
Impact: Availability
Severity: HIPER
- High Impact/PERvasive, Should be installed as soon as
possible.
System firmware changes that affect all systems
- HIPER/Pervasive:
On systems running firmware level AM730_049 or AM730_058, a problem was
fixed that caused the target server to hang, or go to the incomplete
state on the management console, after a Live Partition Mobility (LPM)
operation. This problem can also occur when a partition
hibernation operation is done.
|