Power9 System Firmware
Applies
to: 9040-MR9
This document provides information about the installation of
Licensed
Machine or Licensed Internal Code, which is sometimes referred to
generically
as microcode or firmware.
Contents
1.0
Systems Affected
This
package provides firmware for Power
Systems E950 (9040-MR9) servers
only.
The firmware level in this package is:
1.1 Minimum HMC Code Level
This section is intended to describe the "Minimum HMC Code Level"
required by the System Firmware to complete the firmware installation
process. When installing the System Firmware, the HMC level must be
equal to or higher than the "Minimum HMC Code Level" before starting
the system firmware update. If the HMC managing the server
targeted for the System Firmware update is running a code level lower
than the "Minimum HMC
Code Level" the firmware update will not proceed.
The
Minimum HMC Code levels for this firmware for HMC x86, ppc64
or ppc64le are listed below.
x86 - This term is used to reference the legacy HMC
that runs on x86/Intel/AMD hardware for both the 7042 Machine
Type appliances and the Virtual HMC that can run on the Intel
hypervisors (KVM, VMWare, Xen).
- The
Minimum HMC Code level for this firmware is: HMC V9R1M920 (PTF MH01759).
- Although
the Minimum HMC Code level for this firmware is listed
above, HMC V9R1M920 (PTF
MH01759) with
iFix (PTF MH01787) or
higher
is
recommended.
ppc64 or ppc64le - describes the Linux code that is compiled to
run on Power-based servers or LPARS (Logical Partitions)
- The
Minimum HMC Code level for this firmware is: HMC V9R1M920 (PTF MH01760).
- Although
the Minimum HMC Code level for this firmware is listed
above, HMC V9R1M920 (PTF
MH01760) with
iFix (PTF MH01788) or
higher
is
recommended.
For
information
concerning HMC
releases and the latest PTFs,
go
to the following URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/
For specific fix level
information on key components of IBM
Power Systems running the AIX, IBM i and Linux operating systems, we
suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home
NOTES:
-You must be logged in as hscroot in order for the
firmware
installation to complete correctly.
- Systems Director Management Console (SDMC) does not support this
System Firmware level
2.0 Important
Information
Downgrading firmware from any
given release level to an earlier release level is not recommended.
If you feel that it is
necessary to downgrade the firmware on
your system to an earlier release level, please contact your next level
of support.
2.1 IPv6 Support and
Limitations
IPv6 (Internet Protocol version 6)
is supported in the System
Management
Services (SMS) in this level of system firmware. There are several
limitations
that should be considered.
When configuring a network interface
card (NIC) for remote IPL, only
the most recently configured protocol (IPv4 or IPv6) is retained. For
example,
if the network interface card was previously configured with IPv4
information
and is now being configured with IPv6 information, the IPv4
configuration
information is discarded.
A single network interface card
may only be chosen once for the boot
device list. In other words, the interface cannot be configured for the
IPv6 protocol and for the IPv4 protocol at the same time.
2.2 Concurrent
Firmware Updates
Concurrent system firmware update is supported on HMC Managed
Systems
only.
2.3 Memory
Considerations for
Firmware Upgrades
Firmware Release Level upgrades
and Service Pack updates may consume
additional system memory.
Server firmware requires memory to
support the logical partitions on
the server. The amount of memory required by the server firmware varies
according to several factors.
Factors influencing server
firmware memory requirements include the
following:
- Number of logical partitions
- Partition environments of the logical
partitions
- Number of physical and virtual I/O devices
used by the logical partitions
- Maximum memory values given to the logical
partitions
Generally, you can estimate the
amount of memory required by server
firmware to be approximately 8% of the system installed memory. The
actual amount required will generally be less than 8%. However, there
are some server models that require an absolute minimum amount of
memory for server firmware, regardless of the previously mentioned
considerations.
Additional information can be
found at:
https://www.ibm.com/support/knowledgecenter/9040-MR9/p9hat/p9hat_lparmemory.htm
3.0 Firmware
Information
Use the following examples as a reference to determine whether your
installation
will be concurrent or disruptive.
For systems that are not managed by an HMC, the installation
of
system
firmware is always disruptive.
Note: The concurrent levels
of system firmware may, on occasion,
contain
fixes that are known as Deferred and/or Partition-Deferred. Deferred
fixes can be installed
concurrently, but will not be activated until the next IPL.
Partition-Deferred fixes can be installed concurrently, but will not be
activated until a partition reactivate is performed. Deferred
and/or Partition-Deferred
fixes,
if any, will be identified in the "Firmware Update Descriptions" table
of this document. For these types
of fixes (Deferred and/or
Partition-Deferred) within a service pack, only the
fixes
in the service pack which cannot be concurrently activated are
deferred.
Note: The file names and service pack levels used in the
following
examples are for clarification only, and are not
necessarily levels that have been, or will be released.
System firmware file naming convention:
01VMxxx_yyy_zzz
- xxx is the release level
- yyy is the service pack level
- zzz is the last disruptive service pack level
NOTE: Values of service pack and last disruptive service pack
level
(yyy and zzz) are only unique within a release level (xxx). For
example,
01VM900_040_040 and 01VM910_040_045 are different service
packs.
An installation is disruptive if:
- The release levels (xxx) are
different.
Example:
Currently installed release is 01VM900_040_040,
new release is 01VM910_050_050.
- The service pack level (yyy) and the last disruptive
service
pack level (zzz) are the same.
Example: VM910_040_040
is disruptive, no matter what
level of VM910 is currently
installed on the system.
- The service pack level (yyy) currently installed on the
system
is
lower than the last disruptive service pack level (zzz) of the service
pack to be installed.
Example:
Currently installed service pack is VM910_040_040 and new service
pack is VM910_050_045.
An installation is concurrent if:
The release level (xxx) is the same, and
The service pack level (yyy) currently installed on the system
is the same or higher than the last disruptive service pack level (zzz)
of the service pack to be installed.
Example: Currently installed service pack is VM910_040_040, new
service pack is VM910_041_040.
3.1 Firmware
Information
and Description
Filename |
Size |
Checksum |
md5sum |
01VM920_057_057.rpm |
117965484 |
12875
|
8b2e648abbf426e878e83a447ce5d28d
|
Note: The Checksum can be found by running the AIX sum
command against
the rpm file (only the first 5 digits are listed).
ie: sum 01VM920_057_057.rpm
VM920
For Impact, Severity and other Firmware definitions, Please
refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The
complete Firmware Fix History for
this
Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/VM-Firmware-Hist.html
|
VM920_057_057 / FW920.10
09/24/18 |
Impact:
Data
Severity: HIPER
New features and functions
- DISRUPTIVE:
Support was added for installing and running mixed levels of P9
processors on the system in compatibility mode.
- Support added for PCIe4 2-port 100Gb ROCE RN adapter with
feature code #EC66 for AIX and IBM i. This PCIe Gen4 Ethernet x16
adapter provides two 100 GbE QSFP28 ports.
- Support was added to enable mirrored Hostboot memory.
System firmware changes that affect all systems
- HIPER/Non-Pervasive:
A problem was fixed for a
potential problem that could result in undetected data corruption.
- DEFERRED:
A problem was fixed for the Input
Offset Voltage (VIO) to the processor being set too low, having less
margin for PCIe and XBUS errors that could cause a higher than normal
rate of processor or PCIe device failures during the IPL or at run time.
- A problem was fixed for truncated firmware assisted dumps
(fadump/kdump). This can happen when the dumps are configured
with chunks > 1Gb.
- A problem was fixed for the default gateway in the Advanced
System Management Interface (ASMI) IPv4 network configurations showing
as 0.0.0.0 which is an invalid gateway IP address. This problem
can occur if ASMI is used to clear the gateway value with
blanks.
- A problem was fixed for the Advanced System Management
Interface (ASMI) displaying the IPv6 network prefix in decimal instead
of hex character values. The service processor command line
"ifconfig" can be used to see the IPv6 network prefix value in hex as a
circumvention to the problem.
- A problem was fixed for link speed for PCIe Generation 4
adapters showing as "unknown" in the Advanced System Management
Interface (ASMI) PCIe Hardware Topology menu.
- A problem was fixed for the system crashing on PCIe
errors that result in guard action for the FRU.
- A problem was fixed for an extraneous SRC B7000602 being
logged intermittently when is the system is being powered off.
The trigger for the error log is a HMC request for information that
does not complete before the system is shut down. If the HMC
sends certain commands to get capacity information (eg, 0x8001/0x0107)
while the CEC is shutting down, the SFLPHMCCMD task can fail with this
assertion. This error log may be ignored.
- A problem was fixed for the service processor Thermal
Management not being made aware of a Power Management failure that the
hypervisor had detected. This could cause the system to go into
Safe Mode with degraded performance if the error does not have recovery
done.
- A problem was fixed for the On-Chip Controller (OCC) being
held in reset after a channel error for the memory. The system
would remain in Safe Mode (with degraded performance) until a re-IPL of
the system. The trigger for the problem requires the memory channel
checkstop and the OCC not being able to detect the error. Both of
these conditions are rare, making the problem unlikely to occur.
- A problem was fixed for the memory bandwidth sensors for
the P9 memory modules being off by a factor of 2. As a
workaround, divide memory sensor values by 2 to get a corrected
value.
- A problem was fixed for known bad DRAM bits having
errors logs being generated repeatedly with each IPL. With the
fix, the error logs only occur one time at the initial failure
and then thereafter the known bad DRAM bits are repaired as part
of the normal memory initialization.
- A problem was fixed for a Hostboot run time memory channel
error where the processor could be called out erroneously instead of
the memory DIMM. For this error to happen, there must be a RCD
parity error on the memory DIMM with a channel failure attention on the
processor side of the bus and no channel failure attention on the
memory side of the bus, and the system must recover from the channel
failure.
- A problem was fixed for DDR3 DIMM memory training where the
ranks not being calibrated had their outputs enabled. The JEDEC
specification requires that the outputs be disabled. Adding the
termination settings on the non-calibrating ranks can improve memory
margins ( thereby reduce the rate of memory failures), and it matches
the memory training technique used for the DDR4 memory.
- A problem was fixed for a PCIe2 4-port Slot Adapter
with feature code #2E17 that cannot recover from a double EEH
error if the second error occurs during the EEH recovery. Because
is a double-error scenario, the problem should be very infrequent.
- A rare problem was fixed for slow downs in a Live Partition
Mobility migration of a partition with Active Memory Sharing
(AMS). The AMS partition does not fail but the slower performance
could cause time-outs in the workload if there are time constraints on
the operations.
- A problem was fixed for isolation of memory channel failure
attentions on the processor side of the differential memory interface
(DMI) bus. This only is a problem if there are no attentions from
the memory module side of the bus and it could cause the service
processor run time diagnostics to get caught in hang condition, or
result in a system checkstop with the processor called out.
- A problem was fixed for the memory bandwidth sensors for
the P9 memory modules sometimes being zero.
- A problem was fixed for deconfiguring checkstopped
processor cores at run time. Without the fix, the processor core
checkstop error could cause a checkstop of the system and a
re-IPL, or it could force the system into Safe Mode.
- A problem was fixed for a failed TPM card preventing a
system IPL, even after the card was replaced.
- A problem was fixed for differential memory interface (DMI)
lane sparing to prevent shutting down a good lane on the TX side of the
bus when a lane has been spared on the RX side of the bus. If
the XBUS or DMI bus runs out of spare lanes, it can checkstop the
system, so the fix helps use these resources more efficiently.
- A problem was fixed for IPL failures with SRC BC50090F when
replacing Xbus FRUs. The problem occurs if VPD has a stale bad
memory lane record and that record does not exist on both ends of the
bus.
- A problem was fixed for SR-IOV adapter dumps hanging with
low-level EEH events causing failures on VFs of other non-target SR-IOV
adapters.
- A problem was fixed for SR-IOV VF configured with a
PVID that fails to function correctly after a virtual function
reset. It will allow receiving untagged frames but not be able to
transmit the untagged frames.
- A problem was fixed for SR-IOV VFs, where a VF configured
with a PVID priority may be presented to the OS with an
incorrect priority value.
- A problem was fixed for a Self Boot Engine (SBE)
recoverable error at run time causing the system to go into Safe Mode.
- A problem was fixed for a rare Live Partition Mobility
migration hang with the partition left in VPM (Virtual Page Mode) which
causes performance concerns. This error is triggered by a
migration failover operation occurring during the migration state of
"Suspended" and there has to be insufficient VASI buffers available to
clear all partition state data waiting to be sent to the migration
target. Migration failovers are rare and the migration state of
"Suspended" is a migration state lasting only a few seconds for most
partitions, so this problem should not be frequent. On the HMC,
there will be an inability to complete either a migration stop or a
recovery operation. The HMC will show the partition as migrating
and any attempt to change that will fail. The system must be
re-IPLed to recover from the problem.
- A problem was fixed for Self Boot Engine (SBE) failure data
being collected from the wrong processor if the SBE is not running on
processor 0. This can result in the wrong FRU being called out
for SBE failures.
System firmware changes that affect certain systems
- On systems which do not have a HMC attached, a
problem was fixed for a firmware update initiated from the OS from
FW920.00 to FW920.10 that caused a system crash one hour after the code
update completed. This does not fix the case of the OS initiated
firmware update back to FW920.00 from FW920.10 which will still result
in a crash of the system. Do not initiate a FW920.10 to
FW920.00 code update via the operating system. Use only HMC or
USB methods of code update for this case. If a HMC or USB code
update is not an option, please contact IBM support.
- A problem was fixed
for Linux or AIX partitions crashing during a firmware assisted dump or
when using Linux kexec to restart with a new kernel. This problem
was more frequent for the Linux OS with kdump failing with "Kernel
panic - not syncing: Attempted to kill init" in some cases.
- On a system with an IBM i partition with more than 64
virtual processors assigned to it, a problem was fixed for a possible
system crash or other unexpected behavior when doing a partition dump
IPL
- On a system with an IBM i partition, a problem was fix for
I/O operations timeouts with SRCs B600512D and B6005275 logged and IBM
i performance degradation. This problem only occurs with heavy
I/O traffic.
|
VM920_040_040 / FW920.00
08/20/18 |
Impact:
New
Severity: New
New Features and Functions
|
4.0
How to Determine The Currently Installed Firmware Level
You can view the server's
current firmware level on the Advanced System
Management Interface (ASMI) Welcome pane. It appears in the top right
corner.
Example: VM920_123.
5.0
Downloading the Firmware Package
Follow the instructions on Fix Central. You must read and agree to
the
license agreement to obtain the firmware packages.
Note: If your HMC is not internet-connected you will need
to
download
the new firmware level to a USB flash memory device or ftp server.
6.0 Installing the
Firmware
The method used to install new firmware will depend on the release
level
of firmware which is currently installed on your server. The release
level
can be determined by the prefix of the new firmware's filename.
Example: VMxxx_yyy_zzz
Where xxx = release level
- If the release level will stay the same (Example: Level
VM910_040_040 is
currently installed and you are attempting to install level
VM910_041_040)
this is considered an update.
- If the release level will change (Example: Level VM900_040_040 is
currently
installed and you are attempting to install level VM910_050_050) this
is
considered an upgrade.
Instructions for
installing firmware updates and upgrades can be found at https://www.ibm.com/support/knowledgecenter/9040-MR9/p9eh6/p9eh6_updates_sys.htm
IBM i Systems:
For information concerning IBM i Systems, go
to the following URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/
Choose "Select product", under
Product Group specify "System i", under
Product specify "IBM i", then Continue and specify the desired firmware
PTF accordingly.
7.0 Firmware History
The complete Firmware Fix History (including HIPER descriptions)
for this Release level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/VM-Firmware-Hist.html