Fix (APAR): PI15205 Status: Fix Release: 8.5.5.2,8.5.5.1,8.5.5 Operating System: AIX,HP-UX,IBM i,Linux,Mac OS,Solaris,Windows,z/OS Supersedes Fixes: CMVC Defect: xxxxxx Byte size of APAR: 262977 Date: 2014-06-10 Abstract: cannot connect to messaging engine after failover Description/symptom of problem: PI15205 resolves the following problem: ERROR DESCRIPTION: A messaging engine stopped on one cluster member due to database errors. The messaging engine subsequently started successfully on another cluster member. However other servers within the cell cannot connect to it. Possible symptom message: CWSIT0019E: No suitable messaging engine is available on bus that matched the specified connection properties
Traces for WLM*=all show the following exception occurring repeatedly: com.ibm.ws.cluster.selection.NoAvailableTargetExceptionImpl Within the detailed exception message, the string "removed: no CFEndPoint matching criteria" appears. LOCAL FIX: PROBLEM SUMMARY USERS AFFECTED: All users of IBM WebSphere Application Server PROBLEM DESCRIPTION: Client cannot connect to messaging engine after it fails over RECOMMENDATION: None A messaging engine stopped due to database errors. It subsequently started successfully on another cluster member. However clients cannot connect to it. Possible symptom message: CWSIT0019E: No suitable messaging engine is available on bus that matched the specified connection properties
Traces for WLM*=all show the following exception occurring
repeatedly:
com.ibm.ws.cluster.selection.NoAvailableTargetExceptionImpl
Within the detailed exception message, the string "removed: no
CFEndPoint matching criteria" appears.
PROBLEM CONCLUSION:
The code has been changed to correct a specific cause of this
problem.
This is a difficult problem to identify because there are
other valid causes for the symptom message CWSIT0019E to
appear. There are also other valid causes for the exception
NoAvailableTargetExceptionImpl to appear in traces with the
reason "removed: no CFEndPoint matching criteria".
The problem occurs because an incorrect relationship forms
between two objects, the RuleEtiquette and the
ChannelTargetRule. It may only be possible to observe the
incorrect relationship by having WLM traces active at the time
the messaging engine fails over. The specific trace sequence
is shown below.
[3/25/14 23:06:08:106 CST] 00000024 RuleEtiquette >
handleNotification Entry
{CELL=PSCell1, COMPONENT=CF,
NAME=InboundSecureMessaging, NODE=Node2,
SERVER=MEClusterMember2, TYPE=CHANNEL}
type.memento.updated
[EndPointImpl$MementoImpl#-415927217{CELL=PSCell1,
COMPONENT=CF, NAME=InboundSecureMessaging, NODE=Node2,
SERVER=MEClusterMember2, TYPE=CHANNEL}][true::{CELL=PSCell1,
COMPONENT=CF, NAME=InboundSecureMessaging, NODE=Node2,
SERVER=MEClusterMember2, TYPE=CHANNEL}:false]