Qflex 4

Qflex Startup Guide and Websphere MQ Monitoring Tips

Thought I would post a simplified version of best practices for 
monitoring. If anyone would like to add or provide feedback. Please go 
right ahead. Cheers. 

WebSphere MQ Monitoring Tips. 

Monitoring like anything else is about process. We hope you will find 
these tips beneficial regarding of the tool/product 
you choose for monitoring WebSphere MQ. 

1. Queue Manager 

   1.1 Set up Queue Manager Down Monitor 
   1.2 Set up Command Server Down Monitor. Many applications including 
Qflex depend on command server being available. 
   1.3 Configure TCP Connections Equal or Greater Than monitor close 
to setting specified in the queue manager qm.ini file 
   IBM default is only about 20 connections and this is a common cause 
of problems. You usually want to increase that value 
   and monitor for situations when number of connection is close to 
max or has been exceeded. 

2. Important Queues 

   2.1 SYSTEM.DEAD.LETTER.QUEUE. This should be monitored for 
conditions when depth is more than 0. Each message on dead 
   letter queue is potentially a problem that a developer might not 
know about. It is important to know that a message 
   arrived on a dead letter queue as well as to investigate why. 
   2.2 All Transmission queues should be monitored. If messages begin 
to backlog on a xmitq, it might indicate that there 
   is ongoing or intermittent problem with the channel. 

3. Error, Failure and Backout Queues 

   3.1 Applications have their own error queues. Its a good practice 
for each application to have at least one reserved 
   error queue. It is wise to monitor for conditions when the depth on 
those queues is more than 0, same as dead letter 
   queue. 

4. Application Input Queues 

   4.1 By monitoring the application input queue, we can tell several 
things. How well application is doing processing 
   messages if at all. Each application will have a specific threshold 
that we should know about. For example for 
   some applications it is not abnormal to have more than few thousand 
messages whereas for others anything more than 
   10 might imply a serious problem. 

   4.2 If an application is supposed to be reading from the queue or 
writing to a queue at all times, it is a good 
   idea to monitor input and output counts. If we see that input count 
had dropped below 1, we know that application 
   might have crashed or stopped. 

5. Channels. 

   5.1 If an application is supposed to be connected at all times via 
an SVRCONN channel, it is a good idea to set up a 
   monitor which will detect a condition when that channel is not 
running. This, just like read and write count monitors, 
   can act as early signs of a potential problem with the application. 

   5.2 Sender and receiver channels should also be monitored based on 
their status. 

6. FDC Files. 

   6.1 Presence of FDC files sometimes means that a serious error had 
occurred on the server and it should be looked into 
   though that is not always the case. Sometimes FDC files are 
generated due to minor severity events such as a client 
   disconnecting abruptly. 

   6.2 AMQERROR log files. It is a good idea to keep an eye on those 
log files as well. There are certain AMQ error codes 
   that definitely should be monitored for. We are working on the list 
of most severe ones and will post it as soon as 
   it is ready.