WebSphere MQ Monitoring Tips
Monitoring like anything else is about process. We hope you will find these tips beneficial regarding of the tool/product you choose for monitoring WebSphere MQ.
- Set up Queue Manager Down Monitor
- Set up Command Server Down Monitor. Many applications including Qflex depend on command server being available.
- Configure TCP Connections Equal or Greater Than monitor close to setting specified in the queue manager qm.ini file IBM default is only about 20 connections and this is a common cause of problems. You usually want to increase that value and monitor for situations when number of connection is close to max or has been exceeded.
- SYSTEM.DEAD.LETTER.QUEUE. This should be monitored for conditions when depth is more than 0. Each message on dead letter queue is potentially a problem that a developer might not know about. It is important to know that a message arrived on a dead letter queue as well as to investigate why.
- All Transmission queues should be monitored. If messages begin to backlog on a xmitq, it might indicate that there is ongoing or intermittent problem with the channel.
Error, Failure and Backout Queues
- Applications have their own error queues. Its a good practice for each application to have at least one reserved error queue. It is wise to monitor for conditions when the depth on those queues is more than 0, same as dead letter queue.
Application Input Queues
- By monitoring the application input queue, we can tell severa things. How well application is doing processing messages if at all. Each application will have a specific threshold that we should know about. For example for some applications it is not abnormal to have more than few thousand messages whereas for others anything more than 10 might imply a serious problem.
- If an application is supposed to be reading from the queue or writing to a queue at all times, it is a good idea to monitor input and output counts. If we see that input count had dropped below 1, we know that application might have crashed or stopped.
- If an application is supposed to be connected at all times via an SVRCONN channel, it is a good idea to set up a monitor which will detect a condition when that channel is not running. This, just like read and write count monitors, can act as early signs of a potential problem with the application.
- Sender and receiver channels should also be monitored based on their status.
- Presence of FDC files sometimes means that a serious error had occurred on the server and it should be looked into though that is not always the case. Sometimes FDC files are generated due to minor severity events such as a client disconnecting abruptly.
- AMQERROR log files. It is a good idea to keep an eye on those log files as well. There are certain AMQ error codes that definitely should be monitored for. We are working on the list of most severe ones and will post it as soon as it is ready.