Embedded Systems November 2000 Vol13_12

Issue link:

Contents of this Issue


Page 113 of 189

NIALL MURPHY Watchdog Timers To keep a watchdog timer from resetting your system, you've got to kick it regularly. But that's not all there is to watchdog science. We will examine the use and testing of a watchdog, as well as the integration of a watchdog into a multitasking environment. aking proper use of a watchdog timer is not as simple as restarting a counter. If you have a watchdog timer in your system, you must choose the timeout period carefully, ensure that the watchdog timer is test- ed regularly, and, if you are multitask- ing, monitor all of the tasks. In addi- tion, the recovery actions you imple- ment can have a big impact on overall system reliability. A watchdog timer is a piece of hard- ware, often built into a microcon- troller that can cause a processor reset when it judges that the system has hung, or is no longer executing the correct sequence of code. This article will discuss exactly the sor t of failures a watchdog can detect, and the decisions that must be made in the design of your watchdog system. The first haJf of the article will assume that there is no RTOS present. The second half covers a scheme for making use of a watchdog in a multi-tasking system. The hardware compone nt of a watchdog is a counter that is set to a certain vaJue and then counts down towards zero. It is the responsibility of the software to set the count to its orig- inal value often enough to ensure that it never reaches zero. If it does reach zero, it is assumed that tl1e software has failed in some manner and the CPU is reset. In other texts you will see various terms fo r restarting the timer: strobing, stroking or updating the watchdog. However, in this article we will use the more visual metaphor of a man kicking tl1e dog periodically-with apologies to animal lovers. If the man stops kicking tl1e dog, the dog will take advantage of tl1e he itation and bite the man. 112 NOVEMBER 2000 Embedded Systems Programming The watchdog is kicked by writing to an 1/0 line or a specific memory address. In some implementations a combination of addresses must be accessed in successive bus cycles. This reduces the chance that an erran t pro- gram will accidentally kick the dog on a regular basis, preventing a bite. It is also possible to design tl1e hard- ware so that a kick that occurs too soon will cause a bite, but in order to use such a system, very precise knowledge of the timing characteristics of the ma in required. What errors are caught A properly designed watchdog mecha- nism should, at the very least, catch even ts tl1at hang the system. In electri- cally noisy environme nts, a powe r glitch may corrupt the program counter, stack pointer, or data in RAM. loop of your program is

Articles in this issue

Archives of this issue

view archives of EETimes - Embedded Systems November 2000 Vol13_12