Critical Regions and the Effect on RTOS Responsiveness
A key design concern for any RTOS is that of its responsiveness to interrupts. Because an interrupt can occur at any time, the RTOS must be prepared to recognize the interrupt as quickly as possible and then accommodate the interrupt service routine (ISR) that handles it. The issue becomes one of managing the internal data structures of the kernel in such a way that regardless of what is executing at the time of the interrupt, none of those data structures get corrupted. The time-honored way of preventing data corruption in an RTOS is to make use of one or more critical regions in places where an interruption of processing flow within a kernel service and possible reentrancy might lead to damage.
The critical region can assume several forms but it is primarily a section of code that executes with interrupts disabled. The critical region starts by disabling interrupts and ends by enabling them. In between, the code can execute without fear of interruption and hence, corruption. There are various ways to disable and enable interrupts, especially those processors that have multiple levels, making critical regions a bit more complex. But for now, let’s just keep it simple for the sake of clarity.
Because critical regions disable interrupts, system responsiveness is greatly affected by the duration of the critical region (if any) that is in effect at the time an interrupt occurs. Simply put, the shorter the critical region, the better the system’s responsiveness. If a task is interrupted in the middle of a kernel service it has invoked, there is a good chance that the kernel’s data structures are in a state of flux at the time of the interrupt. Without the kernel executing the code that keeps the data structure pristine, the interrupt and its ISR could invoke a kernel service that totally corrupts the data structure, leading to les than desirable results for both the task and the ISR. Obviously, such situations must be avoided.
Thus, when designing an RTOS one has to make a fundamental decision as to how kernel services are to be treated at the interrupt level. There are basically two choices and both are legitimate.
- Allow ISRs to make kernel service requests just like a task with the exception that an ISR cannot block, (i.e., wait for a service to reach completion such as waiting to get data from an empty queue).
- ISRs can only issue kernel service requests from a limited set of services that restricts what the ISR can do/change in the system.
The first choice naturally leads to a reentrant kernel and great flexibility because tasks and ISRs are treated the same, with that one exception about waiting. That flexibility comes at a price of designing kernel services that require lengthy critical regions in order to protect the underlying data structures in the kernel. An ISR must not be allowed to corrupt an internal data structure in use by a task at the time of the interrupt. Thus, any and all parts of the kernel service code that would be vulnerable to the ISR must be protected by one or more critical regions. Hence, the critical region may be rather large, leading to reduced interrupt responsiveness. So there is a trade-off: Great coding flexibility for reduced responsiveness.
The second design choice limits the capability of the kernel services that may be invoked from an ISR. By making such a restriction, the kernel designer can precisely determine how a task’s use of a service and its underlying data structures would be vulnerable to an ISR. There are still critical regions to consider but the design generally results in fewer and smaller critical regions, making this choice more responsive to interrupts. The trade-off for this concept then is: Reduced flexibility for improved responsiveness.
While both of these choices are legitimate, the RTXC Quadros RTOS uses the second choice in the belief that responsiveness is preferable to ISR coding flexibility, provided that there is a reasonable set of kernel services available to the ISR designer. And, of course, RTXC Quadros RTOS does provide such a set of kernel services using the same form as services available to tasks but with name space separation. Task services have a prefix of KS_ while the services available to ISRs have a prefix of IS_.
Given these two choices and how they can affect an RTOS design, it should be pointed out that it is also possible to create an RTOS design that reduces the number of critical regions in the entire kernel to only a few, perhaps two or three. The resulting design, while atypical, can mitigate the undesirable elements of the two choices just described yielding an RTOS that is highly responsive to interrupts and has high ISR coding flexibility through a common set of kernel services for tasks and ISRs.