August 25, 2006

All systems are real-time and need extra processing power

Real-time systems have much hype on them and as always need a special form of speech to cover the shortcuts. In case of real-time such speech is about "soft real-time".

The problem is - "hard real-time systems" (the ones that ALWAYS meet execution deadlines) are fiendishly difficult to build, especially on the mainstream hardware, using mainstream practices, for any kind of realistic load.

"Soft real-time" systems are thus not required to always meet execution deadlines, but rather have a probabilistic process imposed on them. Yes, it's bad if you fail to react in time, but it's still better if you react 3 ms late 40% of time than 10 ms late 10% of time (depends on the application of course).

But then, aren't all systems real-time ? After all, noone needs a system which never finishes processing, and it's better doing it sooner than later. Systems may have more or less relaxed requirements, but it's the same probabilistic process, for example 90% of users would wait 5 seconds for a transaction to complete, but only 10% wouldn't be calling tech support upon a 30 seconds delay.

Look closely at the word "real-time" itself. Everything is executed in real time, all processes have references to time flowing by, executions have starts and stops and whether or not they've been a success is determined by an external entity (a user in most cases) and timing is not the last criteria.

Anyhow, I'd like to point out one other thing. There is this theorem which says that for a real-time system to meet its deadlines under certain conditions, it must have a significant fraction of CPU processing power free. If I remember correctly, the exact top load percentage is around 70%, which means that if the CPU is busy more than that, the system starts missing deadlines.

If this result is blindly (or philosophically) applied to all the systems, it leads to an interesting conclusion - no system should be functioning under stress load (perhaps above the mentioned 70% CPU), otherwise it starts slipping behind the schedule and, being real-time, degrades its service significantly.

It would thus be a mistake to believe that an application quality of service would remain steady across all of the CPU load scale. Instead, as CPU usage climbs up to 70%, the application may scale perfectly linearly, but as the load keeps rising, the scaling breaks. It's therefore more safe to consider 70% to be the top acceptable load and start adding more processing power as this line is crossed.