From: D Yuniskis on
Hi George,

George Neuner wrote:
> On Sun, 18 Apr 2010 01:56:10 -0400, George Neuner
> <gneuner2(a)comcast.net> wrote:
>
> Re: deadline scheduling
>
>> And, for the record, I'm not describing something impossible - only
>> something which is VERY HARD. I can point you to reading on the
>> subject if you're interested.
>
> I realized right after I sent this that some might consider this
> statement foolishness because - as Don said previously, there are
> situations where no possible scheduling works.

Exactly! My point is that EDF *will* do reliable deadline
scheduling *if* the workload at every instant *is* "schedulable"
(at <= 100% utilization).

So, you could conceivably implement a preemptive MTOS (which
is *truly* preemptive and not just "sort of") *without* timeslicing
and achieve your results with a scheduler that does:

reschedule() {
task_t *task;
task = find_task_with_earliest_deadline();
run_to_completion(task);
}

(N.B. the "run TO COMPLETION")

> A scheduler can only work within the bounds of the resources it has
> .... if those are inadequate for the problem specification, then that
> is not the fault of the scheduler.
>
> However, a guarantee scheduler - of which "deadline" is just one form
> - is an active agent in the system which serves to double check the
> specification. If there is an irresolvable scheduling, that may
> reveal a flaw, either in the spec or in the implementation. If, as

Or, in the environment in which it is deployed.
"/* CAN'T HAPPEN */" often *does*! :>

> Don said:
>
>>> ... that "can't work" -- with unconstrained task requirements.
>>> E.g., I have three "tasks" each needing 100 time units to complete
>>> and each must be completed within 110 time units. "Coincidence"
>>> has caused them to be started (asynchronous events!) at the same
>>> instant. No scheduling algorithm "works" in overload.
>
> The phrase "unconstrained task requirements" is the key here. What
> Don describes above is a specification error.

Or, a misapplication of a scheduling algorithm, or poor
expectations about how the system *will* respond in these
cases.

I.e., my EDF scheduler uses priorities to tell the scheduler what
*must* get done. This lets the scheduler be leaner and focus on
the "high priority" tasks without being distracted by the
order of deadlines.

E.g., a low priority task may have the "earliest" deadline;
it shouldn't compete a higher priority task with a "further out"
deadline. In other words, I can use priorities in an intuitive
manner: what *needs* to get done vs. what *wants* to get done.

> What I'm getting at here is that we are not discussing scheduling in
> general purpose operating systems with an unbounded mix of user tasks.
> Rather, we are discussing scheduling in tightly crafted, coherent
> systems having fairly rigid design specifications. Those are the
> constraints that allow guarantee scheduling to work.
>
> EDF is a very simple form of guarantee scheduling. There are more
> complex forms which take into account more parameters.

But, as Vladimir said, at some point you start consuming lots of
resources trying to juggle competing requests. I.e., you divert
resources that could be used to accomplish those "tasks" (jobs)
to decide *how* to allocate the resources. :<

RT systems really *are* "an acquired taste". Often, there
are no hard and fast rules describing how the system
should be designed, deployed, etc. And, for all but "small"
or "simple" systems (leaving those undefined for now), you
usually have to deploy several different technologies to
get the results you want with the reliability/robustness
that you need (since many RT systems are dedicated, deeply
embedded systems that you can't just "reboot" without
serious consequences).

E.g., I allow multiple scheduling algorithms to be active
at any given time. Because tasks often have very different
criteria governing how they *want* to be scheduled.

Trying to open such a system up to "third party applications"
is just a mess looking for a place to happen -- since you
can't arbitrarily "unconstrain" things just to accommodate
something that *might* come along (e.g., in one product, I
support third party apps by forcing them to run on a VM
that has its own place in the "priority structure" of the
product. If that doesn't work for a particular 3rd party
app... <shrug> "Sorry")
From: D Yuniskis on
Hi Dimiter,

Didi wrote:
> On Apr 16, 9:37 pm, D Yuniskis <not.going.to...(a)seen.com> wrote:
>> ...
>> Note the non-believers who complained of my use of dynamic
>> memory allocation in RT systems. :>
>
> It does not take much belief to see the usefulness of dynamic memory
> allocation, just some (not very high) complexity of the task at
> hand :-) .

<shrug> It takes a mindset that is willing to find the
certainties in the apparent UNcertainty. :> It seems to often
that folks adopt "guidelines" as *fact* -- "Don't use dynamic
memory allocation", "Don't use pointers", etc. -- instead of
understanding the hazards associated with each (heck, division
can be hazardous if the divisor approaches zero!). "Don't
run with scissors".

[is your email address working? I have to forward some stuff
I wrote re: the above as I think you might have some insights.
Though I'm busy with other "distractions" for the next few weeks]

>>> So a task switcher that has an unpredictable task switch time can still
>>> be real time _if_ that unpredictable switch time has an upper bound.
>> Yes. See above. OTOH, if that scheduler examines *every* task in
>> the system -- even those known to be "not READY" -- then its
>> performance will be abysmal.
>
> I believe most people put too much into task switch timing
> for real time. While it has to be kept in sane limits, it is rarely
> if ever the system limiting factor. Interrupt latency is.

Hmmm... I'll agree that the time *for* a context switch is
just "specsmanship". But, I'm not sure I would unilaterally
accept IRQ latency as *the* limiting factor. I'd have to think
about that as I've never tried to identify such a thing (if there
really *is* "one")

In recent projects, communications have been the biggest
factor (transport delay, decoding delay, marshalling, etc.).
Especially in heterogenous systems :<

Having said that, I would, I guess, have to say the design of
the interconnects is the limiting factor (?)

> Vladimirs original question was about fairness of lower priority
> tasks. How I have implemented it in DPS (around 1994 and nearly
> unchanged
> since because of no necessity) can be looked up. I have yet to see
> a scheme which could rival this one (which I have likely reinvented,
> by 1994 all that stuff had all been long beaten to death; but
> reinventing the obvious is often the fastest way :-) ).

Pointers?
From: D Yuniskis on
Hi Paul,

Paul Keinanen wrote:
> On Sat, 17 Apr 2010 13:49:53 -0700, D Yuniskis
>> Vladimir Vassilevsky wrote:
>
>>> I thought for a while about complete unification of interrupts and
>>> tasks. I.e. having them under the same abstraction. Unfortunately, that
>>> implies a lot of OS overhead.
>> I treat the "application layer" AS IF it was just N layers
>> of ISRs competing and coexisting. E.g., I try to design
>> APIs so there aren't arbitrary "can't be called from an ISR"
>> restrictions. Sometimes this affects the types of
>> objects that can be supported (as you need to keep the
>> interfaces "light").
>>
>> The problem I always see with traditional "priority based"
>> scheduling is there are no hard and fast rules for
>> assigning priorities.
>
> For decades I have used the following guidelines:
>
> 1. Each real-time tasks should spend most of their time idle waiting
> for an event to process (waiting more than 99 % of the for the highest
> priority tasks).

I think that depends a lot on the application. E.g., my
multimedia RTOS is designed with the expectation that events
are frequent and need timely servicing (i.e., delivering
audio or video packets to clients). I'm notorious for
trimming hardware down to (just barely) fit the needs of
the application so there isn't much time spent "waiting" :>

> 2. Do as little as possible in the highest priority tasks, i.e. lower
> priority tasks can execute longer at a time than the high priority
> tasks.

Yes. This is analagous to my treating the code as "N levels of
interrupts". You want to get higher priority ISR's out of the
way *quickly* (e.g., NMI) because they are so abusive of
resources. Let lower priority handlers deal with most of
the workload as they can be more "elastic".

> 3. At the bottom of the priority list the null tasks (which normally
> consumes the unused CPU time in an infinite loop) can be replaced by
> e.g. some round robin scheduler running non-RT tasks such as user
> interfaces etc.

I've been bitten by this, in the past! I.e., the user interface
doesn't have "no priority" as there can be times when it is
squeezed out completely (again, I tend to downsize hardware).
I've found, instead, that you should split the user interface
into (at least?) two layers. One layer needs to be very
responsive. I.e., after a few hundred ms, people get
anxious about whether or not the device is "working". And,
they also tend to repeat actions which, if you don't design
with that in mind (often the case for many UI's!), then the
user gets annoyed when you do *exactly* what he told you
to do -- except skewed in time because he assumed you *didn't*
see one of his actions (which you did -- though you buffered it
and reacted "when you had a timeslice")

>> Instead, folks tend to discover
>> belatedly that the structure of their system is "wrong"
>> and try to *make* it work by tweeking priorities.
>
> Unexperienced people seem to increase the priority of those tasks that
> do not respond fast enough.

Yes. Then increase the *other* tasks that *now* aren't
responding properly, etc.

Figure out what your priorities *should* be. On paper.
If they aren't working as you expected, figure out why.
What's wrong with your assumptions? Maybe something
is *broken* (even a hardware -- gasp -- failure that
your software is gallantly trying to cope with).

I've found two features very helpful in designing
MTOS's which can more easily be tuned:
- dynamic priority modification
- timeslice adjustment

In the first, I create each "task" with a range of
legal priorities at which it can operate (these can
be overriden by the MTOS as *it* sees fit -- e.g.,
priority inheritance protocols). A "task" (or,
any task having an appropriate credential for
doing so) can modify another task's priority
dynamically.

So, if a task (or a task monitoring that task)
determines that something isn't getting done when
(or as often as) it should, the deficient task's
priority can be elevated and the results observed.
(this is tricky as it can lead to oscillations at
runtime -- but, it can also be an excellent
debugging tool as you can *look* at the actual
priority that something algorithmically determined
was "correct" based on real *measured* conditions)

The second, I allow the size of the timeslice for
each task to be specified (and modified). So,
two tasks might execute at the same priority,
but, I might give one of those larger blocks of
time ti get its work accomplished. Note that this
is not the same as elevating its priority! (consider
the case of two tasks who are "always READY")

>> And,
>> this turns into a "tipping canoe" exercise -- you move one
>> way and the canoe starts to tip... so you move the other
>> way and it tips a different way, etc. I.e., there is
>> no "engineering" involved in picking the priority levels.
>
> When trying to fix messed up messed up RT systems that others have
> made, the first thing is to look for tasks, for which the priority can
> be _lowered_, without causing significant performance loss for that
> task. When less critical tasks priorities have been moved lower, they
> are no longer competing with the most time critical tasks.
>
> Then it is time to look at those tasks running a long time at each
> activation (but still needing high priority to avoid e.g. Rx buffer
> overrun) and move those parts to a lower priority task (which is
> allowed to execute longer at a time). For instance the received bytes
> are assembled into a complete message frame in a high priority task,
> while any application level message data processing is moved to a
> lower priority task (or even to the null-task).

To be clear, I assume you mean "decompose those tasks running a
long time at each activation". I.e., factor out the portion
of the task that *needs* to run at a higher priority from
that portion that doesn't. (?)

>> It's always amusing to see folks evaluate the *number* of
>> priority levels supported in an OS AS IF "more is better".
>> (Then 50,000 levels has *got* to be better than *20*!)
>
> A large number of priorities is nice, since you can assign a separate
> priority for each task. Of course, in most cases tasks with similar
> priorities could operate as well if the internal order is changed.
>
> A moderate level 8-16 of priority levels is usually enough _provided_
> that you have a full control of the priority levels. The situation
> gets complicated, if the OS is using some fixed priorities of its own.

I don't let the MTOS impose itself on the application. Any
services that it runs (e.g., network, timing, etc.) compete
for resources at the designer's discretion. "Who am I to decide
*this* is more important than something the designer must meet?"

> Assuming that the OS has a task receiving messages and at a priority
> level immediately below it, an other OS tasks that consumes some of
> these messages addressed to it at a low urgency.
>
> Assume that you want to process at a high urgency some messages
> received and addressed to you by the high priority task. Your task
> should run below the producer task, but above the low urgency OS

You need not run at a priority below that of the producer.
I.e., if the producer's actions are implicitly or explicitly
constrained by other aspects of the design, then it might
end up (for example) blocking immediately after sending
the first message (e.g., imagine a synchronous interface).
So, it's priority drops out of the equation.

Since you (consumer) were *waiting* for that message, *your*
priority drops out of the equation while blocked.

> consumer task. In this example, there is no free priority levels
> between the two OS tasks, so you end up putting your consumer task at
> the same priority level as the OS consumer task, fighting unnecessary
> for CPU time.

I find using a uint8 to be sufficient for "priority". I
dynamically build the structures that are needed within the
MTOS so its of no real consequence to my designs.

Since I define ranges of priorities for each task and
let the task's priority vary during execution, having a
slightly larger range of *possible* priority_t's helps.
It, for example, lets me define *overlapping* ranges
for tasks and watch to see how they might rearrange
their respective priorities (e.g., maybe B *should*
have a higher priority than A).

> Please note, I have carefully used the words "addressed to", since
> otherwise typically in such discussions, someone will sooner or later
> claim that I am trying to use priorities to affect which task is
> processing which messages :-).
>
>> IME, an MTOS gives its biggest bang for buck by helping
>> you decompose an application into independant "tasks"
>> (using the meaning: "jobs") so you can focus on one
>> at a time without having to juggle all of them at once.
>
> Absolutely, the OS should make it easy to isolate time critical and
> non-time critical parts into separate tasks.

The problem with decomposition is that it increases
communication costs. Even in a uniprocessor, passing
information between two tasks has synchronization and
signalling overheads, etc. Worse if the tasks
reside on different processors (perhaps even in NORMA
deployments!)

>> Beyond that, its a question of how much you want to
>> migrate application requirements into the OS's domain
>> instead of the application;s.
>
> The requirements for various RT systems vary so greatly that it is
> very hard for the OS designer to create a one size fits all solutions.

Exactly. This is the problem with OTS solutions, IME.
They either force you to do things "their way". Or,
offer too *many* ways of doing things, each with an
associated overhead (inefficiency).

For most applications, the specific needs placed on the
MTOS/RTOS are usually pretty easily defined. Once you've
written one MTOS/RTOS, you *know* what the issues are in
writing same. I find it easier to just write an MTOS/RTOS
specific for each individual application (borrowing from
previous designs). It is also a great way to see what
the native instruction set "does well" on a particular
processor and what things are "not so well".

> In the worst case, they may even harm the total RT system, when
> forcing or at least tempting (e.g. providing an easy to use API) to
> use some solution that is not optimal for a particular RT system.

yes. "Let's use POSIX!" :<