From: Rich Webb on
On Fri, 05 Feb 2010 13:31:30 +0000, Tim Watts <tw(a)dionic.net> wrote:

>Hi,
>
>I can hack my way around AVRs OK - but I realise my limitations... I
>normally program in perl on larger computers, so wasting cycles has never
>been something I've worried about(!). Nor do I normally do timing sensitive
>stuff where timing < 0.1s
>
>Can anyone recommend a good book that would cover things like:
>
>1) Compare-nybble (ie is there a more clever more efficient way than AND-
>mask and compare-word?)

You don't need a book for this. First do what is the clearest and
easiest to understand and maintain. Then, if and only if that approach
really is time constrained does it become appropriate to look for
alternate algorithms, micro-optimization, or dropping into assembler.

>2) Bitstream input techniques - eg efficient sampling of either a self
>clocking serial stream or a timing critical one (eg 1-wire) where the
>timings are so small (10's of uS) that one cannot really afford to waste
>instructions on a 20MIPS device. Use of interrupts and timers...

The usual way is with a peripheral, ideally one that's integrated into
the silicon. Serial UART (and variations on that theme), SPI, I2C, and
CAN are common. The Dallas 1-Wire might be, I've never needed to look
into it. If you need, e.g, just one more serial input then it's not that
hard to whiz one up with a timer that has a capture pin. Details differ
but most microcontrollers will have app notes for thing like this.

>3) Cool ways to semi-virtualise a timer - eg ideally you'd like 10 hardware
>timers but you have 2.

volatile unsigned long Tick;

>4) Bomb proof "boot sector" and live firmware (flash) update tricks.

Vendor-specific. See their app notes.

>5) Efficient algorithms for certain maths operations, eg integer square root
>(as has been mentioned recently, if not here, on a USENET group I'm
>subscribed to).

"Math Toolkit for Real Time programming" Jack Crenshaw, ISBN 1929629095.

>6) Keypad debounce.

http://www.ganssle.com/debouncing.pdf is a good place to start.

>7) Serial comms (SPI, I2C, 1-wire, RS485, roll-yer-own-with-a-GPIO-pin)

See above.

>8) Working with an OS, eg FreeRTOS.

See vendor.

--
Rich Webb Norfolk, VA
From: Tim Wescott on
On Fri, 05 Feb 2010 13:31:30 +0000, Tim Watts wrote:

> Hi,
>
> I can hack my way around AVRs OK - but I realise my limitations... I
> normally program in perl on larger computers, so wasting cycles has
> never been something I've worried about(!). Nor do I normally do timing
> sensitive stuff where timing < 0.1s
>
> Can anyone recommend a good book that would cover things like:
>
> 1) Compare-nybble (ie is there a more clever more efficient way than
> AND- mask and compare-word?)
>
> 2) Bitstream input techniques - eg efficient sampling of either a self
> clocking serial stream or a timing critical one (eg 1-wire) where the
> timings are so small (10's of uS) that one cannot really afford to waste
> instructions on a 20MIPS device. Use of interrupts and timers...
>
> 3) Cool ways to semi-virtualise a timer - eg ideally you'd like 10
> hardware timers but you have 2.
>
> 4) Bomb proof "boot sector" and live firmware (flash) update tricks.
>
> 5) Efficient algorithms for certain maths operations, eg integer square
> root (as has been mentioned recently, if not here, on a USENET group I'm
> subscribed to).
>
> 6) Keypad debounce.
>
> 7) Serial comms (SPI, I2C, 1-wire, RS485, roll-yer-own-with-a-GPIO-pin)
>
> 8) Working with an OS, eg FreeRTOS.
>
> And lots more in the same vein...
>
> Although I'm most likely to use AVRs or maybe PIC24s, the book doesn't
> have to be that specific - just "how to hack around in 8 bits". Pretty
> much a "Knuth for uControllers".
>
> Many thanks in advance :)
>
> Cheers
>
> Tim

http://www.ganssle.com/book.htm

I think you want the one on the bottom first.

Then go buy mine, even if you don't need it :-).

--
www.wescottdesign.com
From: D Yuniskis on
Hi Tim,

Tim Watts wrote:
> Can anyone recommend a good book that would cover things like:
>
> 1) Compare-nybble (ie is there a more clever more efficient way than AND-
> mask and compare-word?)

The list of micro-optimizations is limitless. :> Some idioms
are easily expressed in HLL's while others are best left
to ASM implementations (e.g., compare nybble's could exploit
a "swap nybble" instruction on some processors).

Usually, these optimizations are only essential in ISR's
or in very tight loops where they can materially contribute
to overall performance. E.g., "find rightmost set bit".

Look for changes in choice of *algorithm* to give you the most joy.

> 2) Bitstream input techniques - eg efficient sampling of either a self
> clocking serial stream or a timing critical one (eg 1-wire) where the
> timings are so small (10's of uS) that one cannot really afford to waste
> instructions on a 20MIPS device. Use of interrupts and timers...

If events are 10's of usec's (uS are micro Siemens :> ) apart,
chances are, you won't be using an ISR to look at them (unless
you have a very fast context switch *and* not much else happening
in the processor :> ). I had an (ancient) barcode reader
application with IRQ's at 75usec and it would visibly slow the
system down when you swiped a barcode (though this was on a
CPU with 1/10th the horsepower you're mentioning)

Most vendors of these sorts of devices give pointers to how
they expect you to implement the interfaces to their devices.
Some with dedicated silicon (that they will sell you) or with
just an app note illustrating a particular approach.

> 3) Cool ways to semi-virtualise a timer - eg ideally you'd like 10 hardware
> timers but you have 2.

Ah, perhaps you are misunderstanding how (hardware) timers are
typically used! :>

Most "soft" timing requirements are handled by what you are calling
"virtual timers". E.g., if you want to "wait" for 5 seconds, you
don't "waste" a hardware timer on that. Or, even 0.5 seconds.

Instead, you typically implement a system timer (based on *some*
sort of "reliable" timebase -- most often a hardware timer, though
not necessarily) that maintains a "system time" and "timing
service". "System time" can also drive "time of day" time
(i.e., "calendar time") but it need not. E.g., your washing
machine doesn't care (well, *need* not care!) what time of day
it is; though it does need to know how much time is *passing*
as it performs it's wash cycle (e.g., agitate for 5 minutes,
then spin for 3, etc.).

A timing service allows your code to reference absolute and
relative times (wrt its own "sense of time"; e.g., absolute
time 12345 means something only in the sense of when your
device considers "time 0" to have occurred).

This is usually closely related to the type of OS, if any, that
you implement. E.g., a single threaded system has a lot easier
time dealing with things since it *knows* it is the only active
object and can be aware of everything that *it* has done.
OTOH, a single threaded system has to do everything for itself!

In multithreaded systems, one typically sets aside a thread
(which might run completely in an ISR) that manages "time".
"Tasks" (processes, threads, etc. -- depends on the distinction
your "OS" places on "active objects"; e.g., classic UNIX had
process = thread. More modern systems treat processes as
"resource containers" and threads as "active objects") issue
requests for timing services (pause(), wait_until(), etc.)
which are then hooked to the timing service for completion.

One of the most trivial way to implement such a service is to set
aside a "variable" to contain the "time remaining" on that
particular timer (e.g., you can implement an array of timers
or scatter timers throughout memory -- as long as the timing
service knows how to get to them). Then, each time the jiffy
(periodic timer interrupt) comes around, "decrement" *each*
timer, clamping the result to "0".

This appears wasteful but can actually be very efficient (depending
on the number of timers vs threads, etc.).

One advantage to this approach is that the responsibility for
resuming the thread at the end of the time interval (*or*,
WHATEVER OTHER ACTIVITY is associated with timer expiration!)
can be "outsourced" instead of requiring the timing service
to perform this activity on the threads' behalf. E.g., a
thread can sit and *watch* it's timer. So, pause(time) can be

timerX <- jiffies_per_time_unit(time)

while (timerX > 0)
yield()

Looks pretty lame :> But, in reality, the cost of such a
spin-wait falls through the cracks as you are just testing
a group of contiguous RAM locations for a nonzero value.
E.g., even an 8 bit processor can do this quickly by or-ing
all of the bytes together and checking for non-zero result.

What isn't readily apparent is how versatile this can be! E.g.,

loop:

LED <- on

timerX <- jiffies_per_time_unit(on_time)
while (timerX > 0)
yield()

LED <- off

timerX <- jiffies_per_time_unit(off_time)
while (timerX > 0)
yield()

goto loop

instead of:

loop:
LED <- on
pause(on_time)
LED <- off
pause(off_time)

goto loop

I.e., there is little lost in terms of clarity (especially
as the "pause()" can wrap around the timerX load and following
while loop).

It can also be used for timeouts:

timerX <- jiffies_per_time_unit(time_limit)

while (timerX > 0)
if (reply_received)
break
yield()

And, can also be used to *measure* elapsed time:

timerX <- jiffies_per_time_unit(a_long_time)

-- do stuff --

elapsed_time <- a_long_time - timerX

I.e., the "task" does the work that a "richer" timing service
would otherwise perform on its behalf.

Instead of "updating" the "remaining time" on all active
timers with each jiffy, a smarter timing service can order
"time events" relative to each other. E.g., the "soonest"
event is X time units from now; the next one happens Y units
after that with the third happening Z units after *that*.
I.e., the first is X units from now, the second X+Y from now and
the third X+Y+Z from now.

In this way, the timing service need only update the soonest
event at each jiffy. When the time for the soonest event
has arrived, it's "time remaining" will have shrunk to "0".
The *next* event will, by definition, occur exactly Y units
later. This next event is now the "soonest event".

Etc.

> 4) Bomb proof "boot sector" and live firmware (flash) update tricks.

These all depend on the resources that you have available to
your application (primarily hardware). E.g., if you have enough
flash to store two copies of your application, then you burn
the new version in place of the "oldest" version using code
running in the *current* version. Once that burn has
completed successfully, you flip a pointer (toggle a bit)
that tells the application launcher to use that "new" copy
in place of the "current" copy.

Few folks have the luxury of 2X the flash they need! :>

If you are lucky enough to have a comparable amount of RAM
available, then you can load the new image into RAM, verify
its integrity and then start overlaying the rewritable
portion of your flash with that new image -- flagging
the flash's contents as "corrupt" before starting so that
an unexpected RESET will prohibit the application loader from
trying to execute the flash image that is no longer valid.
If the update procedure successfully runs to completion
and verifies the integrity of the "new" flashed image, then
it can mark the flash's contents as once again "valid".

If you don't have the luxury of sufficient scratchpad RAM
for the entire image, then you load a portion of the image
into whatever resources you have available and flash that
portion before proceeding on to the next portion (note that
you can flash blocks in any random order as long as you
verify the entire image's integrity before flagging it as
"valid" and, thus, executable.)

Some applications have to continue operating while being updated.
These are considerably harder to implement upgrade procedures.
Often, you have to structure the code so that *portions* of
the application can be either disabled (for brief periods
while they are updated) *or* relocated to a different resource
(RAM or another portion of FLASH) while they are updated.
This is so application-specific that it almost defies
characterization. :<

> 5) Efficient algorithms for certain maths operations, eg integer square root
> (as has been mentioned recently, if not here, on a USENET group I'm
> subscribed to).

Google is your friend. Start with Knuth just because he teaches
you a good way of thinking about costs and how to look for
places that can benefit from optimization (vs. wasting your
time saving a clock cycle "here" and three more "there").
My single most favorite class in school was "Introduction
to Algorithms (6.033?)" (Saltzer?). I don't think any of the
algorithms covered would be considered "introductory"
material. But, it showed just how many clever ways there are
of looking at a problem differently and how big the gains can
be! I am amazed at how often I go looking for my class notes
as I will vaguely recall some clever trick that I can apply
to a problem at hand -- *if* I convince myself not to look
at it the "obvious" way.

> 6) Keypad debounce.

Argh! There are myriad ways to doing this (software and
hardware). And, many different ways to "signal" the "key"
event. E.g., do you signal the key *after* it has been
debounced? Or, on initial closure and debounce thereafter?
Do you signal on the downstroke or the release? Are you
dealing with a key matrix or a single contact closure?
What does your hardware interface to the key look like?

> 7) Serial comms (SPI, I2C, 1-wire, RS485, roll-yer-own-with-a-GPIO-pin)

Google manufacturers for app notes on all of the above.
Note that there are subtle variations on several of these
("SPI", EIA485, etc.). Even EIA232 has many bastardized forms
(each with costs, benefits and risks)

> 8) Working with an OS, eg FreeRTOS.

That will depend on the OS itself. Note that people tend to play
fast-and-loose with OS terms -- especially "RTOS". If you
aren't familiar with the subtleties of what distinguishes an
RTOS from an MTOS from a "big loop of code", then you are best
served doing the academic research before wasting time on
the details of a particular "xxOS" (e.g., some so-called RTOS's
don't have truly deterministic behavior for all of the services
they provide; some only provide simple scheduling options;
some fail to implement hooks for priority inversion; etc.).

And, of course, beware the naive assumption that "real-time"
means "real fast"! :>

> And lots more in the same vein...
>
> Although I'm most likely to use AVRs or maybe PIC24s, the book doesn't have
> to be that specific - just "how to hack around in 8 bits". Pretty much a
> "Knuth for uControllers".

If you can find a copy of the (large) black applications handbook
that motogorilla issued for the 6800 (1980-ish?), you would probably
get a much better feel for how to think about microcontrollers.
In that era, you were dealing with a fraction of a (VAX) MIPS
so you truly saw the cost of each instruction you executed.

(sigh) I have tutorials that I have written re: several of
these subjects but they are on a machine that has been disassembled
(migrating from one machine to another has got to be the geek
equivalent of "giving dry birth" :< ) so they aren't easily
available. But, I can offer answers to specific questions
you might have. I know I have a note on timing services,
another describing switch debounce techniques (hardware
and software), another on "resource starved" multitasking, one
on barcode decoding, etc.

<shrug> Maybe, when I retire, I will have time to post
them someplace! :>

--don
From: Jon Kirwan on
On Fri, 05 Feb 2010 09:15:45 -0500, Rich Webb
<bbew.ar(a)mapson.nozirev.ten> wrote:

>"Math Toolkit for Real Time programming" Jack Crenshaw, ISBN 1929629095.

Kind of good, very creative in places and insightful in
others, but it contains errors and isn't comprehensive enough
elsewhere. One needs some skill to know when. I don't refer
to it much.

Jon
From: Jon Kirwan on
On Fri, 05 Feb 2010 13:31:30 +0000, Tim Watts <tw(a)dionic.net>
wrote:

>I can hack my way around AVRs OK - but I realise my limitations... I
>normally program in perl on larger computers, so wasting cycles has never
>been something I've worried about(!). Nor do I normally do timing sensitive
>stuff where timing < 0.1s

Despite your comments above, your questions below tell you
and us that you _do_ have to worry about cycles. It's just
that you want to have someone tell you how to not worry about
them, once again. In practice, there are often parts of an
application where it is important and parts where it is far,
far less important. You are encountering this reality.

Maybe we should say you are entering "The Twilight Zone." ;)

>Can anyone recommend a good book that would cover things like:

Um.. Not likely all in one book.

>1) Compare-nybble (ie is there a more clever more efficient way than AND-
>mask and compare-word?)

In terms of c, sounds like all you want are some idioms to
use. But why worry about this? Either you are doing this
for a LOT of short fields and need the extra boost in
performance, in which case it may be better to write a
routine in assembly, or else this is a one-time affair and
your earlier comment about not worrying about wasting cycles
seems to enter back in. So I am not sure why you need to
care that much about this. Some c compilers will do better
than others, here. Normally, you let the c compiler authors
worry about these details. They generally do it well enough
for most one-time type uses -- especially in the general
situation you earlier described. Which makes me think you
are trying to do this fast, perhaps to handle a serial stream
of data that is arriving at a fair clip?

You didn't mention the shift operator or the % operator or
using bit fields and unions. Have you played with those,
yet? Are you familiar enough with looking at the assembly
output of your c compiler to know which works well and which
does not? (Bit fields aren't always portable, but I assume
you are using a single c compiler and stuck on the AVRs, so
it may not be an issue for you.)

>2) Bitstream input techniques - eg efficient sampling of either a self
>clocking serial stream or a timing critical one (eg 1-wire) where the
>timings are so small (10's of uS) that one cannot really afford to waste
>instructions on a 20MIPS device. Use of interrupts and timers...

Almost by your own definition here, it's likely you won't be
using c for this for the "timints are so small" part of this
question, right? Assembly, yes? Regards self-clocking
streams, it is application specific and the best place to go
for the answer here will be manuals/whitepapers that talk
about the particular method in detail. If the stream is slow
enough for c, you just need to follow the description well in
writing your code. It's not about some special idiomatic c
expression that a general book could fairly discuss.

>3) Cool ways to semi-virtualise a timer - eg ideally you'd like 10 hardware
>timers but you have 2.

I assume by this you'd like software routines to execute at
certain times you can set and that these times are slow
enough that virtualizing the timer works well for you in
terms of latency and variability in that latency.

On this point, I could talk at length. However, I'll
recommend one very good, old book which has an excellent
chapter on the topic of delta queues which solve this problem
with very little code and very repeatable, reliable
performance. Douglas Comer's first book on XINU.

And if you'd like, I can send a file or two that show one
style of implementation details to supplement what you might
find in that book.

I think it is "cool."

>4) Bomb proof "boot sector" and live firmware (flash) update tricks.

Normally, I think you need to provide some care in defining
what is meant by "bomb proof" and "firmware update." For
example, a firmware update might be to a specific driver
placed in a specific location. Making that bomb proof might
mean that it must be movable. Or not.

I guess you are looking for a book that provides a variety of
techniques that might be used, with descriptions of how they
are implemented, things to watch out for in doing so, what
benefits and downsides they each may have, etc., so that you
can get a comprehensive view and make choices here and there,
from time to time. If that is the case, I don't know where
to go and would ask that if _you_ find such a book to let me
know about it, too! ;)

>5) Efficient algorithms for certain maths operations, eg integer square root
>(as has been mentioned recently, if not here, on a USENET group I'm
>subscribed to).

One of the best places to go for things like this are integer
DSPs. For example, Analog Devices printed an excellent book
on the general topic regarding their ADSP-21xx processors. I
think some of it (or all of it) may be available online, too.

The pair of books I'm thinking of here are: "DIGITAL SIGNAL
PROCESSING IN VLSI" and "DIGITAL SIGNAL PROCESSING
APPLICATIONS: Using the ADSP-2100 Family." The latter one,
for example, covers a host of fixed point and floating point
arithmetic operations, function approximations, and so on. In
enough detail to implement on other processor families. I've
ported the ideas from time to time, so I know this is quite
true.

Another excellent reference of another nature is to have is
one of the incarnations of "Numerical Recipes." If you don't
have it get that, too.

If you aren't fully sharpened on your own algebra skills, get
a good algebra book and work on that. If you don't have at
least a 1st year's understanding of calculus, focus on that,
as well, and get a book book on calculus. My own preference
would be that you go further and complete at least some of a
1st/2nd year's diff-eq. If you have a community college
available, that would be an excellent resource to take
advantage of sooner than later.

It goes a LONG way to be able to actually _read_ with
_understanding_ what you see so that you can modify/tailor it
to your specific needs. Otherwise, you are just blind.

Just to provide an example that you bring up, the integer
square root question was answered about the way I might have
by Hans-Bernhard Br�ker. When I first faced the need for one
of these on my own, I didn't go to a book at all. Instead, I
just remembered the hand-method I'd been trained to use in
high school (or was it slightly earlier?) and implemented it
with an algorithm. Worked perfectly well, once I nailed the
rounding issue correctly. (That was the only detail that
didn't get nailed down on the first try because I didn't
think about it before writing the first code sample.) So
training helps you when all else fails you.

>6) Keypad debounce.

You _must_ be able to find unending sources on this topic.
For example, most everyone points to Jack Ganssle's "A Guide
to Debouncing" which is often named "debouncing.pdf" as a
place to start on the topic. He provides a nice survey.

Once you land on something you like to use, you will probably
stick with it for most uses because you'll understand it
well. And besides, I don't see people caring that much to
gaining a comprehensive view on it, anyway. Many find
"something that works" and stop dead in their tracks after
that. I think there is more to learn and it is fun to
continue that education. But most seem to disagree with me
and stop as soon as they find something that works for them
and they feel they understand.

I use the following logic applied to an interrupt interval of
8ms:
IF current <> previous THEN state = 1
ELSEIF state = 1 THEN state = 0
ELSE debounced = current
In other words, the above logic executes once each 8ms.

With a little thought to the above logic, you should be able
to see that the resulting state is always the value that
results from XORing the current and previous values. An XOR
operation is usually pretty efficient. Also, note that the
debounced value is changed only when the prior state is 0 and
the current and previous values are the same. This results
in the following sequential steps:

1: previous = current
2: current = READ PORT PIN(S)
3: temp = (previous XOR current) OR state
4: debounced = (debounced AND temp) OR (current AND NOT temp)
5: state = previous XOR current

This logic then requires the current condition of a switch to
remain at the same level for three observations before the
debounced value is updated.

It's not just a sugestion that this method can be applied to
8 switches at a time as easily as it is to one, when the
switches are lined up in a single port byte. If you want
this to be efficient and usable for more than one switch it
helps to place them on a single port.

>7) Serial comms (SPI, I2C, 1-wire, RS485, roll-yer-own-with-a-GPIO-pin)

??? I tend to go to the standards docs for some of these, or
the datasheets for others. What are you really wanting here?
A "complete skillset" stuffed into your brain?

>8) Working with an OS, eg FreeRTOS.

3rd year CS classes get into general ideas and force you to
write test programs to analyze _some_ of them. I write my
own and don't use any commercial or free system others write
because my needs are specific enough to require my having a
general skill that is deduced to specific cases. I gather
you want to simply use something as a drop-in and use it
reasonably well. For that, I'd probably go to (and hope for)
the good documentation often provided by those writing (and
using) what you are using.

>And lots more in the same vein...
>
>Although I'm most likely to use AVRs or maybe PIC24s, the book doesn't have
>to be that specific - just "how to hack around in 8 bits". Pretty much a
>"Knuth for uControllers".

Oh. I see. If you find Knuth for Micros, let me know.
Actually, I was kind of thinking of recommending you read
Knuth until you said this. I guess you already have that
much and want to see such a tour de force done anew.

If you think about what happened with Knuth, he took on a
HUGE project and stated that there would be (if I recall)
four more books. In fact, I think he titled them back at the
start and listed what he expected to produce. However, in
doing those first three, which took years, he just stopped.
It was that huge. Then he decided he needed to take on
typesetting so that he could get back to writing. And that
one he expected to see last only a few years. Instead, he
took a decade and wrote a nice 5-volume set on typesetting
and then went on to develop a toolset (TeX.) I know he is
working on some of what he'd intended 30 years ago, now, but
I've no idea if he will ever finish!!

And you want someone to take that on, now??? Most sane
people would look at Knuth and say... "Uh, no. I have a
life. I think."

But yes, if you find the new Knuth please let me know!!

>Many thanks in advance :)

Best luck to you, too.

Jon



>
>Cheers
>
>Tim