From: D Yuniskis on
Hi George,

[attributions elided]

>>> Is it feasible to have the client load a new image concurrently with
>>> the old and switch at some well defined point(s)?
>> Not the whole image. I am structuring the application so I
>> can swap out pieces of it and update it incrementally. E.g.,
>> update the RTOS, then switch it in; update parts of the
>> library, then switch them in; etc.
>
> That's definitely the right approach, particularly if you want to

Yeah, but it while it makes the implementation more robust
(in terms of uptimes), it makes the development process
considerably more brittle -- trying to make sure everything
"lines up" nicely. :-/

> scale to low wireless rates ... on 100Mb WiFi a 1GB file takes minutes
> to transfer even with a single client and no interference. You can
> just forget about using something like 3-G cellular.

Yes. I am more concerned with the reduction in bandwidth
(overall as well as) due to contention, interference, etc.
Quite a different world than a wired network with switches.

So, its doubly important to make sure I can recover from
all those "Can't Happen"s that *do* happen! :>

>>>> - The protocol can't be spoofed by "unfriendlies".
>>> There's only one real solution to spoofing and/or "man in the middle"
>>> attacks and that is to encrypt the image binaries. You want it to be
>>> as hard as possible for someone to decompile your application and
>>> figure out how to spoof it.
>> The sources for the application will be available.
>>
>> So, shared secret/public key is the only way to do the encryption.
>> The problem there is it forces all devices to have the same
>> keys (or, I have to build special images for each device).
>
> Multiple devices sharing the same key is not a problem. Although one
> of the keys is "public" and the other "private", it actually doesn't
> matter which is which: public(private(text)) == private(public(text)).

Yes. Just like digital signatures. And, I don't have to worry
about a real key server as I can embed that process in the
development (and update) protocols.

> The point is to authenticate the origin of the image - the client's
> "public" key can only decrypt a file encrypted by your private key -
> successfully decryption is the authentication. (Of course you need to
> verify by signature that the decryption was successful.)
>
> You only need to worry about more complicated key handling if you are
> creating programming tools for end users that themselves will need to
> worry about spoofing. AFAIHS, your previous comments didn't indicate
> that.

Right.

> Regardless, it neither desirable nor necessary to use public key
> encryption for the whole image (or its parts). Symmetric single key
> encryption (like DES, AES, etc.) requires fewer resources and runs
> faster. The only parts that require public key for secure
> communication and authentication are the symmetric key(s) and image
> signature(s).

Yes, this is an advantage on the "bigger" applications
(what's another "module" among friends? :> ). But, for these
smaller devices, it means adding a second piece of code
that just "runs faster" than the one piece of cryptocode
that is already *required*. So, to save resources (since
updates are infrequent in these "throw-away" projects),
it's easier to just wrap the PK utilities in a portable
wrapper (forward thinking) and use *them* as is.

E.g., even something as simple as AES uses a bit of resources
(more so than wrapping an existing PK implementation)

>> The more realistic problem is to guard against something
>> messing with the image or process and effectively leaving
>> the device(s) in "half updated" states. So, it is even
>> more important that updates be supported "in pieces"
>> (otherwise the window of opportunity for screwing things up
>> is just too big)
>>
>>> Building on the idea of embedding the signature in a header, you could
>>> also embed a decryption key for the image. The header itself should
>>> be encrypted with a public key algorithm so that the client can verify
>>> that _you_ created the image. (Obviously the whole image could be
>>> public key encrypted, but it isn't necessary ... symmetric key
>>> algorithms are more performant. Successfully completing the process
>>> of decrypting the header, extracting the image decryption key,
>>> decrypting and verifying the image signature proves that the whole
>>> thing is legitimate.)
>> Yes, but it would have to be done for each "piece".
>
> True, but signatures for all the pieces can be bundled for en-mass
> checking. If the pieces need to be encrypted, you only need to
> provide one decryption key for all of them.

Yes.

>>> And don't store the decrypted image on the device ... design the
>>> update process so that the local image can be decrypted into RAM and
>>> reuse it to do the startup.
>> Huh? I *think* you are assuming there is enough RAM to
>> support the decrypted image *and* my run-time data requirements?
>> If so, that's not the case -- far more code than data.
>
> I only assumed that the device stored the program for cold start and
> that you wanted to keep the code secret. However, since the source
> will be available there is no particular reason to encrypt images at
> all - runtime image encryption is typically only done to protect trade
> secrets.

If you don't encrypt the images, then you have to ensure that
*every* "secret" is bundled in with the keys, signatures, etc.
E.g., if your code has any "passwords" (e.g., for a telnet session)
to access the device, then those must be passed to the device
in that "one" encrypted bundle, etc. Encrypting the entire
image gets around those "lapses" where you forget that something
you have exposed *can* be used against you.

> What I was saying was that the decryption mechanism could be reused in
> the general program loader, enabling the image(s) to remain encrypted
> on storage. That way, it would be much harder to gain access to and
> reverse engineer the code.

Understood. There have been devices that did this in hardware
more than 30 years ago. :-/

>> I'm not too worried about people reverse engineering devices.
>> Rather, I am more concerned with someone having access to
>> the medium and screwing with (spoofing or otherwise) the
>> update "undetected" (consider the wireless case as well as
>> how easy it is to have a hostile client sitting "somewhere"
>> on a network...)
>
> So then the only things that require authentication are the image
> signatures. You don't need to encrypt the image binaries at all -
> just the image signature bundle.

See above. Imagine passing the contents of /etc/passwd in
cleartext (easy to *sniff*).

>>>> - The protocol should be as light as possible -- but
>>>> no lighter! ;-)
>>> This feeds back to the size of the images. TFTP is about as simple as
>>> you can get using a generic farm server, but TFTP does not have any
>>> way to request a partial transfer (e.g., a "head" command). Of
>>> course, for version checking, you can hack your client to start the
>>> transfer and then abort after fetching the predetermined image header.
>> I was figuring I could just have the client request whatever
>> blocks it wants in whatever order, etc. E.g., TFTP doesn't
>> *force* me to request blocks in sequential order...
>
> It has been a long time since I've read any of the relevant RFCs, so I
> may be completely wrong here (and please feel free to correct me 8),
> but AFAIK, none of the popular file copy/transfer protocols allow
> random access.

Grrrr... (slaps head). No, you are right. I was thinking
of something else. Yes, all I can do is order things in
the image in such a way that I can abort a transfer once
I have obtained "what I need".

This also means that I need to arrange the modules *in* the
image in the order that I want them to be burned -- else I
would have to open the file repeatedly to get *to* the module
that I needed "next" (since I can't buffer the entire image).

> My understanding is that protocols like FTP, TFTP, Kermit, etc. just
> send file blocks in order and wait for the receiver to ACK/NAK them.
> Some of the sliding window implementations allow OoO within the window
> but, AFAIK, none allows arbitrary positioning of the window or
> advancing the window beyond a missing block. The so-called
> "restartable" protocols periodically checkpoint the file/window
> position and restart from the last checkpoint if the same client
> re-attaches and requests the same file within some timeout period.
>
> If you want/need essentially random access without needing your own
> server application you might be forced into a remote file system
> protocol (like SMB, NFS, etc.). I have no experience implementing
> such protocols ... I've only used them.

No, I will live within something like TFTP. The others are
more demanding in terms of resources. And, present entirely
new sets of problems (stale handles, etc.)

>>>> - Minimize unnecessary network traffic as well as
>>>> load on the server ...
>> What I am more concerned with are the apps that follow.
>> Those are so big that updates can be much more frequent.
>
> True, but that situation is mitigated by piece wise updates.

Yes. I'm taking on extra "requirements" for these throw-away
devices which wouldn't be needed, otherwise. It gets hard
keeping track of which context applies the more stringent
requirements on each design decision ;-)

>> The more pressing issue for those apps is turnaround time.
>> I.e., the update would want to start as soon as it was
>> available (on the server). So, checking "once a day" would
>> be inappropriate. Once minute might be nice as the expected
>> delay to update start would only he half that (recall that
>> those devices would need much longer to perform their entire
>> update).
>
> Downloading a small header (up to 1 UDP block) once every couple of
> minutes would not be too bad - at least for a limited number of
> devices and images (I'm presuming here that "like" devices can use the
> same image?).

Yes, in general, each type of device will use the same image
(for the throwaway devices). There may be some deviations as
I play with different images during development or to test
different recovery strategies, etc.

The "devices to come" are a different story. The images there
will tend to have bigger (in terms of numbers of affected
bytes) discrepancies. I think I will end up breaking those
images into pieces *on* the server. This will make
server-side management more error-prone (e.g., device A
used images 1Ai, 2Ai, 3Ai (<module_number><device><version>)
while device B uses 1Bj, 2Bj, 3Bj -- even though 1Ai == 1Bj
(etc.).

But, if I use the idea of having an "update module" as the first
part of any update and make this an *active* object used *in*
the update procedure, then that can explicitly go looking
for whatever pieces it needs...

> Obviously it would be far better to broadcast/multicast that updates
> are available and forgo client polling at all ... but that requires an
> intelligent server.

And also places more restraints on the network fabric.

>> The idea (mentioned elsewhere in this thread) of having
>> devices collaborate to:
>> - reduce the polling rate "per device" to keep the overall
>> polling rate (seen by the update server) constant
>> - share discovery of valid updates with other devices
>
> Couple of possible solutions:
>
> - have all the "like" devices on a network segment (i.e. those that
> can share an image) cooperate. Use a non-routable protocol to elect
> one to do the polling and have it tell the others when an update is
> available.

If you concentrate that functionality in one (elected) device,
then that device must remain on-line for the protocol to
work -- else you need to elect his replacement. For things
like the A/V clients, I expect them to see frequent power
up/down cycles. E.g., walk into the kitchen, power up the
audio clients in that room so you can listen to "whatever"
you were listening to in the living room. Finish your
kitchen task, leave the room and the clients can be powered
down. I.e., you want a protocol that can react quickly
in the face of a changing network context.

So, what I have instead opted to do is have each device do
semi-random polling. But, as a device becomes aware of
other peers "on-line", have it increase its average polling
interval proportionately -- *knowing* that each of its
peers is aware of *its* presence and will inform it if
they discover something on their own.

As such, the total polling traffic on the network remains
reasonably constant. And, if a node goes offline, its
disappearance doesn't directly affect the polling -- all
of the other nodes will (eventually) increase their
polling frequency to maintain the same overall polling
"load" on the (update) server.

> - run the devices in promiscuous mode and have them "sniff" update
> checks. This way when any device on the net polls, all devices on the
> segment see the result.

Yes, but that means every device has to process every packet.
And, means it must "see" every packet (what if there is polling
traffic on a different subnet -- not routed locally?)

> - combine election and sniffing so only one device actually talks to
> the server and the others eavesdrop on the conversation.

I thnk the approach I outlined will work. It means each device
can operate independantly of any other. Yet, implicitly cooperate
in their relations with any "shared resources" (the server).

> Sniffing might be more complicated than it's worth - ideally you'd
> like to extend it to cover the whole update process. But it may be
> worth doing for update checks because the file name, block number,
> etc. could be fixed and so easy to search for in sniffed packets.
>
>>> Use a "well known" public name and hide the file updates behind hard
>>> links.
>>>
>>> Changing a hard link is an atomic process so there will be no race
>>> conditions with the update - a process trying to open the file through
>>> the link will either get some version of the file or fail if the open
>>> call sneaks between unlinking the old file and linking the new.
>>>
>>> So updating the server then is a 3 step (scriptable) process. You
>>> upload the new version alongside the old, change the well known hard
>>> link to point to the new version, then delete the old version.
>> <frown> Requires someone do something "special" on the server.
>> When that someone (IT) screws up, The Boss doesn't see it as *his*
>> problem. The IT guy *claims* he did everything right and there
>> must be something wrong with the update software or the devices.
>> So, *vendor* gets a frantic call from an angry customer complaining
>> that the system has been "down" for "over an hour now"...
>
> Yes. But the key is that it is scriptable. Bleary eyed, overworked
> IT person or dumbass noob makes no difference ... it's hard to get
> much simpler than "<script> filename". If you must have a GUI you can

Yes. But it still relies on the user knowing/remembering that
he has to use <script>. If it is something done infrequently,
people tend to forget the prerequisites involved (I am
always amazed at how few people maintain journals!)

> use Javascript in a web browser so the (l)user only needs to specify
> the file (or better yet drop it on the browser page).
>
> In any event, the server update process can be stupid friendly.
From: George Neuner on
On Fri, 12 Mar 2010 12:50:09 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>[attributions elided]
>
>>>> Is it feasible to have the client load a new image concurrently with
>>>> the old and switch at some well defined point(s)?
>>>
>>> Not the whole image. I am structuring the application so I
>>> can swap out pieces of it and update it incrementally. E.g.,
>>> update the RTOS, then switch it in; update parts of the
>>> library, then switch them in; etc.
>>
>> That's definitely the right approach, particularly if you want to
>
>Yeah, but it while it makes the implementation more robust
>(in terms of uptimes), it makes the development process
>considerably more brittle -- trying to make sure everything
>"lines up" nicely. :-/

Incrementally updating a running program can get tricky even in the
best of situations where the program and runtime cooperate. It's
particularly difficult if you need to update concurrently with normal
execution, especially if you need to convert existing data structures
before new code can use them.

In one of my former lives I helped design a modular programming
language and the compiler/runtime for a programmable image
co-processor board. The runtime allowed demand loading and unloading
of arbitrary code modules (i.e. not overlays) under program control.
I've had some ideas on how compiler and runtime could manage code
modules automagically and incrementally update running programs behind
the scenes without them really being aware of it, but I've never had
the opportunity to actually try out any of them.


>> Regardless, it neither desirable nor necessary to use public key
>> encryption for the whole image (or its parts). Symmetric single key
>> encryption (like DES, AES, etc.) requires fewer resources and runs
>> faster. The only parts that require public key for secure
>> communication and authentication are the symmetric key(s) and image
>> signature(s).
>
>Yes, this is an advantage on the "bigger" applications
>(what's another "module" among friends? :> ). But, for these
>smaller devices, it means adding a second piece of code
>that just "runs faster" than the one piece of cryptocode
>that is already *required*. So, to save resources (since
>updates are infrequent in these "throw-away" projects),
>it's easier to just wrap the PK utilities in a portable
>wrapper (forward thinking) and use *them* as is.
>
>E.g., even something as simple as AES uses a bit of resources
>(more so than wrapping an existing PK implementation)

That's true, but I think you can arrange that the two different
decrypt modules will never be needed simultaneously.

AES isn't particularly "simple" ... it is simply less costly than some
of the PK algorithms. Cost depends on how secure you need it to be
.... if you want strong protection you have to pay for it - in code and
data space and in cycles. If the intent is simply to prevent casual
abuse, you could as well just use a reversible "swizzler".


>If you don't encrypt the images, then you have to ensure that
>*every* "secret" is bundled in with the keys, signatures, etc.
>E.g., if your code has any "passwords" (e.g., for a telnet session)
>to access the device, then those must be passed to the device
>in that "one" encrypted bundle, etc. Encrypting the entire
>image gets around those "lapses" where you forget that something
>you have exposed *can* be used against you.

Point taken with the observation that sensitive information embedded
in the image can be segregated such that only a small "resource"
portion of the image need be encrypted.



>> AFAIK, none of the popular file copy/transfer protocols allow
>> random access.
>
>Grrrr... (slaps head). No, you are right. I was thinking
>of something else. Yes, all I can do is order things in
>the image in such a way that I can abort a transfer once
>I have obtained "what I need".
>
>This also means that I need to arrange the modules *in* the
>image in the order that I want them to be burned -- else I
>would have to open the file repeatedly to get *to* the module
>that I needed "next" (since I can't buffer the entire image).

Hopefully such an arrangement is possible ... circular dependencies in
a hot patch situation are a really big PITA.


>> If you want/need essentially random access without needing your own
>> server application you might be forced into a remote file system
>> protocol (like SMB, NFS, etc.). I have no experience implementing
>> such protocols ... I've only used them.
>
>No, I will live within something like TFTP. The others are
>more demanding in terms of resources. And, present entirely
>new sets of problems (stale handles, etc.)

NFS clients are stateless if you don't use file locks ... but I don't
know how complex their implementation is. I don't know offhand
whether SMB clients are similarly stateless, you might want to take a
look at some before you dismiss it.

Using hard link public names on the server would permit the clients to
ignore file locking issues.


>I think I will end up breaking those
>images into pieces *on* the server. This will make
>server-side management more error-prone (e.g., device A
>used images 1Ai, 2Ai, 3Ai (<module_number><device><version>)
>while device B uses 1Bj, 2Bj, 3Bj -- even though 1Ai == 1Bj
>(etc.).
>
>But, if I use the idea of having an "update module" as the first
>part of any update and make this an *active* object used *in*
>the update procedure, then that can explicitly go looking
>for whatever pieces it needs...

An active updater is an excellent idea that solves a bunch of
problems: it can be a throw-away module that implements things like
secondary fast decryption, remote file access, etc. that the
application needs only while updating.

It does not solve any hot patch issues though.

George
From: D Yuniskis on
Hi George,

George Neuner wrote:
> On Fri, 12 Mar 2010 12:50:09 -0700, D Yuniskis
> <not.going.to.be(a)seen.com> wrote:
>
>> [attributions elided]
>>
>>>>> Is it feasible to have the client load a new image concurrently with
>>>>> the old and switch at some well defined point(s)?
>>>> Not the whole image. I am structuring the application so I
>>>> can swap out pieces of it and update it incrementally. E.g.,
>>>> update the RTOS, then switch it in; update parts of the
>>>> library, then switch them in; etc.
>>> That's definitely the right approach, particularly if you want to
>> Yeah, but it while it makes the implementation more robust
>> (in terms of uptimes), it makes the development process
>> considerably more brittle -- trying to make sure everything
>> "lines up" nicely. :-/
>
> Incrementally updating a running program can get tricky even in the
> best of situations where the program and runtime cooperate. It's

Yup! ;-)

> particularly difficult if you need to update concurrently with normal
> execution, especially if you need to convert existing data structures
> before new code can use them.

Ideally, you change as little as is necessary.

> In one of my former lives I helped design a modular programming
> language and the compiler/runtime for a programmable image
> co-processor board. The runtime allowed demand loading and unloading
> of arbitrary code modules (i.e. not overlays) under program control.
> I've had some ideas on how compiler and runtime could manage code
> modules automagically and incrementally update running programs behind
> the scenes without them really being aware of it, but I've never had
> the opportunity to actually try out any of them.

Note that the "update" module lets me, in a pinch, throw
my hands in the air and say, "Do this update off-hours
as the system will be down for XXXX minutes/hours".

I think in the projects to come, some of this will be easier
(in that I will have more resources available) -- though
also harder (in that there will be a greater chance for
more things to be "in flux".

The throwaway projects will be easier to constrain (interfaces,
etc.) but with far fewer resources, that constraint will
almost be *imperative*

>>> Regardless, it neither desirable nor necessary to use public key
>>> encryption for the whole image (or its parts). Symmetric single key
>>> encryption (like DES, AES, etc.) requires fewer resources and runs
>>> faster. The only parts that require public key for secure
>>> communication and authentication are the symmetric key(s) and image
>>> signature(s).
>> Yes, this is an advantage on the "bigger" applications
>> (what's another "module" among friends? :> ). But, for these
>> smaller devices, it means adding a second piece of code
>> that just "runs faster" than the one piece of cryptocode
>> that is already *required*. So, to save resources (since
>> updates are infrequent in these "throw-away" projects),
>> it's easier to just wrap the PK utilities in a portable
>> wrapper (forward thinking) and use *them* as is.
>>
>> E.g., even something as simple as AES uses a bit of resources
>> (more so than wrapping an existing PK implementation)
>
> That's true, but I think you can arrange that the two different
> decrypt modules will never be needed simultaneously.

But they will have to reside in the "active image" concurrently (?)
E.g., I need the PK stuff to decode the keys, signatures, etc.
So, need it early in the update (else can't decode the modules
as they are downloaded). Then, I *quickly* need whatever
code is required to decrypt the module images, themselves.
And, before the process has finished, I need the PK code
(again) in preparation for the *next* update (i.e., I can't
discard the PK code once I have decoded the keys and use
that "space" for the module decrypt code)

> AES isn't particularly "simple" ... it is simply less costly than some
> of the PK algorithms. Cost depends on how secure you need it to be
> .... if you want strong protection you have to pay for it - in code and
> data space and in cycles. If the intent is simply to prevent casual
> abuse, you could as well just use a reversible "swizzler".
>
>> If you don't encrypt the images, then you have to ensure that
>> *every* "secret" is bundled in with the keys, signatures, etc.
>> E.g., if your code has any "passwords" (e.g., for a telnet session)
>> to access the device, then those must be passed to the device
>> in that "one" encrypted bundle, etc. Encrypting the entire
>> image gets around those "lapses" where you forget that something
>> you have exposed *can* be used against you.
>
> Point taken with the observation that sensitive information embedded
> in the image can be segregated such that only a small "resource"
> portion of the image need be encrypted.

If you take this approach, you have to be sure folks modifying
the code are aware of how *any* piece of information can be
used to compromise things. I think it easier just to tell
them they only have to "protect the crypto keys" and the
"process" will then protect the rest.

>>> AFAIK, none of the popular file copy/transfer protocols allow
>>> random access.
>> Grrrr... (slaps head). No, you are right. I was thinking
>> of something else. Yes, all I can do is order things in
>> the image in such a way that I can abort a transfer once
>> I have obtained "what I need".
>>
>> This also means that I need to arrange the modules *in* the
>> image in the order that I want them to be burned -- else I
>> would have to open the file repeatedly to get *to* the module
>> that I needed "next" (since I can't buffer the entire image).
>
> Hopefully such an arrangement is possible ... circular dependencies in
> a hot patch situation are a really big PITA.

Exactly. Rather than trying to engineer some set of
dependencies on the code a priori, I think the "update
module" gives me a way to defer those requirements
to "update time". :>

>>> If you want/need essentially random access without needing your own
>>> server application you might be forced into a remote file system
>>> protocol (like SMB, NFS, etc.). I have no experience implementing
>>> such protocols ... I've only used them.
>> No, I will live within something like TFTP. The others are
>> more demanding in terms of resources. And, present entirely
>> new sets of problems (stale handles, etc.)
>
> NFS clients are stateless if you don't use file locks ... but I don't
> know how complex their implementation is. I don't know offhand
> whether SMB clients are similarly stateless, you might want to take a
> look at some before you dismiss it.

NFS introduces security issues (to the server as well as the
clients). E.g., some shops won't use NFS as it exposes bits
of their server "needlessly" (hmmm... bad choice of word :< )

TFTP is the ideal transport protocol as it is easy to implement,
runs on UDP, very little overhead, supported lots of places,
etc. If I augment it with all this other protocol stuff
(encryption, module sequencing, etc.) it looks like it
will do what I need done.

> Using hard link public names on the server would permit the clients to
> ignore file locking issues.
>
>> I think I will end up breaking those
>> images into pieces *on* the server. This will make
>> server-side management more error-prone (e.g., device A
>> used images 1Ai, 2Ai, 3Ai (<module_number><device><version>)
>> while device B uses 1Bj, 2Bj, 3Bj -- even though 1Ai == 1Bj
>> (etc.).
>>
>> But, if I use the idea of having an "update module" as the first
>> part of any update and make this an *active* object used *in*
>> the update procedure, then that can explicitly go looking
>> for whatever pieces it needs...
>
> An active updater is an excellent idea that solves a bunch of
> problems: it can be a throw-away module that implements things like
> secondary fast decryption, remote file access, etc. that the
> application needs only while updating.

But you still need at least part of this to persist to
the "next update". Hmmm... maybe cut that module in half
and treat part of it as "update IPL" and the rest as the
"updater module" (this latter part being disposable after
an update completes?)

> It does not solve any hot patch issues though.
From: George Neuner on
On Sat, 13 Mar 2010 07:18:39 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>George Neuner wrote:
>> On Fri, 12 Mar 2010 12:50:09 -0700, D Yuniskis
>> <not.going.to.be(a)seen.com> wrote:
>>
>> ... I think you can arrange that the two different
>> decrypt modules will never be needed simultaneously.
>
>But they will have to reside in the "active image" concurrently (?)

Terminology?

They need to be loadable from persistent storage, but _not_ in memory
as they are not needed simultaneously.


>E.g., I need the PK stuff to decode the keys, signatures, etc.
>So, need it early in the update (else can't decode the modules
>as they are downloaded). Then, I *quickly* need whatever
>code is required to decrypt the module images, themselves.
>And, before the process has finished, I need the PK code
>(again) in preparation for the *next* update (i.e., I can't
>discard the PK code once I have decoded the keys and use
>that "space" for the module decrypt code)

See below.


>>> I think I will end up breaking those
>>> images into pieces *on* the server. This will make
>>> server-side management more error-prone (e.g., device A
>>> used images 1Ai, 2Ai, 3Ai (<module_number><device><version>)
>>> while device B uses 1Bj, 2Bj, 3Bj -- even though 1Ai == 1Bj
>>> (etc.).
>>>
>>> But, if I use the idea of having an "update module" as the first
>>> part of any update and make this an *active* object used *in*
>>> the update procedure, then that can explicitly go looking
>>> for whatever pieces it needs...
>>
>> An active updater is an excellent idea that solves a bunch of
>> problems: it can be a throw-away module that implements things like
>> secondary fast decryption, remote file access, etc. that the
>> application needs only while updating.
>
>But you still need at least part of this to persist to
>the "next update". Hmmm... maybe cut that module in half
>and treat part of it as "update IPL" and the rest as the
>"updater module" (this latter part being disposable after
>an update completes?)

Yes, you need at least some of the decrypt code to persist, but since
you are modular and your modules can be unloaded, you can easily
arrange that it doesn't all need to be *active* at the same time.

Consider the following scenario:

=---------

- Each module is verifiable by strong signature. Computing the
signature for the encrypted version (to verify the download) is
sufficient but you can also compute the plaintext version as a check
on the decryption.

- The device maintains a manifest of installed modules to be compared
to potential updates.


1) Load the PK decryption module.

2) Download a PK encrypted manifest of the latest release. The
manifest contains signatures for the modules that comprise the
release, a symmetric encryption key for decrypting them, and a
plaintext signature for itself (to check that PK decrypted it
properly).

3) Compare the update manifest to the running image manifest and note
differences. If the manifests match go back to step 2

4) Unload the PK decryption module.
5) Load the symmetric decryption module.

6) Download and store new (updated) modules.

Ideally you shouldn't replace memory resident code until you've
collected the whole set of updates, but obviously that depends on
whether you have enough local storage to hold multiple versions.

7) Unload the symmetric decryption module.

8) Save the new manifest and the updated module file locations for
your boot loader. Delete the old modules.

9) Finally, replace the memory resident modules with their updated
versions. Could be a warm boot or more complex if the program needs
to continue running through the replacement.

Rinse, Repeat.

=---------

This process is sound and I *think* it meets your requirements for
updating on the client side as you've described them. It may not meet
your sensibilities but I can't do anything about that 8-)

WRT symmetric encryption: as you say, this project is a prototype for
more capable devices - but even if you don't use it later the reason
I'm pushing symmetric encryption now is that you have expressed much
concern about the speed of the update process. Symmetric encryption
implementations are lighter weight in terms of memory use and far more
performant than PK ... attributes that are important to you now on
your current low(er) powered devices and also maybe in the future as
you contemplate WiFi.

With this process you can make download decryption configurable and
tweak it later to eliminate one of the modules or replace them with
different implementations.

George
From: D Yuniskis on
Hi George,

George Neuner wrote:
>>> ... I think you can arrange that the two different
>>> decrypt modules will never be needed simultaneously.
>> But they will have to reside in the "active image" concurrently (?)
>
> Terminology?

I think that's the problem (see below)

> They need to be loadable from persistent storage, but _not_ in memory
> as they are not needed simultaneously.

There is no (separate) "persistent storage". The images XIP.
They are flashed and execute out of the flash.

I.e., the only way to "discard" part of the image ("unload"
it in your description below) is to erase that portion of
the flash.

>> E.g., I need the PK stuff to decode the keys, signatures, etc.
>> So, need it early in the update (else can't decode the modules
>> as they are downloaded). Then, I *quickly* need whatever
>> code is required to decrypt the module images, themselves.
>> And, before the process has finished, I need the PK code
>> (again) in preparation for the *next* update (i.e., I can't
>> discard the PK code once I have decoded the keys and use
>> that "space" for the module decrypt code)
>
> See below.
>
>>>> I think I will end up breaking those
>>>> images into pieces *on* the server. This will make
>>>> server-side management more error-prone (e.g., device A
>>>> used images 1Ai, 2Ai, 3Ai (<module_number><device><version>)
>>>> while device B uses 1Bj, 2Bj, 3Bj -- even though 1Ai == 1Bj
>>>> (etc.).
>>>>
>>>> But, if I use the idea of having an "update module" as the first
>>>> part of any update and make this an *active* object used *in*
>>>> the update procedure, then that can explicitly go looking
>>>> for whatever pieces it needs...
>>> An active updater is an excellent idea that solves a bunch of
>>> problems: it can be a throw-away module that implements things like
>>> secondary fast decryption, remote file access, etc. that the
>>> application needs only while updating.
>> But you still need at least part of this to persist to
>> the "next update". Hmmm... maybe cut that module in half
>> and treat part of it as "update IPL" and the rest as the
>> "updater module" (this latter part being disposable after
>> an update completes?)
>
> Yes, you need at least some of the decrypt code to persist, but since
> you are modular and your modules can be unloaded, you can easily
> arrange that it doesn't all need to be *active* at the same time.

That's the misunderstanding! If it's *in* the device,
it sits in the address space. There's no "secondary storage"
to load/unload from/to.

> Consider the following scenario:
>
> =---------
>
> - Each module is verifiable by strong signature. Computing the
> signature for the encrypted version (to verify the download) is
> sufficient but you can also compute the plaintext version as a check
> on the decryption.
>
> - The device maintains a manifest of installed modules to be compared
> to potential updates.
>
> 1) Load the PK decryption module.

PK module sits in memory. Whether or not it is *active*
is just a function of whether or not it has been "CALLed"

> 2) Download a PK encrypted manifest of the latest release. The
> manifest contains signatures for the modules that comprise the
> release, a symmetric encryption key for decrypting them, and a
> plaintext signature for itself (to check that PK decrypted it
> properly).
>
> 3) Compare the update manifest to the running image manifest and note
> differences. If the manifests match go back to step 2
>
> 4) Unload the PK decryption module.

No "unloading". Module just "RETurns" when done.

> 5) Load the symmetric decryption module.

See above. I.e., this module resides in memory alongside
the PK module. Just one or the other is typically
"executing" at any time.

> 6) Download and store new (updated) modules.
>
> Ideally you shouldn't replace memory resident code until you've
> collected the whole set of updates, but obviously that depends on
> whether you have enough local storage to hold multiple versions.

There isn't enough RAM to hold more than ~5% of the image
(I need to keep the device *running* which uses most of
the RAM resources -- modules have to be really small so
they can have minimal impact on that RAM usage "while
being updated"). There's no "scratch memory" (disk, etc.)
to spool things into. RAM + FLASH is all there is.

> 7) Unload the symmetric decryption module.
>
> 8) Save the new manifest and the updated module file locations for
> your boot loader. Delete the old modules.
>
> 9) Finally, replace the memory resident modules with their updated
> versions. Could be a warm boot or more complex if the program needs
> to continue running through the replacement.
>
> Rinse, Repeat.
>
> =---------
>
> This process is sound and I *think* it meets your requirements for
> updating on the client side as you've described them. It may not meet
> your sensibilities but I can't do anything about that 8-)
>
> WRT symmetric encryption: as you say, this project is a prototype for
> more capable devices - but even if you don't use it later the reason
> I'm pushing symmetric encryption now is that you have expressed much
> concern about the speed of the update process. Symmetric encryption
> implementations are lighter weight in terms of memory use and far more
> performant than PK ... attributes that are important to you now on
> your current low(er) powered devices and also maybe in the future as
> you contemplate WiFi.
>
> With this process you can make download decryption configurable and
> tweak it later to eliminate one of the modules or replace them with
> different implementations.

The point is conserving *ROM* on the first projects. They
are SoC so once you use up FLASH, there's no more to play with.
If I can eliminate one module, then its space in the FLASH
makes room for one of the "update" copies of "a module".