From: Gordon Burditt on
> Accordingly, let us agree for the sake of argument, that plaintext and
> ciphertext are encoded in the same alphabet and that the act of
> deciphering should yield the original plaintext as output, verbatim.
....
> Theoretically, it _is_ possible to create a lossless cipher scheme
> that will
> take arbitrary plaintext and guarantee that the ciphertext will not be
> larger.
>
> So as a practical matter, useful ciphers tend to produce ciphertext
> output that is either the same size as the plaintext input or, at
> least, not
> much larger.

If you're using compression, and a block cipher, you can guarantee
a limit on the amount of expansion. For the compression stage,
either output a '1' bit (for "this is compressed") followed by the
compressed plaintext if it happens to be shorter, or a '0' bit (for
"this is not compressed") followed by the original plaintext, if
the compression ended up expanding the plaintext. In practice, the
extra bit for the compression flag will take a whole byte. You
could also use that byte to indicate what type of compression was
used. Compression could compress up to 255 different ways, then
take the shortest one. It won't be fast. It does have a better
chance of shrinking the plaintext than one compression type, but
it's still a worst-case of 1 byte expansion.

For the block cipher, you probably have to round up to the next
block boundary.


> For good or ill, the near-universal paradigm for electronic data
> storage and interchange has come to be the contiguous sequence
> of eight-bit bytes, usually together with a knowable length.
>
> I'm in complete agreement that this byte-stream format is
> something a generic crypto scheme ought to be able to handle.

From: adacrypt on
On Jul 16, 4:28 pm, gordonb.j4...(a)burditt.org (Gordon Burditt) wrote:
> >My question is this:  How in general, in modern Western cryptography
> >is textual data for encryption entered into a computer.  Is it keyed
> >in interactively at the keyboard or is it read in from prepared
> >external batch files.  I am speaking of ordinary plaintext from the
> >character set of printable ASCII.
>
> I do not enter ordinary plaintext (whether it's for encryption or
> not) from the character set of printable ASCII.  Even one line of
> plaintext ends in a newline.
>
> If it's anything complicated (even one line can be complicated),
> and it's not already machine-readable, I'm going to be typing it
> into a text editor.  (That text editor may be creating files on an
> encrypted filesystem.  Or maybe the message only needs encryption
> for sending over the Internet via email.)
>
> In pretty much all cases of text files, it is important that the
> newlines stay where they are.  Failure to do this can render source
> code unusable, and make poetry unreadable.  It makes CSV files
> unusable.  In no case is it acceptable to limit line lengths by
> inserting a newline in the middle of a word.
>
> Stuff that I encrypt is generally not so sensitive that I can't
> keep it available unencrypted to work on it.  (Why keep around
> secret stuff if you can't use it for something?)  It may be stored
> on an encrypted file system, but when it's mounted, I can access
> it transparently without having to deal with the encryption at that
> point.  Unmount the encrypted file system, and it can't be accessed
> without the passphrase.  The threat is more of an email going astray
> or getting intercepted over the Internet, or perhaps a lost or
> stolen laptop, than someone using my computer when I turn away from
> the computer to talk to someone else with the encrypted filesystem
> mounted.
>
> The data is generally stored in a file in a form so I can *USE* it,
> whether that is a text file, a word processing file, a spreadsheet,
> an image, or whatever.  You may think of those as "batch files" for
> encryption, but I'm more concerned with being able to *use* the
> files, and these are file types appropriate to the application.  If
> I send someone an encrypted email, it's probably important enough
> that I keep a copy, at least until I'm sure he got it and answers.
>
> >My own ciphers are designed to 1) handle large-volume secure
> >communications by reading plaintext in from external batch files and
> >writing out the ciphertext that emanates from the encryption to
> >another dedicated external file that may then be transmitted by
> >electronic means and  2) short impromptu email messages are keyed in
> >from the keyboard in interactive mode, encrypted, and the ciphertext
> >is then written to an external file for transmission by electronic
> >means.
>
> It's unlikely I'm going to use (2) much at all.  If it's used often
> enough, I'll have a script that brings up a text editor, then
> encrypts and sends the message when it is done.
>
> Your ciphers (you claim) are theoretically unbreakable but they are
> theoretically unbreakable *EVEN WITH THE KEY*, since they mangle
> plaintext irretrievably.
>
> >To me this seems the obvious thing to do.  I can’t understand why I am
> >being admonished to read bytes instead of plaintext as it appears.
>
> >Surely it would be retrogressive to prepare plaintext as bytes instead
>
> It is unnecessary to "prepare plaintext as bytes".  It already *is*
> bytes.  If ADA can't do this (and it *can* do this, but you don't
> seem to know enough ADA to figure out how), dump ADA.  You don't
> seriously think someone is suggesting that you  to "prepare" a file
> filled with ASCII 1's and 0's in groups of 8, do you?
>
> I've got programs on my computer that can encrypt an arbitrary file
> in one of many ciphers, then decrypt it back and get exactly the
> same file contents as the original.  Why can't yours do the same
> thing?
>
> >of just letting the computer do that.
>
> Don't you get it?  I don't have to "prepare a batch file", an
> application file is already there.  I have a spreadsheet with an
> inventory of the computers on the network.  It gets updated often
> as things change.  If the boss wants a new copy, I send it as an
> attachment to an email.  No, it's generally NOT needed to do an
> "export to ASCII", and in any case, messing up the newlines in
> an export will mangle the data when it's imported.  The file gets
> run through encryption, then base64 to give a palatable character
> set for email, and made into an attachment.
>
> >What gives here? - adacrypt
>
> You don't understand how people use computers.  Perhaps your brain
> is theoretically impenetrable by common sense.  You've designed
> cryptography that can only encrypt and decrypt a near-empty subset
> of documents used today.

Hi Gordon,

>Don't you get it? I don't have to "prepare a batch file", an
>application file is already there. I have a spreadsheet with an
>inventory of the computers on the network. It gets updated often
>as things change. If the boss wants a new copy, I send it as an
>attachment to an email.

Many thanks for trying so hard - the penny has dropped now with what
you say but no thanks to anyone else - my simple question is answered
OK now so maybe I can tell you piece of idealism I was aiming at.

In my view a cipher should be able to go solo completely on the back
of a theoretically unbreakable mathematical algorithm, that is by
being used in the hands of a non-specialist office worker with minimal
training (that means no user-assistance whatever). Clearly that is
not going to be the case ever with either AES (that you are so
obsessively single-minded about) and even more so the RSA cipher.
These are always going to be costly to run in that they require
specialist management that must be provided by a highly informed,
interactive operator. That is a serious drawback in a competitive
market that is certain to hot up in the years ahead through normal
healthy competition to say nothing of the threat from increasing
computer power that may simply put the kibosh on both of them in one
fell swoop.

From a programming point of view, interactive real-time and batch mode
of operation are quite different program designs that are possible in
mathematically-driven cryptography but may not be possible with
complexity-driven cryptography (AES).

What amazes me most is the arrant presumption that AES and RSA are
unquestionably here to stay - no doubt about it it seems to the
pundits.

The hip-pocket nerve is most sensitive to cost in the world of e-
commerce security of information. This a very discerning market that
unlike the national security arm of cryptography is not a captive one
and is not bound by so-called 'advanced' 'standard' (neither of these
is true) - they will kick the 'standard' bit (pardon the pun) into
touch at the drop of a hat if it is demonstarted to them that they can
go it alone in managing their own network - that is now a distinct
possibility.

I wouldn't take any bets on AES or RSA ciphers being around more than
another ten years. Old ciphers are as useless as old newspapers.

- thanks for you help - adacrypt
From: Bruce Stephens on
adacrypt <austin.obyrne(a)hotmail.com> writes:

[...]

> that is by being used in the hands of a non-specialist office worker
> with minimal training (that means no user-assistance whatever).
> Clearly that is not going to be the case ever with either AES (that
> you are so obsessively single-minded about) and even more so the RSA
> cipher. These are always going to be costly to run in that they
> require specialist management that must be provided by a highly
> informed, interactive operator.

Be more specific. What "specialist management", precisely, do you think
is required?

As I mentioned before, just about everybody who uses a computer uses RSA
and some symmetric cipher (more commonly RC4, I'd guess) routinely and
mostly without noticing. Everybody who uses gmail uses RSA and RC4 (or
some other symmetric cipher, if they're sophisticated enough to change
it) every time they check for email.

Fewer will use encryption for other purposes, but I see no reason to
expect that they'd find S/MIME or GPG (using RSA, DSA, etc.)
particularly problematic---it's just that few people see a particular
desire to use those.

Almost all office workers who work from home for some of the time
currently use cryptography in connecting to the office, again without a
particular problem as far as I can tell---not one that would be helped
by your software, anyway. Many will use TLS even inside the office to
connect to mail servers and things inside the office, just because
that's how things are set up (rather than for any particular security
reasons).

And please stop mentioning RSA: RSA does things which your software
doesn't attempt to do. Asymmetric systems are a different category to
symmetric systems, and yours is symmetric.

[...]

From: rossum on
On Thu, 15 Jul 2010 23:03:16 -0700 (PDT), adacrypt
<austin.obyrne(a)hotmail.com> wrote:

>
>My question is this: How in general, in modern Western cryptography
>is textual data for encryption entered into a computer. Is it keyed
>in interactively at the keyboard or is it read in from prepared
>external batch files. I am speaking of ordinary plaintext from the
>character set of printable ASCII.
You are asking the wrong question. Modern cryptography is able to
encrypt *any* data, whether it is text or not. Images, text,
spreadsheets, databases, compressed, PDFs, proprietary formats, open
formats etc. Any and all possible formats can be encrypted and
decrypted.

Block cypheres, such as AES or DES, work on blocks of bytes. Stream
cyphers, such as Salsa or RC4, work on single bytes. All possible
formats can be reduced to bytes, and the cyphers just deal with bytes.
Bytes are encrypted to other bytes and decrypted back to the original
bytes. The cypher does not know what the bytes mean, and does not
need to know what the bytes mean.

>
>My own ciphers are designed to 1) handle large-volume secure
>communications by reading plaintext in from external batch files and
>writing out the ciphertext that emanates from the encryption to
>another dedicated external file that may then be transmitted by
>electronic means and 2) short impromptu email messages are keyed in
>from the keyboard in interactive mode, encrypted, and the ciphertext
>is then written to an external file for transmission by electronic
>means.
As others have mentioned, it is possible that you have misunderstood
the meaning of "plaintext" in a cryptographic context. It means "the
original data", nothing more. There is no implication that the
plaintext is human-readable. A ZIP file can be plaintext for
cryptographic purposes, even though it is not human readable.

If you want other people to use your crypto system then it should be
able to encrypt any file at all and to decrypt that file back to a
byte-for-byte identical file at the receiving end.

>
>To me this seems the obvious thing to do. I can't understand why I am
>being admonished to read bytes instead of plaintext as it appears.
Plaintext *is* bytes. Do not be mislead by the -text suffix into
thinking that plaintext must be ASCII text.

>
>Surely it would be retrogressive to prepare plaintext as bytes instead
>of just letting the computer do that.
Not retrogressive, just practical. Take a bunch of bytes. Encrypt
them to a different bunch of bytes. Decrypt them back to the original
bytes. That is all a cryptosystem is meant to do. Have a look at
AES, Salsa or any other modern cryptosystem. All of them take bytes
as input and produce bytes as output.

>
>What gives here? - adacrypt
Bytes.

rossum

From: Gordon Burditt on
>In my view a cipher should be able to go solo completely on the back
>of a theoretically unbreakable mathematical algorithm, that is by
>being used in the hands of a non-specialist office worker with minimal
>training (that means no user-assistance whatever).

Unfortunately, these two parts have nothing to do with each other.
The encryption algorithm is something you can change out with a
software algorithm and the users don't have to care at all. The
strength of the algorithm has nothing to do with the amount of
administration an office worker needs to do. The *type* of the
algorithm (symmetric vs. asymmetric, for example) may affect this.

That means that if the user clicks on a PDF, MP3, or video file to
attach to an email, you need to handle it correctly, not quietly
corrupt it, and not have to have someone around to explain that
encryption is for the printable subset of ASCII only.

You will have these problems with user/manager administration using
*ANY* crypto system:

1. You have to keep the key secret. Don't fall for phishing requests
for a copy of whatever file a key is kept in.

2. For two users to communicate by encrypted email, they need to
set up a key first. New users (e.g. new hires) will enter the
group. Asymmetric cryptography, which yours is *not*, has the
advantage that it can send the public key with a message, so anyone
receiving it can send an encrypted reply using a crypto-aware email
client that stores keys for correspondents. No setup, it Just
Works. Also, you can set up public key registries without blowing
security.

Now explain how two users set up to communicate by encrypted mail,
using your cryptography. You can't send the secret key with the
message, that would blow all your security. Of course, your answer
is "Duh, that's a management problem". Of course it is, but it's
very relevant to the problem that your crytography (and any symmetric
cipher) takes a lot more administration than an asymmetric one.

3. You have to educate users about what must be sent encrypted and
what need not be. You will have cases where you need to send email
to someone who has no key set up (e.g. ordering pizza). A smart
email client can automatically encrypt if a key is available, and
warn if it's about to send something unencrypted, but users have
to understand the warning and deal with the fact that it's OK to
order pizza unencrypted but not send client lists and sales reports
to the home office unencrypted.

4. Somehow the user, or email client program, has to figure out
what key to use for what message, both encrypting and decrypting.
Email client programs can be pretty smart but they have to have
something to go on. That includes contending with the fact that
users may have multiple e-mail addresses, and users may have multiple
keys for various purposes (such as "work" and "home").


>Clearly that is
>not going to be the case ever with either AES (that you are so
>obsessively single-minded about) and even more so the RSA cipher.

I didn't mention AES in my post. It deserves mention that it is a
proof-of-concept that you *can* read plaintext as bytes and output
plaintext as bytes, without mangling the message. That code still
works even if AES is broken and can only be trusted for 99-cent app
downloads.

>These are always going to be costly to run in that they require
>specialist management that must be provided by a highly informed,
>interactive operator.

Why interactive? Periodic key changes, adding and deleting users,
and updating the email client program and encryption program don't
have to be kept that up to date.

>The hip-pocket nerve is most sensitive to cost in the world of e-
>commerce security of information.

e-commerce (and particularly SSL) uses RSA for a very significant
reason: It's an asymmetric cipher. An email client can include
my encrypted mail certificate in my outgoing email, and the recipient
can then reply with no further information. An asymmetric cipher
can also be used to authenticate identity with certificates. A
symmetric cipher cannot do that. RSA beats the snot out of any
symmetric cipher for some very important properties crucial to
e-commerce.

Remember, SSL needs to be able to encrypt images.

You *still* haven't explained how two users who want to communicate
using your ciphers set up a key. Especially if the only communication
method they have is the Internet.



>This a very discerning market that
>unlike the national security arm of cryptography is not a captive one
>and is not bound by so-called 'advanced' 'standard' (neither of these
>is true) - they will kick the 'standard' bit (pardon the pun) into
>touch at the drop of a hat if it is demonstarted to them that they can
>go it alone in managing their own network - that is now a distinct
>possibility.

They will need an asymmetric cipher. Yours is not one. If you do come
up with a theoretically unbreakable *asymmetric* cipher, the world will
beat a path to your door. Assuming, of course, that it handles images
and UTF-8.

>I wouldn't take any bets on AES or RSA ciphers being around more than
>another ten years. Old ciphers are as useless as old newspapers.

True. But a symmetric cipher cannot replace RSA for use in e-commerce.