Prev: GPRbuild compatibility
Next: Irony?
From: Dmitry A. Kazakov on 10 Aug 2010 06:36 On Tue, 10 Aug 2010 01:56:22 -0700 (PDT), Natacha Kerensikova wrote: > On Aug 9, 12:56�pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> > wrote: >> On Mon, 9 Aug 2010 02:55:03 -0700 (PDT), Natacha Kerensikova wrote: >>> On Aug 8, 5:15�pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> >>> wrote: >>> S-expressions are not a format on top or below that, it's a format >>> *besides* that, at the same level. Objects are serialized into byte >>> sequences forming S-expression atoms, and relations between objects/ >>> atoms are serialized by the S-expression format. This is how one get >>> the canonical representation of a S-expression. >> >> I thought you wanted to represent *objects* ... as S-sequences? > > It depends what you call object. Here again, my vocabulary might has > been tainted by C Standard. Take for example a record, I would call > each component an object, as well as the whole record itself. That's OK. Using this definition S-sequence in the memory is an object. Which was the question: what was wrong with the first object so that you wanted another instead? My first take was S-sequence used as an object presentation outside the memory, because you used the word "format". Now it looks as a solution before the problem. You seem going to convert objects to S-sequences *in* the memory and then dump the result them into files. Is it so? What was the problem then? Because it cannot work without a conversion between S-sequence in the memory (object) and S-sequence in the file (representation). Why do you need S-sequence in the memory, while dumping objects directly into files as S-sequences (if you insist on having them) is simpler, cleaner, thinner, faster. >>>> The difference is that Character represents code points and octet does >>>> atomic arrays of 8 bits. >> >>> Considering Ada's Character also spans over 8 bits (256-element >>> enumeration), both are equivalent, right? >> >> Equivalent defiled as? In terms of types they are not, because the types >> are different. In terms of the relation "=" they are not either, because >> "=" is not defined on the tuple Character x Unsigned_8 (or whatever). > > Sorry, "equivalent" in the mathematical that there is a bijection > between the set of Characters and the set of Octets, which allows to > use any of them to represent the other. Agreed, this a very week > equivalence, it just means there are exactly as many octet values as > Character values. No it is OK. Character can be used to represent octets. Ada provides means for that, e.g.: type Octet is private; -- Interface private type Octet is new Character; -- Implementation > On the other hand, Storage_Element and Character are not bijection- > equivalent because there is no guarantee they will always have the > same number of values, even though they often do. Yes. >>> Actually I've started to wonder whether Stream_Element might even more >>> appropriated: considering a S-expression atom is the serialization of >>> an object, and I guess objects which know how to serialize themselves >>> do so using the Stream subsystem, so maybe I could more easily >>> leverage existing serialization codes if I use Stream_Element_Array >>> for atoms. >> >> Note that Stream_Element is machine-depended as well. > > I'm sadly aware of that. I need an octet-sequence to follow the S- > expression standard, and there is here an implementation trade-off: > assuming objects already know how to serialize themselves into a > Stream_Element_Array, I can either code a converter from > Stream_Element_Array to octet-sequence, or reinvent the wheel and code > a converter for each type directly into an octet-sequence. For some > strange reason I prefer by far the first possibility. That depends on your goal. Streams are machine-dependent. Streams of octets are not. If you want to exchange objects in the form of S-sequences across the network you have to drop standard stream implementations of the objects and replace them with your own, based on the stream octets. In this case you will not use Stream_Element_Array directly. You will read and write octets, by Octet'Read and Octet'Write. Provided that octet streams work, which is about 99.9%, I guess. When they are not capable to handle octets properly, you will have to implement I/O manually. If you wrap Octet'Read into a function, you will be able to exchange the I/O layer without affecting the upper ones. If we look at all this mechanics we will see the old good OSI model. > If it helps, you can think of S-expressions as a standardized way of > serializing some relations between objects. However the objects still > have to be serialized, and that's outside of the scope of S- > expressions. From what I understood, the existing serializations of > objects use Stream_Element_Array as a low-level type. So the natural > choice for serializing the relations seems to be taking the > Stream_Element_Array from each object, and hand over to the lower- > level I/O a unified Stream_Element_Array. > > Does it make sense or am I completely missing something? In other post Jeffrey Carter described this as low-level. Why not to tell the object: store yourself and all relations you need, I just don't care which and how? In fact I did something like that, persistent objects with dependencies between them and collection of unreferenced objects. But I did it as Jeffrey suggested. There is Put (Store, Object) the rest is hidden. BUT, S-expression do not support references, they are strictly by-value, so you don't need that stuff anyway. >> The point is that you never meet 80 before knowing that this is a "port", >> you never meet "port" before knowing it is of "tcp-connect". You always >> know all types in advance. It is strictly top-down. > > Right, in that simple example it the case. It is even quite often the > case, hence my thinking about a Sexp_Stream in another post, which > would allow S-expression I/O without having more than a single node at > the same time in memory. > > But there are still situations where S-expression have to be stored in > memory. There is no such cases! > For examples the templates, where S-expressions represent a > kind of limited programming language that is re-interpreted for each > template extension. I am not sure what do you mean here, but in general template is not the object, its instance is. You do not need S-expressions here either. You can store/restore templates as S-sequences. A template in the memory would be an object with some operations like Instantiate_With_Parameters etc. The result of instantiation will be again an object and no S-sequence. BTW, for an interpreter I would certainly prefer the Reverse Polish Notation to a tree. (I think this too boils down to a solution before the problem.) > the latter being > roughly the low-level (as in "close to the hardware", at least close > enough not to rule out programming for embedded platforms and system > programming) and the performance. (BTW, Ada is closer to the hardware than C is. You can even describe interrupt handlers in Ada. Try it in C.) > Now I can't explain why your posts often make me feel Ada is > completely out of my tastes in programming languages, In the way of programming you mean? I wanted to convey that a "C-programmer" will have to change some habits when switching to Ada. Ada enforces certain attitude to programming. Some people would say to software engineering. It is not obvious because you can do in Ada anything you can in C. But a stubborn hardcore "C-programmer" might become very frustrated very soon. A competent C developer will only enjoy Ada. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: _FrnchFrgg_ on 10 Aug 2010 07:06 Le 10/08/2010 09:16, Dmitry A. Kazakov a �crit : > On Mon, 09 Aug 2010 23:54:00 +0200, _FrnchFrgg_ wrote: >> I think that you want pattern matching >> (http://en.wikipedia.org/wiki/Standard_ML#Algebraic_datatypes_and_pattern_matching) > > No, I don't believe in type inference, in fact I strongly disbelieve in it. Unification and pattern matching is independent of type inference. Sure, most of the time you find both in the same languages, but IIRC in the course of my master, I have encountered languages with one but not the other.
From: Dmitry A. Kazakov on 10 Aug 2010 07:19 On Tue, 10 Aug 2010 13:06:58 +0200, _FrnchFrgg_ wrote: > Le 10/08/2010 09:16, Dmitry A. Kazakov a �crit : >> On Mon, 09 Aug 2010 23:54:00 +0200, _FrnchFrgg_ wrote: >>> I think that you want pattern matching >>> (http://en.wikipedia.org/wiki/Standard_ML#Algebraic_datatypes_and_pattern_matching) >> >> No, I don't believe in type inference, in fact I strongly disbelieve in it. > > Unification and pattern matching is independent of type inference. Did you mean the standard meaning of pattern matching instead of Standard ML's Volap�k? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: Natacha Kerensikova on 10 Aug 2010 08:06 On Aug 10, 12:36 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> wrote: > On Tue, 10 Aug 2010 01:56:22 -0700 (PDT), Natacha Kerensikova wrote: > > On Aug 9, 12:56 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> > >> I thought you wanted to represent *objects* ... as S-sequences? > > > It depends what you call object. Here again, my vocabulary might has > > been tainted by C Standard. Take for example a record, I would call > > each component an object, as well as the whole record itself. > > That's OK. Using this definition S-sequence in the memory is an object. Just to be sure, what is it exactly you call S-sequence? For the rest of this post I will assume it synonym of S-expression atom, I hope my answers won't be too misguided. > Which was the question: what was wrong with the first object so that you > wanted another instead? The first object is the internal memory representation designed for actual efficient use. For example, an integer will probably be represented by its binary value with machine-defined endianness and machine-defined size. The other object is a "serialized" representation, in the sense that it's designed for communication and storage, for example the same integer, in a context where it will be sent over a network, can be for example represented as an ASCII-encoded decimal number, or in binary but with a predefined size and endianness. This is really the same considerations as when storing or sending an object directly, except that is has to reside in memory for a short time. There is no more conversions or representations than when S-expression-lessly storing or sending objects; the only difference is the memory buffering to allow S-expression-specific information to be inserted around the stream. > My first take was S-sequence used as an object presentation outside the > memory, because you used the word "format". Now it looks as a solution > before the problem. You seem going to convert objects to S-sequences *in* > the memory and then dump the result them into files. Is it so? Yes, it is. > What was the problem then? The problem is to organize different objects inside a single file. S- expression standardize the organization and relations between objects, while something else has to be done beyond S-expression to turn objects into representations suitable to live in a file. Or the same thing with, instead of a file, an IPC socket or a network socket or whatever, I just don't know how to call it generically without resorting to a word derived from "serialize", but I hope you get the point anyway. > Because it cannot work without a conversion between > S-sequence in the memory (object) and S-sequence in the file > (representation). The S-expression standard describe what conversions are allowed and in what they consist. I cannot follow the standard without such a conversion anyway, so either I do it or I drop the idea of S- expressions, but then I don't have anything to portably store or transmit objects, so back to square one. > Why do you need S-sequence in the memory, while dumping > objects directly into files as S-sequences (if you insist on having them) > is simpler, cleaner, thinner, faster. Because I need to examine the S-sequence before writing it to disk, in order to have enough information to write S-expression metadata. At the very lest, I need to know the total size of the atom before allowing its first byte to be send into the file. > >> Note that Stream_Element is machine-depended as well. > > > I'm sadly aware of that. I need an octet-sequence to follow the S- > > expression standard, and there is here an implementation trade-off: > > assuming objects already know how to serialize themselves into a > > Stream_Element_Array, I can either code a converter from > > Stream_Element_Array to octet-sequence, or reinvent the wheel and code > > a converter for each type directly into an octet-sequence. For some > > strange reason I prefer by far the first possibility. > > That depends on your goal. Streams are machine-dependent. Streams of octets > are not. If you want to exchange objects in the form of S-sequences across > the network you have to drop standard stream implementations of the objects > and replace them with your own, based on the stream octets. This looks like a very strong point in favor of using an array-of- octets to represent S-expression atoms. > In this case > you will not use Stream_Element_Array directly. You will read and write > octets, by Octet'Read and Octet'Write. Provided that octet streams work, > which is about 99.9%, I guess. When they are not capable to handle octets > properly, you will have to implement I/O manually. If you wrap Octet'Read > into a function, you will be able to exchange the I/O layer without > affecting the upper ones. If we look at all this mechanics we will see the > old good OSI model. That sounds like a very nice way of doing it. So in the most common case, there will still be a stream, provided by the platform-specific socket facilities, which will accept an array-of-octets, and said array would have to be created from objects by custom code, right? (just want to be sure I understood correctly). > In other post Jeffrey Carter described this as low-level. Why not to tell > the object: store yourself and all relations you need, I just don't care > which and how? That's indeed a higher-level question. That's how it will happen at some point in my code; however at some other point I will still have to actually implement said object storage, and that's when I will really care about which and how. I'm aware from the very beginning that a S-expression library is low-level and is only used by mid-level objects before reaching the application. I've discussed with a friend of mine who has a better knowledge than me about what actually is a parser and a lexer and things like that. It happens that what my S-expression code is intended to be is actually a partial parser, requiring some more specific stuff on top on it to actually be called a parser. Just like S-expression is a partial format in that it describes how to serialize relations between atoms without describing how objects are serialized into atoms. At some point in my projects I will have to write various configuration file parsers, template parsers, and maybe a lot of other parsers. They can all be treated as independent parsers, and implemented with independent code, maybe derived from YACC or something. I didn't choose that path, I chose rather to use what I called a S-expression library, which is sort of a common core to all these parsers, so I only have left to write the specific part of each situations, which is matching the keywords and typing/deserializing the objects. > > But there are still situations where S-expression have to be stored in > > memory. > > There is no such cases! Right, I guess just like goto-use, they can always be avoided, but I'm still not convinced it's always the best. > > For examples the templates, where S-expressions represent a > > kind of limited programming language that is re-interpreted for each > > template extension. > > I am not sure what do you mean here, but in general template is not the > object, its instance is. Just to be sure we use the same words, I'm talking here about HTML templates. The basic definition of a S-expression is a list of nodes, each of which can be a list or an atom; for a template I copy directly atom contents into the output, and lists are interpreted as functions, with an "instance" (or whatever is called the thing containing data to populate the template) handling the actual "execution" of the function. Now the interest I find in the recursive definition of S- expression is that it's easy to pass template-fragments as arguments of these functions. For this it's much easier to let the S-expression describing the template reside into memory. S-expressions are of course not required here, but provide a nice and simple unified format. > You do not need S-expressions here either. You can > store/restore templates as S-sequences. A template in the memory would be > an object with some operations like Instantiate_With_Parameters etc. The > result of instantiation will be again an object and no S-sequence. Well how would solve the problem described above without S- expressions? (That's a real question, if something simpler and/or more efficient than my way of doing it exists, I'm genuinely interested.) > > the latter being > > roughly the low-level (as in "close to the hardware", at least close > > enough not to rule out programming for embedded platforms and system > > programming) and the performance. > > (BTW, Ada is closer to the hardware than C is. You can even describe > interrupt handlers in Ada. Try it in C.) Fantastic \o/ > > Now I can't explain why your posts often make me feel Ada is > > completely out of my tastes in programming languages, > > In the way of programming you mean? I wanted to convey that a > "C-programmer" will have to change some habits when switching to Ada. I unsure about you mean here by "C-programmer". As I said in other posts, I don't really code the same way as other C coders I've met. I still don't know whether it's a good thing or not. I'm willing to change some of my habits to switch to Ada, while I won't give up some other even if mean giving up Ada. The main criterion being that coding must remain fun, because it makes no sense to continue doing a leisure activity that isn't fun anymore. > But a stubborn hardcore "C-programmer" might become very frustrated > very soon. A competent C developer will only enjoy Ada. I somehow hope I'm more of the latter than the former. BTW, that post of yours is one that encourages me rather than restrain me. Thanks for your interesting replies, Natacha
From: Robert A Duff on 10 Aug 2010 08:50
"Randy Brukardt" <randy(a)rrsoftware.com> writes: > Not sure if it still exists in the real world, but the compiler we did for > the Unisys mainframes used Stream_Element'Size = 9. Interesting. Can these machines communicate with non-Unisys machines over a regular TCP/IP network? E.g. send an e-mail using standard protocols, that can be read on a x86? I assume Storage_Element'Size = 9, too. Correct? Next question: Is (was) there any Ada implementation where Stream_Element'Size /= Storage_Element'Size? >...(The Unisys mainframes > are 36-bit machines.) Stream_Element'Size = 8 would have made the code for > handling arrays awful. > > Similarly, Character'Size = 9 on that machine. That sounds like a non-conformance, at least if the SP Annex is supported. Maybe you mean X'Size = 9, where X is of type Character? You'd certainly want 'Component_Size = 9 for array-of-Character. > That would have made a compiler for the Unisys machines impossible; it would > have made streaming impossible. There is no sane way to put 36-bit values > into Octets - the only way that would have worked would have been to use > 16-bits for every 9-bit byte. > > Whether this is a significant consideration today (in 2010) is debatable, > but it surely was a real consideration back in 1993-4. So Ada 95 could not > have made this choice. I think it would not be a good idea to make Ada unimplementable on "odd-ball" machines. - Bob |