Prev: GPRbuild compatibility
Next: Irony?
From: Natacha Kerensikova on 7 Aug 2010 08:56 On Aug 7, 10:39 am, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> wrote: > One cannot judge a format without knowing what is the purpose of. Most of > the formats like S-expressions are purposeless, in the sense that there is > no *rational* purpose behind them. As you wrote above, it is either legacy > (we have to overcome some limitations of some other poorly designed > components of the system) or personal preferences (some people like angle > brackets others do curly ones). Why can't there be general-purpose format, just like there are general- purpose programming languages? Can we at least agree on the fact that a sequence of bytes is a general-purpose format, widely used for storing and transmitting data? (this question is just a matter of vocabulary) Now byte sequences are a very crude format, because it doesn't have any semantics besides what the application specifically put into it. So let's add as few semantics as possible, to keep as much generality as possible. We end up with a bunch of byte sequences, whose semantics are still left to the application, linked together by some kind of semantic link. When the chosen links are "brother of" and "sublist of" you get exactly S-expressions. The almost-RFC I linked is only this definition along with a standardization of how to serialize the links and the bytes sequences. This is undoubtedly still a crude format. You might argue it's useless to add so little semantics on top of byte sequences, and that serialization should be engineered only when you what you are about to serialize, i.e. make a much larger leap between byte sequences and meaningful objects. I might even agree on a philosophical point of view. However from a purely practical point of view, and using the fact that in my background languages (C and 386 asm) bytes sequences and strings are so similar, these crude semantics are all I need (or at least, all I've ever needed so far). Now if we agree that simplicity is a desirable quality (because it leads to less bugs, more productivity, etc), I still fail to see the issues of such a format. When I mentioned earlier flexibility as a strong point for S- expressions, I meant that just like byte sequences, they can accommodate whatever purpose you might want to put on top of them. Now regarding personal preferences about braces, I have to admit I'm always shocked at the way so many people dismiss S-expressions on first sight because of the LISP-looking parentheses. I'm very glad I can have a higher-level conversation about them. > > The other application is actual serialization, > > That should not be a text. I tend to agree with that as a generality, however I believe some particular cases might benefit from a text-based serializations, in order to harness the power of existing text-based tools. > >>> But now that I think about it, I'm wondering whether I'm stuck in my C > >>> way of thinking and trying to apply it to Ada. Am I missing an Ada way > >>> of storing structured data in a text-based way? > > >> I think yes. Though it is not Ada-specific, rather commonly used OOP design > >> patterns. > > > I heard people claiming that the first language shapes the mind of > > coders (and they continue saying a whole generation of programmers has > > been mind-crippled by BASIC). My first language happened to be 386 > > assembly, that might explain things. > > I see where mixing abstraction layers comes from... Could you please point me where I am mixing what? I genuinely want to learn, but I just don't understand what you're referring to. > > Anyway, I genuinely tried OOP > > with C++ (which I dropped because it's way too complex for me (and I'm > > tempted to say way too complex for the average coder, it should be > > reserved to the few geniuses actually able to fully master it)), but I > > never felt the need of anything beyond what can be done with a C > > struct containing function pointers. > > Everything is Turing-complete you know... (:-)) I should know, I was accessing from assembly some (DirectX) C++ objects' vtable array before I knew anything about OOP. My point is, most of my (currently non-OOP) code can be expressed as well in an OOP style. When I defined a C structure along with a bunch of functions that perform operations on it, I'm conceptually defining a class and its methods, only expressed in a non-OOP language. I sometimes put function pointers in the structure, to have a form of dynamic dispatch or virtual methods. I occasionally even leave the structure declared but undefined publicly, to hide internals (do call that encapsulation?), along with functions that could well be called accessors and mutators. In my opinion that doesn't count as OOP because it doesn't use OOP-specific features like inheritance. > > The > > problem is, I just can't manage to imagine how to go in a single step > > from the byte sequence containing a S-expression describing multiple > > objects to the internal memory representation and vice-versa. > > You need not, that is the power of OOP you dislike so much. I don't dislike at all. I just seldom think that way. I've met people who saw objects everywhere, while I tend to see bunch of bits everywhere. As a part of demoscene said when I learned programming, "100% asm, a way of life". > Consider each > object knows how to construct itself from a stream of octets. It is trivial > to simple objects like number. E.g. you read until the octets are '0'..'9' > and generate the result interpreting it as a decimal representation. Or you > take four octets and treat them as big-endian binary representation etc. > For a container type, you call the constructors for each container member > in order. If the container is unbounded, e.g. has variable length, you read > its bounds first or you use some terminator in the stream to mark the > container end. For containers of dynamically typed elements you must learn > the component type before you construct it. That's exactly how I use S-expressions, except that instead of starting from a stream of octets, I start from an array of octets (whose length is known), but if I understood correctly that doesn't change your point. And the reason why I started this thread is only to know how to buffer into memory the arrays of octets, because I need (in my usual use of S-expressions) to resolve the links between atoms before I can know the type of atoms. So I need a way to delay the typing, and in the meantime handle data as a generic byte sequence whose only known information is its size and its place in the S- expression tree. What exactly is so bad with that approach? I hope I don't bother you too much with my noobity and my will to understand, Natacha
From: Dmitry A. Kazakov on 7 Aug 2010 10:23 On Sat, 7 Aug 2010 05:56:50 -0700 (PDT), Natacha Kerensikova wrote: > On Aug 7, 10:39�am, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de> > wrote: >> One cannot judge a format without knowing what is the purpose of. Most of >> the formats like S-expressions are purposeless, in the sense that there is >> no *rational* purpose behind them. As you wrote above, it is either legacy >> (we have to overcome some limitations of some other poorly designed >> components of the system) or personal preferences (some people like angle >> brackets others do curly ones). > > Why can't there be general-purpose format, just like there are general- > purpose programming languages? An interesting question. I would say no, there cannot be such formats. Any presentation format is of course a language. The only difference to the true programming languages is in complexity, any maybe in a tendency to being declarative rather than imperative. There are border cases like Postscript, which IMO illustrate the point, more general purpose it has to be, less "format" it would become. > Can we at least agree on the fact that a sequence of bytes is a > general-purpose format, widely used for storing and transmitting data? > (this question is just a matter of vocabulary) I don't think so. Namely it don't think that "general" is a synonym to "completeness." It is rather about the abstraction level under the condition of completeness. > So let's add as few semantics as possible, to keep as much generality > as possible. We end up with a bunch of byte sequences, whose semantics > are still left to the application, linked together by some kind of > semantic link. When the chosen links are "brother of" and "sublist of" > you get exactly S-expressions. Yes, the language of S-expressions is about hierarchical structures of elements lacking any semantics. I see no purpose such descriptions. But this is a very old and bearded issue. The same question arise when it is discussed why RDBMS are so boring. For the same reason: a naked structure, be it relational, hierarchical, whichever, is useless without the semantics. The semantics when dealt with, is capable to catch such simple relationships as "sibling" with no efforts. DB people believe that one could bridge the gap and somehow come to the semantics from the structure's side. Translated into your S-expressions, it is by putting a proper pattern of opening and closing brackets one could describe everything... > However from a purely practical point of view, and using the fact that > in my background languages (C and 386 asm) bytes sequences and strings > are so similar, these crude semantics are all I need (or at least, all > I've ever needed so far). Lower you descend down the abstraction levels less differences you see. Everything is a bunch of transistors... > Now if we agree that simplicity is a > desirable quality (because it leads to less bugs, more productivity, > etc), I still fail to see the issues of such a format. Programs in 386 Assembler are sufficiently more complex than programs in Ada. Simplicity of nature by no means implies simplicity of use. > Now regarding personal preferences about braces, I have to admit I'm > always shocked at the way so many people dismiss S-expressions on > first sight because of the LISP-looking parentheses. Do you mean LISP does not deserve its fame? (:-)) >>>>> But now that I think about it, I'm wondering whether I'm stuck in my C >>>>> way of thinking and trying to apply it to Ada. Am I missing an Ada way >>>>> of storing structured data in a text-based way? >> >>>> I think yes. Though it is not Ada-specific, rather commonly used OOP design >>>> patterns. >> >>> I heard people claiming that the first language shapes the mind of >>> coders (and they continue saying a whole generation of programmers has >>> been mind-crippled by BASIC). My first language happened to be 386 >>> assembly, that might explain things. >> >> I see where mixing abstraction layers comes from... > > Could you please point me where I am mixing what? Encoding, representation, states, behavior, values, objects, everything is a sequence of bytes, so? > My point is, most of my (currently non-OOP) code can be expressed as > well in an OOP style. When I defined a C structure along with a bunch > of functions that perform operations on it, I'm conceptually defining > a class and its methods, only expressed in a non-OOP language. I > sometimes put function pointers in the structure, to have a form of > dynamic dispatch or virtual methods. I occasionally even leave the > structure declared but undefined publicly, to hide internals (do call > that encapsulation?), along with functions that could well be called > accessors and mutators. In my opinion that doesn't count as OOP > because it doesn't use OOP-specific features like inheritance. I disagree because in my view this is all what OO is about. OO is not about the tools (OOPL), it is about the way of programming. >>> The >>> problem is, I just can't manage to imagine how to go in a single step >>> from the byte sequence containing a S-expression describing multiple >>> objects to the internal memory representation and vice-versa. >> >> You need not, that is the power of OOP you dislike so much. > > I don't dislike at all. I just seldom think that way. I've met people > who saw objects everywhere, while I tend to see bunch of bits > everywhere. Yes, I know them too. I don't believe that everything is object. But I do believe in abstract type systems, that every object in a well-designed program must have a type and that type shall describe the behavior as precise as possible. > And the reason why I started this thread is only to > know how to buffer into memory the arrays of octets, because I need > (in my usual use of S-expressions) to resolve the links between atoms > before I can know the type of atoms. So I need a way to delay the > typing, and in the meantime handle data as a generic byte sequence > whose only known information is its size and its place in the S- > expression tree. What exactly is so bad with that approach? Nothing wrong when at the implementation level. However I don't see why links need to be resolved first. In comparable cases - I do much messy protocol/communication stuff - I usually first restore objects and then resolve links. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: Jeffrey Carter on 7 Aug 2010 11:38 On 08/07/2010 12:23 AM, Natacha Kerensikova wrote: > > I heard people claiming that the first language shapes the mind of > coders (and they continue saying a whole generation of programmers has > been mind-crippled by BASIC). My first language happened to be 386 > assembly, that might explain things. Anyway, I genuinely tried OOP > with C++ (which I dropped because it's way too complex for me (and I'm > tempted to say way too complex for the average coder, it should be > reserved to the few geniuses actually able to fully master it)), but I > never felt the need of anything beyond what can be done with a C > struct containing function pointers. Capers Jones, the function-point person, collected statistics on function points in various languages. One statistic was the average number of LOC to implement a function point. Function points may or may not be a good metric, but the values fell into 3 groups, which he labeled low-level languages (assembler and C), mid-level (FORTRAN, Pascal), and high-level (Ada). Thus, there is a big difference in abstraction between C and Ada. C is about translating the problem into the capabilities of the solution language; Ada is (or should be) about modeling the problem in the software. C is about coding: mapping everything onto a small set of predefined representations. Ada is about SW engineering: creating useful abstractions that represent important aspects of the problem. Effectively using Ada requires a different way of thinking from using C or assembler. You'll more quickly gain that mindset by asking how to approach specific problems in Ada, rather than how to do something you did in C. I think there may be an analogy between languages and data representation formats: everything can be implemented in machine code, but that doesn't mean it's a good language for development. Similarly, everything may be represented as a sequence of bytes, but that doesn't mean it's an appropriate representation for an application. An S-expression library might be a useful thing at a low-level, but few applications should call it directly. Instead, there should probably be at least one layer of abstraction on top of the low-level storage representation, so that the application only deals with appropriate application-level representations of the data. Here, your problem seems to be how to have a human-readable storage format for your applications. While there is merit to discussing the pros and cons of the many existing formats to use, the Ada approach is to use an application-specific abstraction, hiding the implementation detail of which such format is eventually chosen. -- Jeff Carter "I blow my nose on you." Monty Python & the Holy Grail 03 --- news://freenews.netfront.net/ - complaints: news(a)netfront.net ---
From: Natacha Kerensikova on 7 Aug 2010 13:01 On Aug 7, 5:38 pm, Jeffrey Carter <spam.jrcarter....(a)spam.not.acm.org> wrote: > On 08/07/2010 12:23 AM, Natacha Kerensikova wrote: > Thus, there is a big difference in abstraction between C and Ada. C is about > translating the problem into the capabilities of the solution language; Ada is > (or should be) about modeling the problem in the software. C is about coding: > mapping everything onto a small set of predefined representations. Ada is about > SW engineering: creating useful abstractions that represent important aspects of > the problem. Funnily, you're the first one shaking my resolve to learn Ada. Let's get everything straight: I'm amateur. I'm coding for fun. Well, I'll so be coding for a living too, but then I won't have a say in the language chosen. I like coding in C, and I don't care how efficient it is. There is not waste of time in a leisure activity. I'd rather have fun with C rather than doing the same thing 10x faster without fun. The only thing bothering me in C is that I often end up using dangerous construct. For example, *(struct foo **)((char *)whatever + bar.offset). While I'm perfectly fine with that, because I'm confident in what I'm doing, but I can understand it looks sloppy from the outside. My main motivation to learn Ada is publicize the concerns for robustness and correctness that might not be obvious from my C code. I was hoping to do Ada whatever I used to be doing in C: network programming, DS homebrew, etc. Am I misguided? Should I stop now? > Effectively using Ada requires a different way of thinking from using C or > assembler. You'll more quickly gain that mindset by asking how to approach > specific problems in Ada, rather than how to do something you did in C. Ok, so let's have a look at the grand picture. My main objective right now is to code a webserver in Ada. Yes, that's reinventing the wheel, but it does wonders for learning. Here is how I intended to do it, admittedly exactly like I would do it in C, could you please tell me how far I am from the Ada approach? I start by dividing the work into "modules" or "components", each containing a structure or a few related structures, along with whatever functions or procedures to deal with them. I thought this would map perfectly into Ada packages. Configuration files are S-expressions, in the form of (key value) pairs, gathered in sections like (section-name (key value) (key value)) Webpage templates are also S-expressions, in the form "raw- html" (function arg1 arg2 ) "raw-html" (function) "raw-html". The interesting thing being that arg1 arg2 etc are S-expressions and thus can be subtemplates too. As I'm more comfortable using components already coded and tested, I would code them from the lowest to the highest level: - first a component dealing with S-expression I/O, hence this topic. - then a component for configuration, which use the S-expression library and is used by other components either for program-wide configuration variables or for instance specific configuration - then a network component, gluing the rest of my program with whatever socket library I will use (AdaSockets or GNAT.stuff or C interfacing or whatever, don't know yet) - then a HTTP parsing component, taking data from the network component and configuration - then a general page component, dispatching requests to the relevant page objects - then a raw file component, a specific page responding to HTTP request with data taken directly from a file - then a template component, interpreting the function calls from S- expression templates - then a templated page component, another specialization of a page object, dealing with HTTP response and containing instance-specific data used by the template component. And that should be about it, I might encounter the need for other components, maybe for network I/O multiplexing or for logging or for caching templates etc. So, how bad is it? > An S-expression library might be a useful thing at a low-level, but few > applications should call it directly. Instead, there should probably be at least > one layer of abstraction on top of the low-level storage representation, so that > the application only deals with appropriate application-level representations of > the data. I have to admit in the above I don't really know what belongs to a library and what belongs to the application. But indeed, a S- expression package is a low-level thing, I'm well aware of that, I just can begin with high-level stuff if I don't have strong and tested low-level stuff to build upon. However the point of coding so many separate components is to be able to change the internals of one without having to touch everything else. Should I find someday a format so much better than S- expressions, I would only have one component to change. Should I want different formats for configuration and templates, that's a component to add, and maybe little modifications to configuration and/or template modules. And so on > Here, your problem seems to be how to have a human-readable storage format for > your applications. While there is merit to discussing the pros and cons of the > many existing formats to use, the Ada approach is to use an application-specific > abstraction, hiding the implementation detail of which such format is eventually > chosen. Is that so different than what I explained above? Thanks for your help, Natacha
From: Jeffrey Carter on 8 Aug 2010 02:52
On 08/07/2010 10:01 AM, Natacha Kerensikova wrote: > > Let's get everything straight: I'm amateur. I'm coding for fun. Well, > I'll so be coding for a living too, but then I won't have a say in the > language chosen. I like coding in C, and I don't care how efficient it > is. There is not waste of time in a leisure activity. I'd rather have > fun with C rather than doing the same thing 10x faster without fun. > The only thing bothering me in C is that I often end up using > dangerous construct. For example, *(struct foo **)((char *)whatever + > bar.offset). While I'm perfectly fine with that, because I'm confident > in what I'm doing, but I can understand it looks sloppy from the > outside. My main motivation to learn Ada is publicize the concerns for > robustness and correctness that might not be obvious from my C code. I > was hoping to do Ada whatever I used to be doing in C: network > programming, DS homebrew, etc. > > Am I misguided? Should I stop now? One can do anything you can do in C in Ada. Better, since creating buffer overflow and signed-integer overflow vulnerabilities takes effort in Ada, while they're the default in C. (Virtually every "important security update" I see for Linux is a buffer overflow or signed-integer overflow vulnerability. I doubt if people are creating these on purpose. My conclusion is that it is impossible in practice to use C without creating these errors.) > Ok, so let's have a look at the grand picture. My main objective right > now is to code a webserver in Ada. Yes, that's reinventing the wheel, > but it does wonders for learning. > > Here is how I intended to do it, admittedly exactly like I would do it > in C, could you please tell me how far I am from the Ada approach? It's hard to comment meaningfully, since you mostly describe your intended implementation, not your requirements. I'd use Ada Web Server (AWS), and perhaps you should try that, too. Using a significant, existing, high-level Ada application framework like that might help introduce you to how some experienced Ada people thought this kind of thing should be approached. A "web server" can be a variety of things, from a simple page server that serves static files to a highly-dynamic system generating everything on the fly. It appears that you intend something that serves static files and expanded page templates. Initially, I'd observe that the system talks to the network and to the permanent storage that stores the configuration information, static pages, and so on. So my initial decomposition would identify interface modules for communicating with these. (This is an "edges-in" approach.) At a higher level, there is something that responds to incoming requests to serve the appropriate responses. There's something this uses to obtain the configuration from the permanent storage. This could make use of something that can serve a static page and something that can serve an expanded page template. There's also clearly a place for something that expands a page template. I'm doing this off the top of my head, so I won't be surprised if I've missed something or otherwise screwed up. This identifies the major high-level modules in the system. I could now define the package specifications for them and have the compiler check that they are syntactically correct and semantically consistent. Then I could pick one and design its implementation. It's likely at some point tasking would be involved, allowing the processing of multiple requests at once, so this would all have to be done keeping concurrency in mind. At some point I'd get to a low enough level to start thinking about representations, which seems to be where you begin your thinking about the problem. > As I'm more comfortable using components already coded and tested, I > would code them from the lowest to the highest level: In Ada, one can create package specifications, then create other units that make use of those specifications before they are implemented. This is an important concept in Ada called the separation of specification and body. Sometimes it is useful to create stub bodies for such packages, which can then be used to test the units that make use of these packages. Thus it is often possible to implement and test higher-level modules before lower-level modules that they use have been implemented. This may not be especially useful on a single-person project, but can be quite valuable in projects with more than one developer. This often seems to be a foreign concept to those used to C. While your approach seems quite different to mine, many aspects of the final result seem to be similar. This probably bodes well for you being able to use Ada effectively. -- Jeff Carter "I blow my nose on you." Monty Python & the Holy Grail 03 --- news://freenews.netfront.net/ - complaints: news(a)netfront.net --- |