Prev: "Why is it necessary to classify data according to its type incomputer programming?"
Next: "Why is it necessary to classify data according to its type in computer programming?"
From: Pascal J. Bourguignon on 5 Jul 2010 07:27 Fred Nurk <albert.xtheunknown0(a)gmail.com> writes: > My answer to the subject of this thread is: > It is necessary to classify data into different types because a computer > must use a system of codes to classify the data processed. > > The assumption, of course, is that I'm doing basic reading comprehension > from a textbook. > > My answer is very much derived from this paragraph of my textbook: > A data type refers to how the computer classifies data. We can all see at > a glance whether data is numerical or expressed as a percentage or > currency but a computer must use a system of codes to classify the > different types of data that are processed. The different data types in > computer programming include integers, characters, Boolean, floating > point numbers, real numbers, dates, pointers, records, algebraic data > types, abstract data types, reference types, classes and function types. > > Does my answer make a lot of sense? (123 1.23 "123" ABC #S(point :x 12 :y 34) #(1 2 3 4 5)) You can see that we have here a list containing an integer, a floating point, a string, a symbol, a structure of type POINT with two slots x and y bound to integers, and a vector of integers. You, as a human, can see it, and the computer, as a lisp implementation, can parse it too. Therefore you're both at the same level: there is no need to classify data according to its type because there are clues that will tell what kind of data we have and both a human or an computer program can take these clues to interpret the data as being of one type or another. Now it's a question of representation. What you see above, is a textual representation of the data. Internally, we have broadly two classes of representation: - boxed (tagged, OO), - unboxed (untagged, "primitive type"). In the boxed representation, the binary pattern used to represent the data is tagged with a code that indicate the type of data, and therefore the interpretation to be given to the binary pattern. Languages such as Lisp, Smalltalk, Perl, Ruby, Python, JavaScript, and the dynamic object oriented languages work with boxed representations. In the unboxed representation, there is only the raw binary pattern, and it's up to the program to know what type of data it represents. Languages such as Assembler, Fortran, C, C++, Pascal, Ada, and static object oriented languages (C++), work with unboxed representations. In the later case, types take an overwhelming importance, because if you get the wrong type, that is, if you interpret a bit pattern as a different type of data, you get wrong results and program crash. With boxed presentation, since the type is attached to the values (and not to the variables), in a way data is "self classifying". In most circumstances, you may care less about types, and they can change more easily or even dynamically (at run-time). For this reason, amongst others, OO is successful. Now, remains to answer to your question, why is it necessary to distinguish data types? It all comes from the "interpret the data" part. This "interpretation" is done by the functions processing the data. There are a lot of functions that are entirely generic (or there should be!): CL-USER> (mapcar (function type-of) '(123 1.23 "123" ABC #S(point :x 12 :y 34) #(1 2 3 4 5))) ((INTEGER 0 1152921504606846975) SINGLE-FLOAT (SIMPLE-ARRAY CHARACTER (3)) SYMBOL POINT (SIMPLE-VECTOR 5)) CL-USER> (mapcar (function list) '(123 1.23 "123" ABC #S(point :x 12 :y 34) #(1 2 3 4 5))) ((123) (1.23) ("123") (ABC) (#S(POINT :X 12 :Y 34)) (#(1 2 3 4 5))) CL-USER> which shows that you don't really need to classify the data by type. But there are also a lot of functions that work only on a single type or a small set of types. For example, arithmetic operations work only on numbers: CL-USER> (mapcar (lambda (x) (ignore-errors (* 2 x))) '(123 1.23 "123" ABC #S(point :x 12 :y 34) #(1 2 3 4 5))) (246 2.46 NIL NIL NIL NIL) CL-USER> (* 2 "abc") ; Evaluation aborted. Argument Y is not a NUMBER: "abc" [Condition of type SIMPLE-TYPE-ERROR] Or only on sequences: CL-USER> (mapcar (lambda (x) (ignore-errors (length x))) '(123 1.23 "123" ABC #S(point :x 12 :y 34) #(1 2 3 4 5))) (NIL NIL 3 NIL NIL 5) Therefore when you plan to use functions that have type restrictions for their arguments, you need to be careful about the type of data you give them. This is the reason why is it necessary to classify data according to its type in computer programming: because some functions only work on specific data types. Finally, notice that in languages that don't require you to declare the type of the parameters of the functions, you can more easily write generic functions, or at least functions more generic that what you would in a language where you'd declare the type. For example: CL-USER> (defun fact (x) (if (< x 1) 1 (* x (fact (- x 1))))) FACT Gives us of course a function to compute the factorial of integers: CL-USER> (fact 15) 1307674368000 But also of floating points: CL-USER> (fact 15.0) 1.3076743e12 or double floating points: CL-USER> (fact 15.0d0) 1.307674368d12 or even of rationals: CL-USER> (fact 15/4) 1155/64 Or, to tell it in a different way, when the programmers must specify the types, they often do it in a way that is too restrictive, the classification is too specific. -- __Pascal Bourguignon__ http://www.informatimago.com
From: Ben Bacarisse on 5 Jul 2010 08:37 Fred Nurk <albert.xtheunknown0(a)gmail.com> writes: > It is necessary to classify data into different types because a computer > must use a system of codes to classify the data processed. That's a circular answer: it is necessary because it must be used. > That's my answer (which I've tried to take from my textbook) to this > thread's subject. > > My textbook has this paragraph: > A data type refers to how the computer classifies data. We can all see at > a glance whether data is numerical or expressed as a percentage or > currency but a computer must use a system of codes to classify the > different types of data that are processed. The different data types in > computer programming include integers, characters, Boolean, floating > point numbers, real numbers, dates, pointers, records, algebraic data > types, abstract data types, reference types, classes and function types. > > Does my answer make sense? Could I do a better job of showing my > understanding of the importance of data types using the textbook > paragraph? The quoted paragraph is largely about what a type is. It does not say anything about why such a system might be helpful -- all it does is state (without any support) that a computer system much use such a system. Using that paragraph alone you are pretty much stuck but the text probably says more about types that this one paragraph. You could always step outside the text and see if you can come up with an answer on your own. -- Ben.
From: Daniel T. on 5 Jul 2010 09:28 pjb(a)informatimago.com (Pascal J. Bourguignon) wrote: > Fred Nurk <albert.xtheunknown0(a)gmail.com> writes: > > > My answer to the subject of this thread is: > > It is necessary to classify data into different types because a computer > > must use a system of codes to classify the data processed. > > > > The assumption, of course, is that I'm doing basic reading comprehension > > from a textbook. > > > > My answer is very much derived from this paragraph of my textbook: > > A data type refers to how the computer classifies data. We can all see at > > a glance whether data is numerical or expressed as a percentage or > > currency but a computer must use a system of codes to classify the > > different types of data that are processed. The different data types in > > computer programming include integers, characters, Boolean, floating > > point numbers, real numbers, dates, pointers, records, algebraic data > > types, abstract data types, reference types, classes and function types. > > > > Does my answer make a lot of sense? > > > (123 > 1.23 > "123" > ABC > #S(point :x 12 :y 34) > #(1 2 3 4 5)) > > > You can see that we have here a list containing an integer, a floating > point, a string, a symbol, a structure of type POINT with two slots x > and y bound to integers, and a vector of integers. > > You, as a human, can see it, and the computer, as a lisp implementation, > can parse it too. Therefore you're both at the same level: > > there is no need to classify data according to its type > > because there are clues that will tell what kind of data we have and > both a human or an computer program can take these clues to interpret > the data as being of one type or another. That's a funny answer... There is no need to classify data according to its type because we (and the computer) can classify data according to its type. :-)
From: Daniel T. on 5 Jul 2010 09:35 Fred Nurk <albert.xtheunknown0(a)gmail.com> wrote: > It is necessary to classify data into different types because a computer > must use a system of codes to classify the data processed. > > That's my answer (which I've tried to take from my textbook) to this > thread's subject. > > My textbook has this paragraph: > A data type refers to how the computer classifies data. We can all see at > a glance whether data is numerical or expressed as a percentage or > currency but a computer must use a system of codes to classify the > different types of data that are processed. The different data types in > computer programming include integers, characters, Boolean, floating > point numbers, real numbers, dates, pointers, records, algebraic data > types, abstract data types, reference types, classes and function types. > > Does my answer make sense? Could I do a better job of showing my > understanding of the importance of data types using the textbook > paragraph? A computer must classify data according to its type for the same reason a person must classify information according to its type. We classify the data in order to determine what operations can meaningfully be applied to that data. The question is quite existential and belongs in a philosophy class rather than an computer class. For a computer class, it should be enough to understand that the program must be able to classify data according to its type, and then discuss the different means by which a program can do that.
From: tm on 5 Jul 2010 09:58
On 5 Jul., 10:24, Fred Nurk <albert.xtheunkno...(a)gmail.com> wrote: > It is necessary to classify data into different types because a computer > must use a system of codes to classify the data processed. Data itself is just a number of bits and a type assigns a meaning to the bits. The same bits might be seen as integer, character or some reference (pointer) to other (possibly more complex data). A type can also specify other properties of data such as size or valid and invalid bit patterns. In higher level programming languages the type defines also the operations you can do with the data bits. E.g.: Adding two integers makes sense but adding two characters might be prohibited. Some languages attach a type to each variable (parameter ...) at compile time which cannot be changed at runtime (so called static typing) while others maintain a type (which possibly can change) for each variable at runtime (so called dynamic typing). Both (static and dynamic) typing systems have assets and drawbacks. There have been endless flamewars about "the true typing system". To some extent dynamic features can be introduced in statically typed languages. This is the aproach used in the Seed7 programming language. My arguments why I prefer static typing (and dynamic features inside a static typing framefork) can be found here: http://seed7.sourceforge.net/faq.htm#static_type_checking Greetings Thomas Mertes Seed7 Homepage: http://seed7.sourceforge.net Seed7 - The extensible programming language: User defined statements and operators, abstract data types, templates without special syntax, OO with interfaces and multiple dispatch, statically typed, interpreted or compiled, portable, runs under linux/unix/windows. |