"Why is it necessary to classify data according to its type in computer programming?" [General Programming]

Prev: "Why is it necessary to classify data according to its type incomputer programming?"
Next: "Why is it necessary to classify data according to its type in computer programming?"

From: Pascal J. Bourguignon on 5 Jul 2010 07:27

Fred Nurk <albert.xtheunknown0(a)gmail.com> writes:

> My answer to the subject of this thread is:
> It is necessary to classify data into different types because a computer
> must use a system of codes to classify the data processed.
>
> The assumption, of course, is that I'm doing basic reading comprehension
> from a textbook.
>
> My answer is very much derived from this paragraph of my textbook:
> A data type refers to how the computer classifies data. We can all see at
> a glance whether data is numerical or expressed as a percentage or
> currency but a computer must use a system of codes to classify the
> different types of data that are processed. The different data types in
> computer programming include integers, characters, Boolean, floating
> point numbers, real numbers, dates, pointers, records, algebraic data
> types, abstract data types, reference types, classes and function types.
>
> Does my answer make a lot of sense?

(123
1.23
"123"
ABC
#S(point :x 12 :y 34)
#(1 2 3 4 5))

You can see that we have here a list containing an integer, a floating
point, a string, a symbol, a structure of type POINT with two slots x
and y bound to integers, and a vector of integers.

You, as a human, can see it, and the computer, as a lisp implementation,
can parse it too. Therefore you're both at the same level:

there is no need to classify data according to its type

because there are clues that will tell what kind of data we have and
both a human or an computer program can take these clues to interpret
the data as being of one type or another.

Now it's a question of representation.

What you see above, is a textual representation of the data.
Internally, we have broadly two classes of representation:

- boxed (tagged, OO),
- unboxed (untagged, "primitive type").

In the boxed representation, the binary pattern used to represent the
data is tagged with a code that indicate the type of data, and therefore
the interpretation to be given to the binary pattern. Languages such as
Lisp, Smalltalk, Perl, Ruby, Python, JavaScript, and the dynamic object
oriented languages work with boxed representations.

In the unboxed representation, there is only the raw binary pattern, and
it's up to the program to know what type of data it represents.
Languages such as Assembler, Fortran, C, C++, Pascal, Ada, and static
object oriented languages (C++), work with unboxed representations.

In the later case, types take an overwhelming importance, because if you
get the wrong type, that is, if you interpret a bit pattern as a
different type of data, you get wrong results and program crash.

With boxed presentation, since the type is attached to the values (and
not to the variables), in a way data is "self classifying". In most
circumstances, you may care less about types, and they can change more
easily or even dynamically (at run-time). For this reason, amongst
others, OO is successful.

Now, remains to answer to your question, why is it necessary to
distinguish data types? It all comes from the "interpret the data"
part. This "interpretation" is done by the functions processing the data.

There are a lot of functions that are entirely generic (or there should
be!):

CL-USER> (mapcar (function type-of) '(123
1.23
"123"
ABC
#S(point :x 12 :y 34)
#(1 2 3 4 5)))
((INTEGER 0 1152921504606846975) SINGLE-FLOAT (SIMPLE-ARRAY CHARACTER (3)) SYMBOL POINT (SIMPLE-VECTOR 5))
CL-USER> (mapcar (function list) '(123
1.23
"123"
ABC
#S(point :x 12 :y 34)
#(1 2 3 4 5)))
((123) (1.23) ("123") (ABC) (#S(POINT :X 12 :Y 34)) (#(1 2 3 4 5)))
CL-USER>

which shows that you don't really need to classify the data by type.

But there are also a lot of functions that work only on a single type or
a small set of types. For example, arithmetic operations work only on
numbers:

CL-USER> (mapcar (lambda (x) (ignore-errors (* 2 x))) '(123
1.23
"123"
ABC
#S(point :x 12 :y 34)
#(1 2 3 4 5)))
(246 2.46 NIL NIL NIL NIL)

CL-USER> (* 2 "abc")
; Evaluation aborted.

Argument Y is not a NUMBER: "abc"
[Condition of type SIMPLE-TYPE-ERROR]

Or only on sequences:

CL-USER> (mapcar (lambda (x) (ignore-errors (length x))) '(123
1.23
"123"
ABC
#S(point :x 12 :y 34)
#(1 2 3 4 5)))
(NIL NIL 3 NIL NIL 5)

Therefore when you plan to use functions that have type restrictions for
their arguments, you need to be careful about the type of data you give
them. This is the reason why is it necessary to classify data according
to its type in computer programming: because some functions only work on
specific data types.

Finally, notice that in languages that don't require you to declare the
type of the parameters of the functions, you can more easily write
generic functions, or at least functions more generic that what you
would in a language where you'd declare the type.

For example:

CL-USER> (defun fact (x)
(if (< x 1)
1
(* x (fact (- x 1)))))

FACT

Gives us of course a function to compute the factorial of integers:

CL-USER> (fact 15)
1307674368000

But also of floating points:

CL-USER> (fact 15.0)
1.3076743e12

or double floating points:

CL-USER> (fact 15.0d0)
1.307674368d12

or even of rationals:

CL-USER> (fact 15/4)
1155/64

Or, to tell it in a different way, when the programmers must specify the
types, they often do it in a way that is too restrictive, the
classification is too specific.

--
__Pascal Bourguignon__
http://www.informatimago.com

From: Ben Bacarisse on 5 Jul 2010 08:37

Fred Nurk <albert.xtheunknown0(a)gmail.com> writes:

> It is necessary to classify data into different types because a computer
> must use a system of codes to classify the data processed.

That's a circular answer: it is necessary because it must be used.

> That's my answer (which I've tried to take from my textbook) to this
> thread's subject.
>
> My textbook has this paragraph:
> A data type refers to how the computer classifies data. We can all see at
> a glance whether data is numerical or expressed as a percentage or
> currency but a computer must use a system of codes to classify the
> different types of data that are processed. The different data types in
> computer programming include integers, characters, Boolean, floating
> point numbers, real numbers, dates, pointers, records, algebraic data
> types, abstract data types, reference types, classes and function types.
>
> Does my answer make sense? Could I do a better job of showing my
> understanding of the importance of data types using the textbook
> paragraph?

The quoted paragraph is largely about what a type is. It does not say
anything about why such a system might be helpful -- all it does is
state (without any support) that a computer system much use such a
system. Using that paragraph alone you are pretty much stuck but the
text probably says more about types that this one paragraph.

You could always step outside the text and see if you can come up with
an answer on your own.

--
Ben.

From: Daniel T. on 5 Jul 2010 09:28

pjb(a)informatimago.com (Pascal J. Bourguignon) wrote:
> Fred Nurk <albert.xtheunknown0(a)gmail.com> writes:
>
> > My answer to the subject of this thread is:
> > It is necessary to classify data into different types because a computer
> > must use a system of codes to classify the data processed.
> >
> > The assumption, of course, is that I'm doing basic reading comprehension
> > from a textbook.
> >
> > My answer is very much derived from this paragraph of my textbook:
> > A data type refers to how the computer classifies data. We can all see at
> > a glance whether data is numerical or expressed as a percentage or
> > currency but a computer must use a system of codes to classify the
> > different types of data that are processed. The different data types in
> > computer programming include integers, characters, Boolean, floating
> > point numbers, real numbers, dates, pointers, records, algebraic data
> > types, abstract data types, reference types, classes and function types.
> >
> > Does my answer make a lot of sense?
>
>
> (123
> 1.23
> "123"
> ABC
> #S(point :x 12 :y 34)
> #(1 2 3 4 5))
>
>
> You can see that we have here a list containing an integer, a floating
> point, a string, a symbol, a structure of type POINT with two slots x
> and y bound to integers, and a vector of integers.
>
> You, as a human, can see it, and the computer, as a lisp implementation,
> can parse it too. Therefore you're both at the same level:
>
> there is no need to classify data according to its type
>
> because there are clues that will tell what kind of data we have and
> both a human or an computer program can take these clues to interpret
> the data as being of one type or another.

That's a funny answer... There is no need to classify data according to
its type because we (and the computer) can classify data according to
its type. :-)

From: Daniel T. on 5 Jul 2010 09:35

Fred Nurk <albert.xtheunknown0(a)gmail.com> wrote:

> It is necessary to classify data into different types because a computer
> must use a system of codes to classify the data processed.
>
> That's my answer (which I've tried to take from my textbook) to this
> thread's subject.
>
> My textbook has this paragraph:
> A data type refers to how the computer classifies data. We can all see at
> a glance whether data is numerical or expressed as a percentage or
> currency but a computer must use a system of codes to classify the
> different types of data that are processed. The different data types in
> computer programming include integers, characters, Boolean, floating
> point numbers, real numbers, dates, pointers, records, algebraic data
> types, abstract data types, reference types, classes and function types.
>
> Does my answer make sense? Could I do a better job of showing my
> understanding of the importance of data types using the textbook
> paragraph?

A computer must classify data according to its type for the same reason
a person must classify information according to its type. We classify
the data in order to determine what operations can meaningfully be
applied to that data.

The question is quite existential and belongs in a philosophy class
rather than an computer class. For a computer class, it should be enough
to understand that the program must be able to classify data according
to its type, and then discuss the different means by which a program can
do that.

From: tm on 5 Jul 2010 09:58

On 5 Jul., 10:24, Fred Nurk <albert.xtheunkno...(a)gmail.com> wrote:
> It is necessary to classify data into different types because a computer
> must use a system of codes to classify the data processed.

Data itself is just a number of bits and a type assigns a meaning to
the bits. The same bits might be seen as integer, character or some
reference (pointer) to other (possibly more complex data). A type
can also specify other properties of data such as size or valid
and invalid bit patterns. In higher level programming languages the
type defines also the operations you can do with the data bits.
E.g.: Adding two integers makes sense but adding two characters
might be prohibited.

Some languages attach a type to each variable (parameter ...) at
compile time which cannot be changed at runtime (so called static
typing) while others maintain a type (which possibly can change)
for each variable at runtime (so called dynamic typing).

Both (static and dynamic) typing systems have assets and drawbacks.
There have been endless flamewars about "the true typing system".

To some extent dynamic features can be introduced in statically
typed languages. This is the aproach used in the Seed7 programming
language. My arguments why I prefer static typing (and dynamic
features inside a static typing framefork) can be found here:

http://seed7.sourceforge.net/faq.htm#static_type_checking

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

| Next | Last
Pages: 1 2 3 4
Prev: "Why is it necessary to classify data according to its type incomputer programming?"
Next: "Why is it necessary to classify data according to its type in computer programming?"