|
Alternate uses: See Data
(disambiguation)
A datum is a statement accepted at face value. Data is the plural of datum. A large class of practically important statements are measurements or observations of a variable. Such statements may comprise numbers, words, or images.
Etymology
The word data is the plural of Latin datum, neuter past participle
of dare, "to give", hence "something given". The past
participle of "to give" has been used for millennia, in the sense of a statement accepted at face value; one of the works of
Euclid, circa 300 BC, was the Dedomena (in Latin, Data). In
discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used
interchangeably. Such usage is the origin of data as a concept in computer science: data are numbers, words, images, etc., accepted as they stand.
Usage in English
In English, the word datum is still used in the general sense of "something
given", and more specifically in cartography, geography, and geology to mean a reference point, reference
line, or reference surface. The Latin plural data is also used as a plural in English, but it is also commonly treated
as a mass noun and used in the singular. For example, "This is all the data from the experiment". This usage is inconsistent with the rules of
Latin grammar, which would suggest "These are the data ...", each measurement or result being a single datum. However,
given the variety and irregularity of English plural constructions,
there seem to be no grounds for arguing that data is incorrect as a singular mass noun in English.
Uses of data in computing
Raw data are numbers, characters, images or other outputs from devices to convert physical
quantities into symbols, in a very broad sense. Such data are typically further processed by a human or input into a computer, stored and processed there, or transmitted (output) to another human or computer. Raw data is a relative term; data processing
commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next.
Mechanical computing devices are classified according to the means by which they represent data. An analog computer represents a datum as a voltage, distance, position, or other
physical quantity. A digital computer represents a datum as a
sequence of symbols drawn from a fixed alphabet. The most common digital computers
use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such
as numbers and letters, are then constructed from the binary alphabet.
Some special forms of data are distinguished. A computer program
is a collection of data which can be interpreted as instructions. Most computer languages make a distinction between programs and
the other data on which programs operate, but in some languages, notably Lisp and similar
languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. The prototypical example of metadata is the library catalog,
which is a description of the contents of books.
Meaning of a data and information
Data on its own has no meaning, only when interpreted by some kind of data processing system does it take on meaning and become information. People or computers can find patterns in data to
perceive information, and information can be used to enhance knowledge. Since
knowledge is prerequisite to wisdom, we always want more data and information. But, as
modern societies verge on information overload, we
especially need better ways to find patterns.
See also
data processing -- data mining -- data warehouse -- datasheet
This article (or an earlier version of it) contains material from FOLDOC, used
with permission.
|