User Tools

Site Tools


data

Data

data - “1. *Information, in any form, on which computer programs operate. The distinction between program (instructions) and data is a fundamental one in computing (see von Neumann machine). It is in this fundamental sense that the word is used in terms such as data, *data break, *data bus, *data cartridge, *data communications, *data compression, *data name, *data protection, *data subject, and *data type.” (Fair Use ODCS)

“2. In a more limited sense, data is distinguished from other contrasting forms of information on which computers operate, such as text, graphics, speech, and image. The distinguishing characteristic is that it is organized in a structured, repetitive, and often compressed way. Typically the structure takes the form of sets of *fields, where the field names are omitted (this omission being a main means of achieving compression). The ‘meaning’ of such data is not apparent to anyone who does not know what each field signifies (for example, only a very limited meaning can be attached to ‘1234’ unless you know that it occupies the ‘employee number’ field). That characteristic gives rise to the popular fallacy that ‘data is meaningless’.” (Fair Use ODCS)

“Terms such as *database, *data dictionary, *data hierarchy, *data independence, *data model, *data preparation, and *data processing normally carry this second sense — though not invariably; the context should determine which sense is intended.” (Fair Use ODCS)

“3. See statistical methods, statistics.” (Fair Use ODCS)

Snippet from Wikipedia: Data

In common usage, data (, also US: ; ) is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A datum is an individual value in a collection of data. Data is usually organized into structures such as tables that provide additional context and meaning, and which may themselves be used as data in larger structures. Data may be used as variables in a computational process. Data may represent abstract ideas or concrete measurements. Data is commonly used in scientific research, economics, and in virtually every other form of human organizational activity. Examples of data sets include price indices (such as consumer price index), unemployment rates, literacy rates, and census data. In this context, data represents the raw facts and figures from which useful information can be extracted.

Data is collected using techniques such as measurement, observation, query, or analysis, and is typically represented as numbers or characters which may be further processed. Field data is data that is collected in an uncontrolled in-situ environment. Experimental data is data that is generated in the course of a controlled scientific experiment. Data is analyzed using techniques such as calculation, reasoning, discussion, presentation, visualization, or other forms of post-analysis. Prior to analysis, raw data (or unprocessed data) is typically cleaned: Outliers are removed and obvious instrument or data entry errors are corrected.

Data can be seen as the smallest units of factual information that can be used as a basis for calculation, reasoning, or discussion. Data can range from abstract ideas to concrete measurements, including, but not limited to, statistics. Thematically connected data presented in some relevant context can be viewed as information. Contextually connected pieces of information can then be described as data insights or intelligence. The stock of insights and intelligence that accumulates over time resulting from the synthesis of data into information, can then be described as knowledge. Data has been described as "the new oil of the digital economy". Data, as a general concept, refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing.

Advances in computing technologies have led to the advent of big data, which usually refers to very large quantities of data, usually at the petabyte scale. Using traditional data analysis methods and computing, working with such large (and growing) datasets is difficult, even impossible. (Theoretically speaking, infinite data would yield infinite information, which would render extracting insights or intelligence impossible.) In response, the relatively new field of data science uses machine learning (and other artificial intelligence (AI)) methods that allow for efficient applications of analytic methods to big data.

data.txt · Last modified: 2021/02/15 10:13 by 127.0.0.1