Types of Data
Most data that we encounter in the real world can be classified into two broad categories: qualitative data and quantitative data. Qualitative data can be further classified into nominal and ordinal data whereas quantitative data can be further classified into continuous and discrete data.
Nominal attributes are those qualitative attributes in which there is no natural ordering in the values that an attribute can take.
- gender of a person (male, female, etc)
- type of crop (Kharif, Rabi, all-season)
- type of dismissal in cricket (lbw, caught, bowled, stumped, run out, etc)
- type of bowler (fast, medium pace, spin)
- nationality of a person (Indian, American, Chinese, etc)
- brand of a mobile phone (Apple, Samsung, LG, Sony, etc)
Ordinal attributes are those qualitative attributes in which there is a natural ordering in the values that an attribute can take
- income range (low, medium, high)
- health risks (low, medium, high)
- opinion on a policy (strongly disagree, disagree, neutral, agree, strongly agree)
- movie ratings (very bad, bad, ok, good, excellent)
- competence in programming (novice, intermediate, advanced, expert)
Types of numbers
Whole numbers include zero and all positive numbers which do not contain a fractional component (0, 1, 2, 3, 4, 5, ….)
Integers include all whole numbers and in addition include also all negative numbers which do not contain a fractional component (…,-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5,…)
Rational numbers include all numbers which can be expressed as a ratio of two integers (e.g. 1/2, 1/3, 1/4). Notice that, trivially, all integers are also a part of rational numbers. As any integer x can be written as the following ratio x = x/1 which is indeed a ratio two integers.
Irrational numbers are numbers which cannot be written as a fraction of two integers (e.g. π,(2)\pi, \sqrt(2)π,(2) )
Real numbers include both irrational and rational numbers (and hence integers also).
Discrete attributes are those quantitative attributes which can take on only a finite number of values (integers).
- number of bedrooms in a house
- number of farmers in a village
- number of runs scored by a batsman
- number of wickets taken by a bowler
- number of mobile phones owned by a person
Continuous attributes refers to quantitative attributes which can take on fractional values (real numbers)
- height of a person
- blood sugar level
- interest rates
- distance between two cities
- strike rate of a batsman
Notice that sometimes ordinal data can also be expressed as numbers. For example, instead of asking a customer to choose between very poor, poor, ok, good, very good, one could simply ask her to pick a number from 1, 2, 3, 4 and 5 (1 being very poor and 5 being very good). So now is this ordinal data or discrete qualitative data. The answer is it is ordinal data. Although expressed as numbers, the key difference is that there is no notion of distance between these numbers. For example, the difference between very poor and poor may not be the same as that between ok and poor although they are both separated by a rating of 1. In such cases, the numbers 1, 2, 3, 4, 5 are still being used as classes or categories with some ordering between the categories and hence it is ordinal data.