Why Study Statistics?: ||Overview of Statistics:
Introduction
Radio, television, newspapers, and even magazines are the most common mediums for disseminating statistics. Statistics emphasize data analysis in order to generate and facilitate decision-making.
Over the last decade, there has been an increasing recognition of the role of statistics in policy and decision-making among businessmen, policymakers, academics, government economists, sociologists, education, psychology, medicine, and business circles. As a result, many colleges and universities now include statistics as part of their curriculum.
In the early nineteenth and twenty-first centuries, quality in the industry improved in the most progressive countries. The adoption of statistical tools and statistical thinking among management staff has been credited with the achievement of Japan's "industrial miracle" (Walpole, 2002).
Statistics is today a vibrant intellectual discipline, and basic findings may have an impact on statistical practice within the lifespan of the book's readers (Kenny, 2002).
What is Statistics?
What do we mean when we say "statistics"? The term statistics has multiple meanings. It's a word that we hear a lot in ordinary conversation. The origins of the sources we'll utilize in the definition can be traced back to the authors listed below, none of whom have much in common with statistics.
The term statistics, according to Reston (2004), simply refers to a large amount of data or a collection of facts and numbers. The grades of students in the registrar's office, the outcomes of incoming freshmen's admission examinations at certain institutions or colleges, the average temperature and average rainfall each month, and the weekly sales of company A are examples of such statistics in nature.
Statistics is defined by Freund et al. (1986) and Nocon et al. (2000) as a discipline of mathematics concerned with the theory and methods of data gathering, analysis, interpretation, and presentation. As a result, the information could be categorized or numerical.
As defined by Walpole, (2000) and Lind, et al. (2000), data science is a branch of science concerned with the concepts and techniques used in the collection, display, analysis, and interpretation of data in order to help people make better decisions.
Thus, formally we defined Statistics is the branch of science that deals with the collection, presentation, summarization, analyzation, and interpretation of data.
RELATED POST: «Why Study Statistics?»
Data
The facts and figures that are collected, evaluated, and summarized for presentation and interpretation are referred to as data. Name, age, sex, gender, birth date, TIN, grade, height, weight, religion, hobbies, learning style, program, and so on are examples of data. The data set for a study refers to all of the data acquired during that investigation (Anderson et al., 2015, p. 5).
Is it possible to lose weight on a low-carbohydrate diet? Are individuals more inclined to visit a Starbucks after seeing a recent Starbucks commercial on television? The collecting of information is at the heart of finding answers to such problems. Data refers to the information we collect through trials and surveys.
Characteristics of Data
Volume, velocity, and variety are usually considered to be the three fundamental characteristics that define big data. The three Vs are as follows: Volume is easy to understand. There's a lot of information here. Velocity implies that data is arriving at a quicker rate than ever before, and that it must be stored at a faster rate than ever before. The term "variety" alludes to the large range of data structures that could be stored.
Table 1 shows a data set containing information for 16 regions of the Philippines that are affected by Covid19.
Elements are the entities on which data are collected. They are also called individuals. For the data set in Table 1, each location is an element/individual: the element/individual names appear in the first column. With 16 locations, the data set contains 16 elements.
A variable is a characteristic of interest for the elements. The data set in Table 1 includes the following three variables:
Measurements collected on each variable for every element in a study provide the data. The set of measurements obtained for a particular element is called an observation. Referring to Table 1, we see that the set of measurements for the first observation (Metro Manila) is 59,805; 34,346; and 1,033. The set of measurements for the second observation (Central Visayas) is 16,954; 12,069; and 685. A data set with 16 elements contains 16 observations.
RELATED POST: «Types of Variables»
There are four types of scales of measurement: nominal, ordinal, interval, and ratio. Nominal scales are used for labeling variables, without any quantitative value. Examples of nominal scales are gender values, sex values, hair colors, etc. With ordinal scales, the order of the values is what’s significant, but the difference between each one is not known. Examples of ordinal scales are satisfaction, discomfort, happiness, etc. Interval scales are numeric scales in which we know both the order and the exact differences between the values. An example of an interval scale is the Celsius temperature. Lastly, the ratio scales are the ultimate nirvana when it comes to data measurement scales because they tell us about the order, they tell us the exact value between units, and they also have an absolute zero-which allows for a wide range of both descriptive and inferential statistics to be applied. Examples of ratio variables include height, weight, and duration (Guy, 2020).
Data can be classified as categorical and quantitative. Data that can be grouped by specific categories are referred to as categorical data. Categorical data use either the nominal or ordinal scale of measurement. A categorical variable is a variable with categorical data (Anderson et al., 2015, p. 7).
On the other hand, there is quantitative data that we can measure and not just observe. We can represent them numerically and even perform calculations. A quantitative variable is a variable with quantitative data For example, the number of female and male students in Computer Programming will be numerical, thus, it is quantitative data (Data: Types of Data, Primary Data, Secondary Data, Solved Examples, 2020).
RELATED POST: «Quantitative Variables»
For purposes of statistical analysis, distinguishing between time-series data and cross-sectional data is necessary. Time-series data is a set of observations collected at usually discrete and equally spaced time intervals. The daily closing price of a certain stock recorded over the last six weeks is an example of time-series data. Cross-sectional data are observations that come from different individuals or groups at a single point in time. Examples would be an inventory of all ice creams in stock at a particular store and a list of grades obtained by a class of students on a specific test (Team, n.d.).
RELATED POST: «Time Series Analysis Book»
Data retrieved first-hand is known as primary data, but data retrieved from preexisting sources is known as secondary data. Primary data sources include information collected and processed directly by the researcher, such as observations, surveys, interviews, and focus groups. Secondary data sources include information retrieved through preexisting sources: research articles, Internet or library researches, etc. Pre-existing data may also include records and data already within the program: publications and training materials, financial records, student/client data, performance reviews (University of Minnesota & The United States Department of Agriculture’s National Institute of Food and Agriculture, n.d.).
"I'm sure I don't have all of the answers or information about statistics and data right here." I'd love to hear your thoughts on this topic in the comments section. You can follow to this blog to receive notifications of new posts.
0 Comments