Let's start with an example: imagine we want to discover and quantify the elements that influence labor market earnings. A moment of thinking reveals a plethora of characteristics linked to differences in earnings between individuals—occupation, age, experience, educational attainment, drive, and intrinsic talent spring to mind, maybe along with race and gender, which are of particular importance to lawyers. For the time being, let's focus on a single aspect, which we'll call education. The term "simple linear regression" refers to a regression analysis with only one explanatory variable.
The term "regression" and the methods for studying the correlations between two variables are said to have originated around 100 years ago. It was first suggested by Francis Galton, a distinguished British biologist, in 1908, when he was researching heredity. One of his findings was that children of tall parents are taller than normal, but not as tall as their parents. The term "regression toward mediocrity" was used to describe these statistical procedures. The term regression, as well as its history, refers to statistical relationships that exist between variables. Simple regression, in particular, is a regression approach for examining the connection between one dependent variable (y) and one independent variable (x).
The simple linear regression model is typically stated in the form y = β0 + β1x + ε, where y is the dependent variable, β0 is the y intercept, β1 is the slope of the simple linear regression line, x is the independent variable, and ε is the random error. The independent variable is also known as the explanatory or predictor variable, and the dependent variable is also known as the response variable. The causal changes in the response variables are explained by an explanatory variable. A more generic representation of a regression model is y = E(y) + ε, where E(y) represents the response variable's mathematical expectation. The regression is the linear regression when E(y) is a linear combination of exploratory variables x1, x2,..., xk. The regression is straightforward linear regression if k = 1.
The classical assumptions on error term are E(ε) = 0 and a constant variance Var(ε) = σ^2. The typical experiment for the simple linear regression is that we observe n pairs of data (x1, y1),(x2, y2), · · · ,(xn, yn) from a scientific experiment, and model in terms of the n pairs of the data can be written as
with E(εi) = 0, a constant variance Var(εi) = σ^2, and all εi’s are independent. It's worth noting that the true value of σ^2 is frequently unknown. The values of xi's are measured "precisely," meaning there is no measurement error. Following the specification of the model and the collection of data, the next stage is to identify “good” estimates of β0 and β1 for the simple linear regression model that best describes the data from a scientific experiment. In the following part, we shall derive these estimates and examine their statistical features.
On this topic, your comments/suggestions are highly appreciated. Type it in the comment section below. You can follow to this blog to receive notifications of new posts.
0 Comments