How to Interpret P-Values in Linear Regression (With Example) - Statology (2024)

In statistics,linear regression models are used to quantify the relationship between one or more predictor variables and a response variable.

Whenever you perform regression analysis using some statistical software, you will receive a regression table that summarizes the results of the model.

Two of the most important values in a regression table are the regression coefficients and their corresponding p-values.

The p-values tell you whether or not there is a statistically significant relationship between each predictor variable and the response variable.

The following example shows how to interpret the p-values of a multiple linear regression model in practice.

Example: Interpreting P-Values in Regression Model

Suppose we want to fit a regression modelusing the following variables:

Predictor Variables

  • Total number of hours studied (between 0 and 20)
  • Whether or not a student used a tutor (yes or no)

Response Variable

  • Exam score (between 0 and 100)

We want to examine the relationship between the predictor variables and the response variable to find out if hours studied and tutoring actually have a meaningful impact on exam score.

Suppose we run a regression analysis and get the following output:

TermCoefficientStandard Errort StatP-value
Intercept48.5614.323.390.002
Hours studied2.030.673.030.009
Tutor8.345.681.470.138

Here’s how to interpret the output for each term in the model:

Interpreting the P-value for Intercept

Theinterceptterm in a regression table tells us the average expected value for the response variable when all of the predictor variables are equal to zero.

In this example, the regression coefficient for the intercept is equal to48.56. This means that for a student who studied for zero hours, the average expected exam score is 48.56.

The p-value is 0.002, which tells us that the intercept term is statistically different than zero.

In practice, we don’t usually care about the p-value for the intercept term. Even if the p-value isn’t less than some significance level (e.g. 0.05), we would still keep the intercept term in the model.

Interpreting the P-value for a Continuous Predictor Variable

In this example, Hours studied is a continuous predictor variable that ranges from 0 to 20 hours.

From the regression output, we can see that the regression coefficient forHours studied is 2.03. This means that, on average, each additional hour studied is associated with an increase of 2.03 points on the final exam, assuming the predictor variableTutoris held constant.

For example, consider student A who studies for 10 hours and uses a tutor. Also consider student B who studies for 11 hours and also uses a tutor. According to our regression output, student B is expected to receive an exam score that is 2.03 points higher than student A.

The corresponding p-value is0.009, which is statistically significant at an alpha level of 0.05.

This tells us that that the average change in exam score for each additional hour studied is statistically significantly different than zero.

Another way to put this: Hours studied has a statistically significant relationship with the response variable exam score.

Interpreting the P-value for a Categorical Predictor Variable

In this example,Tutoris a categorical predictor variable that can take on two different values:

  • 1 = the student used a tutor to prepare for the exam
  • 0 = the student did not used a tutor to prepare for the exam

From the regression output, we can see that the regression coefficient forTutoris8.34. This means that, on average, a student who used a tutor scored 8.34 points higher on the exam compared to a student who did not used a tutor, assuming the predictor variableHours studiedis held constant.

For example, consider student A who studies for 10 hours and uses a tutor. Also consider student B who studies for 10 hours and does not use a tutor. According to our regression output, student A is expected to receive an exam score that is 8.34 points higher than student B.

The corresponding p-value is0.138, which is not statistically significant at an alpha level of 0.05.

This tells us that that the average change in exam score for each additional hour studied is not statistically significantly different than zero.

Another way to put this: The predictor variable Tutor does not have a statistically significant relationship with the response variable exam score.

This indicates that although students who used a tutor scored higher on the exam, this difference could have been due to random chance.

Additional Resources

The following tutorials provide additional information about linear regression:

How to Interpret the F-Test of Overall Significance in Regression
The Five Assumptions of Multiple Linear Regression
Understanding the t-Test in Linear Regression

Featured Posts

How to Interpret P-Values in Linear Regression (With Example) - Statology (1)

5 Statistical Biases to Avoid

April 25, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (2)

5 Free Statistics Courses for Beginners

April 19, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (3)

5 MIT Statistics Courses That Are Free

April 18, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (4)

5 Free Books to Learn Statistics

April 18, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (5)

How to Use the info() Method in Pandas

April 12, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (6)

How to Use pct_change() in Pandas

April 12, 2024

How to Interpret P-Values in Linear Regression (With Example) - Statology (2024)

FAQs

How do you interpret the p-value example? ›

For example, suppose that a vaccine study produced a P value of 0.04. This P value indicates that if the vaccine had no effect, you'd obtain the observed difference or more in 4% of studies due to random sampling error.

What if the p-value is less than 0.05 in linear regression? ›

They provided a simple explanation of the problem in making an inference from p-value, for example, if the p-value is less than 0.05, we have enough evidence to reject the null hypothesis and accept the claim. By this conviction in the regression framework, we must reject the null hypothesis ( H 0 : β = 0 ) .

Is p-value 0.04 significant? ›

A p-value of 0.05 or lower is generally considered statistically significant. P-value can serve as an alternative to—or in addition to—preselected confidence levels for hypothesis testing.

What does p-value tell you in regression? ›

In the context of regression, the p-value reported in this table (Prob > F) gives us an overall test for the significance of our model. The p-value is used to test the hypothesis that there is no relationship between the predictor and the response.

How do you know if a regression is significant with p-value? ›

Coefficients having p-values less than alpha are statistically significant. For example, if you chose alpha to be 0.05, coefficients having a p-value of 0.05 or less would be statistically significant (i.e., you can reject the null hypothesis and say that the coefficient is significantly different from 0).

What is the p-value and significance level for dummies? ›

Over the years, the value of 0.05 has become accepted as a reasonable criterion for declaring significance. If you adopt the criterion that p must be less than or equal to 0.05 to declare significance, then you'll keep the chance of making a Type I error to no more than 5 percent.

Does p-value matter in linear regression? ›

How Do I Interpret the P-Values in Linear Regression Analysis? The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis.

Is p-value important in linear regression? ›

The p-value of a linear regression model checks if there is a significant linear relationship or correlation between your predictors (in this example the days Monday to Friday) and the target variable (Return/Earnings). If the p-value is low, this means its relationship is significant.

What if the p-value is not significant in regression? ›

These are as follows: if the P value is 0.05, the null hypothesis has a 5% chance of being true; a nonsignificant P value means that (for example) there is no difference between groups; a statistically significant finding (P is below a predetermined threshold) is clinically important; studies that yield P values on ...

How do you explain p-value to non-technicians? ›

The p-value helps you understand the likelihood that your results could have occurred by chance if the null hypothesis were true. A low p-value indicates that your findings are unlikely to be due to random chance, suggesting that the effect you're investigating may indeed be real.

Is p-value 0.75 significant? ›

If the p-value is larger than 0.05, we cannot conclude that a significant difference exists. That's pretty straightforward, right?

Is p-value of 0.0000 significant? ›

A p-value of less than 0.05 implies significance and that of less than 0.01 implies high significance. Therefore p=0.0000 implies high significance. Article Making friends with your data: Improving how statistics are ...

What does a p-value of 0.3 mean? ›

E.g. a p-value of 0.3 means "repeating the study many times, given that the null hypothesis + all other assumptions are true, I would see the result I'm seeing (or a more extreme result) 30% of time, so it wouldn't be super unusual.

What does a high p-value of a feature in linear regression implies? ›

Often, after training a linear regression model on data, some variables/features would have high P-value, which means they are not statistically significant. Although there are automated methods like variable selection, such as Step-wise, LASSO, etc.

How to interpret r squared and p-value? ›

The greater R-square the better the model. Whereas p-value tells you about the F statistic hypothesis testing of the “fit of the intercept-only model and your model are equal”. So if the p-value is less than the significance level (usually 0.05) then your model fits the data well.

Is p 0.0001 statistically significant? ›

Conventionally, p < 0.05 is referred as statistically significant and p < 0.001 as statistically highly significant.

Top Articles
Latest Posts
Article information

Author: Dong Thiel

Last Updated:

Views: 6190

Rating: 4.9 / 5 (79 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Dong Thiel

Birthday: 2001-07-14

Address: 2865 Kasha Unions, West Corrinne, AK 05708-1071

Phone: +3512198379449

Job: Design Planner

Hobby: Graffiti, Foreign language learning, Gambling, Metalworking, Rowing, Sculling, Sewing

Introduction: My name is Dong Thiel, I am a brainy, happy, tasty, lively, splendid, talented, cooperative person who loves writing and wants to share my knowledge and understanding with you.