Countless factors impact every facet of business. How can you consider those factors and know their true impact?
Imagine you seek to understand the factors that influence people’s decision to buy your company’s product. They range from customers’ physical locations to satisfaction levels among sales representatives to your competitors' Black Friday sales.
Understanding the relationships between each factor and product sales can enable you to pinpoint areas for improvement, helping you drive more sales.
To learn how each factor influences sales, you need to use a statistical analysis method called regression analysis.
If you aren’t a business or data analyst, you may not run regressions yourself, but knowing how analysis works can provide important insight into which factors impact product sales and, thus, which are worth improving.
Free E-Book: A Beginner's Guide to Data & Analytics
Access your free e-book today.
Before diving into regression analysis, you need to build foundational knowledge of statistical concepts and relationships.
Start with the basics. What relationship are you aiming to explore? Try formatting your answer like this: “I want to understand the impact of [the independent variable] on [the dependent variable].”
The independent variable is the factor that could impact the dependent variable. For example, “I want to understand the impact of employee satisfaction on product sales.”
In this case, employee satisfaction is the independent variable, and product sales is the dependent variable. Identifying the dependent and independent variables is the first step toward regression analysis.
One of the cardinal rules of statistically exploring relationships is to never assume correlation implies causation. In other words, just because two variables move in the same direction doesn’t mean one caused the other to occur.
If two or more variables are correlated, their directional movements are related. If two variables are positively correlated, it means that as one goes up or down, so does the other. Alternatively, if two variables are negatively correlated, one goes up while the other goes down.
A correlation’s strength can be quantified by calculating the correlation coefficient, sometimes represented by r. The correlation coefficient falls between negative one and positive one.
r = -1 indicates a perfect negative correlation.
r = 1 indicates a perfect positive correlation.
r = 0 indicates no correlation.
Causation means that one variable caused the other to occur. Proving a causal relationship between variables requires a true experiment with a control group (which doesn’t receive the independent variable) and an experimental group (which receives the independent variable).
While regression analysis provides insights into relationships between variables, it doesn’t prove causation. It can be tempting to assume that one variable caused the other—especially if you want it to be true—which is why you need to keep this in mind any time you run regressions or analyze relationships between variables.
With the basics under your belt, here’s a deeper explanation of regression analysis so you can leverage it to drive strategic planning and decision-making.
Regression analysis is the statistical method used to determine the structure of a relationship between two variables (single linear regression) or three or more variables (multiple regression).
According to the Harvard Business School Online course Business Analytics, regression is used for two primary purposes:
Both of these insights can inform strategic business decisions.
“Regression allows us to gain insights into the structure of that relationship and provides measures of how well the data fit that relationship,” says HBS Professor Jan Hammond, who teaches Business Analytics, one of three courses that comprise the Credential of Readiness (CORe) program. “Such insights can prove extremely valuable for analyzing historical trends and developing forecasts.”
One way to think of regression is by visualizing a scatter plot of your data with the independent variable on the X-axis and the dependent variable on the Y-axis. The regression line is the line that best fits the scatter plot data. The regression equation represents the line’s slope and the relationship between the two variables, along with an estimation of error.
Physically creating this scatter plot can be a natural starting point for parsing out the relationships between variables.
There are two types of regression analysis: single variable linear regression and multiple regression.
Single variable linear regression is used to determine the relationship between two variables: the independent and dependent. The equation for a single variable linear regression looks like this:
In the equation:
Multiple regression, on the other hand, is used to determine the relationship between three or more variables: the dependent variable and at least two independent variables. The multiple regression equation looks complex but is similar to the single variable linear regression equation:
Each component of this equation represents the same thing as in the previous equation, with the addition of the subscript k, which is the total number of independent variables being examined. For each independent variable you include in the regression, multiply the slope of the regression line by the value of the independent variable, and add it to the rest of the equation.
You can use a host of statistical programs—such as Microsoft Excel, SPSS, and STATA—to run both single variable linear and multiple regressions. If you’re interested in hands-on practice with this skill, Business Analytics teaches learners how to create scatter plots and run regressions in Microsoft Excel, as well as make sense of the output and use it to drive business decisions.
It’s important to note: This overview of regression analysis is introductory and doesn’t delve into calculations of confidence level, significance, variance, and error. When working in a statistical program, these calculations may be provided or require that you implement a function. When conducting regression analysis, these metrics are important for gauging how significant your results are and how much importance to place on them.
Once you’ve generated a regression equation for a set of variables, you effectively have a roadmap for the relationship between your independent and dependent variables. If you input a specific X value into the equation, you can see the expected Y value.
This can be critical for predicting the outcome of potential changes, allowing you to ask, “What would happen if this factor changed by a specific amount?”
Returning to the earlier example, running a regression analysis could allow you to find the equation representing the relationship between employee satisfaction and product sales. You could input a higher level of employee satisfaction and see how sales might change accordingly. This information could lead to improved working conditions for employees, backed by data that shows the tie between high employee satisfaction and sales.
Whether predicting future outcomes, determining areas for improvement, or identifying relationships between seemingly unconnected variables, understanding regression analysis can enable you to craft data-driven strategies and determine the best course of action with all factors in mind.
Do you want to become a data-driven professional? Explore our eight-week Business Analytics course and our three-course Credential of Readiness (CORe) program to deepen your analytical skills and apply them to real-world business problems.
Catherine Cote is a marketing coordinator at Harvard Business School Online. Prior to joining HBS Online, she worked at an early-stage SaaS startup where she found her passion for writing content, and at a digital consulting agency, where she specialized in SEO. Catherine holds a B.A. from Holy Cross, where she studied psychology, education, and Mandarin Chinese. When not at work, you can find her hiking, performing or watching theatre, or hunting for the best burger in Boston.
We offer self-paced programs (with weekly deadlines) on the HBS Online course platform.
Our platform features short, highly produced videos of HBS faculty and guest business experts, interactive graphs and exercises, cold calls to keep you engaged, and opportunities to contribute to a vibrant online community.
We expect to offer our courses in additional languages in the future but, at this time, HBS Online can only be provided in English.
All course content is delivered in written English. Closed captioning in English is available for all videos. There are no live interactions during the course that requires the learner to speak English. Coursework must be completed in English.
No, all of our programs are 100 percent online, and available to participants regardless of their location.
Certificate Programs
HBS Online welcomes committed learners wherever they are—in the world and their careers—irrespective of their professional experience or academic background. To extend the reach of HBS Online, we no longer require an application for our certificate programs. (Applications are still required for our credential programs: CORe and CLIMB.) You can now immediately enroll and start taking the next step in your career.
All programs require the completion of a brief online enrollment form before payment. If you are new to HBS Online, you will be required to set up an account before enrolling in the program of your choice.
Our easy online enrollment form is free, and no special documentation is required. All participants must be at least 18 years of age, proficient in English, and committed to learning and engaging with fellow participants throughout the program.
Updates to your enrollment status will be shown on your account page. HBS Online does not use race, gender, ethnicity, or any protected class as criteria for enrollment for any HBS Online program.
Credential Programs
HBS Online's CORe and CLIMB programs require the completion of a brief application. The applications vary slightly, but all ask for some personal background information. You can apply for and enroll in programs here. If you are new to HBS Online, you will be required to set up an account before starting an application for the program of your choice.
Our easy online application is free, and no special documentation is required. All participants must be at least 18 years of age, proficient in English, and committed to learning and engaging with fellow participants throughout the program.
Updates to your application and enrollment status will be shown on your account page. We confirm enrollment eligibility within one week of your application for CORe and three weeks for CLIMB. HBS Online does not use race, gender, ethnicity, or any protected class as criteria for admissions for any HBS Online program.