In research, there are many terms that have the same underlying meaning which can be confusing for researchers as they try to complete a project. The problem is that people have different backgrounds and learn different terms during their studies and when they try to work with others there is often confusion over what is what.
In this post, we will try to clarify as much as possible various terms that are used when referring to variables. We will look at the following during this discussion
- Definition of a variable
- Minimum knowledge of the characteristics of a variable in research
- Various synonyms of variable
The word variable has the root of “vary” and the suffix “able”. This literally means that a variable is something that is able to change. Examples include such concepts as height, weigh, salary, etc. All of these concepts change as you gather data from different people. Statistics is primarily about trying to explain and or understand the variability of variables.
However, to make things more confusing there are times in research when a variable dies not change or remains constant. This will be explained in greater detail in a moment.
Minimum You Need to Know
Two broad concepts that you need to understand regardless of the specific variable terms you encounter are the following
- Whether the variable(s) are independent or dependent
- Whether the variable(s) are categorical or continuous
When we speak of independent and dependent variables we are looking at the relationship(s) between variables. Dependent variables are explained by independent variables. Therefore, one dimension of variables is understanding how they relate to each other and the most basic way to see this is independent vs dependent.
The second dimension to consider when thinking about variables is how they are measured which is captured with the terms categorical or continuous. A categorical variable has a finite number of values that can be used. Examples in clue gender, hair color, or cellphone brand. A person can only be male or female, have blue or brown eyes, and can only have one brand of cellphone.
Continuous variables are variables that can take on an infinite number of values. Salary, temperature, etc are all continuous in nature. It is possible to limit a continuous variable to categorical variable by creating intervals in which to place values. This is commonly done when creating bins for histograms. In sum, here are the four possible general variable types below
- Independent categorical
- Independent continuous
- Dependent categorical
- Dependent continuous
Natural, most models have one dependent categorical or continuous variable, however you can have any combination of continuous and categorical variables as independents. Remember that all variables have the above characteristics despite whatever terms is used for them.
Below is a list of various names that variables go by in different disciplines. This is by no means an exhaustive list.
A variable whose values are independent of any changes in the values of other variables. In other words, an experimental variable is just another term for independent variable.
A variable that is independent in an experiment but whose value/behavior the researcher is able to control or manipulate. This is also another term for an independent variable.
A variable whose value does not change. Controlling a variable helps to explain the relationship between the independent and dependent variable in an experiment by making sure the control variable has not influenced in the model
The dependent variable in an experiment. It responds to the experimental variable.
This is a hypothetical variable. It is used to explain the causal links between variables. Since they are hypothetical, they are observed in an actual experiment. For example, if you are looking at a strong relationship between income and life expectancy and find a positive relationship. The intervening variable for this may be access to healthcare. People who make more money have more access to health care and this contributes to them often living longer.
This is the same thing as an intervening variable. The difference being often that the mediating variable is not always hypothetical in nature and is often measured it’s self.
A confounder is a variable that influences both the independent and dependent variable, causing a spurious or false association. Often a confounding variable is a causal idea and cannot be described in terms of correlations or associations with other variables. In other words, it is often the same thing as an intervening variable.
This variable is the same as an independent variable. The difference being that an independent variable is not influenced by any other variables. However, when independence is not for sure, than the variable is called an explanatory variable.
A predictor variable is an independent variable. This term is commonly used for regression analysis.
An outcome variable is a dependent variable in the context of regression analysis.
This is a variable that is measured directly. An example would be gender or height. There is no psychology construct to infer the meaning of such variables.
Unobserved variables are constructs that cannot be measured directly. In such situations, observe variables are used to try to determine the characteristic of the unobserved variable. For example, it is hard to measure addiction directly. Instead, other things will be measure to infer addiction such as health, drug use, performance, etc. The measures of this observed variables will indicate the level of the unobserved variable of addiction
A feature is an independent variable in the context of machine learning and data science.
A target variable is the dependent variable in the context f machine learning and data science.
To conclude this, below is a summary of the different variables discussed and whether they are independent, dependent, or neither.
You can see how confusing this can be. Even though variables are mostly independent or dependent, there is a class of variables that do not fall into either category. However, for most purposes, the first to columns cover the majority of needs in simple research.
The confusion over variables is mainly due to an inconsistency in terms across variables. There is nothing right or wrong about the different terms. They all developed in different places to address the same common problem. However, for students or those new to research, this can be confusing and this post hopefully helps to clarify this.