Tag Archives: research process

Data Science Research Questions

Developing research questions is an absolute necessity in completing any research project. The questions you ask help to shape the type of analysis that you need to conduct.

The type of questions you ask in the context of analytics and data science are similar to those found in traditional quantitative research. Yet data science, like any other field, has its own distinct traits.

In this post, we will look at six different types of questions that are used frequently in the context of the field of data science. The six questions are…

  1. Descriptive
  2. Exploratory/Inferential
  3. Predictive
  4. Causal
  5. Mechanistic

Understanding the types of question that can be asked will help anyone involved in data science to determine what exactly it is that they want to know.


A descriptive question seeks to describe a characteristic of the dataset. For example, if I collect the GPA of 100 university student I may want to what the average GPA of the students is. Seeking the average is one example of a descriptive question.

With descriptive questions, there is no need for a hypothesis as you are not trying to infer, establish a relationship, or generalize to a broader context. You simply want to know a trait of the dataset.


Exploratory questions seek to identify things that may be “interesting” in the dataset. Examples of things that may be interesting include trends, patterns, and or relationships among variables.

Exploratory questions generate hypotheses. This means that they lead to something that may be more formal questioned and tested. For example, if you have GPA and hours of sleep for university students. You may explore the potential that there is a relationship between these two variables.


Inferential questions are an extension of exploratory questions. What this means is that the exploratory question is formally tested by developing an inferential question. Often, the difference between an exploratory and inferential question is the following

  1. Exploratory questions are usually developed first
  2. Exploratory questions generate inferential questions
  3. Inferential questions are tested often on a different dataset from exploratory questions

In our example, if we find a relationship between GPA and sleep in our dataset. We may test this relationship in a different, perhaps larger dataset. If the relationship holds we can then generalize this to the population of the study.


Causal questions address if a change in one variable directly affects another. In analytics, A/B testing is one form of data collection that can be used to develop causal questions. For example, we may develop two version of a website and see which one generates more sales.

In this example, the type of website is the independent variable and sales is the dependent variable. By controlling the type of website people see we can see if this affects sales.


Mechanistic questions deal with how one variable affects another. This is different from causal questions that focus on if one variable affects another. Continuing with the website example, we may take a closer look at the two different websites and see what it was about them that made one more succesful in generating sales. It may be that one had more banners than another or fewer pictures. Perhaps there were different products offered on the home page.

All of these different features, of course, require data that helps to explain what is happening. This leads to an important point that the questions that can be asked are limited by the available data. You can’t answer a question that does not contain data that may answer it.


Answering questions is essential what research is about. In order to do this, you have to know what your questions are. This information will help you to decide on the analysis you wish to conduct. Familiarity with the types of research questions that are common in data science can help you to approach and complete analysis much faster than when this is unclear


Developing a Data Analysis Plan

It is extremely common for beginners and perhaps even experience researchers to lose track of what they are trying to achieve or do when trying to complete a research project. The open nature of research allows for a multitude of equally acceptable ways to complete a project. This leads to  an inability to make decision and or stay on course when doing research.

One way to reduce and eliminate the roadblock to decision making and focus in research is to develop a plan. In this post we will look at one version of a data analysis plan.

Data Analysis Plan

A data analysis plan includes many features of a research project in it with a particular emphasis on mapping out how research questions will be answered and what is necessary to answer the question. Below is a sample template of the analysis plan.


The majority of this diagram should be familiar to someone who has ever done research. At the top, you state the problem, this is the overall focus of the paper. Next comes the purpose, the purpose is the over-arching goal of a research project.

After purpose comes the research questions. The research questions are questions about the problem that are answerable. People struggle with developing clear and answerable research questions. It is critical that research questions are written in a way that they can be answered and that the questions are clearly derived from the problem. Poor questions means poor or even no answers.

After the research questions it is important to know what variables are available for the entire study and specifically what variables can be used to answer each research question. Lastly, you must indicate what analysis or visual you will develop in order to answer your research questions about your problem. This requires you to know how you will answer your research questions


Below is an example of a completed analysis plan for  simple undergraduate level research paper


In the example above, the  student want to understand the perceptions of university students about the cafeteria food quality and their satisfaction with the university. There were four research questions, a demographic descriptive question, a descriptive question about the two main variables, a comparison question, and lastly a relationship question.

The variables available for answering the questions are listed of to the left  side. Under that, the student indicates the variables needed to answer each question. For example, the demographic variables of sex, class level, and major are needed to answer the question about the demographic profile.

The last section is the analysis. For the demographic profile the student found the percentage of the population in each sub group of the demographic variables.


A data analysis plan provides an excellent way to determine what needs to be done to complete a study. It also helps a researcher to clearly understand what they are trying to do and provides a visuals for those who the research wants to communicate  with about the progress of a study.

Research Process

The research process or scientific method is the default mode for systematically gather information for the purpose of answering questions and solving problems. This process serves the purpose of defining the goals of research, making predictions, gather data, and interpreting results.

In general, there are six steps to the research process as listed below.

  1. Identify the research problem
  2. Review the literature
  3. Specify the purpose of the research or develop research questions
  4. Collect data
  5. Analyze and interpret data
  6. Report and evaluate results

Identify the Problem

The problem can come from personal observation, readings, from others, or any other of a host of ways. Finding a problem also helps in focusing your study. When identifying a problem it is important to make sure that you develop a justification for investigating it as well as the importance of it. People need to know why they should care about what you are studying. This has to do with relevancy.

Reviewing the Literature

Reviewing the literature is about knowing what has been done before your so that you can see how you can build on existing knowledge. Most research tends to add to an existing conversation rather than start a new one. Looking at the literature also helps you to see your contribution to the existing body of knowledge. This is one way in which you can find the “gap” in the knowledge that your study will address.

Purpose of Research or Research Questions

The research purpose is the overall objective of the study. It is a restatement of the research problem. Another term for this is the research questions. The research questions are the questions you are asking about the problem. Many times, you do not solve a problem, instead, you ask questions about a problem. The answers to these questions may help to solve the problem or may not. Many people confuse the research purpose with the research questions when they are one in the same. Your goal at this step is to break a part the aspects of the problem into answerable questions. The answer to each question may contribute to solving the research problem.

Collecting Data

This is where the research design begins. Data collection is influenced by the research questions. What you want to know influences what data you will collect. Data collection includes sampling, methods, procedures, and more.

Analysis and Interpretation

Once data is collected it is analyzed. The method of analysis is also influenced by the nature of the research questions. Interpretation is where you answer the research questions. You found a relationship between variables or you didn’t. These answers to your research questions can be used to solve the research problem.

Reporting and Evaluating Research

At this step, the information is complied in a way so that you can communicate with your audience. The format of communication depends on who you are writing for. From journal articles to science fair projects all researchers must know the expected format for communication.

Evaluation is the experience of having your work judge by others based on a certain standard. These standards are not agreed upon. This lack of agreement is another reason to know who you are writing for so you can communicate in a way that is acceptable to them.


The research process serves the purpose of finding answers to questions about problems. A researcher needs to follow the six steps of the research process in order to communicate their findings in a way that is appropriate to their audience.