Author Archives: Dr. Darrin

Completing the Square of Binomial Expression VIDEO

Completing the Square of Binomial Expression

Advertisements

Understanding Variables

In research, there are many terms that have the same underlying meaning which can be confusing for researchers as they try to complete a project. The problem is that people have different backgrounds and learn different terms during their studies and when they try to work with others there is often confusion  over what is what.

In this post, we will try to clarify as much as possible various terms that are used when referring to variables. We will look at the following during this discussion

  • Definition of a variable
  • Minimum knowledge of the characteristics of a variable in research
  • Various synonyms of variable

Definition

The word variable has the root of “vary” and the suffix “able”. This literally means that a variable is something that is able to change. Examples include such concepts as height, weigh, salary, etc. All of these concepts change as you gather data from different people. Statistics is primarily about trying to explain and or understand the variability of variables.

However, to make things more confusing there are times in research when a variable dies not change or remains constant. This will be explained in greater detail in a moment.

Minimum You Need to Know

Two broad concepts that you need to understand regardless of the specific variable terms you encounter are the following

  • Whether the variable(s) are independent or dependent
  • Whether the variable(s) are categorical or continuous

When we speak of independent and dependent variables we are looking at the relationship(s) between variables. Dependent variables are explained by independent variables. Therefore, one dimension of variables is understanding how they relate to each other and the most basic way to see this is independent vs dependent.

The second dimension to consider when thinking about variables is how they are measured which is captured with the terms categorical or continuous. A categorical variable has a finite number of values that can be used. Examples in clue gender, hair color, or cellphone brand. A person can only be male or female, have blue or brown eyes, and can only have one brand of cellphone.

Continuous variables are variables that can take on an infinite number of values. Salary, temperature, etc are all continuous in nature. It is possible to limit a continuous variable to categorical variable by creating intervals in which to place values. This is commonly done when creating bins for histograms. In sum, here are the four possible general variable types below

  1. Independent categorical
  2. Independent continuous
  3. Dependent categorical
  4. Dependent continuous

Natural, most models have one dependent categorical or continuous variable, however you can have any combination of continuous and categorical variables as independents. Remember that all variables have the above characteristics despite whatever terms is used for them.

Variable Synonyms

Below is a list of various names that variables go by in different disciplines. This is by no means an exhaustive list.

Experimental variable

A variable whose values are independent of any changes in the values of other variables. In other words, an experimental variable is just another term for independent variable.

Manipulated Variable

A variable that is independent in an experiment but whose value/behavior the researcher is able to control or manipulate. This is also another term for an independent variable.

Control Variable

A variable whose value does not change. Controlling a variable helps to explain the relationship between the independent and dependent variable in an experiment by making sure the control variable has not influenced in the model

Responding Variable

The dependent variable in an experiment. It responds to the experimental variable.

Intervening Variable

This is a hypothetical variable. It is used to explain the causal links between variables. Since they are hypothetical, they are observed in an actual experiment. For example, if you are looking at a strong relationship between income and life expectancy  and find a positive relationship. The intervening variable for this may be access to healthcare. People who make more money have more access to health care and this contributes to them often living longer.

Mediating Variable

This is the same thing as an intervening variable. The difference being often that the mediating variable is not always hypothetical in nature and is often measured it’s self.

Confounding Variable

A confounder is a variable that influences both the independent and dependent variable, causing a spurious or false association. Often a confounding variable is a causal idea and  cannot be described in terms of correlations or associations with other variables. In other words, it is often the same thing as an intervening variable.

Explanatory Variable

This variable is the same as an independent variable. The difference being that an independent variable is not influenced by any other variables. However, when independence is not for sure, than the variable is called an explanatory variable.

Predictor Variable

A predictor variable is an independent variable. This term is commonly used for regression analysis.

Outcome Variable

An outcome variable is a dependent variable in the context of regression analysis.

Observed Variable

This is a variable that is measured directly. An example would be gender or height. There is no psychology construct to infer the meaning of such variables.

Unobserved Variable

Unobserved variables are constructs that cannot be measured directly. In such situations, observe variables are used to try to determine the characteristic of the unobserved variable. For example, it is hard to measure addiction directly. Instead, other things will be measure to infer addiction such as health, drug use, performance, etc. The measures of this observed variables will indicate the level of the unobserved variable of addiction

Features

A feature is an independent variable in the context of machine learning and data science.

Target Variable

A target variable is the dependent variable in the context f machine learning and data science.

To conclude this, below is a summary of the different variables discussed and whether they are independent, dependent, or neither.

Independent Dependent Neither
Experimental Responding Control
Manipulated Target Explanatory
Predictor Outcome Intervening
Feature Mediating
Observed
Unobserved
Confounding

You can see how confusing this can be. Even though variables are mostly independent or dependent, there is a class of variables that do not fall into either category. However, for most purposes, the first to columns cover the majority of needs in simple research.

Conclusion

The confusion over variables is mainly due to an inconsistency in terms across variables. There is nothing right or wrong about the different terms. They all developed in different places to address the same common problem. However, for students or those new to research, this can be confusing and this post hopefully helps to clarify this.

T-SNE Visualization and R

It is common in research to want to visualize data in order to search for patterns. When the number of features increases, this can often become even more important. Common tools for visualizing numerous features include principal component analysis and linear discriminant analysis. Not only do these tools work for visualization they can also be beneficial in dimension reduction.

However, the available tools for us are not limited to these two options. Another option for achieving either of these goals is t-Distributed Stochastic Embedding. This relative young algorithm (2008) is the focus of the post. We will explain what it is and provide an example using a simple dataset from the Ecdat package in R.

t-sne Defined

t-sne is a nonlinear dimension reduction visualization tool. Essentially what it does is identify observed clusters. However, it is not a clustering algorithm because it reduces the dimensions (normally to 2) for visualizing. This means that the input features are not longer present in their original form and this limits the ability to make inference. Therefore, t-sne is often used for exploratory purposes.

T-sne non-linear characteristic is what makes it often appear to be superior to PCA, which is linear. Without getting too technical t-sne takes simultaneously a global and local approach to mapping points while PCA can only use a global approach.

The downside to t-sne approach is that it requires a large amount of calculations. The calculations are often pairwise comparisons which can grow exponential in large datasets.

Initial Packages

We will use the “Rtsne” package for the analysis, and we will use the “Fair” dataset from the “Ecdat” package. The “Fair” dataset is data collected from people who had cheated on their spouse. We want to see if we can find patterns among the unfaithful people based on their occupation. Below is some initial code.

library(Rtsne)
library(Ecdat)

Dataset Preparation

To prepare the data, we first remove in rows with missing data using the “na.omit” function. This is saved in a new object called “train”. Next, we change or outcome variable into a factor variable. The categories range from 1 to 9

  1. Farm laborer, day laborer,
  2. Unskilled worker, service worker,
  3. Machine operator, semiskilled worker,
  4. Skilled manual worker, craftsman, police,
  5. Clerical/sales, small farm owner,
  6. Technician, semiprofessional, supervisor,
  7. Small business owner, farm owner, teacher,
  8. Mid-level manager or professional,
  9. Senior manager or professional.

Below is the code.

train<-na.omit(Fair)
train$occupation<-as.factor(train$occupation)

Visualization Preparation

Before we do the analysis we need to set the colors for the different categories. This is done with the code below.

colors<-rainbow(length(unique(train$occupation)))
names(colors)<-unique(train$occupation)

We can now do are analysis. We will use the “Rtsne” function. When you input the dataset you must exclude the dependent variable as well as any other factor variables. You also set the dimensions and the perplexity. Perplexity determines how many neighbors are used to determine the location of the datapoint after the calculations. Verbose just provides information during the calculation. This is useful if you want to know what progress is being made. max_iter is the number of iterations to take to complete the analysis and check_duplicates checks for duplicates which could be a problem in the analysis. Below is the code.

tsne<-Rtsne(train[,-c(1,4,7)],dims=2,perplexity=30,verbose=T,max_iter=1500,check_duplicates=F)
## Performing PCA
## Read the 601 x 6 data matrix successfully!
## OpenMP is working. 1 threads.
## Using no_dims = 2, perplexity = 30.000000, and theta = 0.500000
## Computing input similarities...
## Building tree...
## Done in 0.05 seconds (sparsity = 0.190597)!
## Learning embedding...
## Iteration 1450: error is 0.280471 (50 iterations in 0.07 seconds)
## Iteration 1500: error is 0.279962 (50 iterations in 0.07 seconds)
## Fitting performed in 2.21 seconds.

Below is the code for making the visual.

plot(tsne$Y,t='n',main='tsne',xlim=c(-30,30),ylim=c(-30,30))
text(tsne$Y,labels=train$occupation,col = colors[train$occupation])
legend(25,5,legend=unique(train$occupation),col = colors,,pch=c(1))

1

You can see that there are clusters however, the clusters are all mixed with the different occupations. What this indicates is that the features we used to make the two dimensions do not discriminant between the different occupations.

Conclusion

T-SNE is an improved way to visualize data. This is not to say that there is no place for PCA anymore. Rather, this newer approach provides a different way of quickly visualizing complex data without the limitations of PCA.

Checkout ERT online courses

Force-Directed Graph with D3.js

Network visualizations involve displaying interconnected nodes commonly associated with social networks. D3.js has powerful capabilities to create these visualizations. In this post, we will learn how to make a simple force-directed graph.

A force directed graph uses an algorithm that spaces the nodes in the graph away from each other based on a value you set. There are several different ways to determine how the force influences the distance of the nodes from each other that will be explored somewhat in this post.

1.jpg

The Data

To make the visualization it is necessary to have data. We will use a simple json file that has nodes and edges. Below is the code

{

"nodes": [

{ "name": "Tom" },

{ "name": "Sue" },

{ "name": "Jina" },

{ "name": "Soli" },

{ "name": "Lala" }

],

"edges": [

{ "source": 0, "target": 1 },

{ "source": 0, "target": 4 },

{ "source": 0, "target": 3 },

{ "source": 0, "target": 4 },

{ "source": 0, "target": 2 },

{ "source": 1, "target": 2 }

]

}

The nodes in this situation will represent the circles that we will make. In this case, the nodes have names. However, we will not print the names in the visualization for the sake of simplicity. The edges represent the lines that will connect the circles/nodes. The source number is the origin of the line and the target number is where the line ends at. For example, “source: 0” represents Tom and “Target”: 1 means draw a line from Tom to Sue.

Setup

To begin the visualization we have to create the svg element inside our html doc. Lines 6-17 do this as shown below.

1.png

Next, we need to create the layout of the graph. The .node() and the .link() functions affect the location of the nodes and links. The .size() affects the gravitational center and initial position of the visualization. There is also some code that is commented out in below that will be discussed later. Below are lines 18-25 of our code.

1

Now we can write the code that will render or draw our object. We need to append the edges and nodes, indicate color for both, as well as the radius of the circles of the nodes. All of this is captured in lines 26-44

1

The final step is to handle the ticks. To put it simply, the ticks handles recalculating the position of the nodes. Below is the code for this.

1

We can finally see are visual as shown below

You can clearly see that the nodes are on top of each other. This is because we need to adjust the use of the force in the force-directed graph. There are many ways to adjust this, but we will look at two functions. These are .linkDistance() and .charge().

The .linkDistance() function indicates how far nodes are from each other at the end of the simulation. To add this to our code you need to remove the comments on line22 as shown below.

1

Below is an update of what our visualization  looks like.

Things are better but the nodes are still on top of each other. The real differences is that the edges are longer. To fix this, we need to use the .charge() function. The .charge() function indicates how much nodes are attracted to each other or repel each other. To use this function you need to remove the comments on line 23 as shown below.

1

The negative charge will cause the nodes to push away from each other. Below is what this looks like.

You can see that as the nodes were moved around the stayed far from each other. This is because of the negative charge. Off course, there are other ways to modify this visualization but this is enough for now.

Conclusion

Force-directed graphs are a  powerful tool for conveying what could be a large amount of information. This post provided some simple ways that this visualization can be developed and utilized for practical purposes.

Essentialist Teacher

Essentialism was an educational philosophy that was reacting to the superficiality of instruction that was associated with progressivism and the aristocratic air that was linked with perennialism. Essentialism was a call to teach the basics. This position of providing a no frill basic education for employment is the primary position of most educational positions in the world.

1Background

Starting in the 1930’s, essentialism is based on the philosophies of idealism and realism. Essentialism supporters have stressed the need to return to a more subject centered approach vs child centered position. Transmission of knowledge is more important than transforming society.

There were two major moments in American history that propelled essentialism to the forefront of education. The first, happened in 1957 when the Russians launch the Sputnik satellite. Critics of progressivism stated that all this child-centered teaching had crippled an entire generation who lacked basic skills in math and science to compete with the Russians. This was a major blow to progressivism as schools refocused on teaching math and science and having a subject centered curriculum.

In essentialism was not already triumphant it certainly was by the 1980’s when the article “A Nation at Risk” was published. This article stated that American education was mediocre and lead to schools needing to focus on the five basics. By the 1990’s such ideas as “core knowledge” or “common core” was being pushed. Such ideas demonstrate how there are basic truths and ideas that supposedly all students need to have.

Philosophy

School is a place where students master basic skills in preparation for working in society. This includes the three R’s (reading, writing, arithmetic) and some of the humanities. The subject matter cannot always be interesting or even immediately relevant for students.

The mind needs to be trained and some memorization is required. However, there is  less of a focus on raw intellectualism such as is found in perennialism. The center of learning is the teacher and the students are there to follow the teacher.

Essentialism has similarities to perennialism. However, there are differences such as the idea that Essentialism does not have a problem with adapting ideas from progressivism for their on own purposes.   There is also a general indifference to time honor classics  in the humanities for the training of the mind.

In Education

An essentialist teacher is going to focus on developing skills and competency rather the learning knowledge for the sake of knowledge. There will be a focus on the basics of education and the classroom will be subject centered. There will not be much tolerance for meeting needs or understanding differences among students.

Focus on job skills and training towards employment would also be stressed. The focus of the education is in training people to be equipped for the workplace and not for personal fulfillment. If students enjoy what they learn this is an added bonus but not necessarily critical for the learning experience.

Conclusion

Essentialism was in many ways a working-class version of perennialism. Stripped of the humanities and focused on developing job skills, essentialism is the engine of education in many parts of American education. As long as the economy and employment are most important to people we will continue to see a continued support for essentialism.

Perennialist Teacher

Perennialism was a strong educational movement in the early part of the 20th century. It pushed a call to return to older ways of learning and instruction in order to strengthen the man in preparation for life. In this post, we will look briefly at the history, philosophy, and how a teacher with a perennialist perspective may approach their classroom.
1.jpg
Background

Perennialism came about as a strong reaction against progressivism. The emotional focus of the child-centered approach of progressivism was seen as anti-intellectual by perennialists. In place of child- center focus was a call for return to long establish truth and time honored classics.

Supporters of perennialism wanted a liberal education, which implies an education rich with the classical works of man. The purpose of education was the development of the mind rather than the learning of a specific job skill. This position has often been seen as elitist and has clashed with what the working class need for the education of  their children to be in a more practical manner.

A major influencer of perennialism is neo-scholasticism, which is also a supporter of classical studies and was based  on idealism. Perennialism was originally focused higher education and high school but by the 1980’s its influence had spread to elementary education. Prominent supporters of this style include Motimer Adler and Maynard Hutchins.

Philosophical Position

Perennialism believes that people are rational rather than primarily emotional beings. This is the opposite of progressivism which is always worried about feelings. Furthermore, human nature is steady and predictable which allows for everyone to have the same education. Thus, the individual is lost in a strong perennial classroom.

The focus of the classroom is not on the student but rather on the subject matter. The classroom is preparation for life and not design for real-life situations as in progressivism. The mind needs to be developed properly before taking action. Through the study of the greats it is assumed this will help the student become great.

Perennialism and Education

A perennialist teacher would have a classroom in which all the students are treated the same way. Material is taught and delivered to the students whether they like it or not. This is because material is taught that is good for them rather than what they like.

This material would include ancient time tested ideas because that is where truth is and exposure to this great minds would make  great mind. The learning experiences would be mostly theoretical in nature because training in this manner allows for intellectual development.

The classroom might actually be a little cold by the progressivist’standard that focuses on group work and interaction. This is because of the rational focus of perennialism. When the assumption is everyone  is rational and only needed exposure to the content with or without an emotional experience.

Conclusion

Reacting is not always the best way to push for change. Yet this is exactly what brought perennialism into existence. Seeing the lost of absolute truth and long held traditions, perennialism strove to protect these pillars of education. There are some problems. For example, their emphasis on the rational nature of man seems strange as the average person is lacking in the ability to reason and control their emotions. In  addition, the one-size fits all when it comes to education is obviously not true as we need people who have a classic education but also people who can build a house or fix a car. In other words, we need vocational training as well in order to have a balanced society.

Another problem is the fallacy of the appeal to tradition. Just because something is a classic does not make it truth or worthy of study. This simply allow the traditions of the past to rule the present. If all people do is look at the past how will they develop relevant ideas for the present or future?

The main benefit of these different schools of thought is that through these conflicts of opinion a balanced approach to learning can take place for students.

Pie Charts with D3.js

Pie charts are one of many visualizations that you can create using D3.js. We are going to learn how to do the following in this post.

  • Make a circle
  • Make a donut
  • Make a pie wedge
  • Make a segment
  • Make a pie chart

Most of these examples require just a minor adjustment in a standard piece of code. This  will make more sense in a minute.

Make a Circle

Making a circle involves using the .arc() method. In this method there are four parameters that you manipulate to get different shapes. They are explained below

  • .innerRadius() This parameter makes a whole in your circle to give the appearance of a donut
  • .outerRadius() Determines the size of your circle
  • .startAngle() is used in combination with .endAngle() to make a pie wedge

Therefore, to make several different shapes we manipulate these different parameters. In the code below, we create the svg element first (lines 7-10)  then our circle (lines 11-15). Lastly, we append the path to the svg element (lines 16-21). Below is the code and picture.

12

The rest of the examples primarily deal with manipulating the existing code.

Donut

To make the donut, you need to change the value inside the .innerRadius() parameter. The larger the value the bigger the hole in the middle of the circle will become. In order to generate the donut below you need to change the value found in line 12 of  the code to 100.

1.png

Pie Wedge

To make a wedge you need to replace lines 14-15 of the code with the following

.startAngle(0*Math.PI * 2/360)

.endAngle(90*Math.PI * 2/360);

This is telling d3.js to start the angle at 0 degrees and stop at 90 degrees. This is another way of saying making a wedge of 1/4 of the circle. Doing this will create the following.

1

Segment

In order to make the segment you keep the code the same as above for the pie wedge but at a change to the .innerRadius() parameter. AS shown below,

.innerRadius(100)

.outerRadius(170)

.startAngle(0*Math.PI * 2/360)

.endAngle(90*Math.PI * 2/360);

1

Pie Chart

A pie chart is just a more complex version of what we have already done. You still need to set up your svg element. This is don in lines 7-14. Notice that we also had to add a g element and a transform attribute.

Line 16 contains the data for making the pie chat. This is hard coded but we can also use other forms of data. Line 17 uses the .pie() method with the data to set the stage for our pie chart.

Lines 19-27 are for generating the arc. This code is mostly the same except for the functions that are used for the .startAngle() and .end Angle() methods. Line 29 sets the color and Lines 30-42 draw the paths for creating the image. The code with the // will be explained in a moment. Below is the code and the pie chart.

12

Pie Chart Variation

Below is the same pie chart but with some different features.

  • It now has a donut (line 20 change .innerRadius(0) to .innerRadius(50))
  • There are now separations between the different segments to give the appearance that it is pulling apart. Remove the // in lines 36-37 to activate this.

Below is the pie chart

1.png

You can perhaps now see that the possibilities are endless with this.

Conclusion

Pie charts and their related visualizations are another option for communicating insights of data