Extracting & matching in R VIDEO

Leave a reply

This video provides examples of how you can extract and match patterns using regular expressions in R. This is a great tool for manipulating strings as needed.

Understanding the Preface of a Textbook VIDEO

Leave a reply

The preface of a textbook provides important information in terms of understanding what the book is about. In the video below, we will look at the different components of a preface and how they work together to help the reader.

Bokeh Display Customization VIDEO

Leave a reply

The video below shows you how to make modifications to the display of interactive graphs using Bokeh

Essay on Liberation-Subverting Forces & Solidarity

Leave a reply

This post will examine chapters three and four of Herbert Marcuse’s “Essay on Liberation.” This highly influential essay, written in the 1960s, lays out many of the left’s goals and desires regarding the reshaping of society.

Subverting Forces

Chapter 3 is mostly a rehash of complaints and solutions that Marcuse has already addressed in his essay. It begins with a litany of complaints, including the terrible jobs people have to work, the exploitation of minorities, increased violence, and the waste of resources. All of these complaints are blamed on capitalism. It needs to be noted that every system has some sort of flaws and even oppression within them which includes the communist system that Marcuse supports.

Marcuse also mentions how technology can be used to end capitalism rather than support it. The challenge is that the technocrats are using technology to continue the existing system of oppression. Not only is this terrible but the current system must be abolished as reformation is not even an option for Marcuse. This is a sentiment that is shared by many leftists today regarding the destruction of the current system in order to set up a completely new one.

Marcuse also calls on universities to radicalize students by developing and/or awakening their true consciousness. A true consciousness is a mind that has awakened to its true socialist nature. It appears the universities have heeded Marcuse’s call as many of them are considered bastions of liberal left-wing thinking. Again, the problem isn’t that Marcuse believes these things but that he wants everyone else to believe them and thinks it’s okay to use the educational system for this. If we are really free we should be able to accept or reject this worldview that Marcuse so vehemently supports.

Marcuse repeats his desire to radicalize the ghetto (black) population as well. Again, the reason for radicalizing students and minorities is to replace the proletariat workers who are enjoying their middle-class lifestyle. Marcuse never mentions how the ghetto populations were to be radicalized but it would probably involve the use of former university students who have achieved their true consciousness and are educating and working among the ghetto populations and pointing out the oppression these people are facing. Paulo Friere may be one example of this as he worked exclusively among the poor and minority populations as a language teacher in Brazil pointing out oppression.

One shocking comment Marcuse makes about the black population of his time is that they are expendable. Now, expendable does not mean that blacks should be eliminated or that they have no value. Rather, Marcuse used the term “expendable” to mean that the majority of blacks are not contributing significantly to the current economic system. For Marcuse, this is an advantage because these oppressed individuals are potential recruits for the revolution.

Correlation is not causation but there was a surprising number of radical black groups that arose in the 1960s and 1970s. Examples include the Black Panthers and the Black Liberation Army. There are also a host of other left-leaning groups such as the Symbionese Liberation Army, Weather Underground, and Students for a Democratic Society. The example provided explains why Marcuse is often called the “father of the new left.”

Solidarity

The final chapter of Marcuse’s essay shares how the revolution was successful in both Cuba and Vietnam. With such recent success as this (Marcuse was writing in the 1960’s) Marcuse is implying that such success can be experienced in the US. At the time it was unclear what to expect from the communist revolutions in Cuba and Vietnam. However, history shows us that these revolutions were not blessings to the citizens of either of these countries.

Marcuse then goes on to ponder what life after the revolution will look like. He essentially implies that it is unclear what life will truly be like after the communist revolution. This is a common criticism of communism in that the proponents want a different world but have no idea what to do if they take power. Given the track record of communist governments, it is better that communists pursue power rather than obtain it.

Conclusion

Marcuse had a strong vision for what he wanted to see happen in America. His desire was for the fall of capitalism and the rise of a socialist/communist utopia. In his essay, he lays out this dream of his. Unfortunately, the general success of communist revolutions is often negative and leads to huge loss of life as people’s freedoms are curtailed for the sake of the collective.

Bokeh Display Multiple Plots VIDEO

Leave a reply

The video below shows you how to make multiple plots with Bokeh. This is a valuable tool when you are trying to develop multiple visualizations for comparison purposes.

Essay on liberation-The New Sensibility

Leave a reply

This post will look at the second chapter of Herbert Marcuse’s essay “Essay on Liberation.” The general gist of this influential essay is to bemoan capitalism and champion the benefits and superiority of socialism. The focus of this chapter in particular is mostly on the benefits and implementation of socialism.

The New Sensibility

A key word in this chapter is the word “sensibility.” From what I can determine it seems that the word “sensibility” in the title relates to worldview or perhaps world order. Therefore, in this chapter, Marcuse is attempting to explain the new worldview or values of individuals who have been liberated from capitalism.

Within this chapter, Marcuse talks about a world in which injustice and misery have been abolition and there is a controlled economy in place. By controlling the economy, people are free from the evils of capitalism. The evils of capitalism appear to be hard work and consumerism as these are concepts Marcuse seems to criticize and complain about.

Marcuse also tries to explain what a liberated consciousness is. A liberated consciousness is someone who has been awakened to the evils of capitalism and understands the natural state of man, which is a socialist being. The way Marcuse describes this is similar to Plato’s Cave Analogy of someone who realizes the way they see the world is a shadow of the actual reality with the chains representing capitalism. I cannot confirm this but Marcuse’s concept of the liberated consciousness may have inspired Freire’s critical consciousness which sounds similar and is focused on realizing the oppression that is found in the pedagogical process.

Marcuse goes on to share how praxis is key. By praxis, an appropriate definition would be social action which generally involves protesting and other forms of destabilizing the existing society. In other words, it is not enough to be awakened as one must push for the manifestation of this awakening in the real world. Friere also speaks of social action and unrest in his work. Socialism is not content to exist along with other worldviews it wants to overtake the world and bring about the utopia that has never existed in recorded human history.

Another aspect of this chapter was Marcuse’s exploration of how art shapes reality. Art can be used to influence and shape reality through the ability to express what is ideal. Through warping reality through the use of art society can be changed for the better as well. Marcuse briefly touches on this idea in this essay but he does explore it in greater detail in his other works.

Conclusion

Marcuse lays out his claims for the need for socialism and how people would act if they were awakened to their true nature. The main failure of the MArcuse’s argument is its theoretical nature. The reality of socialism and communism is a system that lacks the benefits and resources it claims to provide.

Essay on Liberation-Biological Foundation

Leave a reply

Herbert Marcuse wrote a famous essay in the 1960’s entitled “Essay on Liberation.” The writing is somewhat difficult and convoluted which means interpretation can be challenging. However, the main thesis of Marcuse’s essay appears to be that the productivity of capitalism is inhibiting the rise of the socialist revolution. He addresses this thesis by addressing how a man can take care of himself without being dependent on the capitalist system and by asserting there can be no freedom from labor in the current capitalist system.

In this post, we will attempt to provide a summary of this essay succinctly. In particular, will focus on only chapter one of this essay entitled “Biological Foundation of Socialism”

Biological Foundation for Socialism

The first part of Marcuse’s essay addresses the biological foundation for socialism. From what I can assess the term “biological” means the innate need or basis for socialism. In other words, Marcuse builds a case for socialism as a natural state of man in the first part of his essay.

Marcuse lays out two problems with capitalism, which are the increase in production and the exploitation of products. For Marcuse, capitalist societies overproduce but at the same time do not provide enough for the people trapped in this oppressive system. For people to be free they must break their dependence on this market system with its focus on consumption. However, Marcuse later goes on to prescribe a controlled market as the alternative which has its problems of efficiency as demonstrated by other communist states such as the Soviet Union.

Marcuse also shares that capitalism is transformative. By transformative Marcuse is probably referring to how capitalism changes the nature, character, and or values of the individual. The accusation of the transformative nature of capitalism may also be why Marxists in general speak of transformation. However, when Marxists speak of transformation they believe it relates to awakening man to his true socialist nature rather than the capitalist lie. For Marcuse, the change of an individual brought about by capitalism causes exploitation as the individual buys into an oppressive system. Anyone familiar with the term “rat race” may have sympathy with Marcuse”s views.

Marcuse desires to free man from this exploitative system. This gives the impression that people should not have to do anything they don’t want to do. The problem is that many communist and socialist countries still have exploitive systems that force people to do things after the revolution. In other words, there is no system in which man is truly free. Everyone has to spend time doing things they do not want. The only difference is who is your master and what are the benefits of serving him.

Marcuse then goes on to explain why the Marxist revolution has not taken place. He claims that poverty doesn’t bring revolution, as Marx argued. With the success of capitalism, the proletariat was beginning to move into the middle class. The problem with the economic success of the middle class is that they hate the idea of revolution. This disdain for revolution is because of the middle class’s investment in the current system. In other words, capitalism blunts the desire for true freedom because it bribes individuals with economic gain.

Marcuse’s solution to the middle class’s stabilization was to focus on the radicalization of the super poor and blacks. In later parts of his essay, he adds students to this potential pool of revolutionaries. By shifting the focus away from the traditional proletariat, who are essentially sell-outs, to other oppressed groups, the revolution can continue.

The impact of this statement is felt today. Now, we have a plethora of groups who are crying out about the oppression of capitalism and other norms of society such as sexuality, health, race, etc. The idea of radicalizing various ethnic, sexual, and other minorities for the sake of revolution may have started with the ideas of Marcuse in the 1960s.

Conclusion

Marcuse lays out several key terms of his essay in this first chapter. Establishing this foundation is key as we will see how the rest of the essay is a variation of the ideas presented here.

Bokeh Tools and Tooltips VIDEO

Leave a reply

In the video below, we will take a look at Bokeh tools and tooltips in Python. Tools and tooltips are great options for modifying your data visualization’s interactivity.

Creating Multiple Plots Using Bokeh in Python

Leave a reply

In this post, we will look at how to make multiple plots at once using Boke in Python. This technique can be a powerful tool when you need to create visualizations rapidly for whatever purpose you may have.

Needed Initial Libraries & Data Preparation

Below are the initial libraries we need to begin this example and the data preparation.

from pydataset import data
from bokeh.plotting import figure
from bokeh.io import output_file, show
import pandas as pd

df=data("Duncan")

The first line is the code for the data we will use. It loads the data() function from pydataset. Next, we load the figure function from bokeh which will allow us to create our plots. After this we load the output_file() and show() functions which will allow us to display our plots. Lastly, we created our object df which holds our data from the Duncan dataset which has job types, prestige, income, and education as variables.

Multiple Scatterplots

Below is an example of displaying multiple scatterplots. The code and visualization are below followed by an explanation.

# SCATTER PLOT
from bokeh.layouts import column

wc = df.loc[df["type"] == "wc"]
prof = df.loc[df["type"] == "prof"]

fig_one = figure(x_axis_label="Education",y_axis_label="Prestige")
fig_two = figure(x_axis_label="Education",y_axis_label="Prestige")
fig_one.circle(x="education", y="prestige",source=wc,color="blue", legend_label="wc")
fig_two.circle(x="education", y="prestige",source=prof,color="red", legend_label="prof")

output_file(filename="column_plots.html")
show(column(fig_one, fig_two))

You can see the plots are stacked into a single column. The actual setup for this is simple.

We loaded the column() function which allows us to display visualizations in columns
We subsetted the data so that wc workers are in one object and prof are in the other object
We created two figures (fig_one, fig_two) for each of the other datasets. The figures are identical and both will contain education and prestige as the variables
We then added the data to both figures distinguishing the plots by having different colored dots
We created a name for the output
Inside the show() function we used the column() function to display the visualization in columns

All of this code was mostly reviewed, the only new thing was the use of the column() function within the show() function.

Multiple Bar Plots

In this example, we use bar plots and rows instead of columns. The code is followed by the visualization and the explanation.

# bar PLOT
from bokeh.layouts import row

income=pd.DataFrame(df.groupby('type')['income'].mean())
prestige=pd.DataFrame(df.groupby('type')['prestige'].mean())

types = ["prof", "wc", "bc"]
income_type = figure(x_axis_label="type", y_axis_label="income", 
                       x_range=types)
prestige_type = figure(x_axis_label="type", y_axis_label="prestige)", 
                   x_range=types)

# Add bar glyphs
income_type.vbar(x="type", top="income", source=income)
prestige_type.vbar(x="type", top="prestige", source=prestige)

# Generate HTML file and display the subplots
output_file(filename="my_first_column.html")
show(row(income_type, prestige_type))

Here is what we did

We loaded the row() functions which allows us to make rows as you can see.
We calculated the group means for each job type
We created a list called types which included the three job types in our dataset
Next, we made our two figures. One for income and the other for prestige
After this, we added the data to the plots
Lastly, we created the output and showed the visualizations this time using rows

Gridplots

The code below takes a different approach using grid plots. This allows you to set columns and rows for your multiple plots. Below is the code, output, and explanation of this.

from bokeh.layouts import gridplot
from bokeh.models import ColumnDataSource
from bokeh.models import NumeralTickFormatter

plots = []
#df['type'] = df.type.astype('category')
# Complete for loop to create plots
for type in ["bc","wc"]:
    source = ColumnDataSource(data=df)
    df = df.loc[df["type"] == type]
    fig = figure(x_axis_label="education", y_axis_label="income")
    fig.circle(x="education", y="income", source=source, legend_label=type)
    fig.yaxis[0].formatter = NumeralTickFormatter(format="$0a")
    plots.append(fig)

# Display plot
output_file(filename="gridplot.html")
show(gridplot(plots, ncols=2))

We began by loading gridplot(), ColumnSourceData, and NumeralTickFormatter() functions. Gridplot made the grid, columsource created a native data type for bokeh and numeraltickformatter allows us to format the numbers on the axes.
We created an empty list called plots that we will use in our for-loop
We used a for loop to generate the plots. The plots graph education vs income. The NumeralTickFormatter allowed us to display dollar signs on the y-axis
We then displayed the plot

Conclusion

This post provided an example of how to make multiple plots with bokeh. With these tools, there are many different ways they can be utilized for your data purposes.

Bokeh Display Customization in Python

Leave a reply

In this post, we will examine how to modify the default display of a plot in Bokeh, a library for interactive data visualizations in Python. Below are the initial libraries that we need.

from pydataset import data
from bokeh.plotting import figure
from bokeh.io import output_file, show

The first line of code is where our data comes from. We are using the data() function from pydataset for loading our data. The next two lines are for making the plot’s figure (x and y axes) and for the output file.

Data Preparation

There is no data preparation beyond loading the dataset using the data() function. We pick the dataset “Duncan” and load it into an object called “df.” The code is below, followed by a brief view of the actual data using the .head() method.

df=data('Duncan')
df.head()

This dataset includes various occupations measured in four ways: job type, income, education, and prestige.

Default Graph’s Appearance

Before we modify the appearance of the plot, it is important to know what the default appearance of the plot is for comparison purposes. Below is the code for a simple plot followed by the actual output and then lastly an explanation.

# Create a new figure
fig = figure(x_axis_label="Education", y_axis_label="Income")

# Add circle glyphs
fig.circle(x=df["education"], y=df["income"])

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html")
show(fig)

The first line of code sets up the fig or figure. We use the figure() function to label the axes which are education and income. The second line of code creates the actual data points in the figure using the .circle() method. The last two lines create the output and display it.

So the figure above is the default appearance of a graph. Below we will look at several modifications.

Modification 1

In the code below, we are making the following changes to the plot.

Identifying data points by job type using color
Change the background color to black

Below is the code followed by the output and the explanation

# Import curdoc
from bokeh.io import curdoc

prof = df.loc[df["type"] == "prof"]
bc = df.loc[df["type"] == "bc"]

# Change theme to contrast
curdoc().theme = "contrast"
fig = figure(x_axis_label="Education", y_axis_label="Income")

# Add prof circle glyphs
fig.circle(x=prof["education"], y=prof["income"], color="yellow", legend_label="prof",size=10)

# Add bc circle glyphs
fig.circle(x=bc["education"], y=bc["income"], color="red", legend_label="bc",size=10)

output_file(filename="prof_vs_bc.html")
show(fig)

Here is what happened,

We load a library that allows us to modify the appearance called curdoc
Next, we do some data preparation. Separating the data for types that are “prof” and those that are “bc” into separate objects.
We change the theme of the plot to contrast using curdoc().theme
We also created the figure as done previously
We use the .circle() method twice. Once to set the “prof” data points on the plot and a second time to place the “bc” data points on the plot. We also make the data points larger by setting the size and using different colors for each job type.
The last two lines of code are for creating the output and displaying it.

You can see the difference between this second plot and the first one. This also shows the flexibility that is inherent in the use of Bokeh. Below we add one more variation to the display.

Modified Graph’s Appearance

The plot below is mostly the same except for the following

We add a third job type “wc”
We modify the shapes of the data points

Below is the code followed by the graph and the explanation

# Create figure
wc = df.loc[df["type"] == "wc"]
prof = df.loc[df["type"] == "prof"]
bc = df.loc[df["type"] == "bc"]

fig = figure(x_axis_label="Education", y_axis_label="Income")

# Add circle glyphs for houses
fig.circle(x=wc["education"], y=wc["income"], legend_label="wc", color="purple",size=10)

# Add square glyphs for units
fig.square(x=prof["education"], y=prof["income"], legend_label="prof", color="red",size=10)

# Add triangle glyphs for townhouses
fig.triangle(x=bc["education"], y=bc["income"], legend_label="bc", color="green",size=10)

output_file(filename="education_vs_income_by_type.html")
show(fig)

The code is almost all the same. The main difference is there are now three job types and each type has a different shape for their data points. The shapes are determined by using either .circle(), .triangle(), or .square() methods.

Conclusion

There are many more ways to modify the appearance of visualization in bokeh. The goal here was to provide some basic examples that may lead to additional exploration.

Bokeh-Scatter Plot basics in Python VIDEO

Leave a reply

In the video below we will look at making scatterplots using Bokeh. Bokeh is a Python library that makes interactive visualizations.

Bokeh Tools and Tooltips

Leave a reply

In this post, we will look at how to manipulate the different tools and tooltips that you can use to interact with data visualizations that are made using Bokeh in Python. Tool are the icons that are displayed by default to the right of a visual when looking at a Bokeh output. To the right is what default tools look like. Tooltips provide interactive data based on the position of the mouse.

We will now go through the process of changing these tools and tooltips for various reasons and purposes.

Load Libraries

First, we need to load the libraries we need to make our tools. Below is the code followed by an explanation.

from pydataset import data
from bokeh.plotting import figure
from bokeh.io import output_file, show

We start by loading “data” from “pydataset”. This library contains the actual data we are going to use. The other libraries are all related to Bokeh’s “figure” which will create details for our visualization. In addition, we will need the “output_file” to make our HTML document and the “show” function to display our visualization.

Data Preparation

Data preparation is straightforward. All we have to do is load our data into an object. We will use the “Duncan” dataset, which contains data on various jobs’ income, education, and prestige. Below is the code followed by a snippet of the actual data.

df=data('Duncan')
df.head()

Default Settings for Tools

We will now look at a basic plot with the basic tools. Below is the code.

# Create a new figure
fig = figure(x_axis_label="income", y_axis_label="prestige")

# Add circle glyphs
fig.circle(x=df["income"], y=df["prestige"])

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html")
show(fig)

There is nothing new here. We create the figure for our axes first. Then we add the points in the next line of code. Lastly, we write some code to create an output. The default tools has 7 options. Below they are explained from top to bottom.

At the top, is the logo that takes you to the Bokeh website
Pan tool
Box zoom
Wheel zoom
Save figure
Reset figure
Takes you to Bokeh documentation

We will now show how to customize the available tools.

Custom Settings for Tooltips

In order to make a set of custom tools, we need to make some small modifications to the previous code as shown below.

# Create a list of tools
tools = ["lasso_select", "wheel_zoom", "reset","save"]

# Create figure and set tools
# Create a new figure
fig = figure(x_axis_label="income", y_axis_label="prestige",tools=tools)

# Add circle glyphs
fig.circle(x=df["income"], y=df["prestige"])

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html")
show(fig)

What is new in the code is the object called “tools”. This object contains a list of the tooltips we want to be available in our plot. The names of the tools is available in the Bokeh documentation. We then add this object “tools” to the argument called “tools” in the line of code where we create the “fig” object. If you compare the second plot to the first plot you can see we have fewer tools in the second one as determine by our code.

Hover Tooltip

The hover tooltip allows you to place your mouse over the plot and have information displayed about what your mouse is resting upon. Being able to do this can be useful for gaining insights about your data. Below is the code and the output followed by an explanation.

# Import ColumnDataSource
from bokeh.models import ColumnDataSource

# Create source
source = ColumnDataSource(data=df)

# Create TOOLTIPS and add to figure
TOOLTIPS = [("Education", "@education"), ("Position", "@type"), ("Income", "@income")]
fig = figure(x_axis_label="education", y_axis_label="income", tooltips=TOOLTIPS)

# Add circle glyphs
fig.circle(x="education", y="income", source=source)
output_file(filename="first_tooltips.html")
show(fig)

Here is what happened.

We loaded a new library called ColumnDataSource. This function allows us to create a data structure that is unique to Bokeh. This is not required but will appear in the future.
We then save are dataset using the new function and called it “source”
Next, we create a list called “TOOLTIPS” this list contains tuples which are in parentheses. The first string in the parentheses will be the name that appears in the hover. The second string in the parentheses accesses the value in the dataset. For example, if you look at the hover in the plot above the first line says “Education” and the number 72. The string “Education” is just the first string in the tuple and the value 72 is the value of education from the dataset for that particular data point
The rest of the code is a review of what has been done previously. The only difference is that we use the argument “tooltip” instead of “tool”

Conclusion

With tooltips and tools you can make some rather professional looking visualization with a minimum amount of code. That is what makes the Bokeh library so powerful.

Bar Graphs Using Bokeh and Python VIDEO

Leave a reply

The video below provides an introduction to making bar graphs using the Bokeh library and Python.

Make a Bar Graph with Bokeh in Python

Leave a reply

Bokeh is a data visualization library available in Python with the unique ability of interaction. In this video, we will look at how to make a basic bar graph using bokeh.

To begin we need to load certain libraries as shown below.

from pydataset import data
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import output_file, show

In the code above, we load the “pydataset” library to gain access to the data we will use. Next, we load “pandas” which will help us with some data preparation. The last two libraries are related to “bokeh.” The “figure” function will be used to set the actual plot, the “output_file” function will allow us to save our plot as an HTML file and the “show” function will allow us to display our plot.

Data Preparation

We need to do two things to be ready to create our bar graph. First, we need to load the data. Second, we need to calculate group means for the bar graph. Below is the code for the first step followed by the output.

df=data('Duncan')
df.head()

In the code above we use the “data” function to load the “Duncan” dataset into an object called “df”. Next, we display the output of this. The “Duncan” dataset contains data on different jobs, the type of job, income, education, and prestige. We want to graph prestige and job type as a bar graph which will require us to calculate the mean of prestige by type. The code for this is below.

# Calculate group means of prestige
positions = df.groupby('type', as_index=False)['prestige'].mean()
positions

In the code above we use the “groupby” function on the “df” object. Inside the function, we indicate we want to group by “type”. The “as_index” argument is set to false so that the “type” column is not set at the index or you can say as the row numbers. Next, we subset the data using square brackets to only include the “prestige” column. Lastly, we indicate that we want to calculate the “mean”. The result is that there are three job types and we have the mean for each job’s prestige. The job types and means from this table above are what we will use for making our visualization.

Bar Graph

We are now ready to make our bar graph. Below is the code followed by the output.

# Instantiate figure
fig = figure(x_axis_label="positions", y_axis_label="Prestige", x_range=positions["type"]) 

# Add bars
fig.vbar(x=positions["type"], top=positions["prestige"],width=0.9)

# Produce the html file and display the plot
output_file(filename="Prestige.html")
show(fig)

Here are the steps.

We began by creating the “fig” object. We labeled are x and y axes and also indicated the range of the x values which means determining the categories of our data. For our purposes, this was the unique job type in the “types” column.
Next, we use the “vbar” function to make our bar graph. The x values were set to the “type” column from the “positions” object. The y or “top” values were set to the means of “prestige” from the “positions” object. The “width” argument was set to 0.9 to ensure there was a little whitespace between the bars.
The “output_file” creates a saved plot and the “show” function displays the bar graph.

Conclusion

Bokeh has lots of cool tools available for the data analyst. This post was focused on bar graphs but this is only the most basic information that has been shared here. There is much more possible with this library.

Bokeh-Scatter Plot Basics in Python

Leave a reply

Bokeh is another data visualization library available in Python. One of Bokeh’s unique features is that it allows for interaction. In this post, we will learn how to make a basic scatterplot in Bokeh while also exploring some of the basic interactions that are provided by default.

Data Preparation

We are going to make a scatterplot using the “Duncan” data set that is available in the “pydataset” library. Below is the initial code.

from pydataset import data
from bokeh.plotting import figure
from bokeh.io import output_file, show

The code above is just the needed libraries. We loaded “pydataset” because this is where our data will come from. All of the other libraries are related to “bokeh.” “Figure” allows us to set up our axes for the scatterplot. “Output_file” allows us to create the file of our plot. Lastly, “show” allows us to show the plot of our visualization. In the code below we will load our dataset, give it a name, and print the first few rows.

df=data('Duncan')
df.head()

In the code above we store the “Duncan” dataset in an object called “df” using the data() function. We then display a snippet of the data using the .head() function. The “Duncan” data shares information on jobs as defined by several variables. We will now proceed

Making the Scatterplot

We will now make our scatterplot. We have to do this in three steps.

Make the axis
Add the data to the plot
Create the output file and show the results

Below is the code with the output

# Create a new figure
fig = figure(x_axis_label="education", y_axis_label="income") #labels axises

# Add circle glyphs
fig.circle(x=df["education"], y=df["income"]) #adds the dots

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html")
show(fig)

At the top of the code, we create our axis information using the “figure” function. Here we are plotting education vs income and storing all of this in an object called “fig”. Next, we insert the data into our plot using the “circle” function. To insert the data we also have to subset the “df” dataframe for the variables that we want. Note that the data added to a plot are called “glyphs” in Bokeh. Lastly, we create an output file using a function with the same name and show the results.

To the right of your plot, there are also some interaction buttons as shown below

Here is what they do from top to bottom.

Takes you to bokeh.org
Pan the image
Box zoom
Wheel zoom
Download image
Resets image
It takes you to information about the bokeh function

There are other interactions possible but these are the default ones when you make a plot.

Conclusion

Bokeh is one of many tools used in Python for data visualization. It is a powerful tool that can be used in certain contexts. The interactive tools can also enhance the user experience.

Mistakes in Evaluation Writing

Leave a reply

Writing program evaluation reports is always a tricky task to accomplish. As a writer, you have to be concerned about the style of writing, and the audience of the report, among other challenges. In addition, there are several common mistakes made when writing as shown below.

Small sample
No comparison group
Instrument use
Sharing too little or too much
Hasty generalization

Small Sample Sizes

The sample size is highly important, particularly in quantitative reports. If a sample is small it will be difficult to make strong conclusions and the findings will be considered questionable. Naturally, there is disagreement over what is thought of as an adequate sample size. However, this can be calculated mathematically. The general rule of thumb for statistical tests is a sample size of at least 30 observations.

Even if the sample size starts adequate there is still the challenge of attrition. As time progresses, people will drop out of programs and this can make the data collected on them useless.

If the sample size drops below an acceptable level all is not lost. It is important to communicate the limitations of the report and not oversell the results due to the small sample size. If you know in advance that the sample size will be small, it may be more appropriate to focus more on a qualitative study rather than a quantitative one.

Lack of Comparison Group

A problem that is often associated with sample size is the lack of a comparison group. Quantitative research is about comparing different values to see if they are the same or different. If a program is implemented, there is no way to assess the quality of it unless it is compared to individuals who did not participate in the program. Without a comparison group, there is no way to interpret the program quality.

You can’t say a program is “good” or “bad” in a vacuum. Such a statement as this must be made in comparison to a situation that is similar or the same as the context of the program with the effect of the program. In other words, quality is generally a relative concept rather than an absolute one.

There is an argument that it is unethical to deny some individuals participation in a program for the sake of a comparison group. However, it can also be said that it is unethical to state that a program is good or bad without having a comparison group.

Instrument Use

There are two common mistakes with instruments.

Lack of information on the instruments
Mixing and matching survey items from different instruments

Sometimes people will use instruments to explain anything about the instrument. In general, the writer of a report should provide enough information about an instrument that a reader knows that the instrument is psychometrically appropriate. This can include sharing how many items are in the instrument, the reliability score, the purpose of the instrument (what it measures), and how the instrument was used in the current study. Providing this information on the instrument helps to provide context to the study and allows for the reproducibility of the study.

A common problem, especially among people without a strong background in research is mixing and matching items from various instruments. Sometimes people think that they can take two items from one instrument along with three items from another instrument and make a new instrument.

The problem with this mix-and-match approach is that instruments are tested and developed as a block of items. To add or subtract from this block would mean that the instrument is no longer measuring what it used to measure. This new instrument would have to be retested to make sure that it is reliable and valid. Therefore, whenever employing an instrument it must be unaltered to ensure that it is capturing the data that it was set out to collect.

Sharing too Little or too Much

When writing, the evaluator must find a balance between sharing too little and too much information. This is more of an art than a science but it is something that a writer needs to know.

Too little information would be to make statements and provide no supporting data for the statement. For example, “The scores were low here”. Such a statement needs actual numbers to support it.

Another mistake would be to share too much information. Using the same example of “the scores were low here” and then sharing all the individual scores of each participant. Quantitative research is focused on the aggregation of data and not individual scores.

How much information to share is also influenced by the nature of the report. Quantitative reports will have fewer words and more numbers that share broad conclusions. A qualitative report will be much more focused on individual stories and will not have the same broad conclusions.

Generalizing

The results of a study are limited to the context. To make broad sweeping statements from a limited context is to overgeneralize. For example, if a study is conducted using reading software among 35 fifth graders in rural Texas the results of this study only apply to a similar context. You cannot say that since the program was successful here it will be successful in a different context.

However, to be fair it is possible it just has not been proven yet. This is one reason why further study is always encouraged in academic writing. As the program is proven in different contexts, then there is evidence to make a strong general conclusion about the strength of the program.

Conclusion

There are other ways mistakes can be made in the writing process. The focus here was on common errors and mental miscalculations that obscure the hard work of evaluators. When writing it is important to make sure to maintain that the conclusions that are drawn are accurate in supported by a rigorous methodology.

Treatment Fidelity

Leave a reply

Whenever a program is implemented there are always ways for things to go wrong. Treatment fidelity is a term used to describe how programs are not implemented as intended in the grant proposal. Below is a list of common ways that treatment fidelity can become a problem

Adherence to implementation
Implementation incompetence
Variations in treatment
Program drift

We will look at each of these below

Adherence to Implementation

Implementation adherence is whether the provider of the program follows the intended procedures. For example, if we have a reading lab program to boost students’ reading comprehension. The procedures may be as follows.

Fifth-grade students are to use the reading lab on Monday, Wednesday, and Friday for 30 minutes each. (Dosage)
The students must be engaged actively in using the reading software

If the provider wanders from these procedures it can quickly become an implementation issue. This is common. A teacher may take their kids on a field trip, there could be holidays, the teacher might do 1 hour one day and skip another day, etc. In other words, providers agree to a program but essentially do what they want when necessary. Every time these modifications happen it impacts the quality of the results as other factors are introduced into the study that were not originally planned for.

Implementation Competence

Implementation competence is defined as the provider’s ability to follow directions. If the procedures are too complicated the provider may not be able to follow them for the benefit of the students in the program.

An example would be if a provider is not comfortable with using computers and the reading software they may not be able to help students who are having technical issues. If too many students are unable to use the computers because the provider or teacher cannot help them this could lead to implementation competence concerns.

Difference in Treatment

The difference in treatment means that the treatment that the participants in the program receive should not be the same as participants who are not in the program. The treatments must be different so that comparisons can be made.

Sometimes when a new program is implemented providers will want all students to experience it. In our reading lab example, the procedures might call for allowing only half of the fifth-graders below grade level in reading comprehension. However, a teacher might decide to have all students participate in the reading lab because of the obvious benefits. If this happens, there is no way to compare the results of those who participate and those who do not.

Such well-meaning actions may benefit the students but damage the scientific process. It is always critical that there are differences in treatment so that it can be determined if the treatment makes a difference.

Program Drift

Program drift is the gradual weakening of the implementation of a program. People naturally lose discipline over time and this can apply to obeying the procedures of a program. For example, a provider might vigilantly follow the procedures of the reading lab in the beginning but may slowly allow more or less time for the students.

Program drift is hard to notice. One way to prevent it is to constantly re-train providers so that they are reminded about how to implement the program. Retraining is beneficial when providers want to implement the program correctly.

Conclusion

Treatment fidelity is critical to determine the quality and influence of a program. Evaluators need to be familiar with these common threats to fidelity so that they can provide the needed support to help providers.

UNION and INTERSECT in SQL VIDEO

Leave a reply

In the video below we will look at how to use UNION and INTERSECT clauses in SQL. Both of these clauses have to do with joining data in ways that are an alternative to JOIN statements.

John Holt and Youthism

Leave a reply

In this post, we will look at John Holt and his views on education in the US.

Bio

John Holt was an early proponent of homeschooling in the US. What makes him unique is that Holt was a left-wing or progressive voice for homeschooling. Homeschooling has often been associated with conservatives and Christianity but this was not the case with Holt. By most accounts, Holt was a devout Atheist.

Holt viewed the traditional education experience of children as oppressive. The reason for this oppression was the students did not have control of their learning experience. For Holt, children should be able to choose what they study. The factory-style education in the US was a major criticism of Holt as he believed it stripped young people of their individuality.

Holt’s views were not limited to education. He also supported other left-leaning views involving feminism, environmentalism, and a guaranteed income for all. His motivation behind a guaranteed income was to liberate women and children from being dependent on men or the husband and father of the family. Holt is also considered the father of the Children’s rights movement. In many ways, Holt had issues with traditional views of family.

Youthism

The Children’s Rights Movement has many names such as Youth Rights or Youthism. The main premise of proponents of this belief system is that adults discriminate and oppress young people and children. This belief system is similar to other Communist/Critical Theory-inspired belief systems such as Critical Race Theory, Feminism, etc.

What all of these –isms or theories of oppression have in common is a power struggle between two groups. In Communism or Marxism, the bourgeoise control the means of production and oppress the proletariat. The proletariat needs to rise up, rebel, overthrow the bourgeoise, and seize the means of production.

In Critical Theory, the oppressors maintain the country’s current cultural structure (often portrayed as White, male, and Christian) and the various social institutions (school, church, etc.). People who are not producers of the current culture are oppressed and should rise up and overthrow those who control the production of culture.

Critical Race Theory states that the oppressors are White Americans and the oppressed are people of color. Whites control access to various things through their production of privilege or culture. People of color need to rise up, abolish the privilege of Whites, and destroy the ability of Whites to reproduce the current societal structure or have any form of privilege.

Feminism states that the oppressors are men and the oppressed are women. Men oppressed women through the use of cultural and traditional beliefs and reproduced these beliefs through various social institutions. Women need to rise up and rebel and stop the reproduction of traditional beliefs in society so that women can have emancipation from male leadership.

Queer studies state that the oppressors are people who are straight and the oppressed are people with alternative sexual identities and preferences. Heterosexuals control the means of reproducing heterosexuality through culture, families, and schools. Queer individuals need to rise up, overthrow heteronormativity, and liberate society from those false beliefs.

In Youthism the struggle is between children and adults. Adults oppress children and want to maintain their power and authority over them. Children, in turn, should rebel and seize their autonomy and rights from the adults. By leaving schools, children can seize some of the power and take control of their education. Below is a table that briefly summarizes what has been shared.

Philosophy	Oppressor	Oppressed	Means of Production	Goal
Communism	Feminism	Proletariate	Financial/factories	Revolution
Critical Theory	Majority race	Minority race	Culture, schools, family, religion	Revolution
Feminism	Men	Women	Culture, schools, family, religion	Culture, schools, family, religion
Queer Studies	Heterosexuals	Alternative sexualities	Culture, schools, family, religion	Revolution
Youthism	Adults	Children	Feminism	Revolution

The end game is the same. To overthrow the existing society from one angle or the other. The reason for these various theories and belief systems is the same as why there are different flavors of ice cream, which is to attract the highest number of people possible. All of these various oppressed groups can agree on the need for change and can work together for this. In addition, these various movements create a mult-front assault on the existing society which is much more difficult to defend against than one enemy. Multiple groups of oppressed people also create a picture that something is seriously wrong with society when so many people are dissatisfied with it.

Holts Beliefs

Returning to the focus on Holt, he also had some unusual beliefs about children’s freedom. For example, he believed that a child should be able to drive whenever they possess the ability rather than at 16. He criticized how adults speak to children by calling them “cute” and patting them on the head. Holts also had issues with how adults are sometimes dismissive of the feelings and problems of children, which to him was a form of oppression.

Perhaps one of Holt’s most shocking beliefs was in the sexual freedom of children. Essentially, he believed that children should make their own decisions about sexuality. It may be possible that Holts’ views on this were inspired by Kinsey whose research focused on providing evidence that this was a viable position for children.

Conclusion

John Holt was a trailblazing liberal in the world of homeschooling. He radically supported a conservative idea in his unique way. His influence on homeschooling is significant, whether or not people agree with him on a personal level.

Subqueries in SQL VIDEO

Leave a reply

Subqueries are used in SQL to accomplish specific goals. In the video below we will discuss the use of subqueries.

Windows Functions in SQL VIDEO

Leave a reply

The video below explains the use of window functions in SQL. These functions allow you to calculate various values based on the criteria in that specific row. This can lead to some interesting insights depending on the situation.

Cost-Effectiveness Analysis

Leave a reply

The purpose of a cost-effectiveness analysis is to determine the relationship between the benefits and expenses of a program. Naturally, there are many different ways to do this but there are some common steps for approaching this as shown below

Define the program and outcome indicators
Determine what you want to know
Compute cost
Determine the scope of program outcome data
Compute outcome data
Compute cost-effectiveness ratio

Define the Program and Outcome Indicators

Defining the programs means to know all the components and features of the program. For example, a reading lab program might have the following components.

Online reading in a computer lab that develops reading comprehension and pronunciation skills
Participation 30 minutes a day twice a week
The program lasts one semester
Participants are 30 fifth graders who are reading 2 levels below grade level

The example above is highly simplified but serves our purpose. Once the program is defined it is necessary to determine the outcome indicators. Outcomes are measurable changes in behavior. For our reading lab example below is the the outcome we want to measure.

Number of fifth-grade students who are reading at or above grade level at the end of the semester of reading lab participation.

With the information above we can move to step 2.

Developing Questions

Once you know what the program is about and the outcome you want to measure it is now time to shape questions for the study. This might seem like a wasted step because obviously we want to gather data about the outcome indicator. However, there might be more than one thing we want to know about the outcome. For example, we might want to know if there are differences by gender, race, or socioeconomic factors. Since we can nuance and complicate the study it is important to state explicitly what we want to know. Below are the questions for our reading lab example.

How many fifth-grade students reach grade level for reading comprehension through the use of the reading lab twice a week for 30 minutes?
How many fifth-grade students are unable to reach grade level for reading comprehension through the use of the reading lab twice a week for 30 minutes?
What is the cost per fifth-grade student for the use of the reading lab over the semester?

The next step is to determine the cost of the program

Determine Cost

It is now time to find out how much money was spent. This is a straightforward process that includes calculating the expenses for personnel, facilities, equipment, and other expenses. For our reading lab example, the costs are simple to compute and our shown below.

Personal: The total cost is 0 zero dollars because the teachers are already paid by the school and no additional staff was necessary
Facilities: Again, the total cost was zero because an existing computer lab was used.
Equipment: The expense for the license for the reading software is $30,000 dollars for the length of the program
Other expenses: Zero dollars for other expenses

For our example, only $30,000 is used for this program.

Determine Scope of Data Collection

The amount of data to collect depends on the questions to answer and the maturity of the program. If the program has been around for several years you have to decide if you want to collect data from all years or a subset. In our example, this is a new program so we will take all data from the the fifth graders who participated in the reading lab program.

Compute Outcome

Once the program has run its course it is time to determine outcomes. For our example, after the reading lab program was completed, each student took a reading comprehension test to assess what grade level they were at. For our purposes, students at or above grade level are considered successful. Below are the results

Success	Unsuccess
20	10

The information in the table above has already answered our first two questions for this study. We can now use this information to determine the cost-effectiveness ratio to answer the last question.

Cost-Effectiveness Ratio

The cost-effectiveness ratio can be calculated by dividing the cost of the program by the outcome. For our example, this would mean dividing $30,000 by 20 (number of success). In the table below we have several important calculations

	Reading Lab Program
Program cost	$30,000
Success rate	20 / 30 * 100 = 66%
Number of students at grade level	20
Cost per successful student	$30,000 / 20 = $1,500

The table above provides all the information we need to assess this program within the scope that we defined. Right now it is hard to tell if this program is good or not because there is no standard or another program to compare it to. However, having external standards or another program for comparison is often expected with real examples.

Sometimes an additional step that is taken is a sensitivity analysis. A sensitivity analysis is especially important when there are a lot of estimations in the model. When it is necessary to estimate it is important to adjust these values high and low to see how they affect outcomes. For our example, this is not applicable.

Conclusion

Cost-effectiveness analysis is an important tool in determining the value of a program. The goals of a program are normally to help people while keeping in mind cost and effectiveness. The analysis presented here allows an evaluation to assess programs so that services can be rendered efficiently.

Window Function in SQL

Leave a reply

SQL window functions are calculation functions based on individual rows that create a new column that contains the new calculation. They are similar to aggregate functions but are different in that normal aggregate functions like “group by,” will provide a single number as the output. As mentioned earlier, with a window function, the results are placed in a new column for each row.

Window functions allow users to perform many analytical tasks, from ranking and partitioning data to calculating moving averages, cumulative sums, and differences between consecutive rows. Again, what is important here is that widow functions can be used to calculate values between rows.

Basic Example

In this first example, we will figure out how many customers we have from Texas and California and put this in a new column. Below is the code followed by the output.

SELECT first_name, last_name,state ,
COUNT(*) OVER () as total_customers
FROM customers
WHERE state in ('CA','TX')
ORDER BY customer_id;

In the select statement, we pull first_name, last_name, and state, and we then have our window function. In this window function, we are counting the number of customers in the customer table who are from CA and TX. The OVER() function is used to define the window. Since it is blank it is telling SQL that the entire table is the window. This will not make sense right now but is critical information for future examples. Lastly, we are ordering the data by customer_id.

The output indicates that 9,903 customers are from CA or TX. We can confirm this by running a separate line of code.

SELECT COUNT(*)
FROM customers
where state in('CA','TX')

The output from the window function is repeated in every row called total_customers. Repeating this information doesn’t make sense but it shows us what the window function does. For example, in row 1, the function sees that this person is from TX and then outputs that the total number of customers is 9,903 for somebody from TX or CA. This is a basic example of what window functions can do. Of course, there are things much more insightful than this that can be calculated.

Intermediate Example

We are now going to run mostly the same code with one slight difference. We want to know not just how many total customers there are but how many are from TX and how many are from CA. To do this we will have to use PARTITION BY which is the group by clause for window functions.

SELECT customer_id, first_name, last_name, state,
COUNT(*) OVER (PARTITION BY state) as total_customers
FROM customers
where state in ('CA','TX')
ORDER BY customer_id;

In the output, we have all of the same rows from the previous SELECT clause and the total_customers columns are different. When a person is from TX this column shows how many people are from TX but when a person is from CA it shows how many people are from CA. If you add up the number of people from TX and CA you get the following

4,865 + 5,038 = 9,903

This is the same amount as the total number of customers in our previous example. The PARTITION BY clause breaks the number of customers into two groups, those from TX and those from CA, and assigns the appropriate value based on where the customer in that row is from.

Advanced Examples

The next example will involve using the SUM aggregation function. We are going to add customer_id in a new column. This will be a running total. In other words, SQL will keep adding the customer_id until they get through all of the data. Below is the code followed by the output and explanation.

SELECT customer_id, title, first_name, last_name, state,
Sum(customer_id) OVER (ORDER BY customer_id) as total_customers
FROM customers
where state in('CA','TX')
ORDER BY customer_id;

Here is what is happening. In the total_customer column, a running total of customer_id is being created. For example, row 1 has the value 10 because that is the first customer id of someone from TX. Row 2 has a customer ID of 13. When you add 10 + 13 = 23 which is the value in row 2 of total_customers. This continues for the rest of the table.

Here is another example this time with the RANK() function. The RANK() function allows you to create a new column that ranks the data based on a criteria. For our example, we will rank the data based on their customer_id with the lower the id number the higher the ranking. To make this even more interesting, we will partition the data by state so that the lowest value customer_id will be number 1 for TX and the lowest ranking customer_id will be number 1 for CA. Below is the code

SELECT customer_id, title, first_name, last_name, state,
rank () OVER (PARTITION BY state ORDER BY customer_id) as total_customers
FROM customers
where state in ('CA','TX')
ORDER BY customer_id;

Row number 1 is rank 1 because it is the lowest value customer_id of all TX. Row 2 is also ranked 1 because it is the lowest value customer_id of all CA.

Conclusion

The possibilities are almost endless with window functions. These tools allow you to get into the data and find insightful answers to complex questions. The examples here are only scratching the surface of what can be achieved.

UNION and INTERSECT in SQL

Leave a reply

UNION and INTERSECT are two useful statements used in SQL for specific purposes. In this post, we will define, provide examples of each statement, and compare and contrast UNION, INTERSECT, with the JOIN command.

UNION

The UNION statement is used to append rows together from different select statements. Below is the code followed by the output and explanation.

(
SELECT street_address, city, state, postal_code
FROM customers
WHERE street_address IS NOT null
)
union
(
SELECT street_address, city, state, postal_code
FROM dealerships
WHERE street_address IS NOT null 
)

Notice how the select statements are both in their own set of parentheses with the union statement between them. In addition, you can see that both select statements used the same columns. Remember we are trying to combine data from different places that have the same columns. You can ignore this but the output will be hard to comprehend. For example, if you have different columns in each select statement you will get an output but it will be hard to interpret.

Essentially, in the example above, we took the same columns from two different tables and created one table. UNION removes duplicates. If you want duplicates you must use UNION ALL

In contrast, joins are used to append columns together based on criteria. If we were to join two or more tables the joined table would increase in its number of columns and possibly its rows. A UNION will have a specified number of columns while growing in terms of the number of rows that are present in the output.

INTERSECT

INTERSECT finds common rows between select statements. It is highly similar to JOIN with the main difference being INTERSECT removes duplicates while JOIN will not. Below is an example.

SELECT state,postal_code FROM customers  
INTERSECT 
SELECT state,postal_code FROM dealerships;

The code is simple. You make your select statements and place INTERSECT in between them. The results above show us what data these two select statements share. When state and postal_code are the criteria for these two tables only these three rows are in common. If we did a JOIN we would get every instance of these three state and postal codes rather than just the unique ones.

Conclusion

UNION and INTERSECT have their place in the data analyst toolbelt. UNION is for appending rows together. INTERSECT is for finding commonalities between different select statements. Both UNION and INTERESECT remove duplicates which is not done when using a JOIN.

Subqueries in SQL

Leave a reply

Subqueries allow you to use the tables produced by SELECT queries. One advantage of this is instead of referencing an existing table in your database you can pull from the table you are making when making your SQL query. This is best explained with an example.

WHERE Clause

I have two tables I want to pull data from called salespeople and dealerships. The titles of these two tables explain what they contain. One column these two tables have in common is dealership_id which identifies the dealership in the dealership table and where the salesperson works at in the salespeople table.

Now the question we want to know is “Which of my salespeople work in Texas.” This can only be determined by using my two tables together the salespeople table does not have the state the person is in while the dealerships table has the state but does not have the salesperson’s information.

This problem can be solved with a subquery. Below is the code followed by the output and an explanation.

SELECT *
FROM salespeople
WHERE dealership_id IN (
	SELECT dealership_id FROM dealerships
	WHERE dealerships.state = 'TX'
	)

The first two lines are standard SQL coding. In line three, we have our subquery in the WHERE clause. What is happening is that we are filtering the data to include only dealership_id that matches the state of Texas inside the parentheses. This leads to a note that subqueries are always inside parentheses.

SELECT Clause

Using subqueries in the WHERE clause is most common but we can also do them in the SELECT clause as well. In our first example we learned where the employees work but let’s say we want to know the city and not just the state. Below is a way to pull the city data for one dealership with the salespeople data.

SELECT *,       
       (SELECT city 
        FROM dealerships d         
        WHERE d.dealership_id = 19) as City 
FROM salespeople
WHERE salespeople.dealership_id =19;

In this example, we pull all data from salespeople while pulling only data related to the city from the dealership while also filtering for only dealership_id 19.

FROM Clause

We can also place subqueries in a FROM clause. In the example below, I want the first and last name of the salesperson followed by the city and state they work in. The name info is on the salespeople table and the city and state are on the dealership table. Below is the code and results.

SELECT * 
FROM 
   (SELECT s.first_name, s.last_name,d.city,d.state
                 FROM salespeople s , dealerships d 
                 WHERE s.dealership_id= d.dealership_id);

The code should be self-explanatory. Inside the parentheses, we are creating a table we want to pull data from. The subquery is essentially a join of the two tables based on the criteria. This brings up an important point. Subqueries and joins (inner joins in particular) serve the same purpose. Joins are better for large amounts of data while suqueries are better for smaller amounts of data.

Conclusion

Subqueries are another tool that can be used to deal with data questions. They can be used in the WHERE, SELECT, or FROM clauses. When to use them is a matter of preference and efficiency.

Single System Research Design

Leave a reply

Definition

Single system research design is a term that is associated with program evaluation. This form of research design is highly similar to the experimental designs that are taught in a typical research methods book. There are several differences between single-system research design and experimental design.

The first is the context. Single system research design is associated with program evaluation while experimental design is related to hard and social science research. Another difference is how rigorous each research method is. Generally, single system research design is not as rigorous as experimental design.

Among the reasons for this lack of rigor is sampling. Single system research design is intended to assess how well a program is doing. Therefore, the sample size is limited to the number of people participating in the program. Often there is no sample and all participants are also in the study. The sample size is usually larger for experimental design, but again this depends on the context.

Another difference is random assignment. With single system research design random assignment is not possible which means that there is a lack of independence. Sometimes this lack of independence can also be a problem in experimental design as well but not always.

Since the sampling and the lack of independence are problems this leads to problems with external and internal validity as well for single system research design. It is often not possible to generalize due to sample size or to assess cause and effect due to the limitations of single-system research designs.

Finally, because of these issues, it is often unnecessary to use inferential statistics. If no sampling is done there are no inferences to make about the population. However, for single system research design the purpose is to determine the health and state of the program and not to draw strong conclusions. The goal is intervention rather than the development of theory.

To put it simply, single-system research design is not concerned with being strongly scientific in the traditional sense like experimental design. The goal of this approach is to assess programs and not necessarily to publish data that would withstand the scrutiny of the peer review process

Common Steps

There are several steps involved with single system research design. First, an outcome measure or variable needs to be selected and measured several times. Whatever outcome measure is selected it must be reliably measured and must vary with time.

The results of this measurement must then be graphed to see if the program has had any potential influence on the outcome i.e. a before and after effect. However this depends, there are times when an outcome measure is measured to determine if there is a problem first. For example, looking at reading rates and seeing if they ever fall below a certain threshold to justify the implementation of a program.

Whatever the case, the graphed data is used to make decisions. Again the focus is on that specific outcome measure and not generalizing or making strong cause and effect claims.

Measuring Over Time

The actual design of a single system research design is the same as in experimental design and includes some of the following and more…

AB
ABA
BAB
ABAB
Some include a C or a different intervention

These design formats are explained in most research design textbooks. The “A” represents the state of the program without an intervention. The “B” represents measurement during the intervention. The goal is to see a difference in the graphs when the intervention is present to provide evidence that the program is working

How Single System Research Design is Used

To consolidate this information into one place. Single system research design is used for the following.

Formative-to adjust and enhance an existing program
Summative-Appraise results and outcomes of a program
Quality assurance-Check compliance with regulations ie audits of behaviors

The new one is quality assurance this is similar to formative or summative with the difference of a focus on compliance with external standards.

Conclusion

Single system research design is for program evaluators and not really researchers. The difference is what is focused upon. Since it has limited scope, single system research design normally does not meet the standards of science. Despite, this, it is still a tool for assessing the strength and quality of a program.

Components of Process Evaluation

Leave a reply

Process evaluation is focused on the implementation of a program. There are three main components to a process evaluation and they are…

Program description
Program monitoring
Quality assurance

We will now look at each of these individually.

Program Description

Program description is about documentation for replication. In other words, a program description is used to determine the operational steps of a program if someone else wants to implement the same program. The main challenge of program description is determining what data to find and use. Data can come from clients, staff, program activities, meetings, etc. All of this data has to be organized to explain what the program does and how it does it.

There are some generally recommended steps for this as outlined below.

Determine what the program leaders are interested in knowing. This provides a working framework for shaping the data that is collected
Develop a plan for collecting data
Determine stakeholders to interview. These are the people who provide qualitative data about the program.
Develop surveys based on step 3. The survey allows you to reach many stakeholders using quantitative means.
Conduct the interviews and issue the surveys
Examine any documents about the program
Analyze all information
Share results.

The steps above will allow you to determine what to collect and how to collect it to describe a program.

Program Monitoring

Program monitoring is used to determine what happens within a program and who it happens to. The focus on what happened and to who is to make sure that a program stays focused on its mission and does not wander away from it. Over time, there are changes to a program in terms of the staff, and resources are often reshuffled as other problems arise. This leads to a program losing focus and not staying committed to its original mission.

To prevent a loss of focus program monitoring involves determining what events and activities within a program should be counted. Below are some examples of events and activities that could be counted in many programs.

Number of clients served
Number of new clients served
Number of counseling sessions provided.

As these metrics are gathered it can be determined if the program is staying focused. If these numbers begin to change it will be possible to question and explain why this is happening. For example, the number of clients served may drop due to people moving out of the area. This may lead to the program being shut down or to a change in the demographic of those who are being supported by the program.

Determining what to count can often be decided by looking at the mission, goals, and objectives of a program. The mission is the overarching purpose of a program. The goals are unmeasurable ideas of what the program wants to achieve. The objectives are measurable actions the program takes to achieve its goals.

Quality Assurance

Program monitoring is great for figuring out what is happening but it does not explain how well things are happening. For this reason, we need quality assurance. Quality assurance compares the metrics of the program to an external standard. By doing this, it is possible to determine how well the program is doing.

An example would be for a program that supports juvenile offenders the standard may be to make sure that repeat offenders of crime do not exceed 10% of the participants in the program. The 10% value is the standard. If the program stays below this value it would be considered a good program. However, if the program exceeds 10% of repeat offenders then it would be necessary to determine what types of support and adjustments are needed for the program to meet this standard.

These standards are often set by outside authorities such as accreditation agencies and or the government. Sometimes even the funders of the program will have standards. This is often the case when working with government funding.

Conclusion

Process evaluation is a key component of program evaluation. It allows a team to see the immediate actions of a program in terms of measurable metrics. This analysis can help to document a program, determine if it is on the right path, and assess the quality of the program.

Lenin and the Communist Youth League

Leave a reply

In 1920, Lenin gave a speech to the Communist Youth League of Russia. In this speech, Lenin lays out some of his theories on education, describes how communists seize power, and explains the ethics of communism.

What makes this speech so fascinating is how it has inspired directly or indirectly many arguments made today to attack the establishment. There are ideas in this speech that seem to come directly from Friere, as well as proponents of the various forms of critical studies found today. In this post, we will look at Lenin’s definition of communism as well as his views on education, power, and ethics.

Definitions

Lenin defines communism as a society in which all things are owned in common and the people work in common. This definition is much broader than the definition that is commonly shared today. Many people define communism or socialism as common ownership of the means of production. Generally, the means of production are controlled by capitalists and many today want to strip the capitalists of the means of production while allowing for individual ownership of consumer items such as cars, houses, clothes, etc.

In other words, consuming is permissible among socialist today but production must be controlled by the people. Lenin’s definition makes it clear that the people, which is really just government bureaucrats, want to control everything under the guise of common ownership.

Educational Views

While speaking to the youth, Lenin made it clear that it is the youth who will be the face of communist society. Realizing their responsibility the young people need to learn. Lenin explains that learning and teaching must be redesigned. Rote learning is not true learning as it lacks practical application. Students should not cram knowledge into their minds. Education must be practical and not theoretical with a need for participation.

Lenin’s critique of rote learning or memorization is similar to Friere’s criticism of banking education in which the the teacher deposits knowledge into the students’ heads with any form of critical thought. This style of teaching is oppressive as the student is only going to reproduce the existing society rather than transform it. It would be difficult to prove that Friere was inspired by this particular speech of Lenin but the similarities are interesting.

Friere also talks about Praxis, which is essentially a form of practical political protesting or pushback against the norms of the existing society. Once students have a critical consciousness (awakened to the oppression of the world) they need to mobilize and find ways to resist those who are oppressing them. In other words, just as Lenin stated the need for practical learning, Friere emphasized this in the political education of students.

Lenin also states that books plus struggle is what learning truly is. The choice of the word “struggle” in the English translation is another interesting choice of words. Stalin and Mao later developed criticism and self-criticism in which people would criticized themselves and other people who were not living up to the expectations of the Communist revolution. People were expected to publically confess their “sins” and call out the “sins” of others. If your confession wasn’t good enough it could lead to additional consequences. In China, this was called struggle sessions and has been accused of being a form of brainwashing.

The Plan

Lenin provides an example of how criticism was used to gain power. He states that the Communists must criticize the Bouregise to arouse hatred of them. Once the Bouregise is hated the communists can unite the people (proletariat) to take power. This is what happened in Russia. The Czar and capitalists were criticized, people began to hate them, and the working class seized power under the leadership of the communists.

In the various “studies” of today the same strategy is used. Critical race theory criticizes one racial group to stir up hatred in other groups to unite them and take power. Feminism criticizes men in order to develop hatred among women towards men in order to unite them and take power. Queer studies criticize normalcy to stir up hatred against “normal” people so that the queer will unite and take power. The whole goal is to divide the people so that a revolution takes place between those who are “woke” and those who are not. To see how this strategy was laid out over 100 years ago and is still successful is shocking.

Ethics

Lenin also explains the ethical position of communism. He states clearly that there is no belief in God in the communist worldview. Since there is no God, God is not a source of right or wrong. There is no morality outside of the morality defined by society. In other words, men will decide for themselves what is right or wrong.

Communists have a moral duty to share all resources. Nothing can belong to a person as all resources must be shared. To have private property is to encourage selfishness and is bourgeois. For society to flourish the old ways must be destroyed.

Communists have tried to impose this ethical worldview. However, it never works because people aren’t motivated unless there is something in it for them. Despite this, even to this day, people criticize the capitalist system because it inspires people to work hard for the benefit of themselves and others.

Conclusion

The foundational ideas that Lenin explains here have echoed down over the decades to have powerful effects. Lenin’s views influenced Friere, Lenin’s views influenced criticism of Stalin and Mao, and Lenin’s views have also influenced the various “studies” that have impacted society today.

Fuzzy Joins with R

Leave a reply

In this post, we will look at how you can make joins between datasets using a fuzzy criteria in which the matches between the datasets are not 100% the same. In order to do this, we will use the following packages in R.

library(stringdist)
library(stringr)
library(fuzzyjoin)

The “stringdist” package will be used to measure the differences between various strings that we will use. The “stringr” package will be used to manipulate some text. Lastly, the “fuzzyjoin” package will be used to join datasets.

String Distance

The stringdist() function is used to measure the differences between strings. This is measured in many different ways. For our purposes, the distance is measured by the number of changes the function has to make so that the second string is the same as the original string of comparison. Below is code that uses three different methods each to compare the strings in the function.

> stringdist("darrin", "darren", method = "lv")
[1] 1
> stringdist("darrin", "darren", method = "dl")
[1] 1
> stringdist("darrin", "darren", method = "osa")
[1] 1

This code is simple. First, we call the function. Inside the function, the first string is the ground truth string which is the string everything else is compared to. The second string is the other string that is compared to the first one. The method is how the difference is measured. For each example, we picked a different method. “lv” stands for Levenshtein distance, “dl” stands for full Damerau-Levenshtein distance, and “osa” stands for Optimal String Alignment distance. The details of each of these methods can be found by looking at the documentation for the “stringdist” package. Also, note that there are other methods beyond this that are available as well.

The value for these methods is 1, which means that only one change is needed to convert “darren” to “darrin”. Most of the time the methods are highly similar in their results but just to demonstrate, below is an example where the methods disagree.

> stringdist("darrin", "dorirn", method = "lv")
[1] 3
> stringdist("darrin", "dorirn", method = "dl")
[1] 2
> stringdist("darrin", "dorirn", method = "osa")
[1] 2

Now, the values are different. The reason behind these differences is explained in the documentation.

amatch()

The amatch() function allows you to compare multiple strings to the ground truth and indicate which one is closest to the original ground truth string. Below is the code and output from this function.

amatch(
  x = "Darrin",
  table = c("Darren", "Daren", "Darien"),
  maxDist = 1,
  method = "lv" 
) 
[1] 1

Here is what we did.

The x argument is the ground truth. In other words, all other strings are compared to the value of x.
The “table” argument contains all the strings that are being compared to the x argument.
“maxDist” is how far away or how many changes max can be made in order for the strings in the “table” to be considered the best match
“method” is the method used to calculate the “maxdist”
The output is 1. This means that the first string in the table “Darren” has a max distance of 1

Fuzzy Join

The fuzzy join is used to join tables that have columns that are similar but not the same. Normally, joins work on exact matches but the fuzzy join does not require this. Before we use this function we have to prepare some data. We will modify the “Titanic” dataset to run this example. The “Titanic” dataset is a part of R by default and there is no need for any packages. Below is the code for the data preparation.

Titanic_1<-as.data.frame(Titanic)
Titanic_2<-as.data.frame(Titanic)

Titanic_1$Sex<-str_to_lower(Titanic_1$Sex)
Titanic_1$Age<-str_to_lower(Titanic_1$Age)

Here is what we did.

We saved two copies of the “Titanic” dataset as dataframes. This was done because the fuzzy join function needs dataframes.
Next, we made clear differences between the two datasets. For “Titanic_1” we lowercase the sex, and age columns so that there was not an exact match when joining these two dataframes with the fuzzy join function.

We will now use the fuzzy join function. Below is the code followed by the output.

stringdist_join(
  Titanic_1,
  Titanic_2,
  by = c("Age" = "Age","Sex"="Sex"),
  method = "lv",
  max_dist = 1,
  distance_col = "distance"
)

The stringdist_join() function is used to perform the fuzzy join. “Titanic_1” is the x dataset and “Titanic_2” is the y dataset. The “by” argument tells the function which columns are being used in the join. The “method” argument indicates how the distance is calculated between the rows in each dataset. The “max_dist” argument is the criteria by which a join is made. In other words, if the distance is greater than one no join will take place. Lastly, the “distance_col” argument creates new columns that show the distance between the compared columns.

The output was a full join. All columns from both datasets are present. The columns with “.x” are from the “Titanic_1” while the columns with “.y” are from “Titanic_2”. The “.distance” column tells us the difference when that row of data was compared from each dataset. For example, in row 1 the “Age.distance” is 1. This means that the difference in “Age.x” and “Age.y” is 1. The only difference is that “Age.x” is lowercase while “Age.y” is capitalized.

Conclusion

The tools mentioned here allow you to match data that is different with a clear metric of the difference. This can be powerful when you have to join data that does not have a matching column in both datasets. Therefore, there is a place for the tools in the life of any data analyst who deals with fuzzy data like this.

Using glue_collapse() in R VIDEO

Leave a reply

This video will provide examples of how to use the glue_collapse() function from the glue package in R. This function is often used for dealing with strings and we will see a few other tricks as well with this function

Glue Collapse in R

Leave a reply

The glue_collapse() function is another powerful tool from the glue package. In this post, we will look at practical ways to use this function.

Collapsing Text

As you can probably tell from the name, the glue_collapse() function is useful for collapsing a string. In the code below, we will create an object containing several names and use glue_collapse to collapse the values into one string.

> library(glue)
> many_people<-c("Mike","Bob","James","Sam")
> glue_collapse(many_people)
MikeBobJamesSam

In the code above we called the glue package. Next, we created an object called “many_people” which contains several names separated by commas. Lastly, we called the glue_collapse() function which removes the quotes and commas from the string.

Separate & Last

Another great tool of the glue_collapse() function is the ability to separate strings and have a specific last argument. This technique helps to make the output highly readable. Below is the code

> glue_collapse(many_people,sep = ", ", last = ", and ")
Mike, Bob, James, and Sam

In the code above we separate each string with a comma followed by a space in the “sep” argument. The “last” argument tells R what to do before the last word in the string. In this example, we have a comma followed by a space and the word and.

Collapse a Dataframe

The glue_collapse() function can also be used with data frames. In the example, below we will take a column from a dataframe and collapse it.

> head(iris$Species)
[1] setosa setosa setosa setosa setosa setosa
Levels: setosa versicolor virginica
> glue_collapse(iris$Species)
setosasetosasetosasetosasetosasetosaseto.....

In the code above, we first take a look at the “Species” column from the “iris” dataset using the head() function. Next, we use the glue_collapse() function in the “Species” column. YOu can see how the rows are all collapsed into one long string in this example.

glue and glue_collapse working together

You can also use the glue() and glue_collapse function together as a team.

> glue(
+   "Hi {more_people}.",
+   more_people = glue_collapse(
+     many_people,sep = ", ",last = ", and "
+   )
+ )
Hi Mike, Bob, James, and Sam.

This code is a little bit more complicated but here is what happened.

On the outside, we start with the glue() function in the first line.
Inside the glue() function we create a string that contains the word Hi and a temporary variable called “more_people”.
Next, we define the temporary variable “more_people with the help of the glue_collapse() function.
Inside the glue_collapse() function we separate the strings inside the “many_people” object.

As you can see, the use of the glue_collapse() and glue() functions can be endless.

Conclusion

The glue_collapse() function is another useful tool that can be used with text data. Knowing what options are available for data analysis makes the process much easier.

Using Glue in R VIDEO

Leave a reply

The glue package provides a lot of great tools for working with strings in R. In this video, we will focus on one particular function within this package that is highly useful for various purposes.

Using Glue in R

Leave a reply

The glue package in R provides a lot of great tools for using regular expressions and manipulating data. In this post, we will look at examples of using just the glue() function from this package.

Paste vs Glue

The paste() function is an older way of achieving the same things that we can achieve with the glue() function. paste() allows you to combine strings. Below we will load our packages and execute a command with the paste() function.

> library(glue)
> library(dplyr)

> people<-"Dan"
> paste("Hello",people)
[1] "Hello Dan"

In the code above, we load the glue and the dplyr package (we will need dplyr later). We then create an object called “people” that contains the string “Dan”. We then used the past function to combine the “people” vector with the string “Hello”. The output is at the bottom of the code.

Below is an example of the same output but using the glue() function

> glue("Hello {people}")
Hello Dan

Inside the glue() function everything is inside parentheses. However, the object “people” is inside curly braces and this indicates to the glue() function to look for what “people” represents. The printout is the same but without parentheses.

Multiple Strings

Below is an example of including multiple strings in the same glue() function

> people<-"Dan"
> people_2<-"Darrell"
> glue("Hello {people} and {people_2}")
Hello Dan and Darrell

In the first two lines above we make our objects. In line 3 we used the glue() function again and inside we included both objects in curly braces.

In another example using multiple strings we will replace text if it meets a certain criteria.

> people<-"Dan"
> people_2<-NA
> glue("Hi {people} and {people_2}",.na="What")
Hi Dan and What

In the code above we start by creating two objects. The second object (people_2) has stored NA. The code in the third line is the same with the exception of the “.na” argument. The “.na” argument is set to the string “What” which tells R to replace any NA values with the string “What”. The output is in the final line.

Temporary Variables

It is also possible to make variables that are temporary. The temporary variable can be named or unnamed. Below is an example with a named variable.

> glue("Dan is {height} cm tall.",height=175)
Dan is 175 cm tall.

The temporary variable “height” is inside the curly braces. The value for “height” is set inside the function to 175.

It is also possible to have unnamed variables inside the function. Below we will use a function inside the curly braces.

> glue("The average number is {mean(c(2,3,4,5))}")
The average number is 3.5

The example is self-explanatory. We used the mean() function inside the curly braces to get a calculated value. As you can see, the potential is endless.

Using Dataframes

In our last example, we will see how you can create a data frame and using input from one column to create a new column.

> df<-data.frame(column_1="Dan")
> df
  column_1
1      Dan
> df %>% mutate(new_column = glue("Hi {column_1}"))
  column_1 new_column
1      Dan     Hi Dan

Here is what we did.

We made a dataframe called df. This dataframe contains one column called column_1. In column_1 we have the string Dan.
In line 2 we display the values of the dataframe.
Next, we use the mutate() function to create a new column. Inside the mutate function we use the glue function and set it to create a string that uses the word “Hi” in front of the values of column_1.
Lastly, we print out the results.

Conclusion

The glue package provides many powerful tools for manipulating data. The examples provided here only focus on one function. As such, this package is sure to provide useful ways in which data analyst can modify their data.

Confusing Words for Small Children VIDEO

Leave a reply

Communicating with children is always difficult. However, sometimes it is the adult’s fault that children do not understand. The video below provides examples of terms adults love to use that can be hard for children to understand.

Regular Expression with R

Leave a reply

Regular expressions are used for a variety of reasons. One of the main reasons is for finding data in your dataset that meets specific criteria. In this post, we will use regular expressions for several different purposes.

Initial Setup

The only package we need is the stringr package. We will also create a vector of names that will serve as our data for the first few examples. Below is the code.

library(stringr)
people<-c("Bob","Brad","Dan","Jason","Tony","Tom")

Commonly Used Symbols

We are going to use the people vector for our data. The first function we will use is the str_detect() function. This function detects strings within your data that meet your criteria. The str_detect() function takes the data as the first argument and then a pattern for the second argument.

What we are going to do is subset the people vector using str_detect(). We want to find all words that start with the letter B. To tell R to look for words that start with by we must use the caret (^) symbol in front of the letter B in the pattern argument. Below is the code and the output.

> people[str_detect(people,pattern = "^B")]
[1] "Bob"  "Brad"

The code starts with the vector people. Next, we place all of the code for searching inside brackets. The brackets are used in this example for subsetting the data or for finding data that meets our criteria. Inside the brackets, we are using the str_detect() function. Inside the function is the data we are subsetting followed by the pattern argument. Inside the quotes, we have the caret symbol which means “at the beginning” followed by the letter B. Our output shows the two words that meet this criteria.

The caret symbol is used to indicate finding letters at the beginning. However, the dollar sign “$” is used to find letters at the end of a string. Below is the code and output for this symbol.

> people[str_detect(people,pattern = "n$")]
[1] "Dan"   "Jason"

The code is mostly the same as in the previous example. The only difference is the pattern which shows we want words that end with the letter “n”. The output shows two words that meet this criteria.

The next symbol we will learn is the period “.”. This is used when you want to find strings that have a particular word character anywhere inside the string. Below is the code and output.

> people[str_detect(people,pattern = "a.")]
[1] "Brad"  "Dan"   "Jason"

Again, the only difference is the pattern. We told R we want to find any words that have the letter “a” inside. By using the period we found three words that match this criteria.

Multiple Criteria

All of the previous examples were limited to looking for one character. However, there are several different shortcuts that allow you to look for multiple criteria when using regular expressions. For the next examples, we need to make a different vector of data and we will now be using the str_match_all() function which will find all strings that meet are criteria.

In the code below, we create a new vector that has words and numbers as data. Next, we will use the str_match_all() function to find all strings that contain numbers. To find numbers we will use the “\\d” expression.

> people_and_numbers<-c("Bob","Brad","Dan",1,2,3)
> str_match_all(people_and_numbers,"\\d")
[[1]]
     [,1]

[[2]]
     [,1]

[[3]]
     [,1]

[[4]]
     [,1]
[1,] "1" 

[[5]]
     [,1]
[1,] "2" 

[[6]]
     [,1]
[1,] "3"

The output is a little strange. The actual output is a list. Since there are six strings in our original vector there are six items in our list. The first three items in the list contain nothing because the first three entries in our vector do not contain any numbers. The last three items in the list each contain a number because these are the numbers contained in the original vector.

The next expression we will learn is for finding word characters, which is the “\\w” expression. This expression will find any word character or number. Below is an example.

> str_match_all(people_and_numbers,"\\w")
[[1]]
     [,1]
[1,] "B" 
[2,] "o" 
[3,] "b" 

[[2]]
     [,1]
[1,] "B" 
[2,] "r" 
[3,] "a" 
[4,] "d" 

[[3]]
     [,1]
[1,] "D" 
[2,] "a" 
[3,] "n" 

[[4]]
     [,1]
[1,] "1" 

[[5]]
     [,1]
[1,] "2" 

[[6]]
     [,1]
[1,] "3"

Notice how the output splits apart of the characters in each word. Besides this, the output is to be expected.

We can also indicate that we want only letters. This is done by using brackets and dashes. below is the code and output.

> str_match_all(people_and_numbers,"[A-Za-z]")
[[1]]
     [,1]
[1,] "B" 
[2,] "o" 
[3,] "b" 

[[2]]
     [,1]
[1,] "B" 
[2,] "r" 
[3,] "a" 
[4,] "d" 

[[3]]
     [,1]
[1,] "D" 
[2,] "a" 
[3,] "n" 

[[4]]
     [,1]

[[5]]
     [,1]

[[6]]
     [,1]

The output is mostly the same. The first three words are split apart. However, the last three items are empty because the numbers do not contain letters.

You can put almost anything inside the brackets. In the example below, we are only looking for vowels.

> str_match_all(people_and_numbers,"[aeiou]")
[[1]]
     [,1]
[1,] "o" 

[[2]]
     [,1]
[1,] "a" 

[[3]]
     [,1]
[1,] "a" 

[[4]]
     [,1]

[[5]]
     [,1]

[[6]]
     [,1]

Now only the items that contain vowels are included in the list.

Conclusion

These are just some of the amazing things that regular expression can allow you to do. Whenever you need to wrestle with text it is important to remember how regular expressions can help you.

Regular Expressions with R VIDEO

Leave a reply

The video below will provide examples of using regular expressions with R. Regular expressions are a great tool for finding various information within strings.

Defining a Program

Leave a reply

Programs play a critical role in providing services for individuals in need. In this post, we will look at what a program is and the traits that are often associated with a good program.

Definition

A program is an organized assortment of activities that are designed to achieve specific objectives. For example, a school might put together a math tutoring program. Within this program, there may be activities such as one-on-one tutoring, group tutoring, and peer support. All of these activities are being used to improve participants’ math ability to grade level.

A key point about programs is that the activities are not random but intentional. In other words, the program developers have the end in mind when they select the activities that they will use.

Good Program

There are several characteristics of a good program as well. Good programs will usually have individual(s) who are directly responsible for the success of the program. These people are usually local staff who are dedicated to the implementation of the activities associated with the program. For our math tutoring program, it would be necessary to place someone in charge and to find tutors as needed to support the students

Not only do good programs have committed staff good programs also have dedicated financial support in the form of a budget. Money must be set aside to implement the program. This can include paying staff, purchasing equipment/software, and other necessary items. For math tutoring, money may be needed for finding a location, advertising the program, paying tutors, purchasing supplies, etc.

Successful programs also have an identity. A program’s identity is the level of visibility it has with the public. Some programs have a national or international identity such as the United Nations Development Programme (UNDP). The level of identity depends on the purpose of the program. Many programs need a level of identity that reaches a neighborhood or city.

Associated with the program identity is the service philosophy. The service philosophy is the program’s beliefs of who should be served from the target population. Some programs believe they should not turn anybody away while others do not. As an example, our math program might only service students who are one grade below grade level and no more than two grades below as a maximum. A service philosophy can help a program to focus on who it is they are trying to serve.

Conclusion

The first step in developing a program is to have a clear idea of what a program is and what makes a program good or bad. This post provides a definition of a program as well as what are the characteristics of a good program. With this foundational knowledge, you can take the first step in developing a program that can help people in need.

How Teachers Address Parental Resistance

Leave a reply

Parents are viewed as gatekeepers for their children. For teachers, who have certain ideas and values they want to share, the gatekeepers can help or hinder this process. If the parents provide resistance, some teachers may see them as supporters of the status quo rather than as defenders of the underrepresented and marginalized. In such situations, it leads to a question of who should prevail.

The difference in values between parents and teachers can lead to this struggle over whose values should be shared or taught in the classroom. The metric for determining what is right or wrong is often measured through a critical lens for teachers, which means looking for who has and does not have power and or who is representing the powerless and the powerful. If a teacher is convinced that they stand with the oppressor and the parents do not, a teacher may believe that their values and beliefs are of a higher moral character than the parents (by being more inclusive/respectful). When this happens, the teacher may be convenience that subverting parental values may be necessary by any reasonable means.

Goals of Queer Teachers

The goal of many teachers is to directly disrupt social norms. Often these teachers are inspired by Queer theory or any other critical-inspired belief system, which essentially states that societal norms exclude people who do not conform to existing norms from full participation in society. Therefore, the liberation of these oppressed individuals can only happen when norms are destroyed. Of course, there is no safe space for people who disagree with the idea of a world without norms. People who cannot function in a world without norms would now be just as oppressed as the current people who cannot conform to the existing norms of society. Funnier still, having no norms is a social norm in itself which means there is no such thing as a normless society.

Queer-inspired teachers challenge almost everything. They are against the idea that heterosexual relations are normal (heternormativity). They are even against the idea that homosexual relations are normal (homonormativity). The reason for this is that the war of the queer is against whatever is normal.

Queer-inspired teachers are also against the idea of childhood innocence concerning sexuality. Inspired by Alfred Kinsey’s research, proponents of this believe that children are sexual beings from birth and should be treated as such. This is one reason for the increased introduction of sexual topics to children at younger and younger ages in schools because this is intended to be liberating.

Many teachers are also focused on investigating multiple viewpoints (as there is no objective truth). The focus is also on political problems to stir angst about injustice through the abusive norms that marginalized individuals and groups. From all of this, the goal is to encourage social action against the current structure and function of society.

How to Address Parental Challenges

To raise normless revolutionaries, teachers have had to find ways to bring their values into the classroom without raising the concerns of gatekeeping parents. One approach that has proven to be successful is inserting controversial ideas into a broader, vague curriculum.

For example, a curriculum may be focused on problem-solving, which is a vague topic to address. During such a curriculum, topics on sexuality, racism, and or classism are covered from a perspective of problem-solving. If parents object the teacher can point to the problem-solving emphasis of the curriculum while sharing norm-busting values with the students.

Another way this tactic is used is through inserting side topics from a main curricular topic such as speaking on sexual relationships during a history lesson. Another strategy is using project-based learning which can incorporate almost anything.

The focus is to make sure the controversial material is not taught in isolation but in connection with something that is considered acceptable. This is similar to the wolf in sheep’s clothing analogy. Bad ideas mixed with good do more damage than bad ideas in isolation. Whenever a teacher is attacked about controversial stuff (ie sexuality) they can retreat to the main “theme” of the curriculum such as problem solving.

Accommodation is another strategy. In this situation, when the parents complain the teacher acknowledges their concern and states that their child does not have to participate. When controversial information is being taught the child is removed from the classroom. This is essentially an isolation technique that may frustrate the child. When isolated, the child may believe they are missing out and that the main problem is their parents which can drive a wedge between them. The weakness of this approach is that too many kids may need accommodation. This can shut down the teacher’s plans as too many kids cannot be accommodated.

Dialog is the final strategy here. With this approach, the teacher hears the concerns of the parent but doesn’t change anything. The teacher explains things to the parents, stands by their subject matter expertise, and explains how teaching this material prevents the horrors that happen to marginalized people.

Conclusion

The end game is the same. Find a way to win over the parents or to work around them. Parents who resist these values are the ones who need to change in the eyes of these teachers. Even though they believe in freedom it is only a place in which their values are accepted rather than any other.

Goals, Objectives, and Evaluation

Leave a reply

In program evaluation, goals, objectives, and the evaluation process work together to provide a team with insights into the success of a program. In this post, we will look at the synergy between these concepts and how they help evaluators of programs.

Goals & Objectives

Goals are long-term ideas that provide a general sense of direction for a program. Usually, goals are not measurable or achievable but rather serve an inspirational purpose in shaping the direction of a program. An example of a goal for a project might be

Increase the degree of reading comprehension among young children in north Texas

The goal above is a goal because it lacks the details of knowing when this goal is achieved. What does “degree” mean or how much “increase” is necessary? How are “young children” defined? How much time does the program have to achieve any of this? All of these questions and more are addressed when developing objectives.

Objectives are short-term, measurable, achievable, and set guidelines for the type of intervention that a program will provide. Objectives provide the details that are missing from goals. There are different acronyms for developing objectives such as SMART (Specific, Measurable, Achievable, Relevant, and Time-Bound). Another format for objectives that is used in curriculum development is action, condition, and proficiency. Below is an example of an objective that is derived from the example goal mentioned earlier.

By the end of the semester, minority students in the 5th grade class will improve their reading comprehension one grade level through using the reading lab software.

The objective above specifies a clear context (End of semester, 5th-grade minority students). The objective also provides the action or what the students will be doing (using reading software). Lastly, there is a clear sense of knowing when success takes place (one grade-level improvement in reading comprehension). It is also important to show that this objective is linked to the original goal of increasing reading comprehension.

Types of Objectives

Within program evaluation, three types of objectives can be developed. These are process, outcome, and impact objectives.

Process objectives define which activities will be carried out during a program. Process objectives provide evidence that the program did what it planned to do. An example is below.

Enroll all minority students into the reading lab be the end of the first month of school.

It may seem silly to make such an objective but doing so helps to keep the program on track and to make interventions if the objectives are not achieved in the timeline that was set.

Outcome objectives measure the results of an intervention and answer the question “How well did we do?” Below is an example

By the end of the semester, 90% of the minority students in the 5th grade class will improve their reading comprehension one grade level through using the reading lab software.

The objective above is similar to others but now it has a clear metric for success stipulating the 90% threshold. This objective value helps to determine how well the program did in helping the students.

The last type of objective used in program evaluation is the impact objective. This objective measures the collective results of an intervention and answers the question “So what?” Below is an example.

At the end of the semester, the students will share what they think of the reading program

The objective above is one way in which the overall impact of the program can be assessed by determining what the target population thinks of the intervention.

Evaluation plan

The evaluation plan is linked with the objectives. The evaluation plan assesses the achievement of the program through the use of the results of the objectives. Just as there are three types of objectives there are also three types of data that are collected for an evaluation and these are process, outcome, and impact data.

Process data is documentation of the implementation of the strategies of the program and assesses what happened. Examples of process data can include a spreadsheet showing the number of kids who were enrolled in the reading lab. Such documentation shows that the process of enrolling the kids was completed.

Outcome data is a measure of the success or failure of a program. An example of outcome data would be a spreadsheet showing how many kids were able to improve one grade level in their reading comprehension from the use of the reading lab.

Lastly, impact data is data for the impact objective. An example would be the results of the survey that measures students’ opinions of the reading lab.

Conclusion

What was learned here was the cooperation that needs to take place between goals, objectives, and the evaluation process. When these concepts are working together it can benefit all stakeholders of a particular program.

Critical Race Theory as Defined in Education

Leave a reply

This post will summarize Gloria Ladson-Billings’ critical “Just What is Critical Race Theory and What’s it Doing in a Nice Field like Education?” written in 1998 for the journal “Qualitative Studies in Education.”

Definition

Critical race theory grew out of critical legal studies. Critical legal studies attempted to move the focus of legal scholarship away from doctrinal and policy analysis to a focus on groups in cultural and social contexts. What critical race theory did that was unique was to focus primarily on race instead of other groups such as gender, class, etc. A criticism of critical race theory was its obsession only with race rather than looking at injustice in broader ways.

When dealing with ideas such as critical race theory it is impossible to find consensus on what it is about. However, according to Ladson-Billoings critical race theory has some of the following tenants.

Racism is normal in the US
Racial reform through traditional means is too slow and thus
there is a need for radical reform

Race is the main idea discussed within critical race theory. However, race is not just one’s appearance or genetic phenotype. Ladson-Billings states this because who is considered white has changed throughout US history. For example, Mexicans at one time were considered white. Therefore, there is more to race than biology as race is also a social construct. Essentially, one goal of critical race theory is to break the subordination of blacks to whites by changing the dynamics of law and power even though what is defined as white has been fluid throughout history.

For critical race theory scholars, a major problem with America is that “Whiteness” is positioned as normative and everyone is categorized or ranked according to how well they align with the norms of this culture and people group. For example, a black man who goes to college, speaks American English, and dresses in a suit and tie is more aligned with being “being” than a black man who dropped out of high school, uses slang, and wears baggy clothes. However, even the black man who conforms to “whiteness” is a second-class citizen to a person who has the appearance of being white while having the behavior of the unsuccessful black man.

The goal of critical race theory is to deconstruct, reconstruct, and construct equitable power by exposing the injustice of “whiteness” as normative. All of the critical theories do this with the difference being from what angle. Critical race theory attacks race, queer studies attack everything that is normative, fat studies attack norms around weight, etc.

Traditional means of reforming the system are moving too slowly for critical race theorists. Therefore, they want rapid and radical reform. This is a polite way of saying revolution which is also at the heart of all Marxist’s derived philosophies. By stirring up racial frustration it is possible to radicalize people so that they push or cause rapid changes in the system.

Ladson-Billings also discusses the use of storytelling within critical race theory. Storytelling allows the speaker to name their reality and connect emotionally with the listener. Notice how there is no mention of reasoning or thinking as these are Western forms of communication. Sharing emotional stories of how individuals have suffered under racism helps to shame oppressors and elicit anger from people who are not considered “white.” It is difficult to refute the lived experience of someone who has experienced racism without sounding harsh and callous. It is also difficult to dispute the claims of individuals since one cannot fact-check them.

Examples of Race Relations

Ladson-Billings also shares that white people were the main beneficiaries of the Civil Rights movement. She supports this claim with the example of how anti-discrimination laws benefit white women first before people of color. Allowing white women to get jobs first helped their families which were probably also white.

Another example is Brown V Board of Education. Ladson-Billings states that this court ruling benefits whites by stopping the spread of communism in the USA mong frustrated blacks, it also reassured black WW II veterans of their place in society. To be fair the Soviet Union used to point out the racism in the US during the Cold War.

CRT and Citizenship

The latter half of the article focuses on “whiteness” as property. This argument is not unique to this article. Ladson-Billings’ point is that the US is built on property rights and not individual rights. A person was free because of property ownership and not because of self-worth. This is a problem because blacks did not have property but were rather considered property. Therefore, over time, “whiteness” becomes a form of property that provides privileges that others do not have.

Ladson-Billings then provides examples of how non-whites are pushed to the sides. Within the curriculum, black stories have traditionally been missing in place of the status quo. Another focus has been on supporting a colorblind perspective which may be something no critical race scholar would agree with. Lastly, there is an emphasis on critical thinking, reasoning, and logic in Western schools that discounts other ways of knowing.

When it comes to learning in the classroom black students are often seen as deficient. However, Ladson-Billings argues that this is due to poor curriculum and teaching. Another major problem has been school funding. Schools receive money from local property taxes. Therefore, schools in nicer neighborhoods have more tax dollars available. For Ladson-billings, this is unfair and a form of oppression.

Ladson-Billings ends the article with some warnings. First, she warns against letting critical race theory become watered down like cooperative learning and multicultural education. Cooperative learning was originally about helping students of color perform better but it was eventually reduced to workshops and lesson plans without regard to race. Multicultural education was originally about reconstructing society and examining the contradictions within it. This too was reduced by singing ethnic songs and eating foreign foods.

A much more interesting warning Ladson-Billings made was to protect critical race theory from becoming a tool of the radical left. This warning was not heeded and the political left has used critical race theory to stir up their base and to galvanize society in ways that seem prophetic after examining Ladson-Billings’ warning from the late 1990’s.

Conclusion

Ladson-Billings article provides a great overview of critical race theory and some main tenets and beliefs. The merit of this belief system is left to the individual to judge.

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: