# Principal Component Analysis in R

This post will demonstrate the use of principal component analysis (PCA). PCA is useful for several reasons. One it allows you place your examples into groups similar to linear discriminant analysis but you do not need to know beforehand what the groups are. Second, PCA is used for the purpose of dimension reduction. For example, if you have 50 variables PCA can allow you to reduce this while retaining a certain threshold of variance. If you are working with a large dataset this can greatly reduce the computational time and general complexity of your models.

Keep in mind that there really is not a dependent variable as this is unsupervised learning. What you are trying to see is how different examples can be mapped in space based on whatever independent variables are used. For our example, we will use the “Carseats” dataset form the “ISLR”. Our goal is to understanding the relationship among the variables when examining the shelve location of the car seat. Below is the initial code to begin the analysis

library(ggplot2)
library(ISLR)
data("Carseats")

We first need to rearrange the data and remove the variables we are not going to use in the analysis. Below is the code.

Carseats1<-Carseats
Carseats1<-Carseats1[,c(1,2,3,4,5,6,8,9,7,10,11)]
Carseats1$Urban<-NULL Carseats1$US<-NULL

Here is what we did 1. We made a copy of the “Carseats” data called “Careseats1” 2. We rearranged the order of the variables so that the factor variables are at the end. This will make sense later 3.We removed the “Urban” and “US” variables from the table as they will not be a part of our analysis

We will now do the PCA. We need to scale and center our data otherwise the larger numbers will have a much stronger influence on the results than smaller numbers. Fortunately, the “prcomp” function has a “scale” and a “center” argument. We will also use only the first 7 columns for the analysis  as “sheveLoc” is not useful for this analysis. If we hadn’t moved “shelveLoc” to the end of the dataframe it would cause some headache. Below is the code.

Carseats.pca<-prcomp(Carseats1[,1:7],scale. = T,center = T)
summary(Carseats.pca)
## Importance of components:
##                           PC1    PC2    PC3    PC4    PC5     PC6     PC7
## Standard deviation     1.3315 1.1907 1.0743 0.9893 0.9260 0.80506 0.41320
## Proportion of Variance 0.2533 0.2026 0.1649 0.1398 0.1225 0.09259 0.02439
## Cumulative Proportion  0.2533 0.4558 0.6207 0.7605 0.8830 0.97561 1.00000

The summary of “Carseats.pca” Tells us how much of the variance each component explains. Keep in mind that number of components is equal to the number of variables. The “proportion of variance” tells us the contribution each component makes and the “cumulative proportion”.

If your goal is dimension reduction than the number of components to keep depends on the threshold you set. For example, if you need around 90% of the variance you would keep the first 5 components. If you need 95% or more of the variance you would keep the first six. To actually use the components you would take the “Carseats.pca$x” data and move it to your data frame. Keep in mind that the actual components have no conceptual meaning but is a numerical representation of a combination of several variables that were reduce using PCA to fewer variables such as going form 7 variables to 5 variables. This means that PCA is great for reducing variables for prediction purpose but is much harder for explanatory studies unless you can explain what the new components represent. For our purposes, we will keep 5 components. This means that we have reduce our dimensions from 7 to 5 while still keeping almost 90% of the variance. Graphing our results is tricky because we have 5 dimensions but the human mind can only conceptualize 3 at the best and normally 2. As such we will plot the first two components and label them by shelf location using ggplot2. Below is the code scores<-as.data.frame(Carseats.pca$x)
pcaplot<-ggplot(scores,(aes(PC1,PC2,color=Carseats1$ShelveLoc)))+geom_point() pcaplot From the plot you can see there is little separation when using the first two components of the PCA analysis. This makes sense as we can only graph to components so we are missing a lot of the variance. However for demonstration purposes the analysis is complete. Advertisements # Writing as a Process or Product In writing pedagogy, there are at least two major ways of seeing writing. These two approaches see writing as a process or as a product. This post will explain each along with some of the drawbacks of both. Writing as a Product Writing as a product entailed the teacher setting forth standards in terns of rhetoric, vocabulary use, organization, etc. The students were given several different examples that could be used as models form which to base their own paper. The teacher may be available for one-on-one support but this was not necessarily embedded in the learning experience. In addition, the teacher was probably only going to see the finally draft. For immature writers, this is an intimidating learning experience. To be required to develop a paper with only out of context examples from former students is difficult to deal with. In addition, without prior feedback in terms of progress, students have no idea if they are meeting expectations. The teacher is also clueless as to student progress and this means that both students and teachers can be “surprised” by poorly written papers and failing students. The lack of communication while writing can encourage students to try and overcome their weaknesses through plagiarism. This is especially true for ESL students who lack the mastery of the language while also often having different perspectives on what academic dishonesty is. Another problem is the ‘A’ students will simply copy the examples the teacher provided and just put in their own topic and words in it. This leads to an excellent yet mechanical paper that does not allow the students to develop as writers. In other words the product approach provide too much support for strong students and not enough support for weak ones. Writing as a Process In writing as a process, the teacher supports the student through several revisions of a paper. The teacher provides support for the develop of ideas, organization, coherency, and other aspects of writing. All this is done through the teacher providing feedback to the student was well as dealing with any questions and or concerns the student may have with their paper. This style of writing teaching helps students to understand what kind of writer they are. Students are often so focused on completing writing assignments that they never learn what their tendencies and habits as a writer our. Understanding their own strengths and weaknesses can help them to develop compensatory strategies to complete assignments. This can of self-discovery can happen through one-on-one conferences with the teacher. Off course, such personal attention takes a great deal of time. However, even brief 5 minutes conferences with students can reap huge rewards in their writing. It also saves time at the end when marking because you as the teacher are already familiar with what the students are writing about and the check of the final papers is just to see if the students have revised their paper according to the advice you gave. The process perspective give each student individual attention to grow as individual. ‘A’ students get what they need as well as weaker students. Everyone is compared to their own progress as a writer. Conclusion Generally, the process approach is more appropriate for teaching writing. The exceptions being that the students are unusually competent or they are already familiar with your expectations from prior writing experiences. # Discourse Markers and ESL Discourse markers are used in writing to help organize ideas. They are often those “little words” that native speakers use effortlessly as they communicate but are misunderstood by ESL speakers. This post will provide examples of various discourse markers. Logical Sequence Logical sequence discourse markers are used to place ideas in an order that is comprehensible to the listener/reader. They can be summative for concluding a longer section or resultative which is used to indicate the effect of something. Examples of summative discourse markers includes • overall, to summarize, therefore, so far An example of summarize discourse markers is below. The bold word is the marker. Smoking causes cancer. Studies show that people who smoke have higher rates of lung, esophagus, and larynx. Therefore, it is dangerous to smoke. The paragraph is clear. The marker “Therefore” is summarizing what was said in the prior two sentences. Examples of resultative discourse markers includes the following • so, consequently, therefore, as a result An example of resultative discourse markers is below. The bold word is the marker. Bob smoked cigarettes for 20 years. As a result,he developed lung cancer Again, the second sentence with the marker “As a result” explain the consequence of smoking for 20 years. Constrastive Constrastive markers are words that indicate that the next idea is the opposite of the previous idea. There are three ways that this can be done. Replacive share an alternative idea, antithetic markers share ideas in opposition to the previous one. Lastly, concessive markers share unexpected information given the context. Below are several words and or phrases that are replacive markers • alternatively, on the other hand, rather Below is an example of a replacive contrast marker used in a short paragraph. Bold word is the replacive Smoking is a deadly lifestyle choice. This bad habit has killed millions of people. On the other hand, a vegetarian lifestyle has been found to be beneficial to the health of many people Antithetic markers include the following • conversely, instead, by contrast Below is an example of antithetic marker used in a paragraph A long and healthy life is unusually for those who choose to smoke. Instead, people who smoke live lives that are shorter and more full of disease and sickness. Concsessive markers includes some of the words below • In spite of, nevertheless, anyway, anyhow Below is an example of a concessive marker used in a paragraph Bob smoked for 20 years. In spite of this, he was an elite athlete and had perfect health. Conclusion Discourse markers play a critical role in communicating the finer points of ideas hat are used in communication. Understanding how these words are used can help ESL students in comprehending what they hear and read. # Developing a Data Analysis Plan It is extremely common for beginners and perhaps even experience researchers to lose track of what they are trying to achieve or do when trying to complete a research project. The open nature of research allows for a multitude of equally acceptable ways to complete a project. This leads to an inability to make decision and or stay on course when doing research. One way to reduce and eliminate the roadblock to decision making and focus in research is to develop a plan. In this post we will look at one version of a data analysis plan. Data Analysis Plan A data analysis plan includes many features of a research project in it with a particular emphasis on mapping out how research questions will be answered and what is necessary to answer the question. Below is a sample template of the analysis plan. The majority of this diagram should be familiar to someone who has ever done research. At the top, you state the problem, this is the overall focus of the paper. Next comes the purpose, the purpose is the over-arching goal of a research project. After purpose comes the research questions. The research questions are questions about the problem that are answerable. People struggle with developing clear and answerable research questions. It is critical that research questions are written in a way that they can be answered and that the questions are clearly derived from the problem. Poor questions means poor or even no answers. After the research questions it is important to know what variables are available for the entire study and specifically what variables can be used to answer each research question. Lastly, you must indicate what analysis or visual you will develop in order to answer your research questions about your problem. This requires you to know how you will answer your research questions Example Below is an example of a completed analysis plan for simple undergraduate level research paper In the example above, the student want to understand the perceptions of university students about the cafeteria food quality and their satisfaction with the university. There were four research questions, a demographic descriptive question, a descriptive question about the two main variables, a comparison question, and lastly a relationship question. The variables available for answering the questions are listed of to the left side. Under that, the student indicates the variables needed to answer each question. For example, the demographic variables of sex, class level, and major are needed to answer the question about the demographic profile. The last section is the analysis. For the demographic profile the student found the percentage of the population in each sub group of the demographic variables. Conclusion A data analysis plan provides an excellent way to determine what needs to be done to complete a study. It also helps a researcher to clearly understand what they are trying to do and provides a visuals for those who the research wants to communicate with about the progress of a study. # Developing Purpose to Improve Reading Comprehension Many of us are familiar with the experience of being able to read almost anything but perhaps not being able to understand what it is that we read. As the ability to sound out words becomes automatic there is not always a corresponding increase in being able to comprehend text. It is common, especially in school, for students to be required to read something without much explanation. For more mature readers, what is often needed is a sense of purpose for reading. In this post, we will look at ways to develop a sense of purpose in reading. Purpose Provides Motivation Students who know why they are reading know what the are looking for while reading. The natural result of this is that students are less likely to get distract by information that is not useful for them. For example, if the teacher tells their students to read “the passage and identifying all of the animals in it and be ready to share tomorrow.” Students know what they are suppose to do (identifying all animals in the passage) and why they need to do it (share tomorrow). the clear directions prevent students from getting distracted by other information in the reading. Providing purpose doesn’t necessarily require the students love and enjoy the rational but it is helpful if a teacher can provide a purpose that is motivating. Different Ways to Instill Purpose In addition to the example above there are several quick ways to provide purpose. • Provide vocabulary list-Having the students search for the meaning of specific words provides a clear sense of purpose and provides a context in which the words appear naturally. However, students often get bogged down with the minutia of the definitions and completely miss the overall meaning of the reading passage. This approach is great for beginning and low intermediate readers. • Identifying the main ideas in the reading-This is a great way to gets students to see the “big picture” of a reading. It is especially useful for short to moderately long readings such as articles and perhaps chapters and useful for intermediate to advanced readers in particular. • Let students develop their own questions about the text-By fair my most favorite strategy. Students will initial skim the passage to get an idea of what it is about. After this, they develop several questions about the passage that they want to find the answer too. While reading the passage, the students answer their own questions. This approach provides opportunities for metacognition as well developing autonomous learning skills. This strategy is for advanced readers who are comfortable with vocabulary and summarizing text. Conclusion Students, like most people, need a raison de faire (reason to do) something. The teacher can provide this, which has benefits. Another approach would be to allow the students to develop their own purpose. How this is done depends on the philosophy of the teacher as well as the abilities and tendencies of the students # Linear Discriminant Analysis in R In this post we will look at an example of linear discriminant analysis (LDA). LDA is used to develop a statistical model that classifies examples in a dataset. In the example in this post, we will use the “Star” dataset from the “Ecdat” package. What we will do is try to predict the type of class the students learned in (regular, small, regular with aide) using their math scores, reading scores, and the teaching experience of the teacher. Below is the initial code library(Ecdat) library(MASS) data(Star) We first need to examine the data by using the “str” function str(Star) ## 'data.frame': 5748 obs. of 8 variables: ##$ tmathssk: int  473 536 463 559 489 454 423 500 439 528 ...
##  $treadssk: int 447 450 439 448 447 431 395 451 478 455 ... ##$ classk  : Factor w/ 3 levels "regular","small.class",..: 2 2 3 1 2 1 3 1 2 2 ...
##  $totexpk : int 7 21 0 16 5 8 17 3 11 10 ... ##$ sex     : Factor w/ 2 levels "girl","boy": 1 1 2 2 2 2 1 1 1 1 ...
##  $freelunk: Factor w/ 2 levels "no","yes": 1 1 2 1 2 2 2 1 1 1 ... ##$ race    : Factor w/ 3 levels "white","black",..: 1 2 2 1 1 1 2 1 2 1 ...
##  $schidkn : int 63 20 19 69 79 5 16 56 11 66 ... ## - attr(*, "na.action")=Class 'omit' Named int [1:5850] 1 4 6 7 8 9 10 15 16 17 ... ## .. ..- attr(*, "names")= chr [1:5850] "1" "4" "6" "7" ... We will use the following variables • dependent variable = classk (class type) • independent variable = tmathssk (Math score) • independent variable = treadssk (Reading score) • independent variable = totexpk (Teaching experience) We now need to examine the data visually by looking at histograms for our independent variables and a table for our dependent variable hist(Star$tmathssk)

hist(Star$treadssk) hist(Star$totexpk)

prop.table(table(Star$classk)) ## ## regular small.class regular.with.aide ## 0.3479471 0.3014962 0.3505567 The data mostly looks good. The results of the “prop.table” function will help us when we develop are training and testing datasets. The only problem is with the “totexpk” variable. IT is not anywhere near to be normally distributed. TO deal with this we will use the square root for teaching experience. Below is the code star.sqrt<-Star star.sqrt$totexpk.sqrt<-sqrt(star.sqrt$totexpk) hist(sqrt(star.sqrt$totexpk))

Much better. We now need to check the correlation among the variables as well and we will use the code below.

cor.star<-data.frame(star.sqrt$tmathssk,star.sqrt$treadssk,star.sqrt$totexpk.sqrt) cor(cor.star) ## star.sqrt.tmathssk star.sqrt.treadssk ## star.sqrt.tmathssk 1.00000000 0.7135489 ## star.sqrt.treadssk 0.71354889 1.0000000 ## star.sqrt.totexpk.sqrt 0.08647957 0.1045353 ## star.sqrt.totexpk.sqrt ## star.sqrt.tmathssk 0.08647957 ## star.sqrt.treadssk 0.10453533 ## star.sqrt.totexpk.sqrt 1.00000000 None of the correlations are too bad. We can now develop our model using linear discriminant analysis. First, we need to scale are scores because the test scores and the teaching experience are measured differently. Then, we need to divide our data into a train and test set as this will allow us to determine the accuracy of the model. Below is the code. star.sqrt$tmathssk<-scale(star.sqrt$tmathssk) star.sqrt$treadssk<-scale(star.sqrt$treadssk) star.sqrt$totexpk.sqrt<-scale(star.sqrt$totexpk.sqrt) train.star<-star.sqrt[1:4000,] test.star<-star.sqrt[4001:5748,] Now we develop our model. In the code before the “prior” argument indicates what we expect the probabilities to be. In our data the distribution of the the three class types is about the same which means that the apriori probability is 1/3 for each class type. train.lda<-lda(classk~tmathssk+treadssk+totexpk.sqrt, data = train.star,prior=c(1,1,1)/3) train.lda ## Call: ## lda(classk ~ tmathssk + treadssk + totexpk.sqrt, data = train.star, ## prior = c(1, 1, 1)/3) ## ## Prior probabilities of groups: ## regular small.class regular.with.aide ## 0.3333333 0.3333333 0.3333333 ## ## Group means: ## tmathssk treadssk totexpk.sqrt ## regular -0.04237438 -0.05258944 -0.05082862 ## small.class 0.13465218 0.11021666 -0.02100859 ## regular.with.aide -0.05129083 -0.01665593 0.09068835 ## ## Coefficients of linear discriminants: ## LD1 LD2 ## tmathssk 0.89656393 -0.04972956 ## treadssk 0.04337953 0.56721196 ## totexpk.sqrt -0.49061950 0.80051026 ## ## Proportion of trace: ## LD1 LD2 ## 0.7261 0.2739 The printout is mostly readable. At the top is the actual code used to develop the model followed by the probabilities of each group. The next section shares the means of the groups. The coefficients of linear discriminants are the values used to classify each example. The coefficients are similar to regression coefficients. The computer places each example in both equations and probabilities are calculated. Whichever class has the highest probability is the winner. In addition, the higher the coefficient the more weight it has. For example, “tmathssk” is the most influential on LD1 with a coefficient of 0.89. The proportion of trace is similar to principal component analysis Now we will take the trained model and see how it does with the test set. We create a new model called “predict.lda” and use are “train.lda” model and the test data called “test.star” predict.lda<-predict(train.lda,newdata = test.star) We can use the “table” function to see how well are model has done. We can do this because we actually know what class our data is beforehand because we divided the dataset. What we need to do is compare this to what our model predicted. Therefore, we compare the “classk” variable of our “test.star” dataset with the “class” predicted by the “predict.lda” model. table(test.star$classk,predict.lda$class) ## ## regular small.class regular.with.aide ## regular 155 182 249 ## small.class 145 198 174 ## regular.with.aide 172 204 269 The results are pretty bad. For example, in the first row called “regular” we have 155 examples that were classified as “regular” and predicted as “regular” by the model. In rhe next column, 182 examples that were classified as “regular” but predicted as “small.class”, etc. To find out how well are model did you add together the examples across the diagonal from left to right and divide by the total number of examples. Below is the code (155+198+269)/1748 ## [1] 0.3558352 Only 36% accurate, terrible but ok for a demonstration of linear discriminant analysis. Since we only have two-functions or two-dimensions we can plot our model. Below I provide a visual of the first 50 examples classified by the predict.lda model. plot(predict.lda$x[1:50])
text(predict.lda$x[1:50],as.character(predict.lda$class[1:50]),col=as.numeric(predict.ldaclass[1:100])) abline(h=0,col="blue") abline(v=0,col="blue") The first function, which is the vertical line, doesn’t seem to discriminant anything as it off to the side and not separating any of the data. However, the second function, which is the horizontal one, does a good of dividing the “regular.with.aide” from the “small.class”. Yet, there are problems with distinguishing the class “regular” from either of the other two groups. In order improve our model we need additional independent variables to help to distinguish the groups in the dependent variable. # Factors that Affect Pronunciation Understanding and teaching pronunciation has been controversial in TESOL for many years. At one time, pronunciation was taught in a high bottom-up behavioristic manner. Students were drilled until they had the appropriate “accent” (American, British, Australian, etc.). To be understood meant capturing one of the established accents. Now there is more of an emphasis on top-down features such as stress, tone, and rhythm. There is now an emphasis on being more non-directive and focus not on the sounds being generate by the student but the comprehensibility of what they say. This post will explain several common factors that influence pronunciation. This common factors include • Motivation & Attitude • Age & Exposure • Native language • Natural ability Motivation & Language Ego For many people, it’s hard to get something done when they don’t care. Excellent pronunciation is often affected by motivation. If the student does not care they will probably not improve much. This is particularly true when the student reaches a level where people can understand them. Once they are comprehensible many students loss interests in further pronunciation development Fortunately, a teacher can use various strategies to motivate students to focus on improving their pronunciation. Creating relevance is one way in which students intrinsic motivation can be developed. Attitude is closely related to motivation. If the students have negative views of the target language and are worried that learning the target language is a cultural threat this will make language acquisition difficult. Students need to understand that language learning does involve learning of the culture of the target language. Age & Exposure Younger students, especially 1-12 years of age, have the best chance at developing native-like pronunciation. If the student is older they will almost always retain an “accent.” However, fluency and accuracy can achieve the same levels regards of the initial age at which language study began. Exposure is closely related to age. The more authentic experiences that a student has with the language the better their pronunciation normally is. The quality of the exposure is the the naturalness of the setting and the actual engagement of the student in hearing and interacting with the language. For example, an ESL student who lives in America will probably have much more exposure to the actual use of English than someone in China. This in turn will impact their pronunciation. Native Language The similarities between the mother tongue and the target language can influence pronunciation. For example, it is much easier to move from Spanish to English pronunciation than from Chinese to English. For the teacher, understanding the sound system’s of your students’ languages can help a great deal in helping them with difficulties in pronunciation. Innate Ability Lastly, some just get it while others don’t. Different students have varying ability to pick up the sounds of another language. A way around this is helping students to know their own strengths and weaknesses. This will allow them to develop strategies to improve. Conclusion Whatever your position on pronunciation. There are ways to improve your students pronunciation if you are familiar with what influences it. The examples in this post provided some basic insight into what affects this. # Tips for Developing Techniques for ESL Students Technique development is the actual practice of TESOL. All of the ideas expressed in approaches and methods are just ideas. The development of a technique is the application of knowledge in a way that benefits the students. This post would provide ideas and guidelines on developing speaking and listening techniques. Techniques should Encourage Intrinsic Motivation When developing techniques for your students. The techniques need consider the goals, abilities, and interest of the students whenever possible. If the students are older adults who want to develop conversational skills heavy stress on reading would be demotivating. This is because reading was not on of the students goals. When techniques do not align with student goals there is a lost of relevance, which is highly demotivating. Of course, as the teacher, you do not always give them what they want but general practice suggest some sort of dialog over the direction of the techniques. Techniques should be Authentic The point here is closely related to the first one on motivation. Techniques should generally be as authentic as possible. If you have a choice between real text and textbook it is usually better to go with real world text. Realistic techniques provide a context in which students can apply their skills in a setting that is similar to the wold but within the safety of a classroom. Techniques should Develop Skills through Integration and Isolation When developing techniques there should be a blend of techniques that develop skill in an integrated manner, such as listening and speaking and or some other combination. There should also be ab equal focus on techniques that develop on one skill such as writing. The reason for this is so that the students develop balanced skills. Skill-integrated techniques are highly realistic but students can use one skill to compensate for weaknesses in others. For example, a talker just keeps on talking without ever really listening. When skills our work on in isolation it allows for deficiencies to be clearly identified and work on. Doing this will only help the students in integrated situations. Encourage Strategy Development Through techniques students need to develop their abilities to learn on their own autonomously. This can be done through having students practice learning strategies you have shown them in the past. Examples include context clues, finding main ideas, identifying facts from opinions etc The development of skills takes a well planned approach to how you will teach and provide students with the support to succeed. Conclusion Understanding some of the criteria that can be used in creating techniques for the ESL classroom is beneficial for teachers. The ideas presented here provide some basic guidance for enabling technique development. # Generalized Additive Models in R In this post, we will learn how to create a generalized additive model (GAM). GAMs are non-parametric generalized linear models. This means that linear predictor of the model uses smooth functions on the predictor variables. As such, you do not need to specific the functional relationship between the response and continuous variables. This allows you to explore the data for potential relationships that can be more rigorously tested with other statistical models In our example, we will use the “Auto” dataset from the “ISLR” package and use the variables “mpg”,“displacement”,“horsepower”,and “weight” to predict “acceleration”. We will also use the “mgcv” package. Below is some initial code to begin the analysis library(mgcv) library(ISLR) data(Auto) We will now make the model we want to understand the response of “accleration” to the explanatory variables of “mpg”,“displacement”,“horsepower”,and “weight”. After setting the model we will examine the summary. Below is the code model1<-gam(acceleration~s(mpg)+s(displacement)+s(horsepower)+s(weight),data=Auto) summary(model1) ## ## Family: gaussian ## Link function: identity ## ## Formula: ## acceleration ~ s(mpg) + s(displacement) + s(horsepower) + s(weight) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 15.54133 0.07205 215.7 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(mpg) 6.382 7.515 3.479 0.00101 ** ## s(displacement) 1.000 1.000 36.055 4.35e-09 *** ## s(horsepower) 4.883 6.006 70.187 < 2e-16 *** ## s(weight) 3.785 4.800 41.135 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.733 Deviance explained = 74.4% ## GCV = 2.1276 Scale est. = 2.0351 n = 392 All of the explanatory variables are significant and the adjust r-squared is .73 which is excellent. edf stands for “effective degrees of freedom”. This modified version of the degree of freedoms is due to the smoothing process in the model. GCV stands for generalized cross validation and this number is useful when comparing models. The model with the lowest number is the better model. We can also examine the model visually by using the “plot” function. This will allow us to examine if the curvature fitted by the smoothing process was useful or not for each variable. Below is the code. plot(model1) We can also look at a 3d graph that includes the linear predictor as well as the two strongest predictors. This is done with the “vis.gam” function. Below is the code vis.gam(model1) If multiple models are developed. You can compare the GCV values to determine which model is the best. In addition, another way to compare models is with the “AIC” function. In the code below, we will create an additional model that includes “year” compare the GCV scores and calculate the AIC. Below is the code. model2<-gam(acceleration~s(mpg)+s(displacement)+s(horsepower)+s(weight)+s(year),data=Auto) summary(model2) ## ## Family: gaussian ## Link function: identity ## ## Formula: ## acceleration ~ s(mpg) + s(displacement) + s(horsepower) + s(weight) + ## s(year) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 15.54133 0.07203 215.8 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(mpg) 5.578 6.726 2.749 0.0106 * ## s(displacement) 2.251 2.870 13.757 3.5e-08 *** ## s(horsepower) 4.936 6.054 66.476 < 2e-16 *** ## s(weight) 3.444 4.397 34.441 < 2e-16 *** ## s(year) 1.682 2.096 0.543 0.6064 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.733 Deviance explained = 74.5% ## GCV = 2.1368 Scale est. = 2.0338 n = 392 #model1 GCV model1gcv.ubre
##   GCV.Cp
## 2.127589
#model2 GCV
model2$gcv.ubre ## GCV.Cp ## 2.136797 As you can see, the second model has a higher GCV score when compared to the first model. This indicates that the first model is a better choice. This makes sense because in the second model the variable “year” is not significant. To confirm this we will calculate the AIC scores using the AIC function. AIC(model1,model2) ## df AIC ## model1 18.04952 1409.640 ## model2 19.89068 1411.156 Again, you can see that model1 s better due to its fewer degrees of freedom and slightly lower AIC score. Conclusion Using GAMs is most common for exploring potential relationships in your data. This is stated because they are difficult to interpret and to try and summarize. Therefore, it is normally better to develop a generalized linear model over a GAM due to the difficulty in understanding what the data is trying to tell you when using GAMs. # Listening Techniques for the ESL Classroom Listening is one of the four core skills of language acquisition along with reading, writing, and speaking. This post will explain several broad categories of listening that can happen within the ESL classroom. Reactionary Listening Reactionary listening involves having the students listen to an utterance and repeat back to you as the teacher. The student is not generating any meaning. This can be useful perhaps for developing pronunciation in terms of speaking. Common techniques that utilize reactionary listening are drills and choral speaking. Both of these techniques are commonly associated with audiolingualism. Responsive Listening Responsive listening requires the student to create a reply to something that they heard. Not only does the student have to understand what was said but they must also be able to generate a meaningful reply. The response can be verbal such as answering a question and or non-verbal such as obeying a command. Common techniques that are responsive in nature includes anything that involves asking questions and or obeying commands. As such, almost all methods and approaches have some aspect of responsive listening in them. Discriminatory Listening Discriminatory listening techniques involves listening that is selective. The listener needs to identify what is important from a dialog or monologue. The listener might need to identify the name of a person, the location of something, or develop the main idea of the recording. Discriminatory listening is probably a universal technique used by almost everyone. It is also popular with English proficiency test such as the IELTS. Intensive Listening Intensive listening is focused on breaking down what the student has heard into various aspect of grammar and speaking. Examples include intonation, stress, phonemes, contractions etc. This is more of an analytical approach to listening. In particular, using intensive listening techniques may be useful to help learners understand the nuances of the language. Extensive Listening Extensive listening is about listening to a monologue or dialog and developing an overall summary and comprehension of it. Examples of this could be having students listening to a clip from a documentary or a newscast. Again, this is so common in language teaching that almost all styles incorporate this in one way or another. Interactive Listening Interactive listening is the mixing of all of the previously mentioned types of listening simultaneously. Examples include role plays, debates, and various other forms of group work. All of the examples mentioned require repeating what others say (reactionary), replying to to others comments (responsive), identifying main ideas (discriminatory & extensive), and perhaps some focus on intonation and stress (intensive). As such, interactive listening is the goal of listening in a second language. Interactive listening is used by most methods most notable communicative language teaching, which has had a huge influence on the last 40 years of TESOL. Conclusion The listening technique categories provided here gives some insight into how one can organize various listening experiences in the classroom. What combination of techniques to employ depends on many different factors but knowing what’s available empowers the teacher to determine what course of action to take. # Wire Framing with Moodle Before teaching a Moodle course it is critical that a teacher design what they want to do. For many teachers, they believe that they begin the design process by going to Moodle and adding activity and other resources to their class. For someone who is thoroughly familiar with Moodle and have developed courses before this might work. However, for the majority online teachers they need to wire frame what they want their moodle course to look like online. Why Wire frame a Moodle Course In the world of web developers a wire frame is a prototype of what a potential website will look like. The actual wire frame can be made in many different platforms from Word, powerpoint, and even just paper and pencil. Since Moodle is online a Moodle course in many ways is a website so wire framing applies to this context. It doesn’t matter how a you wire frames their Moodle course. What matters is that you actually do this. Designing what you want to see in your course helps you to make decisions much faster when you are actually adding activities and resources to your Moodle course. It also helps your Moodle support to help you if they have a picture of what the you wants rather than wild hand gestures and frustration. Wire farming a course also reduces the cognitive load on the teacher. Instead of designing and building the course a the same time. Wire framing splits this task into two steps, which are designing, and then building. This prevents extreme frustration as it is common for a teacher just to stare at the computer screen when trying to design and develop a Moodle course simultaneously. You never see and architect making his plans while building the building. This would seem careless and even dangerous because the architect doesn’t even know what he wants while he is throwing around concrete and steel. The same analogy applies with designing Moodle courses. A teacher must know what they want, write it down, and then implement it by creating the course. Another benefit of planning in Word is that it is easier to change things in Word when compared to Moodle. Moodle is amazing but it is not easy to use for those who are not tech-savvy. However, it’s easiest for most of us to copy, paste, and edit in Word. One Way to Wire Frame a Moodle Course When supporting teachers to wire frame a Moodle course, I always encourage them to start by developing the course in Microsoft Word. The reason being that the teacher is already familiar with Word and they do not have to struggle to make decisions when using it. This helps them to focus on content and not on how to use Microsoft Word. One of the easiest ways to wire frame a Moodle course is to take the default topics of a course such as General Information, Week 1, Week 2, etc. and copy these headings into Word, as shown below. Now, all that is needed is to type in using bullets exactly what activities and resources you want in each section. It is also possible to add pictures and other content to the Word document that can be added to Moodle later. Below is a preview of a generic Moodle sample course with the general info and week 1 of the course completed. You can see for yourself how this class is developed. The General Info section has an image to serve as a welcome and includes the name of the course. Under this the course outline and rubrics for the course. The information in the parentheses indicate what type of module it is. For Week 1, there are several activities. There is a forum for introducing yourself. A page that shares the objectives of that week. Following this are the readings for the week, then a discussion forum, and lastly an assignment. This process completes for however many weeks are topics you have in the course. Depending on the your need to plan, you can even planned other pages on the site beside the main page. For example, I can wire frame what I want my “Objectives” page to look like or even the discussion topics for my “Discussion” forum. Of course, the ideas for all these activities comes from the course outline or syllabus that was developed first. In other words, before we even wire frame we have some sort of curriculum document with what the course needs to cover. Conclusion The example above is an extremely simple way of utilizing the power of wire framing. With this template, you can confidently go to Moodle and find the different modules to make your class come to life. Trying to conceptualize this in your head is possible but much more difficult. As such, thorough planning is a hallmark of learning. # Generalized Models in R Generalized linear models are another way to approach linear regression. The advantage of of GLM is that allows the error to follow many different distributions rather than only the normal distribution which is an assumption of traditional linear regression. Often GLM is used for response or dependent variables that are binary or represent count data. THis post will provide a brief explanation of GLM as well as provide an example. Key Information There are three important components to a GLM and they are • Error structure • Linear predictor • Link function The error structure is the type of distribution you will use in generating the model. There are many different distributions in statistical modeling such as binomial, gaussian, poission, etc. Each distribution comes with certain assumptions that govern their use. The linear predictor is the sum of the effects of the independent variables. Lastly, the link function determines the relationship between the linear predictor and the mean of the dependent variable. There are many different link functions and the best link function is the one that reduces the residual deviances the most. In our example, we will try to predict if a house will have air conditioning based on the interactioon between number of bedrooms and bathrooms, number of stories, and the price of the house. To do this, we will use the “Housing” dataset from the “Ecdat” package. Below is some initial code to get started. library(Ecdat) data("Housing") The dependent variable “airco” in the “Housing” dataset is binary. This calls for us to use a GLM. To do this we will use the “glm” function in R. Furthermore, in our example, we want to determine if there is an interaction between number of bedrooms and bathrooms. Interaction means that the two independent variables (bathrooms and bedrooms) influence on the dependent variable (aircon) is not additive, which means that the combined effect of the independnet variables is different than if you just added them together. Below is the code for the model followed by a summary of the results model<-glm(Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories + Housing$price, family=binomial)
summary(model)
##
## Call:
## glm(formula = Housing$airco ~ Housing$bedrooms * Housing$bathrms + ## Housing$stories + Housing$price, family = binomial) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -2.7069 -0.7540 -0.5321 0.8073 2.4217 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -6.441e+00 1.391e+00 -4.632 3.63e-06 ## Housing$bedrooms                  8.041e-01  4.353e-01   1.847   0.0647
## Housing$bathrms 1.753e+00 1.040e+00 1.685 0.0919 ## Housing$stories                   3.209e-01  1.344e-01   2.388   0.0170
## Housing$price 4.268e-05 5.567e-06 7.667 1.76e-14 ## Housing$bedrooms:Housing$bathrms -6.585e-01 3.031e-01 -2.173 0.0298 ## ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 681.92 on 545 degrees of freedom ## Residual deviance: 549.75 on 540 degrees of freedom ## AIC: 561.75 ## ## Number of Fisher Scoring iterations: 4 To check how good are model is we need to check for overdispersion as well as compared this model to other potential models. Overdispersion is a measure to determine if there is too much variablity in the model. It is calcualted by dividing the residual deviance by the degrees of freedom. Below is the solution for this 549.75/540 ## [1] 1.018056 Our answer is 1.01, which is pretty good because the cutoff point is 1, so we are really close. Now we will make several models and we will compare the results of them Model 2 #add recroom and garagepl model2<-glm(Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories + Housing$price + Housing$recroom + Housing$garagepl, family=binomial)
summary(model2)
##
## Call:
## glm(formula = Housing$airco ~ Housing$bedrooms * Housing$bathrms + ## Housing$stories + Housing$price + Housing$recroom + Housing$garagepl, ## family = binomial) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -2.6733 -0.7522 -0.5287 0.8035 2.4239 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -6.369e+00 1.401e+00 -4.545 5.51e-06 ## Housing$bedrooms                  7.830e-01  4.391e-01   1.783   0.0745
## Housing$bathrms 1.702e+00 1.047e+00 1.626 0.1039 ## Housing$stories                   3.286e-01  1.378e-01   2.384   0.0171
## Housing$price 4.204e-05 6.015e-06 6.989 2.77e-12 ## Housing$recroomyes                1.229e-01  2.683e-01   0.458   0.6470
## Housing$garagepl 2.555e-03 1.308e-01 0.020 0.9844 ## Housing$bedrooms:Housing$bathrms -6.430e-01 3.054e-01 -2.106 0.0352 ## ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 681.92 on 545 degrees of freedom ## Residual deviance: 549.54 on 538 degrees of freedom ## AIC: 565.54 ## ## Number of Fisher Scoring iterations: 4 #overdispersion calculation 549.54/538 ## [1] 1.02145 Model 3 model3<-glm(Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories + Housing$price + Housing$recroom + Housing$fullbase + Housing$garagepl, family=binomial) summary(model3) ## ## Call: ## glm(formula = Housing$airco ~ Housing$bedrooms * Housing$bathrms +
##     Housing$stories + Housing$price + Housing$recroom + Housing$fullbase +
##     Housing$garagepl, family = binomial) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -2.6629 -0.7436 -0.5295 0.8056 2.4477 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -6.424e+00 1.409e+00 -4.559 5.14e-06 ## Housing$bedrooms                  8.131e-01  4.462e-01   1.822   0.0684
## Housing$bathrms 1.764e+00 1.061e+00 1.662 0.0965 ## Housing$stories                   3.083e-01  1.481e-01   2.082   0.0374
## Housing$price 4.241e-05 6.106e-06 6.945 3.78e-12 ## Housing$recroomyes                1.592e-01  2.860e-01   0.557   0.5778
## Housing$fullbaseyes -9.523e-02 2.545e-01 -0.374 0.7083 ## Housing$garagepl                 -1.394e-03  1.313e-01  -0.011   0.9915
## Housing$bedrooms:Housing$bathrms -6.611e-01  3.095e-01  -2.136   0.0327
##
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
##     Null deviance: 681.92  on 545  degrees of freedom
## Residual deviance: 549.40  on 537  degrees of freedom
## AIC: 567.4
##
## Number of Fisher Scoring iterations: 4
#overdispersion calculation
549.4/537
## [1] 1.023091

Now we can assess the models by using the “anova” function with the “test” argument set to “Chi” for the chi-square test.

anova(model, model2, model3, test = "Chi")
## Analysis of Deviance Table
##
## Model 1: Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories +
##     Housing$price ## Model 2: Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories + ## Housing$price + Housing$recroom + Housing$garagepl
## Model 3: Housing$airco ~ Housing$bedrooms * Housing$bathrms + Housing$stories +
##     Housing$price + Housing$recroom + Housing$fullbase + Housing$garagepl
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1       540     549.75
## 2       538     549.54  2  0.20917   0.9007
## 3       537     549.40  1  0.14064   0.7076

The results of the anova indicate that the models are all essentially the same as there is no statistical difference. The only criteria on which to select a model is the measure of overdispersion. The first model has the lowest rate of overdispersion and so is the best when using this criteria. Therefore, determining if a hous has air conditioning depends on examining number of bedrooms and bathrooms simultenously as well as the number of stories and the price of the house.

Conclusion

The post explained how to use and interpret GLM in R. GLM can be used primarilyy for fitting data to disrtibutions that are not normal.

# Common Challenges with Listening for ESL Students

Listening is always a challenge as students acquire any language. Both teachers and students know that it takes time to developing comprehension when listening to a second language.

This post will explain some of the common obstacles to listening for ESL students. Generally, some common roadblocks includes the following.

• Slang
• Contractions
• Rate of Delivery
• Emphasis in speech
• Clustering
• Repetition
• Interaction

Slang

Slang or colloquial language is a major pain for language learners. There are so many ways that we communicate in English that does not meet the prescribed “textbook” way. This can leave ESL learners completely lost as to what is going on.

A simple example would be to say “what’s up”. Even the most austere English teacher knows what this means but this is in no way formal English. For someone new to English it would be confusing at least initially.

Contractions

Contractions are unique form of slang or colloquialism that is more readily accept as standard English. A challenge with contractions is there omission of information. With this missing information there can be confusion.

An example would be “don’t” or “shouldn’t”. Other more complicated contractions can include “djeetyet” for “did you eat yet”. These common phrase leave out or do not pronounce important information.

Rate of Delivery

When listening to someone in a second language it always seems too fast. The speed at which we speak our own language is always too swift for someone learning it.

Pausing at times during the delivery is one way to allow comprehension with actually slowing the speed at which one speaks. The main way to overcome this is to learn to listening faster if this makes any sense.

Emphasis in Speech

In many languages there are complex rules for understanding which vowels to stress, which do not make sense to a non-native speaker. In fact, native speakers do not always agree on the vowels to stress. English speakers have been arguing or how to pronounce potato and tomato for ages.

Another aspect is the intonation. The inflection in many languages can change when asking a question, a statement, or being bored, angry or some other emotion. These little nuances of language as difficult to replicate and understand.

Clustering

Clustering is the ability to break language down into phrases. This helps in capturing the core of a language and is not easy to do. Language learners normally try to remember everything which leads to them understanding nothing.

For the teacher,  the students need help in determining what is essential information and what is not. This takes practice and demonstrations of what is considered critical and not in listening comprehension.

Repetition

Repetition is closely related to clustering and involves the redundant use of words and phrases. Constantly re-sharing the same information can become confusing for students. An example would be someone saying “you know” and  “you see what I’m saying.” This information is not critical to understanding most conversations and can throw of the comprehension of a language learner.

Interaction

Interaction has to do with a language learner understanding how to negotiate a conversation. This means being able to participate in a discussion, ask questions, and provide feedback.

The ultimate goal of listening is to speak. Developing  interactive skills is yet another challenge to listening as students must develop participatory skills.

Conclusion

The challenges mentioned here are intended to help teachers to be able to identify what may be impeding their students from growing in their ability to listen. Naturally, this is not exhaustive list but serves as a brief survey.

# Types of Oral Language

Within communication and language teaching there are actually many different forms or types of oral language. Understanding this is beneficial if a teacher is trying to support students to develop their listening skills. This post will provide examples of several oral language forms.

Monologues

A monologue is the use of language without any feedback verbally form others. There are two types of monologue which  are planned and unplanned. Planned monologues include such examples as speeches, sermons, and verbatim reading.

When a monologue is planned there is little repetition of the ideas and themes of the subject. This makes it very difficult for ESL students to follow and comprehend the information. ESL students need to hear the content several times to better understand what is being discussed.

Unplanned monologues are more improvisational in nature. Examples can include classroom lectures and one-sided conversations. There is usually more repetition in unplanned monologues which is beneficial. However, the stop and start of unplanned monologues can be confusing at times as well.

Dialogues

A dialogue is the use of oral language involving two or more people . Within dialogues there are two main sub-categories which are interpersonal and transactional. Interpersonal dialogues encourage the development of personal relationships. Such dialogues that involve asking people how are they or talking over dinner may fall in this category.

Transactional dialogue is dialogue for sharing factual information. An example might be  if someone you do not know asks you “where is the bathroom.” Such a question is not for developing relationships but rather for seeking information.

Both interpersonal and transactional dialogues can be either familiar or unfamiliar. Familiarity has to do with how well the people speaking know each other. The more familiar the people talking are the more assumptions  and hidden meanings they bring to the discussion. For example, people who work at the same company in the same department use all types of acronyms to communicate with each other that outsiders do not understand.

When two people are unfamiliar with each other, effort must be  made to provide information explicitly to avoid confusion. This carries over when a native speaker speaks in a familiar manner to ESL students. The style of communication  is inappropriate because of the lack of familiarity of the ESL students with the language.

Conclusion

The boundary between monologue and dialogue is much clear than the boundaries between the other categories mentioned such as planned/unplanned, interpersonal/transactional, and familiar/unfamiliar. In general, the ideas presented here represent a continuum and not either or propositions.

# Proportion Test in R

Proportions are are a fraction or “portion” of a total amount. For example, if there are ten men and ten women in a room the proportion of men in the room is 50% (5 / 10). There are times when doing an analysis that you want to evaluate proportions in our data rather than individual measurements of mean, correlation, standard deviation etc.

In this post we will learn how to do a test of proportions using R. We will use the dataset “Default” which is found in the “ISLR” pacakage. We will compare the proportion of those who are students in the dataset to a theoretical value. We will calculate the results using the z-test and the binomial exact test. Below is some initial code to get started.

library(ISLR)
data("Default")

We first need to determine the actual number of students that are in the sample. This is calculated below using the “table” function.

table(Default$student) ## ## No Yes ## 7056 2944 We have 2944 students in the sample and 7056 people who are not students. We now need to determine how many people are in the sample. If we sum the results from the table below is the code. sum(table(Default$student))
## [1] 10000

There are 10000 people in the sample. To determine the proprtion of students we take the number 2944 / 10000 which equals 29.44 or 29.44%. Below is the code to calculate this

table(Default$student) / sum(table(Default$student))
##
##     No    Yes
## 0.7056 0.2944

The proportion test is used to compare a particular value with a theoretical value. For our example, the particular value we have is 29.44% of the people were students. We want to compare this value with a theoretical value of 50%. Before we do so it is better to state specificallt what are hypotheses are. NULL = The value of 29.44% of the sample being students is the same as 50% found in the population ALTERNATIVE = The value of 29.44% of the sample being students is NOT the same as 50% found in the population.

Below is the code to complete the z-test.

prop.test(2944,n = 10000, p = 0.5, alternative = "two.sided", correct = FALSE)
##
##  1-sample proportions test without continuity correction
##
## data:  2944 out of 10000, null probability 0.5
## X-squared = 1690.9, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.2855473 0.3034106
## sample estimates:
##      p
## 0.2944

Here is what the code means. 1. prop.test is the function used 2. The first value of 2944 is the total number of students in the sample 3. n = is the sample size 4. p= 0.5 is the theoretical proportion 5. alternative =“two.sided” means we want a two-tail test 6. correct = FALSE means we do not want a correction applied to the z-test. This is useful for small sample sizes but not for our sample of 10000

The p-value is essentially zero. This means that we reject the null hypothesis and conclude that the proprtion of students in our sample is different from a theortical proprition of 50% in the population.

Below is the same analysis using the binomial exact test.

binom.test(2944, n = 10000, p = 0.5)
##
##  Exact binomial test
##
## data:  2944 and 10000
## number of successes = 2944, number of trials = 10000, p-value <
## 2.2e-16
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.2854779 0.3034419
## sample estimates:
## probability of success
##                 0.2944

The results are the same. Whether to use the “prop.test”” or “binom.test” is a major argument among statisticians. The purpose here was to provide an example of the use of both