Developing an Artificial Neural Network in R

In this post, we are going make an artificial neural network (ANN) by analyzing some data about computers. Specifically, we are going to make an ANN to predict the price of computers.

We will be using the “ecdat” package and the data set “Computers” from this pacakge. In addition, we are going to use the “neuralnet” pacakge to conduct the ANN analysis. Below is the code for the packages and dataset we are using

library(Ecdat);library(neuralnet)
## Loading required package: Ecfun
## 
## Attaching package: 'Ecdat'
## 
## The following object is masked from 'package:datasets':
## 
##     Orange
## 
## Loading required package: grid
## Loading required package: MASS
## 
## Attaching package: 'MASS'
## 
## The following object is masked from 'package:Ecdat':
## 
##     SP500
#load data set
data("Computers")

Explore the Data

The first step is always data exploration. We will first look at nature of the data using the “str” function and then used the “summary” function. Below is the code.

str(Computers)
## 'data.frame':    6259 obs. of  10 variables:
##  $ price  : num  1499 1795 1595 1849 3295 ...
##  $ speed  : num  25 33 25 25 33 66 25 50 50 50 ...
##  $ hd     : num  80 85 170 170 340 340 170 85 210 210 ...
##  $ ram    : num  4 2 4 8 16 16 4 2 8 4 ...
##  $ screen : num  14 14 15 14 14 14 14 14 14 15 ...
##  $ cd     : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 2 1 1 1 ...
##  $ multi  : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ premium: Factor w/ 2 levels "no","yes": 2 2 2 1 2 2 2 2 2 2 ...
##  $ ads    : num  94 94 94 94 94 94 94 94 94 94 ...
##  $ trend  : num  1 1 1 1 1 1 1 1 1 1 ...
lapply(Computers, summary)
## $price
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     949    1794    2144    2220    2595    5399 
## 
## $speed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   25.00   33.00   50.00   52.01   66.00  100.00 
## 
## $hd
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    80.0   214.0   340.0   416.6   528.0  2100.0 
## 
## $ram
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   4.000   8.000   8.287   8.000  32.000 
## 
## $screen
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   14.00   14.00   14.00   14.61   15.00   17.00 
## 
## $cd
##   no  yes 
## 3351 2908 
## 
## $multi
##   no  yes 
## 5386  873 
## 
## $premium
##   no  yes 
##  612 5647 
## 
## $ads
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    39.0   162.5   246.0   221.3   275.0   339.0 
## 
## $trend
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   10.00   16.00   15.93   21.50   35.00

ANN is primarily for numerical data and not categorical or factors. As such, we will remove the factor variables cd,multi, and premium from further analysis. Below are is the code and histograms of the remaining variables.

hist(Computers$price);hist(Computers$speed);hist(Computers$hd);hist(Computers$ram)

dc97c105-c791-41ae-9e6a-e01061d82f10c50779eb-99c6-47e6-b44f-c037cd58ab2d.png34980edf-1c6f-4d5b-9214-c932860dff32.png725e049e-f58d-4d22-a5a6-70e9a8290e61.png

hist(Computers$screen);hist(Computers$ads);hist(Computers$trend)

200d47df-0ca3-4ee6-90a7-0ab6c8057278.png8a53ccf2-b6be-4243-90ec-4fcf6b6f4eb4.pngd3080f32-fefe-419a-af3c-8e2ade2fc11f.png

Clean the Data

Looking at the summary combined with the histograms indicates that we need to normalize are data as this is a requirement for ANN. We want all of the variables to have an equal influence initially. Below is the code for the function we wil use to normalize are variables.

normalize<-function(x) {
        return((x-min(x)) / (max(x)))
}

Explore the Data Again

We now need to make a new dataframe that has only the variables we are going to use for the analysis. Then we will use our “normalize” function to scale and center the variables appropriately. Lastly, we will re-explore the data to make sure it is ok using the “str” “summary” and “hist” functions. Below is the code.

#make dataframe without factor variables
Computers_no_factors<-data.frame(Computers$price,Computers$speed,Computers$hd,
                              Computers$ram,Computers$screen,Computers$ad,
                              Computers$trend)
#make a normalize dataframe of the data
Computers_norm<-as.data.frame(lapply(Computers_no_factors, normalize))
#reexamine the normalized data
lapply(Computers_norm, summary)
## $Computers.price
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.1565  0.2213  0.2353  0.3049  0.8242 
## 
## $Computers.speed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.0800  0.2500  0.2701  0.4100  0.7500 
## 
## $Computers.hd
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00000 0.06381 0.12380 0.16030 0.21330 0.96190 
## 
## $Computers.ram
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.0625  0.1875  0.1965  0.1875  0.9375 
## 
## $Computers.screen
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00000 0.00000 0.00000 0.03581 0.05882 0.17650 
## 
## $Computers.ad
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.3643  0.6106  0.5378  0.6962  0.8850 
## 
## $Computers.trend
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.2571  0.4286  0.4265  0.5857  0.9714
hist(Computers_norm$Computers.price);hist(Computers_norm$Computers.speed);

ed7fe31f-5ac2-47a0-a178-bbf2565d1d5b.png8b4b0bac-0936-448a-8216-cc7b06768692.png

hist(Computers_norm$Computers.hd);hist(Computers_norm$Computers.ram);

38b5ea71-47f8-4912-8a8d-c7656544d4da.pngdbd9d052-ca58-4ba7-b774-46f5da6d4532

hist(Computers_norm$Computers.screen);hist(Computers_norm$Computers.ad);

bde52e4e-cacd-4798-9dc1-2b56c1ca9c1a.png27d73af5-56d6-4c0b-ab9d-368be59a5d78

hist(Computers_norm$Computers.trend)

cfe12fb9-9074-441c-9a6c-9b4136ad6a4e.png

Develop the Model

Everything looks good, so we will now split are data into a training and testing set and develop the ANN model that will predict computer price. Below is the code for this.

#split data into training and testing
Computer_train<-Computers_norm[1:4694,]
Computers_test<-Computers_norm[4695:6259,]
#model
computer_model<-neuralnet(Computers.price~Computers.speed+Computers.hd+
                                  Computers.ram+Computers.screen+
                                  Computers.ad+Computers.trend, Computer_train)

Our intial model is a simple feedforward networm with a single hidden node. You can visualize the model using the “plot” function as shown in the code below.

plot(computer_model)

Screenshot from 2016-07-11 14:35:35.png

We now need to evaluate our model’s performance. We will use the “compute” function to generate predictions. The predictions generated by the “compute” function will then be compared to the actual prices in the test data set using a pearson correlation. Since we are not classifying we can’t measure accuracy with a confusion matrix but rather with correlation. Below is the code followed by the results

evaluate_model<-compute(computer_model, Computers_test[2:7])
predicted_price<-evaluate_model$net.result
cor(predicted_price, Computers_test$Computers.price)
##              [,1]
## [1,] 0.8809571295

The correlation between the predict results and the actual results is 0.88 which is a strong relationship. THis indicates that our model does an excellent job in predicting the price of a computer based on ram, screen size, speed, hard drive size, advertising, and trends.

Develop Refined Model

Just for fun we are going to make a more complex model with three hidden nodes and see how the results change below is the code.

computer_model2<-neuralnet(Computers.price~Computers.speed+Computers.hd+
                                   Computers.ram+Computers.screen+ 
                                   Computers.ad+Computers.trend, 
                           Computer_train, hidden =3)
plot(computer_model2)
evaluate_model2<-compute(computer_model2, Computers_test[2:7])
predicted_price2<-evaluate_model2$net.result
cor(predicted_price2, Computers_test$Computers.price)

Screenshot from 2016-07-11 14:36:34.png

##              [,1]
## [1,] 0.8963994092

The correlation improves to 0.89. As such, the increased complexity did not yield much of an improvement in the overall correlation. Therefore, a single node model is more appropriate.

Conclusion

In this post we explored an application of artificial neural networks. This black box method is useful for making powerful predictions in highly complex data.

Advertisements

3 thoughts on “Developing an Artificial Neural Network in R

  1. Pingback: Developing a Artificial Neural Network in R | E...

  2. Pingback: Developing an Artificial Neural Network in R — educational research techniques | The WEDA Coalition

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s