Introduction
Fits a neural network for classification or regression.
A random 30% of the data is used for crossvalidation to find the optimal number of epochs according to crossvalidation loss. The final network is trained on all data for the optimal number of epochs.
Requirements
A data set containing an outcome variable and predictor variables to use the predictive model.
This method is only available in Q5.
Method
Usage
To run Deep Learning:
1. Select Create > Classifier > Deep Learning.
2. Under Inputs > Deep Learning > Outcome select the outcome variable.
3. Under Inputs > Deep Learning > Predictors select the predictor variables.
4. Change any other settings as required.
Example
The Cross Validation output of a deep learning model. Loss is the quantity minimized by the model, which is mean squared error in the case of a numeric output. The model mean absolute error is also shown.
Options
Outcome  The variable to be predicted by the predictor variables. It may be either a numeric or categorical variable.
Predictors  The variable(s) to predict the Outcome.
Algorithm  The machine learning algorithm. Defaults to Deep Learning but may be changed to other machine learning methods.
Output

 Accuracy  When Outcome is categorical, produces a table of accuracy by class. Else calculates Root Mean Squared Error and Rsquared if Outcome is numeric.
 PredictionAccuracy Table  Produces a table relating the observed and predicted outcome. Also known as a confusion matrix.
 Cross Validation  Produces charts of loss (i.e. network error) and accuracy or mean absolute error vs training epoch.
 Network Layers  This returns a description of the layers of the network.
Missing data  See Missing Data Options.
Variable names  Displays Variable Names in the output.
Maximum epochs  The maximum number of epochs to train the network for. The actual number of epochs may be lower if the crossvalidation error stops improving.
Hidden layers  A comma delimited list of the number of units in the hidden layers.
Normalize predictors  Whether the predictor variables are normalized to zero mean and unit variance. This is recommended if the variables differ significantly in their ranges. Note that categorical variables are also converted to dummy variables.
Random seed  Seed used to initialize the (pseudo)random number generator for the model fitting algorithm. Different seeds may lead to slightly different answers, but should normally not make a large difference.
Increase allowed output size  Check this box if you encounter a warning message "The R output had size XXX MB, exceeding the 128 MB limit..." and you need to reference the output elsewhere in your document; e.g., to save predicted values to a Data Set or examine diagnostics.
Maximum allowed size for output (MB)  This control only appears if Increase allowed output size is checked. Use it to set the maximum allowed size for the regression output in MegaBytes. The warning referred to above about the R output size will state the minimum size you need to increase to to return the full output. Note that having very many large outputs in one document or page may slow down the performance of your document and increase load times.
Weight  Where a weight has been set for the R Output, a new data set is generated via resampling, and this new data set is used in the estimation.
Filter  The data is automatically filtered using any filters prior to estimating the model.
DIAGNOSTICS
PredictionAccuracy Table  Creates a table showing the observed and predicted values, as a heatmap.
SAVE VARIABLE(S)
Predicted Values  Creates a new variable containing predicted values for each case in the data.
Probabilities of Each Response  Creates new variables containing predicted probabilities of each response.
Acknowledgments
Uses the keras package, which uses TensorFlow.
More information
See this blog post for an introduction to deep learning.
Next