

The first three columns show the importance of the variable at improving accuracy by category of the outcome variable. In this example, x1 is clearly the most important variable, followed by x2, and x3.

In a rough sense, it can be interpreted as showing the amount of increase in classification accuracy that is provided by including the variable in the model (a more precise statement of the meaning is complicated, and requires a detailed understanding of the underlying mechanics of random forests). Higher values mean that the variable improves prediction. The column called MeanDecreaseAccuracy contains a measure of the extent to which a variable improves the accuracy of the forest in predicting the classification. The table below shows the variable importance as computed by a Random Forest. Under Inputs > Random Forest > Predictor(s) select your predictor variables.Ĥ.

Under Inputs > Random Forest > Outcome select your outcome variable.ģ. In Q, select Create > Classifier > Random Forest.Ģ. In Displayr, select Anything > Advanced Analysis > Machine Learning > Random Forest. Fits a random forest of classification or regression trees.ġ.
