Page 364 - Proceeding The 2nd International Seminar of Science and Technology : Accelerating Sustainable Innovation Towards Society 5.0
P. 364

nd
               The 2  International Seminar of Science and Technology
               “Accelerating Sustainable innovation towards Society 5.0”
               ISST 2022 FST UT 2022
               Universitas Terbuka
               assumptions that must be met [10]. Random Forest is a method that
               consists of a structured set of trees that each casts a vote unit for the
               class and the results obtained are based on the most decisions. The
               basic technique used by Random Forest is Decision Tree. In other
               words, a random forest is a set of decision trees that are used for
               classification and prediction of data by entering input into the roots
               above and then down to the leaves below [2].
               Random  Forest  uses  an  ensemble  bagging  strategy  that  can
               overcome the overfitting problem that occurs if the train data is small
               [11]. The results of the Random Forest analysis for classification are
               the mode of each tree of the forest built, while the prediction results
               are obtained from the average value of each tree [12]. The algorithm
               to follow when constructing a tree using a Random Forest is divided
               into two parts. The first is the creation of "n" trees to form a random
               forest. The second is to make predictions from Random Forests that
               have been made [2].
               Input:
               −  D, a dataset consisting of d rows
               −  k, the number of trees
               The Random Forest method process in constructing a tree:
               a.  Generate sample data Di data by taking random data from
                   dataset D with replacement.
               b.  Use sample data Di to build a tree to i (i=1,2,…k)
               c.  Steps 1 and 2 are repeated k times
               In the classification process, the individuals are based on the vote of
               the  most  votes  in  the  tree  population  collection,  while  for  the
               regression using the average results of the tree population. Stages of
               analysis of the Random Forest method
               1)  In Random Forest analysis, the first step is to input data into the
                   R Studio software.
               2)  Divide the data into training and testing data. Then identify the
                   Random Forest model with the ntree value (number of trees) that
                   has been determined using training data, testing data is used to
                   see the error rate of the model made.


               ISST 2022 – FST Universitas Terbuka, Indonesia            327
               International Seminar of Science and Technology “Accelerating Sustainable
               Towards Society 5.0
   359   360   361   362   363   364   365   366   367   368   369