Page 25 - Proceeding The 2nd International Seminar of Science and Technology : Accelerating Sustainable Innovation Towards Society 5.0
P. 25
nd
The 2 International Seminar of Science and Technology
“Accelerating Sustainable innovation towards Society 5.0”
ISST 2022 FST UT 2022
Universitas Terbuka
installation of new information taken from free chunks of data that
assist in decision making. The term data mining is sometimes also
called knowledge discovery [13]. It is worth remembering that the word
mining itself means an attempt to obtain a small number of valuables
from a large number of basic materials. Therefore, data mining
actually has long roots from fields of science such as artificial
intelligence, machine learning, statistics and databases [14]. Mining
data contains searches for desired trends or patterns in large
databases to assist decision making in the time to be come [15].
2.2 Cluster Analysis
Cluster analysis is to find a collection of objects until objects in one
group are the same (or have a relationship) with another and are
different (or unrelated) to objects in another group [16]. The purpose
of the analysis is to minimize the distance within the cluster and
maximize the distance between clusters [17]. Cluster analysis is
considered as a form of classification that labels objects with their
class labels [11]. There are many method methods of clustering
developed by experts. Each method has character, advantages, and
disadvantages, one of which is the K-Means method [18].
2.3 K-Means Clustering
The K-Means method is one of the commonly used non-hierarchical
methods. This method is included in the partitioning technique that
divides or separates objects into separate area pok groups [1]. The
purpose of the K-means is to divide n observations into group k all
observations are part of a cluster that serves as a prototype cluster
[18]. The K-Means algorithm uses a process repeatedly to obtain a
cluster database [19]. The K-Means clustering algorithm is based on
optimizing the similarity scale between each cluster with the lowest
value and the highest value for the value in the cluster, in other words
K-Means tries to reduce the distance between clusters and increase
the similarity in the cluster [20]. The K-Means method will select the k
pattern as the starting point of the centroid randomly or randomly. The
number of iterations to reach the centroid cluster will be influenced by
4 ISST 2022 – FST Universitas Terbuka, Indonesia
International Seminar of Science and Technology “Accelerating Sustainable
Towards Society 5.0