Analysis of Customer Transaction Data Associations Based on The Apriori Algorithm

UD Dian Pertiwi is one of the small and medium enterprises engaged in materials with the main product is building materials. This business experiences large amounts of transactions every day, the data obtained becomes increasingly large, and it will only be limited to a pile of useless data or commonly called junk. By utilizing a data mining approach apriori algorithm technique, the data can be utilized to support the sales process and achieve a target of UD Dian Pertiwi. Based on research and data mining that has been done using association analysis and apriori algorithms by applying a minimum of support = 1% and a minimum of confidence = 70% resulted in the ten strongest association rules can be used by UD Dian Pertiwi in the process of applying a sales strategy including determining interrelationships, in short, the product has the potential to be purchased at the same time, increasing the amount of product stock and conducting promotions.


Introduction
UD Dian Pertiwi is one of the small-medium enterprises engaged in the material with the main product that is building materials. This business every day is experiencing large amounts of transactions so that the data obtained longer becomes more and more, when not used then it will only be the limit of useless data stacks, or commonly called data Garbage, while the data can be used to achieve the target of UD Dian Pertiwi which until now still has not achieved, with the utilization of data mining approach, especially the technique of using apriori algorithm that has Excellence of mining data quickly, then it can be utilized the data to support the sales process to achieve a target of UD Dian Pertiwi, through the research is expected to be found a rule of association with the value of confidence, in order to support the sales process, so that the target that has been set can be achieved.

Data Mining
Data mining as a process for obtaining useful information from a large database warehouse [1]. Data mining can also be interpreted as extracting new information extracted from large chunks of data that aid in decision making [2]. The term data mining is sometimes called knowledge discovery [3] [4].  a. Cleaning and Integration Data sanitization will also affect the formation of data mining systems because the data handled will be reduced in number and complexity.
Data integration is done on attributes that identify unique entities such as name attribute, product type, customer number, etc. Data integration needs to be done carefully because errors in data integration can produce deviant results and even misleading action later [5].

b. Selection and Transformation
Transformation and selection of this data also determine the quality of data mining results later because there are several characteristics of specific data mining techniques that depend on this stage [6].

c. Data Mining Process
The application of the data mining technique is only one part of the data mining process. Several data mining techniques are already commonly used [7]. We will go into more information on the techniques that are in the next section. Note that there are times when general data mining techniques available in the market are insufficient to carry out data mining in certain areas or for specific data.

d. Evaluation and Presentation
The last phase of the data mining process is how to formulate decisions or actions from the results of the analysis. Sometimes it should involve people who do not understand data mining. Therefore, the presentation of data mining results in the form of knowledge that everyone can understand is a necessary step in the process of data mining.
Data mining contains the desired pattern search in an extensive database to help decision making in the future. This pattern is recognizable by specific devices that can provide a useful and insightful data analysis that can then be learned more thoroughly, which may use other decision-making devices [8].

Association Rule
The Analysis of association or association rule mining is a data mining technique to find associative rules between a combination of items. The analysis of the association is also known as one of the data mining techniques that become the basis of one of the other data mining techniques. In particular, one stage of the association analysis attracted the attention of many researchers to produce an efficient algorithm, namely, the analysis of the high-frequency pattern (a frequent pattern mining) [8].

Apriori Algorithm
Apriori algorithm is one of the algorithms that can be used in the implementation of the basket analysis market to find the rules of association that meet the limits of support and confidence [9]. During the process stage, the algorithm generates a systematic excavation without exploring all the candidates, while in the second stage is carried out extraction against a strict rule. Frequent itemset one usually refers to a collection of items that often appear together in transactional data [10][15].

Weka
WEKA is a Java API that provides APIS for processing in data mining that is open source-based (GPL) and JAVAengine. WEKA is developed by the University of Waikato in New Zealand and is also free software available under the GNU (General Public License) [11] [14]. WEKA provides the use of classification techniques using a decision tree with the J48 algorithm. The technique used by WEKA is a classifier.

Problem Identification
In this research, the first thing to do is to identify the problem. Identification of the problem is a vital research process as it can determine the quality of research conducted and to formulate the problems that will be the background in the research object that is done, the problem identified is how to analyze sales data on UD Dian Pertiwi based on an Apriori algorithm.

Data Collection
After completing the problem identification, then proceed to the next stage of data collection. At this stage, researchers collect the data needed to find association rules based on a priori algorithm [12] [13]. Data needed in the form of data from interviews directly conducted by asking questions to the admin of UD Dian Pertiwi to know the problems that exist in order to be a reference in the research, from now on by the way of the library study collecting data and studying the data of customer spending transactions in UD Dian Pertiwi within 5 months, from January to May 2019, as for the documentation level to complete the data retrieval

Application Of The Method Of Association Using Apriori Algorithm
By implementing a priori algorithm, the authors will find the association rules with a good confidence value to support the sales process in UD Dian Pertiwi by looking for a combination of association rules items. The first step is to find a combination of items that meet the minimum support requirement. The support value is a value that shows the combination of items in the database taken from the number of transactions containing A divided by the number of transactions. Then searchable association rules that meet the minimum requirements for confidence.

Results and Discussion
In this study, customer spending transaction data at UD Dian Pertiwi within 5 months from January to May 2019.

Pre-processing Data
In this research, authors perform Pre-processing data, which aims to transform raw data into the appropriate format for analysis, to make it easier then it is done coding on each item of goods. After doing the coding process, the tabular table is made to make it easier to know how many items were purchased in a transaction. Tabulation data is the creation of tables by containing data that has been coded initial goods according to the required analysis. Transactions are given by giving a "Y" (yes), meaning that the transaction contains such goods and a "?" sign, which means that the transaction does not contain the goods.

Apriori with Weka
At this stage, the process is done from a priori algorithm to acquire a combination pattern of goods in Weka. In this process is determined in advance a minimum of support and confidence, where researchers determine 1% for minimum support and 70% for minimum confidence. 1% stated that of all transactions of goods showing goods purchased simultaneously and 70% confidence stated that if a transaction occurs on 1 item, then there is a 70% possibility to buy 2nd item. With a minimum of 1% support and confidence 70% then the process on Weka becomes like this: Then with a minimum of 1% support and confidence, 70% formed final Association rules as much as 10 final association rules.
a. If you buy "faucets" and "cement," then it will buy "sand." b. If buying "faucets" and "sand," then it will buy "cement." c. If you buy "ceramic," "nail," and "sand," then it will buy "cement." d. If buying "glue isarplas" and "cement," then it will buy "sand." e. If you buy "ceramic" and "sand," then it will buy "cement." f. If you buy "Kloset" and "sand", then it will buy "cement." g. If you buy "cat aviat" and "sand," then it will buy "cement." h. If buying "glue isarplas" and "sand," then it will buy "cement." i. If you buy a "Lank lips" and "sand," then it will buy "cement." j. If you buy "nail" and "sand," then it will buy "cement."

Conclusion
The final result of the association pattern on the purchase of material goods UD Dian Pertiwi using a apricot algorithm method of transaction data for 5 months is from testing 1,302 transaction data for 5 months using Weka, generating the best rules with Minimum 1% support and confidence 70% formed final Association rules as many as 10 association rules.