Sales Transaction Data Analysis using Apriori Algorithm to Determine the Layout of the Goods

In a shop, usually apply a sales strategy in order. The sales strategy can be in the form of determining the layout of goods so that they are close to one another. Determining the layout of items can be based on items that are often purchased simultaneously. Searching for items that are often purchased together can be done using data mining techniques, which is processing data to become more useful information. Sales transaction data processing can be done using apriori algorithm. Apriori algorithm is the most famous algorithm for finding high-frequency patterns and generating association rules. From the results of the discussion and data analysis, there were 3 (three) association rules formed, namely "If you buy Milo Active 18 grm, then buy ABC Kopi Susu 31G" with support 0.36% and 75% confidence, "If you buy Dancow 1 + Honey 200 grm, then buy Ice Cream Corneto" wit H Support 0.36% and confidence 60%, "If you buy SIIP Roasted 6.5 grm, then buy Davos Strong 10 grm" with support 0.36% and 75% confidence. From the association's rules can be used as decision making to determine the layout of goods that are likely to be purchased simultaneously by the buyer.


Introduction
In the business world of minimarket, data is vital information to expand the scope of business. To achieve this, there are several ways to be done namely improve product quality, ensure stock inventory and determine layout of goods. All these ways can be done by analyzing minimarket data. In the business world, such as a shop, minimarket, and supermarket is required to develop its business strategy in selling products [1] [2]. One of the sales strategies that are done is to determine the layout of items based on the items that are often purchased. Pandak Baturraden Minimarket is a place that sells a variety of daily necessities. In order to increase the convenience of minimarket and service quality, it is necessary to make decision making to determine the Minimarket sales strategy by determining the layout of the goods to the adjacent distance for the goods related to one with the others [3] [4].
Based on the interview with the Minimarket owner currently minimarket Pandak Baturraden has not a sales strategy yet such as layout determination of goods based on the goods that are often purchased. At minimarket Pandak Baturraden, the placement of the layout of the goods still not neatly arranged, the placement of goods is still very random so that there is no association of goods with the other. The placement of goods/products is still based on the price of products that have a high selling price placed behind the cashier to avoid the risk of loss. Then the product that type of rotation is quickly placed on the front shelf. Product placement shelf has several types namely TPE 29, 49, and 56; the meaning is on one shelf; there are 29, 49, and 56 sampling products. The products on the shelves are placed based on similar categories.
Since the Minimarket Pandak Baturraden only applies the layout of goods/products based on the price and category only, the Minimarket is challenging to determine which goods/products are often bought by consumers. Based on the results of the Minimarket, the interview experienced a decrease in the number of transactions that usually in February 2018 as much as 200 transactions per day to 100 transactions only per day in March 2018. Therefore, Pandak Minimarket requires a new sales strategy that is the determination of the layout of goods based on the items that are often purchased. To determine which frequently purchased items can use the sales transaction data on a Pandak minimarket. One of the techniques used to process transaction data is through data mining.
Data mining is a series of processes to explore the value of a data collection of knowledge that has not been known manually [5]. Data in question can be a sales transaction data a minimarket that was used only as a sales report, inventory control and so on [6] [7]. With the technique of data mining to process, several transaction data can be used in deciding to obtain more useful information than ever before. To obtain information from a data mining technique should be used, several methods contained in data mining; one such method is the method of association. Also, in data mining, several algorithms are often used to analyze shopping carts using sales transaction data such as a priori algorithm, FP-Growth algorithm, and Linear Congruent Method (LCM) algorithm [8] [11].
In this research, researchers use a priori algorithm as a method of data processing of sales transactions at Minimarket because the priori algorithm will produce a combination pattern of items and rules as knowledge and information of the transaction data [9]. Using a priori algorithm in processing the transparency data can be found as an association rule that is a data mining technique to find the associative rules of a combination of items with the "if-then" pattern. The search process of the association rules starts with the transaction data processing, and then search the relationship between the purchased products.
Information obtained after using apriori algorithm can be used as a decision making in the arrangement of the product layout in the Minimarket. With the arrangement of product layout that has the relevance of one and the other, can facilitate the customer in the process of searching the desired product or item. In a minimarket, employees are still ignoring the problem of product layout so that they are not irregular in the product drafting process. Product layouts can be viewed from which products are most commonly purchased at the same time, by implementing those rules [10].
As the research has been done [12] that uses a priori algorithm as a decision making process in determining the layout of the goods in Buaran Market Puwokerto. From the explanation above, researchers will conduct shopping cart analysis by utilizing data mining techniques using a method of association and apriori algorithm to determine the layout of goods using the transaction data Pandak Baturraden convenience store that can be used as a product sales strategy.

Method
The concepts of this research are as follows: In this study, researchers used the following research concepts:

Data Collection
In this research, researchers use primary data in the form of sales transaction data at Minimarket Pandak Baturraden. The Data consists of 829 transactions and 829 product names from 16 March to 17 April 2018. Each transaction consists of 1 to 5 different product names. Preliminary data obtained by researchers in the form of a random sales receipt that has not been processed into the ready-made data in the research.

Data transformation
Then at this stage is the transformation of data to select data in order to get clean data and ready to use. Through the transformation process, researchers use one attribute, i.e. the name of the product shortened to the number 1 to 829. This data starts by converting data into a. Arff form by using the product code as an attribute and a "T" (true) that has a meaning purchased and a "?" sign that has a meaning of not being purchased.

Application of Apriori algorithm
In this stage, there are several forms of itemset ranging from 1-itemset, 2-itemset, up to K-Itemset. The process of forming itemset is done if multiple item sets have qualified minimum support. The support value is a value that shows the combination of items in the database taken from the number of transactions containing A divided by the number of transactions.
After the establishment of itemset that has qualified minimum support which will then be formed into a priori rule. Then after all the process of high frequency pattern is obtained next sought association rules that meet the minimum requirement indicates strong The relationship between items in the association rules that are addressed from the number of transactions containing A and B divided by the number of transactions containing A.

Conclusion
At this stage, researchers concluded that from the results of the research method the association has given a solution in decision making to determine the layout of the product in Pandak Baturraden minimarket by applying the combination of items that have been Generated.

Data Collection
This research uses primary data that is directly obtained from the Minimarket Pandak Baturraden. The data is a sales transaction data that must be input when the transaction process occurs. The transaction data is then saved for a transaction report. The Data used in the study began from 16 March to 17 April 2018 with total transaction and Total product name of 829 transactions and product name.

Data Transformation
At this research, use Prepocessing/Cleaning data stages to discard data duplication, check data and correct errors on the data. Then give initials a product to facilitate the research process. The initializer is based on the brand or product name on each transaction. Then the brand/name of the product is adjusted based on alphabetical order to facilitate the time of the initializer process.  Table 4.3 is the sales transaction data that has been given initials on each product purchased to make it easier to know how many items are purchased on each transaction. The next process is a tabulation of data that is the process of creating tables with members sign on each attribute of the product name is "T" (true) which means the product has been purchased and the mark "?" which means the product is not purchased by the consumer.  Table 4.4 then changed the data of the .arff form, .arff is weka format. Figure 2 is the. Arff format ready to be processed to start the process on Weka. In that format, there is a "T" mark indicating that the item has been purchased. In that format, there is also a product name attribute that has been replaced with the initial type of nominal product.

Application of Apriori algorithm
In this stage, the process is done from a priori algorithm to acquire a combination pattern of each item. In the process of forming an item using Weka must be determined in advance the support value and confidence value. This study used a support value of 0.35% and confidence by 60%. Support 0.35% stated that of all analyzed transactions showing the products purchased simultaneously and confidence 60% showed that if the buyer buys product 1 then there is a 60% chance of buying a product 2. With support and confidence value if inserted into Weka will produce the following data: .

Fig. 3. Process results on Weka
Based on Figure 4.3, analysis generated using apriori algorithm on Weka with a minimum of 0.35% support and minimum confidence 60% There are 3 (three) combinations generated that have a relation to the sales transaction data a. Analysis High-frequency pattern Furthermore, the process of forming 1-itemset combination with minimum support amounted to 0.35% and total transaction of 829 transactions. The process of specifying a support value can be done using the formula as follows: ( ) = Aqua 750 ml 6 6/829 = 0.72% After the establishment of a 1-item combination in the 4.5 table with minimum support of 0.35% can be obtained as many as 230 items that meet the minimum support requirement. The next step is to form a 2-item by combining one item with another item.   Table 6 is the result of a 2-item combination that satisfies 0.35% support for 9 combinations of items. The next step is to form a 3-item by combining one item with another item. The following is a 3-item combination table. Because no itemset meets the support value, then the formation of combinations is discontinued. Next count confidence values on each 2-item combination.

b. Establishment of association rules
Association rules are formed based on the selected itemset. The establishment of an association rule is done by calculating the minimum confidence value and looking for a combination of items that meet the minimum confidence of 60%. The confidence value can be searched by using the formula as follows: c) 691  155 is "SIIP Roasted 6.5 grams, confidence 75%. Meaning the buyer who bought SIIP Roasted 6.5 grams has a probability of 75% to buy Davos Strong 10 grm and SIIP Roasted 6.5 items and Davos Strong 10 GRM is likely bought simultaneously at 0.36%.

Conclusion
Based on the results of the database, scanning can be generated 3 best final rules consisting of 829 transaction data and 829 attributes with minimum support of 0.35% and minimum confidence 60%. Items with product name Milo Active 18 grams have a history purchased as much as 4 times the purchase and 3 times purchased simultaneously with the product ABC coffee milk 31 g with confidence 75%.
Then the item with the product name Dancow 1 + Honey 200 GRM has a history purchased as much as 5 times the purchase and 3 times purchased simultaneously with the product Ice Cream Corneto with confidence 60%. Moreover, the item with the product name SIIP Roasted 6.5 grams has a history purchased as much as 4 times and 3 times purchased simultaneously with the product Davos Strong 10 grm with confidence 75%.
This shows that buyers have bought the 6 items of the product at the Pandak Baturraden minimarket. Utilizing the formed association rules can be used to determine the layout of the adjacent goods to facilitate the buyer in search of products related to each other and can be used as a reference in Stock of available goods.

Conclusion
Based on the research that has been done, can be concluded as follows: 1. There are 3 rules of the highest final association formed that meets the value of support and confidence value is: a. If buying Milo Active 18 GRM, then it will buy ABC coffee milk 31 g with support 0.36% and confidence 75%.
b. If buying Dancow 1 + Honey 200 GRM, then it will buy Ice Cream Corneto with support 0.36% and confidence 60%.
c. If buying SIIP Roasted 6.5 grm, then it will buy Davos Strong 10 grm with support 0.36% and confidence 75%. 2. With the rules formed above, then it is advisable to determine the layout of goods, i.e. Milo Active 18 grm adjacent to the ABC coffee milk 31 G, Dancow 1 + Honey 200 grm adjacent to Ice Cream Corneto, and Siip Roasted 6.5 grm adjacent to Davos Strong 10 grm.
3. From the results of the study, Pandak Baturraden can use the association rules that formed to establish a sales strategy that is to determine the layout of the goods closely because it has a relation to one another.