Analysis of Sequential Book Loan Data Pattern Using Generalized Sequential Pattern (GSP) Algorithm

As a center for learning and information services, STMIK Amikom Purwokerto Library is a source of learning and a source of intellectual activity that is very important for the entire academic community in supporting the achievement of the college Tridharma program. Book lending transaction data, can produce information that is important as supporting decision making when further analyzed. One useful information is that it can provide information in the form of user behavior patterns in borrowing books that are used to maintain the availability of related book stocks to be balanced. This study uses the Generalized Sequential Pattern (GSP) algorithm, which can be used to determine the behavior patterns of users in each transaction and can show relationships or associations between books, both requested simultaneously and sequentially. From the calculations that have been done, 295 frequent sequences are consisting of 3 sequence patterns that are formed from the minimum support of 0.53% or the minimum number of books borrowed, namely 2 books. Three book items have very strong linkages in book lending transactions, namely book code 6690, 2026, and 8131.


Introduction
The library is a venue, a building reserved for the maintenance and use of books and so on, can also be interpreted as a collection of books, magazines, and other literature materials stored for reading, learning, talk about [1]. In addition to serving the collection of writings, prints, and/or records, currently, the library is considered the resource of information that is the mobilizer of an institution [2] [3].
Book-Lending transaction Data can produce valuable information as support for decision making when analyzed further. One useful information is to provide information in the form of student behavior patterns in borrowing books, providing information about related books, keeping the stock availability of related books to be balanced, arrangements The placement of books related to bookshelves, and many other beneficial strategies that can be used as supporting in making decisions related to the management of book stock in the library STMIK Amikom Purwokerto.
Further analysis of book loan data can be done by implementing data mining. Data mining is an essential process of extracting information or patterns in large databases [4]. The application of data mining in book-lending data in the library of Stmik Amikom Purwokerto is expected to be used as a support for decision making by looking at the book lending pattern.
There has been previous research on the processing of book-lending data in the library of Stmik Amikom Purwokerto by combining the apriori algorithm and the fp-growth algorithm conducted by [5] obtained 5 association rules from Minimum support 0.01 (1%) and minimum confidence 0.5 (50%) [6].
While the research conducted by [7] is about the development of market basket analysis application in supermarkets using the algorithm of a generalized sequential pattern. The results of the analysis can be used as a strategy for running a business, such as a layout recommendation of goods and maintaining the availability of product stock related to being balanced. From the test application conducted from transaction data period August 10, S. D August 25th, obtained the information in the form of rules from minimum support 50% resulted in rules 18, number of transactions 10, number of customers 4, number of item 25 and execution time 00:00:00:604.

Generalized Sequential Pattern (GSP) algorithm
The GSP algorithm is used in the mining sequence and is useful for solving many Mining sequence issues based on a priori algorithm [8]. The primary function of the GSP algorithm is to find a pattern. [9] Data extracting sequential patterns try to find relationships between occurrences of a sequential event in order to search for a specific sequence of events. In other words, a sequential pattern excavation aims to find a frequently occurring sequence to describe data or data that predicts a future or periodic pattern excavation [10].
Generalized Sequential Patterns (GSP), can obtain data information about the frequently borrowed books (the rules) and frequently borrowed books sequentially (sequential pattern Rules) by the same borrower. With the GSP algorithm, both kinds of information will be obtained simultaneously in one process [11][12].

Sequential Pattern Mining Framework (SPMF)
Sequential Pattern Mining Framework (SPMF) is a library function written in the Java programming language to handle data Mining tasks with the Open-source GPL v3 license [13]. For the result of the input file, there is a Format output file defined as a text file. Each row is a frequent sequential pattern. Each item of a sequential pattern is a positive integer and an item of the same itemset in a sequence separated by one space. The value "-1" indicates the end of an itemset. The value "-2" indicates the end of a sequence (this appears at the end of each line). On each row, the sequential pattern is indicated first. Then, the keyword "#SUP:" appears followed by an integer that demonstrates the support of Dari pattern as some sequences [14] [15].

Method
This research is conducted in the library of STMIK AMIKOM Purwokerto, located at Jl. Letjend. Pol. Sumarto front NES Purwoketo Watumas. The research uses the lending data from 20 March 2017 to 16 August 2017.

Fig 1. Research Framework
The research flows in this study are as follows:

Problem Identification
The process of identification of the problem is done as an effort to know the problem in Amikom library and using the new method so it can be determined that several points analyze algorithm performance in book lending

Data Collection
In this study, the primary data used by book-lending data can be from the library of STMIK AMIKOM Purwokerto. The Data consisted of 1425 Records, which were started from 20 March 2017 -16 August 2017. Also, data of information about the library is a library profile, graphs, and interviews to obtain information about the library.

Pre-Processing Stage
At this stage, the data selection process to obtain data is clean and ready to be used as research material. These stages include data selection, data sanitization, Data transformation. This research is done by analyzing the type of book in each transaction, not the number of books in each transaction and to find the relation between books. Therefore, the result of data ready to be filtered back leaves a top record when in one advanced a name the same book.

Data Analysis Using Generalized Sequential Pattern Algorithm
The algorithm that the author uses in this Study is a generalized sequential pattern algorithm.

Conclusion
At this stage, the authors conclude from the results of the study that has been done produced a book lending pattern that is formed from the Use of the generalized sequential pattern algorithm.

Results and Discussion
To determine the sequential pattern with the GSP algorithm is done with the help of the SPMF application. This application reads input in the form of data in sequential form. Before the data is in the process through the SPMF application, it is initialized to the book code. Analisa generated by using the GSP algorithm with a minimum of 0.53% support in obtaining results is using the execution time GSP algorithm required to form a sequential pattern of the SPMF software generating 302 frequent sequences. From the calculated result, the minimum support used is 0.53%, and the minimum confidence is 50% obtained the following results.

Fig. 2 SPMF Data input display
The following results of the GSP algorithm output using SPMF software. After the data analysis using the SPMF software is complete, the process of datasets is then used by Microsoft excel 2010. This stage aims to obtain the Association rules using the generalized Sequential Pattern algorithm. From the calculated result, the minimum support used is 0.53%, and the minimum confidence is 50% obtained the following results.

The Generate Frequent Itemset Process
The generate frequent itemset process is the process of forming a candidate itemset and its support to obtain a frequent itemset that satisfies the minimum support using the GSP algorithm.
a. Calculation minimum of support for the appearance of each item.  The non-sequential Items field is a borrowed item at the same time, while for the sequential Item field The item is borrowed sequentially.  Table 3 and Table 4 are the 2nd candidate Itemset that meets the minimum support. The results of the nonsequential frequent itemset result are received 30 non-sequential patterns (in the same loan) with the highest support value of 5 with the initials 22 49 (2026 8131). Moreover, the results of the frequent itemset to-2 sequentially obtained 16 sequential patterns that are borrowed sequentially by users with all the results of its support value 2. c. The establishment of the 3rd candidate Itemset is depicted by joining the result of the frequent itemset to-1 with the results of the 2nd frequent itemsetfurther calculations for the number of his support. From the result of the calculation, obtained 1 non-sequential pattern and 1 sequential pattern with the same support value of 2. d. The iteration of this GSP algorithm will stop if it can no longer be found next candidate Itemset that can be formed. Table 5 and Table 6 are examples of the last result of frequent itemset that can result from transaction data.

The Generate Rules
The generate rule process serves to generate rules by processing the data in the frequent itemset table that has been generated in the generate frequent items process. a. If users borrow Buku "multimedia konSep & Application in education" then will borrow the book "multimedia konSep & Application in education" and "multimedia DigitaL base theory + development". b. If a user borrows a Buku "Rational Rose untofan object-oriented modeling" then it will borrow the book "Rational Rosefor themodeling of object-oriented" and "software-oriented engineering with the USDP (Unified SSoftware development Process") method. c. If users lend a book "Multimedia konSep & Application in education" with the book "method of research qualitative Kand R & D" then will borrow the book "Multimedia konSep & Application in Education", "Digita multimediabase theory + development" and the book "method Qualitative Quantitative research and R & D ".

Conclusion
B theamount of rule generated is influenced by the large number of transaction data, borrowed items simultaneously and the number of users. In addition, the pattern of the book-Lending behavior by users in different times can reappear at a later time.