Modelling Customers Lifetime Value For Non-Contractual Business

Because customer equity is becoming increasingly important in today's business environment, many companies are focusing on customer loyalty and profitability as a way to grow market share. A company's successful Customer Relationship Management (CRM) begins with identifying true value and customer loyalty, as customer value can provide basic information that can be used to spread more targeted and personalized marketing. Customer lifetime value (CLV) is used in this document to segment consumers in non-contracted companies. The findings of this research are very promising. CLV has successfully analyzed and produced a fairly strong assumption about the value possessed by each customer whether they will make a return transaction or not.


Introduction
Companies must build innovation activities to capture consumer desires and improve customer loyalty and retention in today's market, which is increasingly more dynamic and competitive [1]. In this way, customer relationship management is a well-known method for attracting and retaining clients. CRM's key aim is to develop long-term and successful customer partnerships [2]. A broad database containing comprehensive data on demographic details and consumer purchases is accessible in this context. To evaluate consumer equity, different CRM tools may be used to analyze this data. Customer Lifetime Value (CLV) is a CRM term that reflects the present value of all potential income produced by a customer [3]. CLV estimation has many uses, and many writers have developed models for them, including performance assessment [4], consumer segmentation [5], marketing capital distribution [6,7], product offering [8], pricing [9], and product offering. [10 -12] The relationship between the company and the buyer in E-Commerce or retail is a non-contractual relationship. Customers quit in the non-contract universe, but they do it quietly; they didn't have to tell us they were going. Calculating the CLV gets even more complex as a result of this. We would consider the period of time after the customer's last transaction to determine if the customer is alive but inactive or "dead" ("on" means the customer interacted with us, "dead" means they have become inactive as a customer). We will model the consumer lifetime for non-contracted companies in this report.

Customer Lifetime Value
CLV changed its focus away from the Customer Relationship Management issue (CRM). CRM is a company-wide technique for better understanding and shaping customer behaviour through positive dialogue in order to improve customer acquisition, retention, satisfaction, and profitability [13]. CRM's aim is to create closer and stronger relationships with consumers in order to increase their lifetime value to a brand [14]. There have been multiple classifications for the CLV model in previous research. Gupta et al suggested one of these divisions. Six modeling methods are described by Gupta et al: Recency, Frequency, and Monetary (RFM) model; A probability model based on the Pareto / NBD model and the Markov chain; Customer acquisition, customer retention, and customer margins and expansion; econometric models such as chance models based on the Pareto / NBD model; The Longevity Model is focused on forecasting the behavior of its constituents, which involve acquisition, retention, and cross-selling. Models in computer science are theory-based (e.g., utility theory) and quick to grasp [15].
WRFM -weighted RFM -instead of RFM was suggested by several scholars in a recent study. R, F, and M each have their own range of weights. Different weights must be applied to the RFM parameters based on industry characteristics. For eg, Wei [16] suggests putting the most weight on Frequency, then Recency, and finally the monetary measure [18], but Chuang and Shen (2008) suggest putting the most weight on monetary and the least weight on highest recency [1]. The relative value (weight) of the RFM variables is calculated using the AHP equation.

Data Mining
Data mining is the method of automatically locating valuable information in large data warehouses. Data mining methods are used to sift through vast datasets in search of new and valuable trends that would otherwise go unnoticed [19]. There are two types of data mining methods: analytical and predictive. Grouping is a descriptive process, whereas classification is a statistical method. The method of discovering a model (or function) that represents and differentiates data classes or principles with the intention of using the model to predict the type of objects whose class mark is unknown is known as classification [20]. Unlike classification and prediction, which look at data objects with class identifiers, grouping looks at data objects without them. The CLV for each section is determined using the k-means clustering approach in this article. K-means, originally known as the Forgy process [21], is a well-known clustering algorithm that has been commonly used in a number of fields, including data processing, computational data analysis, and other market applications.

Method
Before starting the modeling and prediction process, we will describe the data we will use for today's research. We use the Online Retail Dataset which is available and can be downloaded open source in the UCI Machine Learning Repository. Let's take a closer look at the features in the dataset first, Attached in figure 1 below is the structure of the dataset that we use along with some sample data in it, there are approximately 550,000 data in it, but not all of the data will be used for manufacturing (Training & Testing) model. Besides that, there are still data that need to be cleaned because they still have a NaN value which will have a quite fatal effect when used for training models. As we said before, we have some cleanup to do, then create a new data frame containing only CustomerID, InvoiceDate (timeless) and add a new column which is "sales":

Fig. 2. New Dataframe
The following nomenclature is used for CLV models: Frequency refers to the amount of times a consumer has made the same transaction. That is, it is less for one of the overall sales price. T denotes the customer's age in the time unit chosen (daily, in our data set). This is the time interval between a customer's first transaction and the completion of the testing era. Recency suggests the customer's age at the time of their most recent order. This is the same as the time period between a customer's first and last transaction. (The recency is 0 if they just make one purchase.)

Fig. 3. CLV Dataframe
Our database currently includes 4339 clients. CustomerID 12346, for example, made only one transaction (no repeat), so the frequency and recency are both zero, and the age is 325 days (for example, the duration between the first purchase and the end of the period in the analysis). More than 35% of all consumers in our database just made a single order (no repeat).

Visualizing our frequency/recency matrix
To begin, we must consider the fact that the customer has made a payment every day for the past four weeks, and yet we haven't heard from him in months. Is it true that he is still "alive"? Isn't it very small? Customers who have made sales once a year in the past and twice in the most recent quarter, on the other hand, are most definitely still alive. The frequency / recency matrix, which measures the estimated amount of purchases a consumer will make in the next time frame depending on their recency (age at last purchase) and frequency, can be used to visualize this relationship (number of recurring transactions it has made).

Fig. 6. Expected Number of Future Purchases for 1 Unit of time
When a client has made 120 transactions and his most recent purchase was 350 days ago (i.e., Recency: the time period between the first and last transaction was 350 days), he is our best customer (bottom right). A customer who has ordered regularly and lately is likely to be the best customer in the future. We'll never be able to get enough of them. Customers that have recently (top right corner) purchased a significant quantity of goods may have left. Other forms of customers (40, 300) reflect a client who seldom buys, but we haven't seen him in a while, so he'll most likely buy again. We're not sure whether he's gone or whether he just made a one-time buy. In the end we can predict which customer is definitely still alive: Fig. 7. Probability Customer is Alive In figure 7 above, we can conclude that customers who have just bought must still be "alive". Customers who have bought a lot but not recently, have most likely left. And the more they bought in the past, the more likely they were to quit. They are represented at the top right. We rank customers from "highest estimated purchase of next period" to lowest. The model represents a method that will predict the purchases customers expect in the next period using their history.  Figure 8) are our top 5 customers that the model is expecting them to buy the next day. The predict purchases column shows the number of purchases expected while the other three columns show their current RF metric. Model BG / NBD believes that these people will be making more purchases in the near future because they are our best customers at the moment.

Assessing Model Fit
The results we got were quite acceptable, the output model was also not too bad and usable. So, we can continue with our analysis. We now partition the dataset into a calibration period dataset and a split dataset. This is important because we want to test the performance of our model on unseen data (such as cross validation in machine learning practice).  We divide the data in the plot above ( Figure 10) into sample (calibration) and validation cycles (splits). The validity period runs from 2011-06-09 to 2011-12-09, while the survey period runs from early to 2011-06-08. The plot divides all consumers in the calibration cycle by the amount of repeated transactions they make (x-axis) and then averages those purchases over the split period (y-axis). The green and blue lines on the y-axis reflect the model prediction and actual outcome, respectively. As we can see, our model can predict the behavior of the out-of-sample, under-forecast consumer base at 4 and 5 transactions with considerable precision.

Fig. 11. Customer Transaction Prediction
Based on the customer's history, we can now predict what individual purchases will be in the future. Our model predicts that the future 12347 customer transactions will be 0.157 in 10 days. Based on a customer's transaction history, we can calculate their historical probability of staying alive, according to our trained model. For example, we want to see the transaction history of our best customers and see if they are still alive:

Results and Conclusion
The current study focuses on customer segmentation as one of the CLV applications. As a case study, customer data from non-contractual businesses was examined. Using the Algorithm, we divide customers into segments based on RFM and Extended RFM parameters. Customer segmentation allows decision makers to more clearly identify market segments and develop more effective marketing and sales strategies for customer retention. The CLV method is used to determine the relative importance of the RFM variables based on the point of view of the expert in the sales department, because the RFM weights vary with industry characteristics. For each customer segment, the CLV value is calculated using the weighted RFM parameter. After that, each segment is given a CLV rating based on its CLV value. Potential value represents cross-selling opportunities, while present value provides a financial perspective. We can develop a refined marketing strategy for each segment by analyzing the CLV ratings of a segmented customer group. Our future work will be to implement this strategy in the company.