Marketing Metrics: Leveraging predictive analytics to estimate customer lifetime value
By Matthew Lulay
Customer lifetime value (CLV) is not a new tool for marketers. Its application has been used for decades to understand a customer’s financial value. It comes in many shapes and sizes, varying from historical CLV, which calculates a CLV based only on what a customer has previously spent with a business, to predictive CLV, which leverages both observed historical behavior and predicted retention to estimate a discounted stream of future (lifetime) revenue.
Historical CLV has several drawbacks, the most important of which being that, since it is the sum of past revenue or profit for a particular customer or group, it only provides insight into what has already occurred, and, thus, sheds little insight into the value of new subscribers. Predictive CLV, however, with its ability to incorporate expected retention, allows marketers to obtain several key insights, including what types of subscribers will be the most profitable over a specific time period, where acquisition dollars earn the highest return on investment and what customer attributes are drivers of retention. These types of actionable insights can help marketers make more well-informed, data-driven decisions that promote efficiency, savings and revenue growth. This article explores the basic tenets of predictive CLV, illustrated by examples from the newspaper industry.
In the newspaper industry, revenue for a particular subscriber includes the subscription rate and the subscriber’s share of the market’s advertising revenue.
Major Components of CLV
The calculation at the bottom of the page shows the three major components of predictive CLV: profitability, predicted retention and discounting.
Profitability: Profitability is the simplest component of the CLV metric, as it is a straightforward calculation of revenues minus costs. In the newspaper industry, revenue for a particular subscriber includes the subscription rate and the subscriber’s share of the market’s advertising revenue, which comes in the form of pre-print advertising inserted into each day’s paper, as well as digital advertising revenue via impressions on the market’s website. The subscription rate can vary based on a variety of factors, including the number of delivery days (e.g., Sunday only vs. seven-day), the period length (e.g., 13-week vs. 52-week), acquisition source (e.g., direct mail vs. telemarketing) and payment method (e.g., check vs. credit card). Pre-print advertising value is dependent upon the subscriber’s demographic profile, which is normally measured at the zip code or zip+4 level. Costs at the subscriber level for newspapers include print and ink, delivery and acquisition.
Predicted retention: Once revenues and costs are calculated and we arrive at a profit level, the next component of predictive CLV is estimating retention probability, which provides us with the risk-adjusted profit. By “risk-adjusted,” we simply mean profit that has been adjusted to account for the risk of subscriber churn – the probability that a particular customer will retain over a certain time period. In the newspaper industry, while all subscribers come up for renewal at different points throughout the year based on the term length of the subscription, not all subscribers exhibit the same propensity to renew. In fact, subscribers with different characteristics can retain at drastically different rates. While an average newspaper may experience overall annual retention of 75 percent, pockets of subscribers within the market may be retaining at 90+ percent, while others retain at less than 40 percent. Mather Economics uses an econometric method called “survival analysis” to estimate the retention probabilities among different subscriber groups.
Survival analysis, originally developed for application in the biosciences, is a method of estimating the probability of an event occurring at a particular time interval. Examples include the probability of survival for a heart transplant patient, the probability of transmission failure on new cars or the probability of divorce after marriage. The probability of these events can be estimated over time using survival analysis. With the application to the newspaper industry, we use survival analysis to calculate the probability of subscriber retention at different intervals of time. More specifically, we leverage historical transaction information to fit a parametric survival model with a log-logistic distribution.
We use a parametric model because we understand the underlying distribution of our dependent variable, which is retention probability. The distribution of that variable is log-logistic in nature, where the rate of decline in the probability of retention increases in the early stages and decreases later. This creates a curve that is downward sloping with a slope that decreases in severity over time. An example of this is shown in Figure 1, where we estimate survival probability for subscribers in different income groups, revealing that the most affluent subscribers in this particular market had a retention probability approximately three times higher than those with in the lowest income level after 365 days.
Figure 1: Estimate survival probability for subscribers in different income groups.
|Figure 2: Day-to-day prediction retention
of a new subscriber over a two-year period.
Figure 1 shows only the expected retention probabilities for subscribers grouped by one variable. But when we combine all of the information we have on a particular subscriber, we can estimate a unique survival curve for every single subscriber in a database. In Figure 2, predicted retention is plotted for a new subscriber by day from the point of acquisition to a point two years out from acquisition. The area under the curve gives us the second component of predictive CLV – estimated retention (expected lifetime).
Discounting: Predictive CLV attempts to capture the present value of a customer’s stream of lifetime revenue. Since we’re trying to capture the present value of future revenue, we need to incorporate a discount rate to account for the positive time value, or positive time preference, of money, which essentially states that money today is worth more than the same amount at some point in the future. This concept is why interest rates tend to be positive and why the need for a discount rate exists for valuing future dollars in present value terms. The selection of a discount rate is an important decision, as values are highly sensitive to this rate, especially in estimations in which predictions are made over longer periods of time. A variety of factors are taken into account when choosing a discount rate, including the length of time of the estimation, costs of capital, rate of return on private investment, interest rates on government and corporate bonds and output growth. With this in mind, government agencies in the United States tend to leverage discount rates of 2 percent to 3 percent on intra-generational projects. At Mather Economics, we normally estimate CLV as the risk-adjusted present value of five years of expected earnings for an individual subscriber and use a discount rate of two percent.
Predictive CLV has a variety of useful applications, varying from acquisition optimization to upgrade campaign targeting to customer service prioritization. Consider the simplified example in Figure 3.
While the telemarketing source delivers a cost per order one-tenth that of direct mail, the types of customers acquired from the telemarking source are much less valuable to the firm, and thus, in the aggregate, are less profitable, even when considering the much lower cost to acquire. In this example, CLV helps marketers prioritize their acquisition efforts to acquire subscribers with the most lifetime value to the firm.
Additionally, CLV can help inform upgrade campaigns. Since the value of customers is known with CLV, steps can be taken to increase engagement and increase the value of existing subscribers. In the newspaper industry, for example, publishers use CLV to target lower value Sunday-only subscribers with paid upgrade offers to weekend subscriptions. This promotes a higher level of engagement with the product from the subscriber, which may lead to better retention in the long term and also provides value to advertisers, as the number of circulation units increases, improving advertising frequency.
Figure 3: Simplified example of CLV in telemarketing vs. direct mail.
Another application of CLV is prioritizing the customer service experience. Once a current subscriber base is scored, meaning their lifetime value has been estimated, customer service teams can leverage that data to improve the customer experience and make it more efficient. One example of this in the newspaper industry is having dedicated customer service representatives for high-value subscribers, where calls from these customers are prioritized to minimize wait times and provide the highest level of service possible. CLV data can also be used in the customer service department to create customized retention (stop save) scripting based on estimated subscriber value, where representatives are given the ability to offer more aggressive save offers to keep high-value subscribers on the books.
The three application examples above are just a small sample of the multitude of positive impacts CLV can have on businesses in a variety of industries.
We’ve seen that predictive CLV has several advantages over historical CLV, the most important of which being that it is a dynamic, forward-looking tool, allowing users to calculate the risk-adjusted value of a customer’s lifetime. Leveraging some of the econometric tools at our disposal, including survival analysis, allows us to add the predictive portion to CLV by estimating retention probabilities for individual customers. Once calculated, CLV provides the analytical foundation for many applications, including those in acquisition, retention and customer service. As such, predictive CLV can serve as a valuable asset in a firm’s analytical toolset to help inform strategic, data-driven and profit-maximizing decisions.
As a director for Mather Economics, Matthew Lulay helps media companies optimize pricing through data-driven analysis. In addition, he has authored several white papers on environmental economics, including a model forecasting the employment effects of increased funding along the Gulf Coast region following the Deepwater Horizon oil spill.He has extensive experience collecting, formatting and analyzing data from various state, national and international sources including the Bureau of Labor Statistics, the Federal Reserve and the World Bank. Lulay has a degree in economics from the University of Minnesota and a master’s degree in economics from Florida Atlantic University, where his research focused on macroeconomics and U.S. tax policy.