Share with your friends


Analytics Magazine

New Revenue Streams: Predictive analytics in the publishing industry

September/October 2015

The transformation of traditional print media and the growing intelligence behind monetization strategies

Arvid TchivzhelBy Arvid Tchivzhel

Declining circulation, bankruptcy, short attention spans of millennials, technology disruption, new competition from content aggregators … the list goes on and on. These are all common threads in most articles that address print and digital media since I have been working with newspapers and media companies over the last seven years. The stereotype of the slowly dying newspaper is easy to accept at face value. And it’s true that since the financial crisis, dozens of newspapers shuttered the printing press and hundreds drastically cut staff to remain solvent. However, two closely related narratives are rarely discussed and go against the stereotype:

  1. Traditionally profitable and bulky newspapers finally had to come to terms with a competitive and rapidly changing technological environment, and thus had to face the exact same challenges as airlines, retailers, hotels, car rental companies and cable television.
  2. Newspapers have begun and are continuing to expand their use of analytics and data to improve profitability, much like companies in the competitive industries mentioned above have used it to improve yield and customer retention.

This article will focus on the second narrative, which is indeed an outcome from the first narrative.

While the term “creative destruction” may not elicit sympathy from those who have suffered due to layoffs and budget cuts, the positive externality is it has led to leaner, more adaptive organizations. The newspaper of the new millennium is embracing big data technologies and learning how to become a data-driven decision-maker. No longer is advertising, circulation and pricing led by the phrase “this is what we have always done,” but rather the business wing of the newspaper (the other being journalism) is looking to data scientists and predictive modeling to drive important decisions. Among the things being driven by data:

  1. products/prices to offer new and existing customers;
  2. setting advertising rate card and inventory premiums;
  3. targeted marketing messages (who gets a sports-themed creative vs. a business newsletter, etc.);
  4. customized customer service experience (which complaining customer should get a 50 percent discount vs. who should be kept at full price);
  5. where to set the paywall (how much content should be paid vs. free); and
  6. what content to share on social media (and when to share on Facebook vs. other platforms).

Countless decisions can be made by looking at data. However, data is only as valuable as the person looking at it. Derrick Harris from summarized it nicely: “Data is the new oil – it’s very valuable to the companies that have it, but only after it has been mined and processed.”

The analogy makes some sense, but it ignores the fact that many people and companies don’t have the means to collect the data they need or the ability to process it once they have it. A lot of us just need gasoline.

The newspaper of the new millennium is embracing big data technologies and learning how to become a data-driven decision-maker.
The newspaper of the new millennium is embracing big data technologies and learning how to become a data-driven decision-maker.

Mixing Gasoline and Newspapers

The most amazing impact of the Internet is that publishing content online makes it immediately accessible to every part of the globe. This allows a local newspaper like the Peninsula Clarion in Kenai, Alaska, to be readily available to a curious individual in Durban, South Africa, who can read about salmon dip-netting on the North Beach. Seemingly, the potential audience of any digital content could be infinite if the content and marketing is right.

Detailed information of user engagement, both paid and anonymous, is revealed using big data tools by capturing and crunching billions of rows of data. Before these tools, publishers had only surveys, focus groups and angry customers to tell them what resonated and what did not. By using “revealed preference” instead of “stated preference,” publishers can learn exactly which users and what content drives engagement. While this is a simple exercise in data clustering to a data scientist, user segmentation has informed how content producers sell and market their products, helped to set advertising rates on premium inventory and created more effective targeted marketing campaigns. At last, publishers can understand in real time what content attracts the loyal users and what content attracts the flyby user.

I will refrain from naming specific newspapers for the sake of privacy, but below are some quick examples of how newspapers have used detailed engagement data:

  1. The newspaper: a large metro daily with a rigid paywall at five free articles per month. The analytics: Detailed log-level website traffic was analyzed and segmented using simplistic clustering and categorization. The output: Major “content anchors” were identified that attracted loyal and returning users. A significant subset of users anchored themselves exclusively in sports content with minimal overlap in other content. This was compared to the other significant user group anchored in news content who overlapped with politics, community news and some business content. The outcome: It was determined that the users anchored in news content would be more amenable to the “all access” offer, which already existed. However, the users anchored in sports content would respond to a targeted offer so a digital-only sports product at a lesser price point was proposed to monetize this highly segmented audience.
  2. The newspaper: a digital-only national sports publication with premium content. The analytics: Detailed log-level website traffic was analyzed and matched at the user level (via account ID and login ID) to a customer data warehouse. Detailed customer records, transactions and payments were reconciled with online behavior to create a complete profile of individual customers using both online and offline data. Predictive analytics, using survival analysis, were applied to measure expected retention based on historical data. The output: A Customer Lifetime Value (CLV) score was assigned to each customer using the mix of online and offline data as well as predictive modeling. Customers were scored 1-100 in terms of expected lifetime value. The outcome: It was proved statistically that recent new starts had a significantly lower CLV compared to new starts in earlier years, exposing the need to redesign and reinforce the value of the product to new subscribers. New initiatives were taken to redesign the website, adjust promotional pricing and offers, create a mobile-friendly experience and improve new start retention and value.

The outcomes above were achieved by doing fairly straightforward analysis, data matching and segmentation. The next example uses a more advanced algorithm to find an optimal outcome.

Subscribers are the New Black

Advertising has and will continue to be an important revenue stream for media companies, but many content producers have embraced paywalls. The Wall Street Journal, New York Times and the Financial Times, for example, understand that valuable content should not be given away whether it is printed on a piece of paper or if it is published via content management system to an iPad app. Valuable content and compelling journalism can indeed attract a paying audience large enough to forego potential ad impressions and revenue. Given the generally low cost per thousand impressions (CPMs) sell-through rates and click-through rates, it is not surprising that publishers are monetizing through a digital subscription or print/digital “all access” subscription. Recent trends even suggest there might be a larger shift in the industry toward paid journalism rather than ad-supported journalism. That said, there is still a risk of losing valuable advertising revenue if the paywall is too aggressive. The challenge is determining what should be paid vs. free.

The challenge posed above is common for publishers and represents the constant pendulum between advertising dollars and paid digital subscribers. On the one hand, leaving a website completely free guarantees the maximum amount of ad revenue and impressions, but it creates a conundrum for publishers asking subscribers to pay a recurring monthly price to a print product while leaving the same content online for free, as well as foregoing potential new digital subscriptions. On the other hand, a hard paywall will maximize the amount of paid digital subscriptions and reinforce the product but put at risk all the ad revenue that could be possible. It is not an “either-or” question; the optimal point for a paywall is where both are maximized simultaneously.

Using detailed traffic patterns and advertising data, this problem can be solved. Measuring user engagement by geography (local vs. non-local), device, day of week, time of day and what content drives engagement can be used to model expected conversion probability per user segment. Measuring key advertising metrics such as click-through, sell-through and CPMs by the exact same segments (geography, device, seasonality, day-part, content and segment) shows the expected ad value in an “apples-to-apples” context. An algorithm that performs an optimization formula can be deployed to dynamically adjust and reset the paywall based on continuously optimizing the balance between advertising and subscribers.

The outcome from this type of analysis might be to set the paywall for entertainment content (let’s assume high CPMs, high sell-through and low subscriber engagement) to be light, perhaps even letting this content be completely free. Conversely, for sports content (let’s assume low CPMs, low sell-through and high subscriber engagement), the paywall could be set much more aggressively. In this way, advertising revenue is realized for content with high ad value, but subscriber revenue can also be realized for content that is higher value for subscribers. Local vs. non-local traffic, mobile web vs. desktop vs. native apps, etc., also could be split in similar ways to take advantage of very different ROIs for the two major revenue streams.

Data and analytics can bring a newfound longevity to a traditional print product and allow efficient use of new digital content. Publishers are learning to take advantage of analytical tools and analysts who can help find insights and create real action plans to drive actual dollars to their bottom line. They are learning to become efficient and competitive, as have other industries that found themselves in the same position.

Arvid Tchivzhel, a director with Mather Economics (, oversees the delivery and operations for all Mather Economics consulting engagements, along with internal processes, analytics, staffing and new product and services development. He has led numerous consulting engagements across various industries, working with econometric modeling, forecasting, economic analysis, statistics, financial analysis and other rigorous quantitative methods.

business analytics news and articles

Related Posts

  • 44
    March/April 2013 By Vijay Mehrotra As described in the previous edition of Analyze This!, I am currently working on a research study with Jeanne Harris at Accenture’s Institute for High Performance. Specifically, we are seeking to develop a quantitative and qualitative understanding of the current state of analytics practice. If…
    Tags: data, big, predictive, modeling
  • 40
    FEATURES Welcome to ‘worksocial’ world By Samir Gulati New approach, technology blends data, process and collaboration for better, faster decision-making. How to pick a business partner By David Zakkam and Deepinder Singh Dhingra Ten things to consider when evaluating analytics and decision sciences partners. Big data, analytics and elections By…
    Tags: data, modeling, media, big
  • 38
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data, big
  • 36
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big
  • 36
    Cathy O’Neil, an industry insider and experienced expert, thoroughly covers the sociological downside of data science in her New York Times bestseller and first-of-its kind book, “Weapons of Math Destruction.” In the world of big data, there’s a lot of music to be faced. With all its upside, data science’s…
    Tags: data, predictive, big


Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to