Robot Newspaper Editors and Why Social Data Isn’t Enough

A robot woman head with internal technology

The idea for a robot-generated newspaper fueled by an algorithm sounds like something out of a science fiction novel. But it is very real, and it is happening today.

With no media company today immune to the noise and fierce competition of digital publishing, and with giants like The New York Times experiencing trouble on how to keep audiences engaged and loyal, all newspapers and media sources are looking for new ways to retain and excite readers. Enter: a new project known as #Open001 — a limited print newspaper fully generated by robots with the curation based on social sharing activity, both by your friends and social virality as a whole, published by The Guardian.

I think the challenge that #Open001 is trying to answer is this: how do media and publishing brands keep a reader’s attention and drive revenue and pageviews when there are so many other outlets all vying to remove them from your site? If a brand is basing “engaging content” on what it appears people are socially interacting with, then, in theory, it should work in keeping readers onsite and coming back.

While the idea and approach are incredibly interesting, I have a few thoughts on how it could be improved and why human editors may not agree with the approach (job security, aside).

Expanding the POV and serving individual interests

In order to keep readers returning for more and more content, there needs to be a relevant and engaging content mix. That mix needs to take into account both an individual’s interests as well as content discovery, or the “serendipity factor”, to broaden their point of view. This is table stakes for news and media editors; they want to push their readers to expand their content breadth. Only serving content that is popular is counterintuitive to the notion of the media crusade to increase awareness of topics that their users don’t already know about. This is what differentiates one media source from the next and if media outlets move to this social-based model, it will most likely erode any form of brand loyalty in the media space. What users come back to is a tone that resonates and the ability to access content that is relevant to them, as an individual.

That said, creating discovery is possible via algorithms (like the one at Sailthru). Rather than using a single data source to drive content discovery, sophisticated algorithms combine historical behavioral and interest data sets on the user level. This gives a view into not only what readers SAY they like, but also what their behavior proves they are interested in. The right algorithm can then recommended new content based on historical interests in a topic, even if an explicit interest has not been stated by a user. And the more that users engage with a site powered by the right algorithm, the smarter it gets to recommend what’s truly relevant.

Virality or social activity feeds simply can’t support discovery based on relevancy, which is a great segue into my next point…

Social data is not created equal

There are two main challenges with using social data alone to try and construct a customer interest and engagement framework.

1) Reality vs. portrayal. As discussed in The New York Times’ Psychology of Sharing, 68% of people share content in order to define themselves to others. If people are projecting an image of themselves which they want audiences to perceive them as, the true interest level of what they share is considerably diminished. Further, the CEO of Chartbeat discovered via their massive troves of data little to no correlation between social sharing and actually reading (and especially finishing) an article. Lead data scientist at Chartbeat wrote in to The Verge with the following:

“There is obviously a correlation between number of tweets and total volume of traffic that goes to an article. But just not a relationship between stories that are most heavily consumed and stories that are most heavily tweeted.”

Shares cannot and do not denote that an article is being consumed en masse. They are an indication of virality but not for content engagement propensity. Story selection based on sharing therefore proliferates the challenge that the media industry is currently facing: how do you get users to stay onsite following clickthrough of a shared link? What the industry needs are solutions for keeping users engaged, not for delivering more users who will exit.

2) Social data, while meaningful, is disconnected without context. Social data on its own is a very siloed form of understanding what people want. Everyone uses and interacts with social media for different reasons. While one reader might only use her account for work and thus only share articles on digital marketing, another person with a deep passion for digital marketing might only share articles on tech gadgets. There’s a time-bound element to social, too. With channels like Twitter shedding light on the latest trends, the newspaper would end up being more trend and gossip fodder focused rather than news meant to create loyalty.

The Guardian’s algorithm rests on the framework that media companies are struggling with getting users onsite instead of keeping them onsite. So while we understand the pull of the viral, shareable content, if you can’t keep them onsite and coming back over and over, how useful is the algorithm?

When it comes to social media as an island, it lacks the contextual details that a robot editor would need in order to construct what people care about on a personalized and larger production scale. Omnichannel data modeling — meaning learning about our readers whether they’re onsite, on the app, on social media, etc. — is what fills out the missing pieces and builds a true interest graph that can deliver relevant content.

How robot can understand individuals

By aggregating historical behavioral and interest data across all channels, here’s what a smarter, more contextual content selection algorithm should accomplish:

1) Pull readers onsite through relevant content

2) Keeps readers onsite by offering more relevant content through smart discovery interfaces

3) Learn more about users interests and behaviors to make each experience richer

4) Incrementally increase engagement across all channels

This approach allows for a living, breathing, real-time data ecosystem of your audience so that a robot-generated newspaper could actually transform how readers are interacting with your brand by keeping them engaged longer, and creating a chance for real brand loyalty. It will also begin to change your perception of a “user” into an individual– an individual who has a unique story in how they interact with the world, and with your brand, that you can personally engage with on a 1:1 level…even with a robot editor.

Kristine Lowery is a Senior Marketing Associate at Sailthru.