Tip Sheets

Generative AI’s ‘insatiable hunger’ for data could chill user participation online

Media Contact

Becka Bowyer

The owner of Tumblr and WordPress.com, Automattic, is reportedly in talks with major AI companies to provide training data from users’ posts. Following the news, both sites now have a policy allowing users to opt-out of data sharing with third parties.


Frank Pasquale

Professor of Law

Frank Pasquale, professor at Cornell Tech, is an expert on the law of artificial intelligence, algorithms and machine learning. He predicts deals like this will slow participation on online platforms.

Pasquale says:

"This deal is more evidence of the need for deep changes in how online data is governed. Few if any users of Tumblr could anticipate the use of their work on the platform for AI training. They are very unlikely to personally benefit from the training. Now they risk having very personal information and postings ripped out of context and potentially regurgitated by an AI system, indefinitely. I foresee deals like this chilling participation on online platforms, particularly public participation, as ordinary users realize just how exploitable their work is.

"It is a sad commentary on AI hype that generative AI's insatiable hunger for largely amateur, user-generated content continues to attract investment, while other forms of AI focused on advancing health care, logistics, agriculture and other fields more directly connected to human welfare often languish or get less attention."  

Cornell University has television, ISDN and dedicated Skype/Google+ Hangout studios available for media interviews.