Today, we provided comments to the UK's consultation on #copyright and #AI. This is part of our ongoing policy strategy to bring a public interest perspective to the policies and legislation that will regulate the AI ecosystem. In our submission, we continue to call for limited copyright to facilitate text and data mining (#TDM). Studying and analyzing materials to derive facts, ideas, and other uncopyrightable elements should be permitted - even where that analysis requires making a copy as an intermediate step. We also believe that a copyright exception should be complemented by ways for rightsholders to express their preferences about use of their works in TDM, including AI training. You can read our submission to the UK below. 👇
While I agree with the underlying philosophy and understand that this is a consistent application of that philosophy, I think that the perspective outlined in these comments has the potential to fail creators in practical ways. TDM is technically complex and extremely expensive–by not standing up for (at very least) a right of attribution, we effectively cede the commons to be harvested by large, for-profit companies building large, for-profit systems. We require people to provide citations and ostracize human plagiarists, but give ground when corporate developers tell us that citation in their for-profit products is too hard? It is a hard problem, but I have serious doubts that it's an investment priority for the folks building foundation models–without a strong advocates for creator attribution, it never will be. I understand that this document is more narrowly addressing copyright in the UK, and that attribution is a slippery topic, especially given the technology and the all-or-nothing nature of copyright enforcement. But a healthy commons requires active creators, with good reasons to participate. I would really love to see a point of view that better reflects that. Thanks for your continued work in this space.
Is there a non-LinkedIn URL for this so I can refer it to other people? (I went to the CC home page but did not find it).
How do you reconcile the "broken link" of attribution with a sustainable commons? Further, are the -SA and -NC licenses not already clear enough in terms of signalled preference? Perhaps I missed discussion of these topics in your response, but it seems to me that users of CC0/BY families have very different needs and expectations than -SA, -NC, and -ND users, and this response seems to lump them all together.