The Society of Authors has written to Google ahead of the launch of their latest generative AI offering, Google Gemini, following reports of a continuing lack of transparency regarding the ‘datasets’ used to train it. You can download our letter below.
Dear Google,
We understand that the next evolution of Google’s AI offering, Google Gemini, was released earlier this month in three versions: Gemini Ultra, Gemini Pro and Gemini Nano. The release precedes the launch of a more powerful Gemini model, set to arrive sometime in 2024.
Capable of analysing more nuanced text information, as well as being ‘natively multimodal’ – that is, capable of understanding images, video, audio and code – and generating responses to more ‘complicated’ questions from users, it is presented as an advancement of existing AI models such as Open AI’s GPT-3.5. Users will be able to access Gemini Pro through Bard.
As multinational commercial entities continue to compete for market dominance in the generative AI space, we are very concerned that there continues to be a total lack of transparency by the developers of these systems regarding the creative works they copy to create their systems. The Society of Authors was greatly concerned to hear that, at the launch of Gemini pro, ‘Google repeatedly refused to answer questions from reporters about how it collected Gemini’s training data, where the training data came from and whether any of it was licensed from a third party’. Given the reportedly extensive infringement of copyright protected works that took place to build existing Large Language Models (LLMs), this continued lack of transparency is alarming and leads us to assume that Google has used creators’ works unlawfully and unfairly, without consent or any kind of remuneration.
We call on Google to give complete transparency regarding the musical, artistic and literary works used to develop Gemini. Where copyright protected works have been used, Google must comply with existing copyright laws and provide fair remuneration to the creators whose works were used. Authors make their living from licensing creative works. Not paying them undermines the law, and risks harming all those whose livelihoods depend on it. This will ultimately harm the quality of future models of AI systems, which cannot exist without individual humans creating the beautiful, inspirational, informative and transformative works which give meaning to our lives.
We look forward to hearing from you with links to a full database of all materials used to develop your new programmes.
Yours,
Society of Authors
Notes to authors
You can learn more about where we stand on artificial intelligence here, and on our dedicated website pages. If you are concerned about the use of your work by generative AI systems, please see our guidance on how to protect yourself and your work from the impact of new technologies.