Authorship Complexities in Using AI

Aug 10

Have you ever created an AI generative image? Did you ever find yourself on ChatGPT prompting it to make a story or song? In a world that’s rapidly accepting AI within the workforce, schools, and social media, all in the name of efficiency, this poses two questions: What does it mean to have authorship and ownership of an original work, especially when using AI? Can AI be an author or the owner of original work? The answers to these questions should be clear; however, authorship has become a muddled issue with unclear guidelines set by the Copyright Office, and many content creators have spoken out against the problem of copyright infringement that AI brings.

Currently, the Copyright Office defines originality and authorship as:

Works are original when they are independently created by a human author and have a minimal degree of creativity. Independent creation simply means that you create it yourself, without copying.

Based on this definition, original works must be created independently and only by humans, and the human creator of that work is the author or owner. However, some have proposed that AI could be an author or a joint author with human creators, for example, Dr. Stephen Thaler in the Thaler vs Perlmutter case and Wei Liu and Weijie Huang. In Thaler’s case, he tried registering with the copyright office that his AI system, the Creativity Machine, which generates AI-images, is an author. The Copyright Office denied him, and he took the issue to court, trying to appeal to several judges, but lost the case. He presented three theories in his argument.

In the first theory, the Operation of Law, he suggested that AI was the author because it created the work, and due to the property doctrine, a term used generally in transferring property ownership, Thaler is the author since he is the machine’s owner. The problem with this theory is that it contradicts the Copyright Office’s definition of authorship because Thaler suggests that authorship is determined by owning a possession, for example, if someone bought a book, the buyer is now the author of that book, even if they never wrote it. In the second theory, the Work-For-Hire Doctrine, which means an employer owns the created work by an employee or hired person, Thaler suggests that the AI system is the employee who created the work, making Thaler the employer who now owns the work. This theory has two problems: Thaler cannot hire his AI system because American law cannot allow a contractual relationship between a human programmer and their machine, such as between a human employer and their employee. Second, this argument suggests that the AI system can be an author, which again goes against the Copyright Office’s definition that an author must be a human. Now for Thaler’s last theory, Indirect but for Originator, he suggests that since he owns the machine and designed the machine’s automative creative framework, he is the work’s author. The problem with this last theory is that it would be like saying that the person who built a camera owns a photographer’s work. Despite Thaler’s case, Wei Liu and Weijie Huang also believe that AI can have authorship and that joint authorship is possible.

Authors Wei Liu and Weijie Huang of ScienceDirect wrote an article stating that AI can have sole authorship and be in joint authorship with users, believing both parties can own the copyright of AI content due to the creative control theory. This theory states that as long as users can guide the AI with detailed prompts to their desired results, it can be copyrighted. However, this theory violates how the Copyright Office defines authorship and originality, which is work created by a human, independently from copying. AI is a machine, meaning if it were to have the ability to become an author, the Copyright Law must change to include authorship to machines. The machine must also be able to think independently, know what it’s doing, and have the feelings to create an original work based on life experience, like a human. At this time, AI has many limitations: it does not know what it’s doing, it has no feelings or life experience; it’s just spitting out information based on the training data models it’s learned from.

Since AI cannot be the author like the authors from ScienceDirect and Thaler suggest, does the content belong to the creator if a content creator uses AI, or does it belong to the AI developer? According to the Copyright Office, if the human creator alters most of the AI response, the human creator has authorship and copyrights to that work. The problem is that the office has not made any guidelines to determine how much AI users need to alter to remain the work’s author. But if most of the content comes from AI, this is where it gets sticky due to the lack of guidelines. The human creator prompting AI cannot copyright the work, but the AI developer cannot have copyright of the work because they didn’t make the training data. The machine learns from public domain data and copyrighted materials from many content creators without permission. This content includes blogs, songs, books, photos, news articles, peer-reviewed articles, academic materials, etc. It’s at this point where the question of copyright infringement becomes vital.

According to the Copyright Office, AI developers inputting these copyrighted materials into training data models can fall under fair use since it’s not making “the same work.” But why is this allowed when YouTubers need permission to place popular songs in their videos to avoid copyright infringement? Why do content creators of articles, blogs, books, etc., need permission or must purchase a license to use copyrighted photos? Because those pieces of work (books, songs, photos, art pieces, etc.) belong to the content creator who has complete authorship or ownership of the work used by others. Yet, the Copyright Office doesn’t believe AI developers should be licensed to use content creators’ work in their training data. Not only is the Copyright Office’s lack of action causing gray areas, but it’s also not protecting authors, such as companies, who are starting to lose money now, and possibly more in the future, because of this inaction. For example, the New York Times sued OpenAI and Microsoft because they used articles in their software without permission, including subscription articles. Many AI users could read these articles, which contained misinformation occasionally, and to make matters worse, AI’s response never provided any sources to back up the content. The New York Times discussed that OpenAI and Microsoft were not only stealing money and readers from them, but were also misrepresenting their newspaper due to the lack of sources and inaccurate AI responses.

However, there are solutions to the authorship and copyright issues: First, the Copyright Office needs to create clear guidelines on how much authors can use AI to remain copyrighted. Second, AI developers should be required to use public domain content, purchase licenses to use copyrighted content, and compensate authors or content creators for allowing them to use their work in the training data. For example, Adobe and Getty Images are already implementing similar ideas because the companies understand this is necessary to protect authors and their work. Adobe only uses Adobe iStock for their generative AI, while Getty Images does the same thing, only using Getty Images iStock for their generative AI; this way, they remain within the copyright law and protect authors.

Creator Statement: Using AI in this Article

In the name of transparency, I’m a college student, and this article is an assignment for my writing class discussing the use of AI in content creation, such as writing. Since my class was about learning how to use AI, my professor required that I use an AI program that could help me research, gather sources, and summarize them. For my research, I used Notebook LLM, where I input several articles from the Copyright Office and law schools into the system to read and analyze. The software summarized the material, and I prompted it to show me connections and differences between the articles and their main points and to draw questions from the materials to drive more inspiration for this article. I’ll never use the program again because it made the research process more tiresome and drawn out since I had to ensure that the information the program was spitting out was factual by reading every article beforehand and then all the Notebook LLM summaries. Despite Notebook LLM, the only other AI program I used is Grammarly.

Grammarly is a program I use for all my writing, such as academic, fictional, and for work. For this article, I wrote every word in the system so that it could check my grammar and spelling. During the process, Grammarly suggested different words and sentence rearrangements that I applied to the article so that the writing had a nice tone, was clear, correct, engaging, and had nice delivery. In addition, it helps identify AI text and plagiarism within writing. However, as I wrote the article, for the first time in my many years of using this software, AI flagged that two sentences I wrote were AI text, when it wasn’t, causing me to alter the sentences. Here were the sentences:

Can AI be an author or the owner of original work? These questions are crucial because it’s an issue muddled with unclear guidelines set by the Copyright Office, and many content creators have spoken out against the problem of copyright infringement.

I’ve never been a fan of using AI in content creation. Still, the exception to the rule was always Grammarly (because it has helped improve my grammar and taught me rules that I had difficulty understanding when I was first starting to write). Now I’m concerned about using it since it produced a false positive of AI text, which, as a college student, is concerning when most of my professors insert my papers into a program used for detecting AI text and plagiarism. My experience is another reason we need caution when using AI, despite all its benefits, like Grammarly.

K. M. Wisener

Authorship Complexities in Using AI

Centralized VS Decentralized Platforms