A recent study has shown that the general public has difficulty distinguishing between poems written by famous poets, such as Shakespeare, and those generated by the AI model ChatGPT-3.5. When participants were asked to evaluate both types of poetry, many showed a surprising preference for AI-generated works.
The study also found that individuals who compared human-created poetry with AI-generated poems often mistook the former’s complexity for the inconsistencies found in AI poetry, underestimating how closely AI can mimic human creativity.
The study, led by Dr. Brian Porter and his team from the University of Pittsburgh, was published on Friday in Scientific Reports. The research involved presenting participants with poems from ten renowned poets alongside poems generated by ChatGPT-3.5, which was tasked with imitating their writing styles. Participants were asked to distinguish between the two and evaluate them.
In the first experiment, 1,634 participants were given a series of poems and tasked with identifying whether each poem was written by a famous poet or generated by AI. The second experiment involved 696 participants who were asked to rate 14 characteristics of each poem, including beauty, rhythm, emotion, and originality.
The study featured 50 poems from ten famous poets, including 14th-century poet Geoffrey Chaucer, William Shakespeare, Walt Whitman, and T.S. Eliot. These were presented alongside 50 AI-generated poems created by ChatGPT-3.5, which were modeled after the styles of these poets.
In the first experiment, participants were randomly shown five poems written by famous poets and five generated by AI and asked to distinguish between them. The results were striking: participants could identify the correct author just 46.6% of the time, even lower than what would be expected by random chance.
In the second experiment, participants were divided into three groups. Each group was provided with additional information about the poems—whether they were written by a person, written by AI, or had unavailable source information. They were then asked to evaluate the poems based on 14 different characteristics, including quality, beauty, emotion, rhythm, and originality.
The results showed that participants who were told a poem was written by AI rated it lower across 13 of the 14 characteristics than those told the poem was written by a human. This indicates that, regardless of the poem’s actual origin, the perception of its authorship influenced how it was rated.
However, an exciting twist emerged when participants were given no information about the poem’s source. In this case, the AI-generated poems received higher ratings than those written by human poets.
The research team noted that while poetry has traditionally been seen as a domain where generative AI struggles to create works indistinguishable from human creations, their study reveals that AI’s capabilities in poetry have already surpassed public expectations.