This study investigates biases present in large language models (LLMs) when utilized for narrative tasks, specifically in game story generation and story ending classification. Our experiment involves using popular LLMs, including GPT-3.5, GPT-4, and Llama 2, to generate game stories and classify their endings into three categories: positive, negative, and neutral. The results of our analysis reveal a notable bias towards positive-ending stories in the LLMs under examination. Moreover, we observe that GPT-4 and Llama 2 tend to classify stories into uninstructed categories, underscoring the critical importance of thoughtfully designing downstream systems that employ LLM-generated outputs. These findings provide a groundwork for the development of systems that incorporate LLMs in game story generation and classification. They also emphasize the necessity of being vigilant in addressing biases and improving system performance. By acknowledging and rectifying these biases, we can create more fair and accurate applications of LLMs in various narrative-based tasks.

What Is Waiting for Us at the End? Inherent Biases of Game Story Endings in Large Language Models / Taveekitworachai P.; Abdullah F.; Gursesli M.C.; Dewantoro M.F.; Chen S.; Antonio Lanata; Guazzini A.; Thawonmas R.. - ELETTRONICO. - 14384 LNCS:(2023), pp. 274-284. (Intervento presentato al convegno 16th International Conference on Interactive Digital Storytelling) [10.1007/978-3-031-47658-7_26].

What Is Waiting for Us at the End? Inherent Biases of Game Story Endings in Large Language Models

Gursesli M. C.;Antonio Lanata;Guazzini A.;
2023

Abstract

This study investigates biases present in large language models (LLMs) when utilized for narrative tasks, specifically in game story generation and story ending classification. Our experiment involves using popular LLMs, including GPT-3.5, GPT-4, and Llama 2, to generate game stories and classify their endings into three categories: positive, negative, and neutral. The results of our analysis reveal a notable bias towards positive-ending stories in the LLMs under examination. Moreover, we observe that GPT-4 and Llama 2 tend to classify stories into uninstructed categories, underscoring the critical importance of thoughtfully designing downstream systems that employ LLM-generated outputs. These findings provide a groundwork for the development of systems that incorporate LLMs in game story generation and classification. They also emphasize the necessity of being vigilant in addressing biases and improving system performance. By acknowledging and rectifying these biases, we can create more fair and accurate applications of LLMs in various narrative-based tasks.
2023
Interactive Storytelling
16th International Conference on Interactive Digital Storytelling
Taveekitworachai P.; Abdullah F.; Gursesli M.C.; Dewantoro M.F.; Chen S.; Antonio Lanata; Guazzini A.; Thawonmas R.
File in questo prodotto:
File Dimensione Formato  
978-3-031-47658-7_26.pdf

Accesso chiuso

Licenza: Open Access
Dimensione 2.02 MB
Formato Adobe PDF
2.02 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1346080
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact