Researchers Get AI To Generate Harmful Content

May 21, 2024

ResearchAI-Generated ContentHarmful Content

Researchers used prompts from an academic paper and their own questions to elicit responses from language models, resulting in harmful content such as Holocaust denial articles, sexist emails, and suicide encouragement text. In response, developers of these large language models (LLMs) reiterated their commitment to safety. OpenAI, the company behind ChatGPT, stated that its technology is not intended to generate hateful or violent content. Anthropic, developer of Claude chatbot, prioritized avoiding harmful responses for its Claude 2 model. These incidents highlight concerns about the potential misuse of AI language models.

View Article