Long Conversations Weaken AI Safety

AI systems lose their safety filters during longer chats, increasing the risk of harmful or inappropriate replies. A new report revealed that users can override safeguards in AI tools with just a few simple prompts.

Cisco Tests Major Chatbots

Cisco examined large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to determine how many prompts triggered unsafe information. Researchers ran 499 conversations using “multi-turn attacks,” where users asked several questions to slip past safety checks. Each chat contained between five and ten exchanges.

The team compared responses from single and multiple prompts to see how easily chatbots shared dangerous or unethical data, such as private company information or misinformation. They extracted harmful content in 64 percent of multi-question sessions but only 13 percent of single-prompt ones.

Success rates differed sharply, from 26 percent with Google’s Gemma to 93 percent with Mistral’s Large Instruct model. Cisco warned that these multi-turn methods could spread harmful content or grant hackers access to private data.

Open Models Shift Safety Responsibility

The study found that AI systems often forget their rules over longer conversations, letting attackers gradually adjust prompts and dodge safeguards. Mistral, Meta, Google, OpenAI, and Microsoft all use open-weight models, allowing the public to view their safety parameters. Cisco explained that these open systems usually contain lighter protections so users can modify them freely. This shifts safety responsibility to whoever customizes the model.

Cisco noted that Google, OpenAI, Meta, and Microsoft have worked to limit malicious fine-tuning. However, AI developers still face criticism for weak guardrails that allow criminal misuse. In one case, U.S. firm Anthropic admitted that criminals used its Claude model for large-scale data theft and extortion, demanding ransoms exceeding $500,000 (€433,000).

What's Hot

Nvidia Breaks $215 Billion Revenue as AI Demand Drives Unprecedented Growth

Aston Martin to cut 20% of jobs after losses widen to £363.9m

Macron Plans to Expand France’s Role in European Nuclear Defence

Long Conversations Weaken AI Safety

OpenAI Considered Alerting Police About Future Canadian School Shooter

Seattle Startup Funding Drives Tech Growth

Big Tech’s AI Spending Surge Threatens Europe’s Digital Independence

Government considers ban on unlicensed gambling sponsors in Premier League

EU Puts US Trade Deal on Hold Amid Legal Clash and New Tariffs

UK halts puberty blocker study as regulator calls for higher minimum age

The Trial That Could Change How Social Media Protects Young Users

Orange Juice Influences Gene Activity

Russian Cyberattacks Match Terror Threats in European Security Focus

Best Buy Boosts Full-Year Sales Outlook

Scientists Create First Accurate Blood Test for Chronic Fatigue Syndrome

Latest News

Devastating School Shooting Rocks Tumbler Ridge, B.C.

Maxwell Invokes Fifth Amendment as Lawmakers Press for Answers

ACC Halts European Battery Factory Plans Amid Slower EV Growth

What's Hot

Long Conversations Weaken AI Safety

Cisco Tests Major Chatbots

Open Models Shift Safety Responsibility

Related Posts