Long Conversations Weaken AI Safety

AI systems lose their safety filters during longer chats, increasing the risk of harmful or inappropriate replies. A new report revealed that users can override safeguards in AI tools with just a few simple prompts.

Cisco Tests Major Chatbots

Cisco examined large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to determine how many prompts triggered unsafe information. Researchers ran 499 conversations using “multi-turn attacks,” where users asked several questions to slip past safety checks. Each chat contained between five and ten exchanges.

The team compared responses from single and multiple prompts to see how easily chatbots shared dangerous or unethical data, such as private company information or misinformation. They extracted harmful content in 64 percent of multi-question sessions but only 13 percent of single-prompt ones.

Success rates differed sharply, from 26 percent with Google’s Gemma to 93 percent with Mistral’s Large Instruct model. Cisco warned that these multi-turn methods could spread harmful content or grant hackers access to private data.

Open Models Shift Safety Responsibility

The study found that AI systems often forget their rules over longer conversations, letting attackers gradually adjust prompts and dodge safeguards. Mistral, Meta, Google, OpenAI, and Microsoft all use open-weight models, allowing the public to view their safety parameters. Cisco explained that these open systems usually contain lighter protections so users can modify them freely. This shifts safety responsibility to whoever customizes the model.

Cisco noted that Google, OpenAI, Meta, and Microsoft have worked to limit malicious fine-tuning. However, AI developers still face criticism for weak guardrails that allow criminal misuse. In one case, U.S. firm Anthropic admitted that criminals used its Claude model for large-scale data theft and extortion, demanding ransoms exceeding $500,000 (€433,000).

What's Hot

Trump Canada Tariffs Hit New Imports

US Cuba Relations Report Sparks Fresh Debate

US Iran Strait Hormuz Tensions Escalate

Long Conversations Weaken AI Safety

NHS AI Patient Triage Set To Transform Care

US Stock Market Today Ends Mixed On Tech Losses

OpenAI GPT 56 Release Faces White House Limits

2026 World Cup Golden Boot Race Heats Up Fast

Trump Iran War Risks Fuel Debate Ahead of US Midterm Elections

Rikers Island World Cup Watch Brings New Focus

Uganda Ebola Recovery Nears Major Health Milestone

UN warns global climate pledges fall far short of 1.5C target

China’s AI Chip Revolution: Closing in on Global Leadership

Lecornu Resigns After Short-Lived Premiership

US Cuba Relations Report Sparks Fresh Debate

Latest News

US Cuba Relations Report Sparks Fresh Debate

US Iran Strait Hormuz Tensions Escalate

Trump Faces Boos At World Cup Trophy Ceremony

What's Hot

Long Conversations Weaken AI Safety

Cisco Tests Major Chatbots

Open Models Shift Safety Responsibility

Related Posts