Close Menu
    What's Hot

    Epstein assistant Groff faces explosive scrutiny

    June 9, 2026

    Cuba sanctions UN warning urges immediate removal

    June 9, 2026

    Trump attorney general nominee Todd Blanche pick

    June 9, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Epstein assistant Groff faces explosive scrutiny
    • Cuba sanctions UN warning urges immediate removal
    • Trump attorney general nominee Todd Blanche pick
    • Trump NBA finals boos rock Madison Square Garden
    • Beecle: Official Online Shop Launch Marks New Entry Into the Cosmetics Industry
    • Probiotic foods dietitian reveals gut fix
    • AI stock market pause shocks Wall Street
    • Alderney ferry subsidy service cancelled after tender
    MirnewsMirnews
    • General
    • World
    • Finance
    • Money
    • Lifestyle
    • More
      • Culture
      • Travel & Tourism
      • Environment & Sustainability
    Subscribe
    • Latest News
    • Politics
    • Opinion
    • Business
    • Technology
    • Sports
    • Health
    • Education
    • Entertainment
    MirnewsMirnews
    Home»Technology»Long Conversations Weaken AI Safety
    Technology

    Long Conversations Weaken AI Safety

    Rachel MaddowBy Rachel MaddowNovember 6, 2025No Comments2 Mins Read
    Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    AI systems lose their safety filters during longer chats, increasing the risk of harmful or inappropriate replies. A new report revealed that users can override safeguards in AI tools with just a few simple prompts.

    Cisco Tests Major Chatbots

    Cisco examined large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to determine how many prompts triggered unsafe information. Researchers ran 499 conversations using “multi-turn attacks,” where users asked several questions to slip past safety checks. Each chat contained between five and ten exchanges.

    The team compared responses from single and multiple prompts to see how easily chatbots shared dangerous or unethical data, such as private company information or misinformation. They extracted harmful content in 64 percent of multi-question sessions but only 13 percent of single-prompt ones.

    Success rates differed sharply, from 26 percent with Google’s Gemma to 93 percent with Mistral’s Large Instruct model. Cisco warned that these multi-turn methods could spread harmful content or grant hackers access to private data.

    Open Models Shift Safety Responsibility

    The study found that AI systems often forget their rules over longer conversations, letting attackers gradually adjust prompts and dodge safeguards. Mistral, Meta, Google, OpenAI, and Microsoft all use open-weight models, allowing the public to view their safety parameters. Cisco explained that these open systems usually contain lighter protections so users can modify them freely. This shifts safety responsibility to whoever customizes the model.

    Cisco noted that Google, OpenAI, Meta, and Microsoft have worked to limit malicious fine-tuning. However, AI developers still face criticism for weak guardrails that allow criminal misuse. In one case, U.S. firm Anthropic admitted that criminals used its Claude model for large-scale data theft and extortion, demanding ransoms exceeding $500,000 (€433,000).

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    Previous ArticleChelsea back Maresca’s rotation strategy despite Qarabag setback
    Next Article Tesla shareholders approve Elon Musk’s unprecedented $1 trillion compensation plan
    Rachel Maddow
    • Website
    • Facebook

    Rachel Maddow is a freelance journalist based in the USA, with over 20 years of experience covering Politics, World Affairs, Business, Health, Technology, Finance, Lifestyle, and Culture. She earned her degree in Political Science and Journalism from Stanford University. Throughout her career, she has contributed to outlets such as MSNBC, The New York Times, and The Washington Post. Known for her thorough reporting and compelling storytelling, Rachel delivers accurate and timely news that keeps readers informed on both national and global developments.

    Related Posts

    Huawei Chip Technology Claims Breakthrough Innovation

    May 25, 2026

    Microchip Technology Stock Drops After New Warning

    May 16, 2026

    Rural Internet Expansion Boosts Free Access

    May 5, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Latest News

    Alderney ferry subsidy service cancelled after tender

    June 8, 2026

    Werrington Fields fence dispute divides community

    June 8, 2026

    Washington AI gala exposes growing public backlash

    June 7, 2026

    Belmont Stakes 2026 Golden Tempo wins again

    June 7, 2026

    Trump threatens to slap 100% tariffs on foreign films

    Politics September 30, 2025

    Donald Trump renewed his threat to impose a 100% tariff on all movies made outside…

    Netherlands D66 Makes Big Gains

    November 3, 2025

    Premier League Clubs Braced for Higher Wage Bills After Budget Tax Change

    November 29, 2025

    Interpol Leads Major African Cybercrime Sweep

    September 26, 2025

    Mir News brings you fresh stories, news, culture, and trends from the United States and beyond — your daily source for insight, inspiration, and authentic perspectives.

    We're social. Connect with us:

    Facebook Instagram
    Categories
    • Business
    • Culture
    • Education
    • Entertainment
    • Environment & Sustainability
    • Health
    • Media
    • Latest News
    • Opinion
    • Real Estate
    • Sports
    • Technology
    • Travel & Tourism
    Latest News

    Trump attorney general nominee Todd Blanche pick

    June 9, 2026

    Trump NBA finals boos rock Madison Square Garden

    June 9, 2026

    Alderney ferry subsidy service cancelled after tender

    June 8, 2026
    All Rights Reserved © 2026 Mirnews.
    • Contact Us
    • Privacy Policy
    • Terms and conditions
    • Disclaimer
    • Imprint

    Type above and press Enter to search. Press Esc to cancel.