AI has moved from novelty to infrastructure, embedding itself in apps, editors, workflows, and now directly into browsers, turning the browser from a passive tool into an active thinking partner. That shift promises huge productivity gains, but it also creates a new class of risk: language and trust become attack surfaces rather than just features of convenience.
Perplexity’s Comet: Adoption and Expectations
Perplexity’s Comet browser recently rolled out broadly, putting agentic browsing into the hands of millions and leaning on a promise of trustworthy, referenced answers that users have come to expect from Perplexity’s search product. The rollout reached a wide audience through promotions and trials accompanying the public launch, rapidly expanding the number of people who rely on the agent’s in-browser reasoning.
The Experiment: How a Trusted Agent was Influenced
ActiveFence’s team set out with one objective: to see whether hidden or indirect instructions could influence the assistant’s trusted outputs. They began black-box testing with embedded prompts within web content and documents.. Initial attempts were blocked until they hit Comet’s rate limits.
After that point, their embedded prompts began to succeed. The researchers propose several possible explanations for this inconsistent behavior, including differences between user tiers, fallback to alternate model configurations, or caching effects that altered responses over time.
From Instruction to Exploitation: Phishing Through Trusted Language
Once prompts were executed reliably, ActiveFence proved that attackers could manipulate the assistant into generating misleading interface elements, closely resembling legitimate browser content. By directing users to a convincing external page, the researchers demonstrated how these linguistic exploits could be used to solicit sensitive information. This technique exploits the browser’s built-in trust: the AI is expected to summarize and render page content, but those same mechanisms can be redirected into social-engineering attacks.
Hiding the Payload: The Invisible Vector
Perhaps the most alarming finding came when researchers explored how hidden prompts could persist within ordinary documents. They discovered that AI agents can read and act on textual signals embedded in areas of web or document metadata not visible to human users, such as alternative descriptions or structural attributes that assist accessibility or indexing. These properties are invisible during normal viewing but still processed by the agent, allowing hidden instructions to influence behavior without user awareness. This makes prompt injection both stealthy and transferable across platforms like collaborative document suites.
Normal Behavior: Abnormal Consequences
ActiveFence emphasizes that the assistant was often doing what it was designed to do, such as rendering markdown, summarizing pages, and following instructions. However, that is the point: normal agent behavior can be weaponized. In some cases, Comet refused to summarize malicious content, which prevented information leakage but still denied service and wasted users’ tokens. These tradeoffs reveal design choices that privilege functionality over safety for certain account tiers.
Risk and Recommendations: Security Cannot Be a Premium Feature
The payloads did not work on Pro accounts where model selection and stricter guardrails appear enabled, but free users remained exposed.
ActiveFence’s core warning is that protections shouldn’t be reserved for paying customers. As agentic features proliferate across browsers and productivity tools, all tiers must receive baseline security controls that detect and neutralize malicious language-based attacks. Words have become a new form of exploit, and trusted assistants must be engineered to resist being co-opted.
An Engineering Challenge, Not Just a Patch
This episode with Comet shows that when instruction-following is fundamental to a product’s design, language must be treated like executable code and trust like a protected resource. ActiveFence’s findings demonstrate how quickly convenience can become exposure, and why security must be integral to model design, feature rollout, and user experience from the outset.