ChatGPhish: Markdown Rendering in ChatGPT Web Summaries Enables Prompt Injection and Credential Harvesting

Permiso Security has identified a rendering vulnerability in OpenAI's ChatGPT web interface, specifically in the feature that summarises external web content. The vulnerability stems from the response renderer's implicit trust of Markdown formatting without adequate sanitisation, allowing attackers to embed malicious links and images that appear legitimate within summarised content.

The technical mechanism exploits a common pattern in AI systems: rendering user-supplied or third-party data with minimal filtering. When ChatGPT summarises a webpage containing specially crafted Markdown, an attacker can inject prompt instructions that alter how the AI responds to subsequent user queries, or embed hyperlinks that direct users to credential harvesting pages. The attacker controls the source webpage, making this a server-side template injection analogue targeting the AI's rendering pipeline rather than the web server itself.

The attack surface is significant because users interacting with web summaries may have reduced scepticism compared to direct URLs. They see the content as pre-vetted by ChatGPT, creating a false sense of trust. An attacker could compromise a legitimate domain, inject malicious Markdown, or create a lookalike site that ChatGPT summarises on user request. This bridges two threat vectors: prompt injection (which affects the AI's behaviour) and phishing (which affects user behaviour).

Defenders should counsel users to verify URLs independently before entering credentials, treat summarised content with the same caution as raw web content, and avoid assuming that AI-rendered summaries have been security-reviewed. OpenAI should implement strict Markdown sanitisation, disable certain formatting elements in summaries, and consider rendering summaries in a sandboxed context that limits link functionality.

This vulnerability highlights a broader architectural weakness in AI systems: the assumption that rendering third-party content is safe because it has passed through an AI filter. As AI assistants become information intermediaries, the security model must shift from trusting the AI to independently validating the sources the AI presents.