Technology

Malicious prompts in Trivy VS Code extension hijack local AI agents

Two OpenVSX releases injected hidden prompts to coerce local AI CLIs into reconnaissance and exfiltration; users must update, audit tokens and check GitHub activity.

Dr. Elena Rodriguez·3/4/2026·3 min read

Published 10:13 AM

Listen to this article•0:00 min

Share this article:

Malicious prompts in Trivy VS Code extension hijack local AI agents — Source: aquasec.com

Researchers found that two recent releases of the Aqua Trivy VS Code extension on the OpenVSX registry contained hidden natural‑language prompts designed to weaponize developers' local AI coding agents. The compromised artifacts, published as versions 1.8.12 and 1.8.13 on OpenVSX, were identified shortly after their February 27–28, 2026 publication and analyzed in a technical writeup published March 3, 2026.

Socket.dev researchers, joined by technical breakdowns from Gbhackers and reporting by Cybersecuritynews and StepSecurity, traced the tampering to the OpenVSX package aquasecurityofficial.trivy-vulnerability-scanner. Up through 1.8.11 the extension code matched the open GitHub upstream (aquasecurity/trivy-vscode-extension), but 1.8.12 and 1.8.13 included additional code absent from the public repository and from any tagged release.

The injected artifacts concealed long natural‑language prompts whose explicit objective was to coerce local AI CLIs into performing system inspection and potential data exfiltration. Version 1.8.12 embedded a "2,000-word prompt" that framed the AI as a "forensic agent" instructed to scan for "compromises, credentials, financial data" and to exfiltrate via channels such as email or Slack. In 1.8.13 the prompt shifted to target tools, tokens and the GitHub CLI with instructions to create a repository named posture-report-trivy and write findings to REPORT.MD.

Gbhackers' analysis found the malicious code hid inside the workspace activation function pl(), running before Trivy's normal setup. The tampered block spawned five local AI CLIs, Claude, Codex, Gemini, Copilot and Kiro, with permissive runtime flags such as "–dangerously-skip-permissions –add-dir /" for Claude and "–ask-for-approval never –sandbox danger-full-access" for Codex. Cybersecuritynews noted that in v1.8.13 the harmful block was wrapped in an if statement using JavaScript's comma operator so malicious commands would run first. "All five AI commands ran as detached background processes with silent error handling, any tool not installed simply failed without visible noise," the reporting said.

Analysts linked the injected prompts to a wider AI‑powered bot campaign that targeted GitHub Actions workflows across multiple projects. Mayura Kathir and StepSecurity documented the campaign, named "hackerbot-claw" in one account, describing theft of a personal access token, repository takeover and push of malicious artifacts in related incidents. At the time of reporting there was no public repository named posture-report-trivy and no public evidence that exfiltration succeeded.

Security teams should treat this as a supply‑chain compromise. Recommended steps include uninstalling the affected extension if installed from OpenVSX and updating to a patched release beyond 1.8.13, rotating any GitHub tokens, cloud credentials and SSH keys that may have been exposed, and scanning shell history for invocations of claude, codex, gemini, copilot or kiro-cli with permissive flags. Investigators should also check for unexpected GitHub activity or a posture-report-trivy repository and inspect AI agent logs and permissions.

The incident underscores a shifting attacker technique: delegating reconnaissance and exfiltration to locally trusted AI agents to evade traditional signature‑based detection. Trivy remains a widely used vulnerability scanner and its VS Code extension was intended to bring that functionality into developers' editors. Researchers cautioned defenders that, absent file hashes or network indicators from the published summaries, detection will require careful host and account auditing rather than simple IOC matching. A Reddit commenter summarized the stakes bluntly: "Ransomware seems to be the most dangerous because it can completely shut a business down.

Know something we missed? Have a correction or additional information?

Submit a Tip