Agent Skills – Open Security Database

34 points by 4ppsec 21 hours ago on hackernews | 4 comments

About the Index

The Skills Security Index is a centralized repository providing security risk analysis for agentic AI skill definitions. As AI agents increasingly rely on modular skills to perform tasks, the instructions used to define these skills become a critical attack surface. This index helps security engineers and developers understand the potential "blast radius" of any given skill before deployment.

Inside the Lab

Each entry in the index represents a unique skill found across major platform registries in GitHub. We perform a deep scan of the skill's identity, its instructions, and associated code to build a comprehensive security profile.

Assessment Method

Analyses are performed against a standardized security schema and focuse on instructional risk. Such as identifying when a skill's prompts encourage an agent to bypass guardrails or perform sensitive operations without oversight.

Risk Ranking Framework

Risk is calculated dynamically across three dimensions. A skill is assigned the highest (most severe) level detected among:

  • Pass: No significant risks detected in instructions or tools.
  • Low: Minor capability risk with appropriate scoping context.
  • Medium: Potentially risky tool use or instructions that lack clear restrictions.
  • High: Direct instructions for sensitive operations (e.g., broad file system write or unencrypted network use).
  • Critical: Encouragement of malicious actions, data exfiltration, or explicit bypasses.

Capabilities

We classify instructions into several buckets: Tools, Code Execution, Web Access, File System, Data Access, Authentication, Network, and System. "Detected" means the skill explicitly encourages the agent to utilize these modalities.

Findings

Findings report specific deviations from security best practices, such as Prompt Injection vulnerabilities, Credential Exposure, or Excessive Permissions.

Permissions

Permissions are the underlying resource requests implied by the skill. We evaluate whether each request is justified by the skill's stated purpose.