AI incident investigator (prototype)
Local tooling and workflows for an AI-assisted, read-only incident investigation of DigitalOcean droplets that are managed through Ploi.io. The investigation user should remain a normal Linux account that you can see and manage in Ploi—do not create hidden users manually on the server.
This repository is prototyping-only for now: SSH private keys stay untracked and nothing here is assumed to be published yet.
Goal
When an infrastructure alert fires, a local agent (or operator) SSHs into the affected server, inspects it like a human SRE with read-only access, and produces a local report that covers:
- likely cause
- supporting evidence
- confidence level
- suggested remediation (for a human to approve and run)
- commands and data that were checked
Typical questions this supports:
- Why is RAM over a threshold (for example 70%)?
- Is CPU/load from traffic, PHP workers, MySQL, queues, or something else?
- Is there evidence of an HTTP flood or DDoS?
- Which IPs, URLs, services, or processes look suspicious?
- What should a human do next?