AI Safety & Security

Build AI systems that are safe, aligned, and resilient.

Frontier AI safety and security work informed by direct fellowship experience training models for accurate intent assessment, refining safety policies, and evaluating edge cases that balance security constraints with legitimate user needs.

Discuss an engagement

Intent Assessment & Model Evaluation

Train and evaluate AI systems to accurately interpret user intent — distinguishing legitimate use from misuse, and improving response quality across complex prompts.

AI Red-Teaming & Adversarial Testing

Probe frontier models for jailbreaks, prompt injection, data exfiltration, and policy bypasses using techniques from TCM AI Hacking and applied fellowship work.

Safety Policy Refinement

Translate organizational risk posture into concrete safety policies, refine refusal behavior, and tune responses that balance security with legitimate user needs.

Edge-Case & Harm Evaluation

Surface and document edge cases where models behave unexpectedly, harmfully, or inconsistently — and recommend mitigations grounded in established AI safety practice.

Our approach

We bring decades of enterprise security discipline — NIST, FISMA, and offensive security — to AI systems that increasingly drive business decisions and customer experiences.

  1. 1

    Threat Modeling for AI

    Map the attack surface: prompt injection, training-data risks, model theft, and downstream misuse.

  2. 2

    Red-Team & Evaluation

    Adversarially probe models against documented harm taxonomies and your specific policy boundaries.

  3. 3

    Policy & Guardrail Design

    Refine system prompts, refusal behavior, and runtime guardrails to encode your safety posture.

  4. 4

    Continuous Monitoring

    Stand up evaluation harnesses and review cadences so safety posture keeps pace with model updates.

Credentials & experience

Backed by direct frontier AI fellowship work and a current AI security credential stack.

Handshake AI fellowship — intent assessment & safety policy
TCM AI Hacking (2026)
TCM AI Fundamentals (2026)
ISC2 — Building AI Strategies (2025)
20+ years of enterprise security and red-team experience
Aligned with NIST AI RMF and emerging federal AI guidance