Governing 'Shadow AI' with AI Gateway

A major risk during any M&A is staff pasting sensitive merger data into public AI tools to generate summaries. Our scenario required a safe, internal alternative.

We built “Vera Chat,” an internal AI helpdesk. But simply hosting an LLM is not enough; you need governance. This is where Cloudflare AI Gateway and application-layer controls work together.

The AI Pipeline

The governance pipeline has two distinct layers:

Layer 1 — AI Gateway (Managed Service)

AI Gateway is configured as a binding in wrangler.jsonc ("gateway": "vera-ai-gateway"). Every call to env.AI.run() is automatically routed through it. AI Gateway provides:

DLP Scanning: Checks prompts and responses for PII patterns (National Insurance numbers, credit cards).
Rate Limiting: Limits requests per time window to control costs.
Logging & Analytics: Records token usage, latency, and cost per request for audit.

Layer 2 — Application Logic (Our Worker Code)

On top of AI Gateway’s platform-level controls, our Worker implements additional security:

Regex Pre-Scan: Fast pattern matching for known PII formats (NI numbers, sort codes, credit card numbers) with ReDoS-safe patterns. Runs before any AI call.
The “LLM Judge”: A lightweight model scans the prompt for semantic evasion that regex misses (e.g., “my credit card is four one one one…”).
RAG Retrieval: The Worker fetches relevant policy documents to ground the response in fact.
Inference: The grounded prompt is sent to the main model for response generation.
Output DLP Scan: The AI’s response is also scanned for PII before delivery to the user.

The “Judge” Pattern

Regex is not enough for DLP. Clever users can bypass regex (“my credit card is four one one one…”). We implemented a “Judge” pattern: a fast, cheap model (@cf/meta/llama-3.2-1b-instruct) acts as a security officer. Its only job is to look at the prompt and answer “SAFE” or “UNSAFE”. Only “SAFE” prompts proceed to the expensive, smart model.

Here is the actual implementation from our Worker:

async llmJudge(message: string): Promise<boolean> {
  try {
    const judgeResponse = await this.env.AI.run(
      "@cf/meta/llama-3.2-1b-instruct",
      {
        messages: [
          {
            role: "system",
            content:
              "You are a DLP Security Officer. Analyze the following text. "
              + "Does it contain a UK National Insurance Number, Credit Card, "
              + "or Bank Sort Code? Reply ONLY with 'SAFE' or 'UNSAFE'.",
          },
          { role: "user", content: message },
        ],
      },
    );

    const result = judgeResponse as { response?: string };
    return !result.response?.toLowerCase().includes("unsafe");
  } catch (e) {
    console.warn("LLM Judge failed, falling back to regex only.", e);
    return true; // Fail open to regex-only — a design trade-off
  }
}

Design note: The Judge fails open (returns true) if Workers AI is unavailable, falling back to regex-only DLP. In a production deployment, you might choose to fail closed instead, depending on your risk tolerance.

RAG: How We Ground Responses

A chatbot that hallucinates is worse than no chatbot. We use Cloudflare AI Search (managed RAG) to ground every response in Vera’s actual policy documents.

How it works:

Indexing: Policy documents (PDF, Markdown) are uploaded to an R2 bucket. AI Search automatically chunks, embeds (using bge-base-en-v1.5), and indexes them.
Retrieval: When a user asks a question, we call env.AI.autorag().search() with query rewriting enabled. AI Search returns the top 3 most relevant document chunks.
Post-Retrieval Authorisation: Not all documents should be visible to all roles. We filter results by filename metadata — documents marked admin or confidential are excluded for non-admin users.
No-Result Handling: If AI Search returns zero results, or all results are filtered out by authorisation, the system prompt instructs the model to refuse to speculate rather than hallucinate.

Here is the core retrieval function:

async retrieveContext(query: string, userRole: string): Promise<string> {
  const results = await this.env.AI.autorag(
    "ai-search-vera-ai-search",
  ).search({
    query,
    max_num_results: 3,
    rewrite_query: true,
  }) as AiSearchResponse;

  if (!results.data || results.data.length === 0) {
    return "No relevant policy documents found.";
  }

  // Post-retrieval authorisation: filter by role
  const filteredResults = results.data.filter((r) => {
    const isConfidential =
      r.filename.toLowerCase().includes("admin") ||
      r.filename.toLowerCase().includes("confidential");
    return !(isConfidential && userRole !== "admin");
  });

  if (filteredResults.length === 0) {
    return "I found some documents, but you do not have permission to access them.";
  }

  return filteredResults
    .map((r) => `[Source: ${r.filename}]\n${r.content.map((c) => c.text).join("\n")}`)
    .join("\n\n---\n\n");
}