How long does an AI implementation take?

Most single workflow implementations take 2-6 weeks from kickoff to production. Full AI transformation programmes run 6-12 weeks.

Do you work with specific AI models?

We are model-agnostic and work with all major providers including Anthropic Claude, OpenAI GPT, Google Gemini, Meta Llama, Mistral, and more.

Can you deploy AI on our own servers?

Yes. Our Local & Private AI service deploys models on your own infrastructure or private cloud.

TechnicalFree Template

Prompt Engineering Playbook Template

A practical playbook for designing, testing, and managing prompts for LLM-powered applications. Covers prompt patterns, structured prompt design, testing methodology, version control, and continuous optimisation strategies.

Overview

What's included

Prompt design framework with examples

Common prompt patterns and when to use them

Prompt testing and evaluation methodology

Version control and change management process

Optimisation strategies for quality and cost

Prompt library management approach

Prompt Design Framework

The CRAFT Method

Use this structure when designing a new prompt:

C — Context: Set the stage. Who is the AI? What domain is it operating in?

You are a senior financial analyst at a UK investment firm.
You help clients understand their portfolio performance.

R — Role & Rules: Define constraints and guidelines.

Rules:
- Only use data provided in the context. Do not make up figures.
- Always express returns as percentages.
- If uncertain, say so rather than guessing.
- Respond in British English.

A — Action: Specify what the AI should do.

Analyse the provided portfolio data and produce a quarterly
performance summary highlighting top and bottom performers.

F — Format: Define the output structure.

Format your response as:
## Portfolio Summary
[2-3 sentence overview]

## Top Performers
[Table: Fund Name | Return % | Benchmark Delta]

## Underperformers
[Table: Fund Name | Return % | Recommendation]

T — Tone: Set the communication style.

Tone: Professional and clear. Avoid jargon.
Audience: Clients with moderate financial literacy.

Common Prompt Patterns

1. Chain of Thought

When to use: Complex reasoning, multi-step problems, calculations

Think through this step by step:
1. First, identify the key variables
2. Then, calculate the intermediate result
3. Finally, derive the answer
Show your working.

2. Few-Shot Examples

When to use: When the AI needs to match a specific format or style

Classify the following customer messages.

Example 1:
Input: "My order hasn't arrived yet"
Category: Delivery Issue
Urgency: Medium

Example 2:
Input: "I'd like to cancel my subscription"
Category: Cancellation
Urgency: High

Now classify:
Input: "{user_message}"

3. Structured Output

When to use: When you need parseable, consistent output

Respond ONLY with valid JSON matching this schema:
{
  "summary": "string (max 100 words)",
  "sentiment": "positive | negative | neutral",
  "confidence": number (0.0 to 1.0),
  "key_topics": ["string"]
}

4. Self-Critique

When to use: When accuracy is critical

After generating your answer:
1. Review it for factual accuracy
2. Check it answers all parts of the question
3. Verify any numbers or calculations
4. If you find errors, correct them before responding

5. Persona + Audience

When to use: When tone and expertise level matter

You are explaining {topic} to a {audience}.
Adjust your language complexity accordingly.
Use analogies from their domain where helpful.

Prompt Testing & Versioning

Prompt Version Tracking

Version	Change Description	Test Result	Status
v1.0	Initial prompt	/5 avg quality	Production
v1.1	Added few-shot examples	/5 avg quality	Testing
v1.2	Refined output format	/5 avg quality	Draft

A/B Testing Framework

Prompt A (Control)	Prompt B (Variant)	Metric	Result
v1.0 (current prod)	v1.1 (new)	Quality score	A: /5 vs B: /5
		Latency	A: ms vs B: ms
		Token cost	A: vs B:

Prompt Test Checklist

Before promoting a prompt to production:

Tested on + evaluation examples
Quality score meets minimum threshold: /5
No regression on previously passing test cases
Adversarial inputs handled correctly
Output format is consistent and parseable
Token usage is within budget: < tokens average
Reviewed by second team member

Prompt Library Structure

prompts/
  ├── classification/
  │   ├── v1.0.txt
  │   ├── v1.1.txt
  │   └── README.md (changelog)
  ├── summarisation/
  │   ├── v1.0.txt
  │   └── README.md
  └── extraction/
      ├── v1.0.txt
      └── README.md

Each prompt file includes:

The full prompt template with variable placeholders
CRAFT metadata (context, rules, action, format, tone)
Test results summary
Known limitations

Instructions

How to use this template

Use the CRAFT framework for new prompts

Start every prompt with Context, Role/Rules, Action, Format, and Tone. This structure ensures completeness and consistency.

Select appropriate patterns

Choose from the common patterns based on your task type. Chain of thought for reasoning, few-shot for formatting, structured output for parsing.

Test before deploying

Run every prompt change against your evaluation dataset. Compare quality, cost, and latency against the current production version.

Version control all prompts

Store prompts in your code repository alongside the application code. Track changes, review as code, and link to test results.

Iterate based on production feedback

Monitor user feedback and quality metrics. Use low-scoring outputs as new test cases and iterate on the prompt.

Watch Out

Common mistakes to avoid

Writing vague instructions — be as specific as possible about what the AI should and should not do.

Not testing prompt changes — even small wording changes can significantly affect output quality.

Overloading a single prompt — if a prompt tries to do too much, split it into multiple focused prompts.

Ignoring token cost — long system prompts consume tokens on every request; optimise for both quality and efficiency.

Not using examples — few-shot examples are one of the most reliable ways to improve output consistency.

FAQ

Frequently asked questions

As short as possible while achieving the desired output quality. Most effective system prompts are 200-500 tokens. If your prompt exceeds 1000 tokens, consider whether all instructions are necessary or if some can be moved to few-shot examples.

Use few-shot examples when the output format or style is hard to describe in words. Use detailed instructions when the rules are clear and logical. Often a combination works best: clear rules plus 1-2 examples.

Cache responses for repeated queries, use shorter prompts where possible, move static context to system messages (which can be cached by some providers), and consider using smaller models for simpler tasks.

Centralise prompt management with 1-2 prompt engineers who understand the patterns and testing methodology. Other team members can propose changes, but a prompt engineer should review and test them before deployment.

Different models respond differently to prompts. If you need to support multiple models, maintain model-specific prompt variants and test each variant against the target model. Common differences include: structured output handling, instruction following, and reasoning capabilities.

Need a custom AI template?

Our team can build tailored templates for your specific business needs. Book a free strategy call.

Book a Strategy Call View Pricing

Prompt Engineering Playbook Template

What's included

Prompt Design Framework

Prompt Design Framework

The CRAFT Method

Common Prompt Patterns

Common Prompt Patterns

1. Chain of Thought

2. Few-Shot Examples

3. Structured Output

4. Self-Critique

5. Persona + Audience

Prompt Testing & Versioning

Prompt Testing & Versioning

Prompt Version Tracking

A/B Testing Framework

Prompt Test Checklist

Prompt Library Structure

How to use this template

Use the CRAFT framework for new prompts

Select appropriate patterns

Test before deploying

Version control all prompts

Iterate based on production feedback

Common mistakes to avoid

Frequently asked questions

AI Agent Workflow Design Template

AI Model Evaluation Template

What Is Prompt Engineering?

Need a custom AI template?