🔬 Core Innovation

Code Verification

The revolutionary idea: don't trust AI output — generate test code and prove the answer is correct.

Don't Trust — Test

Standard AI gives you an answer and says "trust me." Turbo says "let me prove it."

For each candidate answer, Turbo asks the AI to write verification code — a JavaScript program that independently tests whether the answer is correct. This code runs in a secure sandbox, and its output determines whether the candidate passes or fails.

🔬 The Scientific Method for AI

This is the same approach scientists use: form a hypothesis (the candidate answer), design an experiment (the verification code), run the experiment (sandbox execution), and observe the result (pass/fail). It's the scientific method applied to every AI response.

The Verification Pipeline

Here's a complete walkthrough of how verification works for a math question:

1
Question
"What is 47 × 83?"
2
Candidate Answer
"The answer is 3,901"
3
AI-Generated Verification Code
const result = 47 * 83;
const expected = 3901;
result === expected; // → true
4
Sandbox Execution Result
✅ PASSED — code returned true, answer verified

How Verification Code Is Generated

Turbo sends a specialised prompt to the AI asking it to write verification code. The prompt instructs the model to:

// The verification prompt (simplified) const verifyPrompt = ` Given this question: "${question}" And this answer: "${candidateAnswer}" Write JavaScript that verifies whether the answer is correct. The code must: - Be self-contained (no imports, no fetch, no require) - Return true if the answer is correct, false otherwise - Use only pure computation - Complete within 5 seconds Return ONLY the JavaScript code, nothing else. `; const verificationCode = await llm.generate(verifyPrompt); const result = sandbox.execute(verificationCode); // result.output === true → candidate passes

What Can Be Verified

Verification works best when the answer can be independently computed or checked with code:

🔢
Mathematics
Arithmetic, algebra, calculus — compute and compare
🧩
Logic
Boolean logic, set theory, deductive reasoning
💻
Code Correctness
Run the code, check output against expected results
📊
Data Transforms
Sorting, filtering, aggregation — test with sample data
🔤
String Operations
Regex, parsing, formatting — pattern matching
📐
Conversions
Units, bases, encodings — deterministic transforms
⚠️ Limitations

Verification works best for objectively testable claims. Subjective questions ("Is this poem good?"), creative writing, and opinion-based answers rely more on the fuzzy scoring system rather than code verification.

Verification in Action

Here's a real verification scenario showing both a passing and failing candidate:

✅ CANDIDATE A — PASSES
Answer: "3,901"
// Verification code const a = 47; const b = 83; const expected = 3901; a * b === expected; // → true ✅
❌ CANDIDATE C — FAILS
Answer: "3,891"
// Verification code const a = 47; const b = 83; const expected = 3891; a * b === expected; // → false ❌

The verification code is independently generated — it doesn't just check string equality. It actually computes the answer from scratch and compares. This means even if the AI's reasoning was flawed, the verification code can catch the error.

See verified answers in real time

⚡ Try Turbo in Synapse