Kognitos + GPT-3: OCR Takes a Huge Leap Forward

Name: By combining Kognitos and GPT3, Optical Character Recognition is taking a huge leap forward
Uploaded: 2026-04-08
Duration: 3 min 16 s

What's in this video

A 3-minute demonstration of how Kognitos combines traditional OCR with English-defined logic to extract fields that template-based OCR cannot, and how every fix becomes a permanent learning.

The OCR problem this video solves

Most OCR pipelines need a hand-built template per document layout. Damaged, unstructured, or unfamiliar documents break them. A human reading the document would say something like “the customer ID is always the line below the trailer number” simple logic, but until now hard for automation to use.

How Kognitos teaches OCR new tricks

Base OCR pass: Kognitos runs traditional OCR first and reports a confidence score for every field, alongside the values it could and could not extract.
The exception: in this run, the customer ID could not be extracted, so the brain pauses and surfaces the gap.
Mini playground: a sandboxed test environment where the user types phrases like “grab the line below the document's trailer number” and verifies the result without touching the real system of record.
Validate and save: once the value comes out correct, the user clicks Teach a Technique and the rule is saved.
Fallback at scale: every future document from that vendor runs regular OCR first; if the customer ID is still missing, Kognitos applies the saved location-based rule automatically.
Manage techniques: all learnings for a process live in one place, they can be reviewed, edited, or removed.

Why pair OCR with an LLM brain

OCR alone can read pixels. An LLM alone hallucinates. Kognitos combines them: OCR does the deterministic reading, the LLM-driven brain understands the English logic that fills the gaps, and a learnings library makes the combination repeatable on damaged or variable documents.

Questions answered in this video

What's the example used in this OCR demo?

A vendor document where regular OCR successfully extracts most fields but cannot find the customer ID. The user teaches Kognitos a location-based rule, “the customer ID is the line below the trailer number” to handle that vendor going forward.

What is the Kognitos mini playground used for?

It's a sandboxed test environment where you can try different English phrases or location rules against a document and see the result, without touching any live application. Once the logic works, you save it as a technique.

How does Kognitos use the saved technique on future documents?

For each new document from the same vendor, Kognitos runs traditional OCR first. If the field it needs is missing or low-confidence, it falls back to the saved location-based rule, recovering the value without human intervention.

Can I review and remove techniques later?

Yes. All learnings for a process are stored in a managed list where you can review, edit, or delete them at any time.

By combining Kognitos and GPT3, Optical Character Recognition is taking a huge leap forward

What's in this video

The OCR problem this video solves

How Kognitos teaches OCR new tricks

Why pair OCR with an LLM brain

Questions answered in this video

Ready to see it in action?