DOCUMENTITIS

How this started

2024. First-year MBBS.

"Why is all of this still on paper?"

So instead of drawing the brachial plexus in our notebooks, we started sketching system architecture.

We walked in expecting stethoscopes and scrubs. We got mountains of paper instead — handwritten records, illegible prescriptions, lab reports stuffed into folders that no one would ever open again.

Between Anatomy lectures and Biochemistry vivas, we kept asking each other the same question — and we couldn't let it go.

Three medical students. One conviction: clinical records should never be a barrier to better care.

The real bottleneck

80% of clinical data is trapped in unstructured form.

Before a study can even begin enrolling, someone has to read it all by hand.

30–60 min

Per patient

Manual chart abstraction from scanned notes, handwritten prescriptions and free-text discharge summaries — for every single record.

100s of hrs

Per trial

Multiplied across a 500-patient study, that's hundreds of hours of pure data entry before the first patient is enrolled.

Integrity risk

Inter-rater variability

Different abstractors read the same chart differently — quietly threatening the integrity of multi-site research data.

What DOCUMENTITIS does

Five gated stages. All on-chain of custody.

A fully local pipeline that converts unstructured clinical documents into structured, research-ready data — every step on the hospital's own servers.

Upload

→

OCR Quality Gate

→

AI Extraction

→

Validation

→

Export

On-premiseEvery processing step runs on the hospital's own servers.

Source-linkedEvery extracted data point links back to its source line in the original document.

Knows its limitsUncertain extractions are flagged for human review — never silently trusted.

Export-readyOutputs flow directly into REDCap and Medidata Rave.

The intelligence behind it

Two AI layers. Both running locally.

OCR Layer

Reads scanned & handwritten documents, block-by-block

dots.ocr · Tesseract 5 · EasyOCR — ensemble OCR. Low-confidence blocks are auto-flagged for human review.

Local LLMs via Ollama

Schema-constrained extraction, validation & summarisation

Nemotron-mini — extracts schema-constrained clinical variables.

DeepSeek-R1 (8B/14B) — validates findings, normalises units, generates patient summaries.

Llama 3.1 — powers a clinical assistant chatbot across all documents.

All of this runs on consumer-grade hardware (Apple Silicon M4). No cloud APIs. No GPU clusters.

Why local

How DOCUMENTITIS differs from existing tools.

	Cloud Clinical AI	DOCUMENTITIS
Data location	Sent to cloud	Stays in hospital
Infrastructure	Data-center GPUs	Consumer hardware
Traceability	Black-box outputs	Source-linked, audit-grade
Regulatory fit	Difficult in India / EU	Built natively for data sovereignty
Cost to hospital	High recurring fees	One-time deployment

Where this fits best

Built for clinical research teams.

Retrospective Research

Automates chart abstraction for studies on sepsis, readmission rates and oncology outcomes — turning months of manual review into hours.

Multi-Site Trial Consistency

Standardises extraction across hospitals, eliminating the site-to-site inter-rater variability that compromises trial integrity.

Adverse Event Detection

Continuously evaluates records to flag abnormal lab values and safety signals — supporting real-time pharmacovigilance.

Built within legal & ethical frameworks

Trust is the architecture — not a feature.

HIPAA-aligned

GDPR-aligned

FDA 21 CFR Part 11

ICH E6 (R2) GCP

Any extraction with a confidence score below 0.89 is automatically paused and flagged for mandatory clinician verification. Development uses strictly de-identified records only.

What we're seeking

A clinical research partner.

We're three medical students with a working v1.0.0. Here's where the right institution changes everything.

Institutional Partnership

A clinical research institution willing to host a pilot deployment, strictly on de-identified historical data.

Validation Dataset Access

Permission to validate our extraction accuracy against ground-truth data, under proper ethical clearance.

Mentorship & Direction

Clinical research expertise to help us refine which use cases to prioritise first.

admin@produsa.dev

DOCUMENTITIS

Three medical students.
One conviction.

Clinical records should never be a barrier to better care.

Launch the Console →

2024. First-year MBBS.

80% of clinical data is trapped in unstructured form.

Per patient

Per trial

Inter-rater variability

Five gated stages. All on-chain of custody.

Two AI layers. Both running locally.

OCR Layer

Local LLMs via Ollama

A working console, today.

Clinical Intelligence Dashboard

Cross-Document Patient Timeline

AI Patient Summary

Quality & Audit Queue

Clinical Assistant Chatbot

Open the full console →

How DOCUMENTITIS differs from existing tools.

Built for clinical research teams.

Retrospective Research

Multi-Site Trial Consistency

Adverse Event Detection

Trust is the architecture — not a feature.

A clinical research partner.

Institutional Partnership

Validation Dataset Access

Mentorship & Direction

Three medical students.
One conviction.

2024. First-year MBBS.

80% of clinical data is trapped in unstructured form.

Per patient

Per trial

Inter-rater variability

Five gated stages. All on-chain of custody.

Two AI layers. Both running locally.

OCR Layer

Local LLMs via Ollama

A working console, today.

Clinical Intelligence Dashboard

Cross-Document Patient Timeline

AI Patient Summary

Quality & Audit Queue

Clinical Assistant Chatbot

Open the full console →

How DOCUMENTITIS differs from existing tools.

Built for clinical research teams.

Retrospective Research

Multi-Site Trial Consistency

Adverse Event Detection

Trust is the architecture — not a feature.

A clinical research partner.

Institutional Partnership

Validation Dataset Access

Mentorship & Direction

Three medical students.One conviction.

Three medical students.
One conviction.