PDF BulkX Enterprise

Extract hundreds of PDFs into Excel — set up once, run forever. Self-hosted, PDPA-aligned, use your own AI keys.

More flexible than OCR. Faster than AI. Runs on your server.

Just want to try without IT setup?

PDF BulkX Cloud gives you self-serve extraction in 5 min. Free 8 credits/month, no card required.

Try PDF BulkX Cloud Free
😤

The Problem

Businesses receive PDFs daily — invoices, reports, forms — and manually re-key data into spreadsheets. It's slow, error-prone, and doesn't scale.

The Solution

PDF BulkX lets you define what data to extract, upload sample PDFs, and AI generates the extraction logic. Then run bulk extractions of hundreds of files and download a clean Excel result.

How PDF BulkX Works

From raw PDFs to clean Excel data in 4 simple steps

1

Upload Sample PDFs

Upload representative sample files of your PDF type

2

Define Columns

Name the fields to extract (e.g. Invoice Number, Date, Total)

3

Generate & Extract

AI builds extraction logic or reads PDFs directly — picks the best mode automatically

4

Run Bulk Extraction

Upload hundreds of PDFs, download one clean Excel file

Two Extraction Modes

PDF BulkX automatically picks the best mode for your PDFs

Default

Logic Mode

AI learns your PDF layout during setup, then extraction runs locally as deterministic logic — fast, repeatable, highly accurate on consistent layouts.

  • Fast & deterministic
  • Works offline after setup
  • Higher accuracy than per-run AI
  • Best for consistent layouts

Logic Mode is built for speed and accuracy at scale. Once configured, extraction runs as deterministic logic — same input produces the same output every time. No AI interpretation drift between runs. Process thousands of PDFs in minutes, with the same structured columns on every bulk run.

Fallback

AI Extract Mode

AI reads each PDF directly on every run. Handles scanned PDFs and varied/inconsistent layouts.

  • Handles scanned PDFs
  • Works with varied layouts
  • Multi-language support
  • Auto-fallback when needed

AI Extract Mode is built for flexibility — it handles scanned PDFs, varied layouts, multilingual documents, and supplier format changes without re-configuration. Pair it with your own Gemini API key (more providers coming): choose the AI model that fits your organisation, and keep direct control of your AI billing costs.

Key Features

🖥️

Local Execution

Bulk extraction runs on your own server — your documents stay within your network

💰

Predictable Cost

Perpetual license — no per-page AI billing from us

👤

Non-Technical Friendly

Anyone can create a new PDF extraction project without IT involvement

📄

Scanned PDF Support

AI fallback handles scanned and messy PDFs

🔌

REST API Ready

Integrate with your existing systems

📚

Thai Manual

Thai manual available inside

Refinement Prompts

Fine-tune extraction results with natural language instructions

Target Use Cases

📑

Accounting & Finance

Extract data from invoices, purchase orders, receipts, and financial statements into structured Excel files.

🏦

Banking & Insurance

Digitize bank statements, loan documents, and insurance forms for faster processing.

🚚

Logistics

Process shipping documents, customs forms, and delivery receipts at scale.

🏛️

Government & Compliance

Extract data from digital forms, regulatory reports, and compliance documents.

🔄

Any PDF Workflow

Any repetitive PDF-to-spreadsheet workflow — if it has a consistent structure, PDF BulkX can extract it.

Frequently Asked Questions

What's the difference between Logic Mode and AI Mode?

Logic Mode uses AI once during setup to learn your layout, then runs locally — fast and deterministic. AI Extract Mode sends each PDF to an AI provider on every run — handles scanned PDFs and inconsistent layouts.

What types of PDFs are supported?

PDF BulkX works with both digital (machine-readable) and scanned PDFs. Logic Mode handles digital PDFs best, while AI Extract Mode can process scanned documents.

Can it handle scanned PDFs?

Yes. AI Extract Mode processes scanned PDFs using your chosen AI provider. Logic Mode works best with machine-readable PDFs but will auto-fallback to AI Mode for unreadable files.

What are refinement prompts?

Refinement prompts let you fine-tune extraction results using natural language instructions. For example, you can tell the system to format dates differently or combine certain columns.

What are the deployment requirements?

PDF BulkX runs on any Linux server with Docker, PostgreSQL, and Redis. It's self-hosted — you maintain full control of your infrastructure.

Does it work offline?

Logic Mode extraction runs fully offline after the initial setup. AI Extract Mode requires an internet connection to reach AI providers. Full local AI support is coming soon.

Is PDF BulkX PDPA-compliant?

PDF BulkX Enterprise is designed to align with PDPA requirements. Because it runs on your own server AND uses your own AI keys for AI Extract Mode, you retain full control over personal data end-to-end — which is the foundation of PDPA compliance. In Logic Mode, documents never leave your network. In AI Extract Mode, extraction routes through your own AI provider key with full audit logs.

Can we deploy on AWS/Azure/GCP private cloud, or only on-prem?

Both. PDF BulkX Enterprise runs in Docker, so it deploys anywhere Docker runs — on-prem Linux, AWS EC2, Azure VM, GCP Compute, or any Thai-local VPS provider. "Self-hosted" means on infrastructure you control, not necessarily on-premise.

How long does setup take for a new PDF type?

For Logic Mode, first-time setup per PDF type takes 3-15 minutes, depending on document complexity. You upload a few sample PDFs, define the columns to extract, and the AI generates extraction logic that runs locally for every future bulk run. AI Extract Mode setup is shorter — just define the columns, since the AI reads each PDF directly per run. After setup, the system is ready for bulk extraction — typically, Logic Mode handles 400+ pages per minute per slot, AI Extract Mode 12 pages per minute per slot. Your first run can be ready before the first coffee.

How can I try PDF BulkX Enterprise?

Trying PDF BulkX Enterprise starts with a demo, then a Proof of Concept on our online test server using your real PDFs — no install required on your side. If it fits, we issue a quotation, install on your server, deliver training, and complete a User Acceptance Test (UAT). The Purchase Order is issued after UAT passes. Three months of free Tier 1 customer support is included from that point.

Transparent Pricing

Predictable cost. Cheaper than Cloud at scale.

PDF BulkX Enterprise License — perpetual, paid once

Installation (one-time, with first slot) ฿100,000
1st concurrent slot (perpetual) ฿150,000
Additional slots (2–4, perpetual) ฿100,000 each
Slots 5+ (perpetual, 15% off) ฿85,000 each
10+ slots Enterprise Agreement — contact us

Total year-1 cost by slot count:

  • 1 slot: ฿250,000 (install + ฿150K)
  • 3 slots: ฿450,000 (install + ฿150K + 2×฿100K)
  • 5 slots: ฿635,000 (install + ฿150K + 3×฿100K + ฿85K)
  • 9 slots: ฿975,000 (install + ฿150K + 3×฿100K + 5×฿85K)
  • 1 slot ≈ 400+ pages/min in Logic Mode, 12 pages/min in AI Extract Mode
  • Use your own AI keys (Gemini today; no AI billing from us)
  • Year 2+ optional support: ฿20,000/year/slot (see Support & Maintenance below)

Support & Maintenance

INCLUDED WITH EVERY LICENSE

  • 3 months Standard support (email + LINE, 48-hr SLA, business hours)
  • Security patches — indefinitely
  • AI provider compatibility updates
  • Thai + English product manuals

After 3 months, pick one:

Self-serve (Free)

Free

  • Docs + community
  • Best-effort fixes
  • No SLA

Ad-hoc

  • Subscriber: ฿3,000/hr
  • Non-sub: ฿3,500/hr
  • Urgent +50%
  • 1-hr minimum

v2 Major Upgrade (when released):

  • Active Subscription customers: free
  • All other v1 customers: ฿75,000/slot (50% off list)

ROI vs hiring — at two workload levels

"All-in" FTE cost = base salary + employer social security + workers' compensation + 1-month year-end bonus + workspace + equipment.

฿25,000/month all-in ≈ ฿17,000–฿18,000 listed base salary.

Each FTE handles ~4,000 PDFs/month (at 2 min/file × 6 productive hrs/day). 1 slot of PDF BulkX Enterprise handles tens of thousands of PDFs/month.

Scenario A — Replace 1 person handling ~4,000 PDFs/month:

Hire 1 FTE for 1 year:฿350,000
Hire 1 FTE for 3 years:฿1,050,000
PDF BulkX Enterprise (1 slot, year 1):~฿252,000*
PDF BulkX Enterprise (1 slot, 3-year total):~฿296,000*
Year 1 saving vs hiring:~฿98,000
3-year saving vs hiring:~฿754,000

Scenario B — Replace a 3-person team handling ~12,000 PDFs/month:

Hire 3 FTEs for 1 year:฿1,050,000
Hire 3 FTEs for 3 years:฿3,150,000
PDF BulkX Enterprise (1 slot, year 1):~฿257,000*
PDF BulkX Enterprise (1 slot, 3-year total):~฿310,000*
Year 1 saving vs hiring:~฿793,000
3-year saving vs hiring:~฿2,840,000

The license is perpetual — Year 2+ costs only the optional ฿20,000/year/slot support subscription. Every subsequent year, the gap vs hiring widens.

* Estimated AI cost ~฿2,000/year per slot (Scenario A) / ~฿6,500/year per slot (Scenario B).

Ready to Automate Your PDF Processing?

See PDF BulkX Enterprise in action with your own documents

Request a Demo

Get more information

Tell us about your use case and we'll get back to you within 24 hours