PDF BulkX Enterprise
Extract hundreds of PDFs into Excel — set up once, run forever. Self-hosted, PDPA-aligned, use your own AI keys.
More flexible than OCR. Faster than AI. Runs on your server.
The Problem
Businesses receive PDFs daily — invoices, reports, forms — and manually re-key data into spreadsheets. It's slow, error-prone, and doesn't scale.
The Solution
PDF BulkX lets you define what data to extract, upload sample PDFs, and AI generates the extraction logic. Then run bulk extractions of hundreds of files and download a clean Excel result.
How PDF BulkX Works
From raw PDFs to clean Excel data in 4 simple steps
Upload Sample PDFs
Upload representative sample files of your PDF type
Define Columns
Name the fields to extract (e.g. Invoice Number, Date, Total)
Generate & Extract
AI builds extraction logic or reads PDFs directly — picks the best mode automatically
Run Bulk Extraction
Upload hundreds of PDFs, download one clean Excel file
Two Extraction Modes
PDF BulkX automatically picks the best mode for your PDFs
Logic Mode
AI learns your PDF layout during setup, then extraction runs locally as deterministic logic — fast, repeatable, highly accurate on consistent layouts.
- Fast & deterministic
- Works offline after setup
- Higher accuracy than per-run AI
- Best for consistent layouts
Logic Mode is built for speed and accuracy at scale. Once configured, extraction runs as deterministic logic — same input produces the same output every time. No AI interpretation drift between runs. Process thousands of PDFs in minutes, with the same structured columns on every bulk run.
AI Extract Mode
AI reads each PDF directly on every run. Handles scanned PDFs and varied/inconsistent layouts.
- Handles scanned PDFs
- Works with varied layouts
- Multi-language support
- Auto-fallback when needed
AI Extract Mode is built for flexibility — it handles scanned PDFs, varied layouts, multilingual documents, and supplier format changes without re-configuration. Pair it with your own Gemini API key (more providers coming): choose the AI model that fits your organisation, and keep direct control of your AI billing costs.
Key Features
Local Execution
Bulk extraction runs on your own server — your documents stay within your network
Predictable Cost
Perpetual license — no per-page AI billing from us
Non-Technical Friendly
Anyone can create a new PDF extraction project without IT involvement
Scanned PDF Support
AI fallback handles scanned and messy PDFs
REST API Ready
Integrate with your existing systems
Thai Manual
Thai manual available inside
Refinement Prompts
Fine-tune extraction results with natural language instructions
Target Use Cases
Accounting & Finance
Extract data from invoices, purchase orders, receipts, and financial statements into structured Excel files.
Banking & Insurance
Digitize bank statements, loan documents, and insurance forms for faster processing.
Logistics
Process shipping documents, customs forms, and delivery receipts at scale.
Government & Compliance
Extract data from digital forms, regulatory reports, and compliance documents.
Any PDF Workflow
Any repetitive PDF-to-spreadsheet workflow — if it has a consistent structure, PDF BulkX can extract it.
Frequently Asked Questions
What's the difference between Logic Mode and AI Mode?
Logic Mode uses AI once during setup to learn your layout, then runs locally — fast and deterministic. AI Extract Mode sends each PDF to an AI provider on every run — handles scanned PDFs and inconsistent layouts.
What types of PDFs are supported?
PDF BulkX works with both digital (machine-readable) and scanned PDFs. Logic Mode handles digital PDFs best, while AI Extract Mode can process scanned documents.
Can it handle scanned PDFs?
Yes. AI Extract Mode processes scanned PDFs using your chosen AI provider. Logic Mode works best with machine-readable PDFs but will auto-fallback to AI Mode for unreadable files.
What are refinement prompts?
Refinement prompts let you fine-tune extraction results using natural language instructions. For example, you can tell the system to format dates differently or combine certain columns.
What are the deployment requirements?
PDF BulkX runs on any Linux server with Docker, PostgreSQL, and Redis. It's self-hosted — you maintain full control of your infrastructure.
Does it work offline?
Logic Mode extraction runs fully offline after the initial setup. AI Extract Mode requires an internet connection to reach AI providers. Full local AI support is coming soon.
Is PDF BulkX PDPA-compliant?
PDF BulkX Enterprise is designed to align with PDPA requirements. Because it runs on your own server AND uses your own AI keys for AI Extract Mode, you retain full control over personal data end-to-end — which is the foundation of PDPA compliance. In Logic Mode, documents never leave your network. In AI Extract Mode, extraction routes through your own AI provider key with full audit logs.
Can we deploy on AWS/Azure/GCP private cloud, or only on-prem?
Both. PDF BulkX Enterprise runs in Docker, so it deploys anywhere Docker runs — on-prem Linux, AWS EC2, Azure VM, GCP Compute, or any Thai-local VPS provider. "Self-hosted" means on infrastructure you control, not necessarily on-premise.
How long does setup take for a new PDF type?
For Logic Mode, first-time setup per PDF type takes 3-15 minutes, depending on document complexity. You upload a few sample PDFs, define the columns to extract, and the AI generates extraction logic that runs locally for every future bulk run. AI Extract Mode setup is shorter — just define the columns, since the AI reads each PDF directly per run. After setup, the system is ready for bulk extraction — typically, Logic Mode handles 400+ pages per minute per slot, AI Extract Mode 12 pages per minute per slot. Your first run can be ready before the first coffee.
How can I try PDF BulkX Enterprise?
Trying PDF BulkX Enterprise starts with a demo, then a Proof of Concept on our online test server using your real PDFs — no install required on your side. If it fits, we issue a quotation, install on your server, deliver training, and complete a User Acceptance Test (UAT). The Purchase Order is issued after UAT passes. Three months of free Tier 1 customer support is included from that point.
Transparent Pricing
Predictable cost. Cheaper than Cloud at scale.
PDF BulkX Enterprise License — perpetual, paid once
Total year-1 cost by slot count:
- 1 slot: ฿250,000 (install + ฿150K)
- 3 slots: ฿450,000 (install + ฿150K + 2×฿100K)
- 5 slots: ฿635,000 (install + ฿150K + 3×฿100K + ฿85K)
- 9 slots: ฿975,000 (install + ฿150K + 3×฿100K + 5×฿85K)
- 1 slot ≈ 400+ pages/min in Logic Mode, 12 pages/min in AI Extract Mode
- Use your own AI keys (Gemini today; no AI billing from us)
- Year 2+ optional support: ฿20,000/year/slot (see Support & Maintenance below)
Support & Maintenance
INCLUDED WITH EVERY LICENSE
- 3 months Standard support (email + LINE, 48-hr SLA, business hours)
- Security patches — indefinitely
- AI provider compatibility updates
- Thai + English product manuals
After 3 months, pick one:
Self-serve (Free)
Free
- Docs + community
- Best-effort fixes
- No SLA
Subscription
฿20,000/yr/slot
- 8 tickets/year
- 48-hour SLA
- Priority queue
- Multi-ticket for complex issues
Ad-hoc
- Subscriber: ฿3,000/hr
- Non-sub: ฿3,500/hr
- Urgent +50%
- 1-hr minimum
v2 Major Upgrade (when released):
- Active Subscription customers: free
- All other v1 customers: ฿75,000/slot (50% off list)
ROI vs hiring — at two workload levels
"All-in" FTE cost = base salary + employer social security + workers' compensation + 1-month year-end bonus + workspace + equipment.
฿25,000/month all-in ≈ ฿17,000–฿18,000 listed base salary.
Each FTE handles ~4,000 PDFs/month (at 2 min/file × 6 productive hrs/day). 1 slot of PDF BulkX Enterprise handles tens of thousands of PDFs/month.
Scenario A — Replace 1 person handling ~4,000 PDFs/month:
Scenario B — Replace a 3-person team handling ~12,000 PDFs/month:
The license is perpetual — Year 2+ costs only the optional ฿20,000/year/slot support subscription. Every subsequent year, the gap vs hiring widens.
* Estimated AI cost ~฿2,000/year per slot (Scenario A) / ~฿6,500/year per slot (Scenario B).
Ready to Automate Your PDF Processing?
See PDF BulkX Enterprise in action with your own documents
Request a DemoGet more information
Tell us about your use case and we'll get back to you within 24 hours