Home
Teal is a versatile and user-friendly API designed to simplify working with PDF documents. Whether you're a developer looking to automate PDF processing or integrate PDF functionalities into your existing workflow, Teal provides a seamless and efficient solution.
For the source code, see https://github.com/rueedlinger/teal.
Key Features
- Digitize documents to searchable PDF or archivable PDF/A.
- Extract metadata, text, and tables as structured data.
- Convert different document types to PDF.
- Convert PDFs to PDF/A.
- Check PDF/A compliance.
Understanding Different Types of PDFs
Digitally Created PDFs:
- Created using software like Microsoft Word or Excel, or via the "print" function within applications.
- Contains text and images with electronic character designation.
- Text and images can be easily edited, searched, and manipulated.
Image-only PDFs:
- Generated from scanned hard copy documents or images.
- Content is locked in a snapshot-like image without a text layer.
- Not searchable or editable without OCR (Optical Character Recognition).
Searchable PDFs:
- Result from applying OCR to scanned or image-based documents.
- Have a text layer added underneath the image layer, making them fully searchable.
- Text can be selected, copied, and marked up like in original documents.
Alternatives
Here are some alternatives to Teal:
- Gotenberg converts documents with LibreOffice and Chromium to PDF.
- OCRmyPDF is designed to be used as a command line tool, but it can be used in a web service.
- Paperless-ngx is a community-supported open-source document management system.