Skip to content

Home

Teal is a versatile and user-friendly API designed to simplify working with PDF documents. Whether you're a developer looking to automate PDF processing or integrate PDF functionalities into your existing workflow, Teal provides a seamless and efficient solution.

For the source code, see https://github.com/rueedlinger/teal.

Key Features

  • Digitize documents to searchable PDF or archivable PDF/A.
  • Extract metadata, text, and tables as structured data.
  • Convert different document types to PDF.
  • Convert PDFs to PDF/A.
  • Check PDF/A compliance.

Understanding Different Types of PDFs

Digitally Created PDFs:

  • Created using software like Microsoft Word or Excel, or via the "print" function within applications.
  • Contains text and images with electronic character designation.
  • Text and images can be easily edited, searched, and manipulated.

Image-only PDFs:

  • Generated from scanned hard copy documents or images.
  • Content is locked in a snapshot-like image without a text layer.
  • Not searchable or editable without OCR (Optical Character Recognition).

Searchable PDFs:

  • Result from applying OCR to scanned or image-based documents.
  • Have a text layer added underneath the image layer, making them fully searchable.
  • Text can be selected, copied, and marked up like in original documents.

Alternatives

Here are some alternatives to Teal:

  • Gotenberg converts documents with LibreOffice and Chromium to PDF.
  • OCRmyPDF is designed to be used as a command line tool, but it can be used in a web service.
  • Paperless-ngx is a community-supported open-source document management system.