PHP Classes

What is the best PHP pdf to html class?: Convert PDF to HTML

Recommend this page to a friend!
  All requests RSS feed  >  What is the best PHP pdf to html class?  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

What is the best PHP pdf to html class?

Edit

Picture of VEDPRAKASH PODDAR by VEDPRAKASH PODDAR - 8 years ago (2016-07-01)

Convert PDF to HTML

This request is clear and relevant.
This request is not clear or is not relevant.

+3

I need to convert PDF document to HTML using PHP.

Ask clarification

2 Recommendations

PHP PDF to HTML: Convert PDF to HTML using Poppler

This class can convert PDF to HTML using Poppler program.

It can take the path of the Poppler program tools and execute several operations to extract information from PDF documents.

Currently the class can convert whole PDF documents or individual pages to HTML, get the document information, return the page count, etc..

Several parameters can be configured like the the preferred format of the pictures inside the document, zoom scale, whether to use images or CSS inline within the HTML or as external files, etc..
This recommendation solves the problem.
This recommendation does not solve the problem.

+1

Picture of Anton N Nikolaev by Anton N Nikolaev package author package author Reputation 215 - 8 years ago (2016-12-02) Comment

I also needed it. And result is here!


PHP PDF to Text: Extract text contents from PDF files

This package can extract the text contents from a PDF file using pure PHP code (no external tools are needed).

It provides the following features:

- Text is extracted from PDF files as a single text property. Individual page contents are also available separately
- Text strings can be searched over the whole file contents, or through individual pages
- Support for multiple character sets: parsed text is returned in UTF8
- Embedded images can be extracted if desired
- Several option flags are available to adjust PDF contents processing
- RTL language processing
- Basic page layout rendering
- PDF Form data extraction
- Ability to extract areas of text as well as line and column contents, using an XML-based capture definitions
This recommendation solves the problem.
This recommendation does not solve the problem.

+1

Picture of Manuel Lemos by Manuel Lemos Reputation 26695 - 8 years ago (2016-07-03) Comment

Converting PDF to HTML in pure PHP is hard. There are some packages for that but they rely on external programs, so they are not in pure PHP.

On the other hand this PDF to text class can be the basis for generating HTML from a PDF document. Maybe with some work it can extract more than just the text.

  • 1 Comment
  • 1. Picture of Christian Vigh by Christian Vigh package author package author - 8 years ago (2016-07-03) Reply

    As Manuel said, extracting text from PDF is hard since you have to face so many different situations.

    There are commercial products that exist.

    On the other hand, my PdfToText class was aimed at extracting only text from a PDF file.

    Thanks to phpclasses users, this class is constantly evolving, since I received many many samples that presented issues, which helped making it better for interpreting pdf contents.

    So, please feel free to have a try with it. It's a complete standalone PHP class, that do not use any external tool at all. It can even extract individual page contents and images.

    And, of course, if you encounter issues when extracting text from your samples, please feel free to send them to me at this address :

    christian.vigh@wuthering-bytes.com

    I will be happy to handle the issues that will help me enhance my class.


Recommend package
: 
: