leaf-focus 0.6.2

Last updated:

0 purchases

leaf-focus 0.6.2 Image
leaf-focus 0.6.2 Images
Add to Cart

Description:

leaffocus 0.6.2

leaf-focus
Extract structured text from pdf files.
Install
Install from PyPI using pip:
pip install leaf-focus




Download the Xpdf command line tools and extract the executable files.
Provide the directory containing the executable files as --exe-dir.
Usage
usage: leaf-focus [-h] [--version] --exe-dir EXE_DIR [--page-images] [--ocr]
[--first FIRST] [--last LAST]
[--log-level {debug,info,warning,error,critical}]
input_pdf output_dir

Extract structured text from a pdf file.

positional arguments:
input_pdf path to the pdf file to read
output_dir path to the directory to save the extracted text files

optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--exe-dir EXE_DIR path to the directory containing xpdf executable files
--page-images save each page of the pdf as a separate image
--ocr run optical character recognition on each page of the
pdf
--first FIRST the first pdf page to process
--last LAST the last pdf page to process
--log-level {debug,info,warning,error,critical}
the log level: debug, info, warning, error, critical

Examples
# Extract the pdf information and embedded text.
leaf-focus --exe-dir [path-to-xpdf-exe-dir] file.pdf file-pages

# Extract the pdf information, embedded text, an image of each page, and Optical Character Recognition results of each page.
leaf-focus --exe-dir [path-to-xpdf-exe-dir] file.pdf file-pages --ocr

Dependencies

xpdf
keras-ocr
Tensorflow (can optionally be run more efficiently using one or more GPUs)

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.