To convert a PDF file to text format on a Linux or UNIX system, you can use the pdftotext
utility, which is part of the poppler-utils
package. pdftotext
is a command-line utility that can be used to extract the text from a PDF file and save it to a plain text file.
To install the poppler-utils
package, you will need to use your system's package manager. On a Debian-based system, such as Ubuntu, you can use the following command:
sudo apt-get install poppler-utils
On a Red Hat-based system, such as CentOS, you can use the following command:
sudo yum install poppler-utils
Once poppler-utils
is installed, you can use the pdftotext
command to convert a PDF file to text format. For example, to convert the file input.pdf
to text and save the result to output.txt
, you can use the following command:
pdftotext input.pdf output.txt
You can also use the -layout
option to preserve the layout and formatting of the original PDF file:
pdftotext -layout input.pdf output.txt
You can use the -f
and -l
options to specify the first and last pages