Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.


If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping


Answers may be delayed for posts that do not include sufficient information.

pdf2txt problem

Hi, I have just completed a new installation on Ubuntu 18.04 and run int a problem with pdf2txt. I installed the package PDFMiner from both the repository and the apt download. In both cases I have tried running pdf2txt diirectly from tha command line and always get the following error
raise PDFSyntaxError ('No /Root object! - is this really a PDF?') pdfminer.pdfparser.PDFSyntaxError: No /Root object! - Is this really a PDF?

  • pdftotext converts the file ok
  • I have checked the pdf file using an online pdf validator and it passes ok.
  • This does not seem to be a CA problem but if anyone has had the same problem and solved it or could point me in the right direction I would be most thankful.

Comments

  • Ok got it to run in the terminal - it converts the pdf file but still ends the process with the PDF SyntaxError message. Can anyone tell me where the converted file is stored in Providence?, would like to see what is in there.
    Thanks

  • I've never seen this one. What version of pdf2txt are you running? There's a port to Python 3 as well as the original Python 2 version.

  • Hello Seth, I tried both versions with the same result however I did a reinstall to try and resolve another problem and the issue seems to have gone away. UniversalViewer works fine now and I can do text searches in uploaded PDFs. I have one more issue to resolve but will put in a separate post. Great job and many thanks for your patience with all the amateurs.

Sign In or Register to comment.