Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.


If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping


Answers may be delayed for posts that do not include sufficient information.

Some pdf files imported as binary

edited February 2018 in Troubleshooting
Hi. I'm having this problem: CA batch  imports some pdf files, that I can usually open with any pdf viewer, as binary files, and it does not generate any thumbnail.
If I download the original imported file, I can open it without any problem.
Also, if I upload the file directly using the object interface, it works...

Attached is my configuration and a screenshot of the media display bundle.

In app.conf, I set 
dont_use_imagemagick_to_identify_pdfs = 1
dont_use_graphicsmagick_to_identify_pdfs = 0
and
dont_use_zendpdf_to_identify_pdfs = 1 

Any help is REALLY appreciated!

Thank you.


Informazione sulla versione
ComponenteVersione
Versione dell'applicazione1.7.5
Revisione dello schema147
Tipo di rilascioGIT
System GUID2898bdfc-702e-4705-a7cf-5b14e0e69646
Last change log ID149768
Motore di ricerca: SQL Search
ImpostazioneDescrizioneStato
MySQL è il database back-endThe SqlSearch search engine requires that MySQL be the back-end database for your CollectiveAccess installation.ok
SqlSearch database tables existThe SqlSearch search engine requires that certain tables be present in your database. They are installed by default and should be present, but if they are not SqlSearch will not be able to operate.ok
Media Processing Plugins
PluginInformazioniStato
AudioFornisce elaborazione e conversione audio utilizzando ffmpegDisponibile
GDFornisce servizi limitati di elaborazione e conversione immagini utilizzando libGD
Didn't load because GraphicsMagick is available and preferred
Not used
GmagickProvides image processing and conversion services using ImageMagick via the PECL Gmagick PHP extension
Didn't load because Gmagick is not available
Non disponibile
GraphicsMagickProvides image processing and conversion services using GraphicsMagick via exec() callsDisponibile
ImageMagickFornisce servizi di elaborazione e conversione immagini utilizzando ImageMagick attraverso chiamate exec() agli eseguibili di ImageMagick
Non caricato poiché non si trovano gli eseguibili di ImageMagick
Didn't load because GraphicsMagick is available and preferred
Not used
ImagickFornisce servizi di elaborazione e conversione immagini utilizzando ImageMagick arrtaverso l'estensione PHP PECL Imagick
Non caricato poiché Imagick non è disponibile
Non disponibile
MeshAccetta file che descrivono modelli 3DDisponibile
OfficeAccetta e processa documenti in formato Microsoft Word, Excel e PowerPointDisponibile
PDFWandProvides PDF conversion services using ImageMagick or the Zend_PDF library. Will use Ghostscript to generate image-previews of PDF files.Disponibile
QuicktimeVRProvides services for processing of QuicktimeVR filesDisponibile
Spin360Accepts ZIP archives containing 360 spinnable images in SpinCar format (http://SpinCar.com)Disponibile
VideoProvides ffmpeg-based video processingDisponibile
XMLDocAccetta ed elabora documenti in formato XMLDisponibile
BinaryFileAccepts any file unrecognized by other media plugins and stores it as-isNon disponibile
PDF Rendering Plugins
PluginInformazioniStato
PhantomJSRenderizza HTML come PDF utilizzando PhantomJSNon disponibile
domPDFRenders HTML as PDF using domPDF
Didn't load because wkhtmltopdf is available and preferred
Not used
wkhtmltopdfRenders HTML as PDF using wkhtmltopdfDisponibile
Barcode generation
ComponenteInformazioniStato
GDGD è una libreria di elaborazione grafica richiesta per qualsiasi generazione di codici a barre.Disponibile
Application Plugins
PluginInformazioniStato
ULANImports artist records from ULANDisponibile
WorldCatImports bibliographic information from WorldCatDisponibile
duplicateMenuAdds a "duplicate" menu listing all recently duplicated items and providing an easy way to create additional duplicates.Disponibile
historyMenuAdds a "history" menu listing all recently edited itemsDisponibile
hspExportEnforces HSP-specific export rulesDisponibile
ns11mmServicesImplements Memex services for National September 11th Museum.Disponibile
prepopulateThis plugin allows prepopulating field values based on display templates. See http://docs.collectiveaccess.org/wiki/Prepopulate for more info.Disponibile
relationshipGeneratorAutomatically assigns an object to a collection, based upon rules you specify in the configuration file associated with the pluginDisponibile
travelogueAccepts submissions of media via email.Disponibile
image

Comments

  • Hi,

    I'm having a similar issue. Did you ever figure this out?

  • I noticed that the issue is generated by some pdf files. The only workaround I have found is to batch optimize the files saving as pdf-a using Acrobat.

  • Hi everyone, we have hit this exact same problem. We have just uploaded 2000 (locally working) PDFs, and the majority of them are appearing as 'binary' in CA, and thus not rendering or usable. I am now running Acrobat to batch upgrade to PDF/A, which is fine.

    However, what's the best, easiest way to replace all the PDFs? I didn't (and now wishing I did) put all the PDFs I uploaded into a Set. Is there a way to create a Set of all PDFs, or ideally all binary files? Then I could mass delete and just upload.

    Otherwise, I have to go through thousands of objects and manually delete the corrupted files, then upload, which will take days of work.

    Anybody got a good idea of how to quickly mass-replace the PDFs? Thanks for any help.

  • I found that:
    a) Re-saving all PDFs has PDF/A, and more importantly
    b) Increasing the vCPU and RAM of our shared server (up to 4 VCPUs and 8GB RAM)

    solved the problem - I can now upload and render hundreds of PDFs in 1 hit, with no problems.

Sign In or Register to comment.