Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.

If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping

Answers may be delayed for posts that do not include sufficient information.

Batch deleting media representations based on format

I need to figure out how to delete a set of media representations based on the representation identifier and format.

I have over a few thousand (eek) media representations (mp4s) that were imported incorrectly over the past year or so. I need to delete these incorrect representations and re-import the mp4s. Unfortunately, the representation identifiers are not totally unique -- for each object there is at least one jpg and a mp4 (example, FM02210.jpg and FM02210.mp4 both linked to the same object). Basically I need to delete any media representation that starts with FM and ends with .mp4.

I have configured my advanced media representation search to include the representation identifier, and I can do a wildcard search for "FM*", but I can't figure out how to exclude the jpgs from my search results. Any ideas for how I can get these media representations in a set so that I can delete them in one fell swoop?


  • Figured out how to get done what I needed to get done (this might not be the most elegant solution -- but documenting here for others who might have a similar issue)

    I needed to replace mp4 media representations that should have included watermarks. Several thousand mp4s were uploaded without watermarks because our media_processing.conf file wasn't configured correctly. After poking around, I realized I could use the caUtils reprocess-media command to just reprocess that media based on the new rules outlined in our media_processing.conf file. 

    Next step: isolating the media representations that needed reprocessing. I created a new display for object representations that included the collective access ID. Then, I completed my an advanced media representation search "FM*" in the Representation Identifier field. I exported these results as a tab delimited file, opened that file in Excel, and filtered by cells that contain .mp4 in the Representation Identifier field/column (since ultimately, I wanted to isolate all representations that started with FM (accomplished by my search within Providence) and ended in .mp4 (accomplished through the Excel filter)). 

    I then went back to the command line to run the bin/caUtils reprocess-media command, using the flag --ids to filter the command for "comma separated list of representation ids to reload." One thing that hung me up here -- the help page for reprocess-media says a list of "representation ids," which I thought meant the value from the Representation Identifier column -- however this is actually the collective access ID no for the representations that you want to reprocess. 

    And then finally, I had to change the display conf for pawtucket to use the h264_hi video instead of the original video. 
Sign In or Register to comment.