Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.


If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping


Answers may be delayed for posts that do not include sufficient information.

objectRepresentationSplitter wildcards / setting preferred labels from filename

a couple of objectRepresentationSplitter questions. any help is greatly appreciated. more than happy to do this another way if i'm going down the wrong path entirely


First question: Is it possible to use wildcards to import images from a folder specified in a data source with many differently named images inside? tried using regular expression wildcards but couldn't figure out how to get it to work...So far the basic objectRepresentationSplitter parameters I am imagining are something like

{
"mediaPrefix": "^2/",
"attributes": {
"media": "[WILDCARD?]"
}
}


Second question: Is it possible for each objectRepresentation to have the preferred label set from the filename during this same import? I think the element is ca_object_represenations.preferred_labels.name ? but not sure how/if possible to set this in the refinery parameters.


Background info:

Creating an import mapping that is working decently for most columns. Having trouble with the object representation column. To simplify here is a two column example of what we're working with:

Column 1 contains individual object names. Column 2 contains one corresponding directory name, for the directory containing relevant jpgs, for each entry. The folders are mainly consistent (have exaggerated the differences for this example) but the images have an inconsistent naming scheme.

Example Spreadsheet:

Object 1 Name | Object 1 Images Folder
Object 2 Name | 1999 Object 2 Folder
Object 3 Name | Some Folder for Object 3

Example Directory Tree:

Object 1 Images Folder/
01 One Test.jpg
02 Test One2.jpg
03 Another Testing1 Picture.jpg

1999 Object 2 Folder/
01 Two2 Test.jpg
02 2Test Two.jpg
03 Another2 Testing Picture.jpg

Some Folder for Object 3/
01 Test Three.jpg
02 Test 3Three.jpg
03 Another Testing Picture3.jpg

Comments

  • edited February 29

    if it's useful to anyone else here's the method we came up for this import:

    1. added a new column to the source spreadsheet (A) for id nos

    ITEM0001 | Object 1 Name | Object 1 Images Folder
    ITEM0002 | Object 2 Name | 1999 Object 2 Folder
    ITEM0003 | Object 3 Name | Some Folder for Object 3

    1. created a mapping to import these objects with relevant metadata (ignoring representations for now)

    2. created an index spreadsheet (B) to the directories of images we've placed in the import directory. Added a reference table in this spreadsheet (B) to match the id no from spreadsheet (A) with folder names. can go into the method we used for automating this if anyone needs it. ended up with something that looks like:

    ITEM0001 | Object 1 Images Folder | 01 One Test.jpg
    ITEM0001 | Object 1 Images Folder | 02 Test One2.jpg
    ITEM0001 | Object 1 Images Folder | 03 Another Testing1 Picture.jpg

    ITEM0002 | 1999 Object 2 Folder | 01 Two2 Test.jpg
    ITEM0002 | 1999 Object 2 Folder | 02 2Test Two.jpg
    ITEM0002 | 1999 Object 2 Folder | 03 Another2 Testing Picture.jpg

    ITEM0003 | Some Folder for Object 3 | 01 Test Three.jpg
    ITEM0003 | Some Folder for Object 3 | 02 Test 3Three.jpg
    ITEM0003 | Some Folder for Object 3 | 03 Another Testing Picture3.jpg

    4A. created a new mapping solely for elements ca_objects.idno and ca_object_representations, both with the option "skipIfEmpty": 1

    4B. the ca_object_representations mapping uses refinery objectRepresentationSplitter with parameters:

    {
    "objectRepresentationType": "front",
    "attributes": {
    "media": "^3",
    "idno": "^3"
    },
    "dontCreate": "1"
    }

    4C. existingRecordPolicy is set to merge_on_idno

    1. Imported spreadsheet B using this second mapping and all images were found and associated with the previously imported records by idno.

    we tried specifing mediaPrefix for each subdirectory of images but never got it working. images only were correctly imported after we removed the mediaPrefix parameter completely, looks like subdirectories in the import folder are scanned for matching files automatically...maybe the import runs faster if the prefix is set?

    in reference to the second question above about setting the preferred label: the images imported with the label automatically defaulting to the filename minus the extension, looks like that's just the default behavior...

Sign In or Register to comment.