Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.


If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping


Answers may be delayed for posts that do not include sufficient information.

Import: Use non-idno field for matching entries?

In our project, we have project-internal identifiers that are auto-generated in "idno" fields.
Another field, "extID" (=external identifier) contains an official identifier which contains several "official" identifiers (e.g. ISO, etc).
Now I'd like to import data from various sources that refer to these common extIDs.

The importer documentation mentions to a setting "existingRecordPolicy", but only for "idno" and "preferred_labels".

Is there any way to match by the "extID" field?

Thank you very much in advance.

Comments

  • I've found "entitySplitter.matchOn" in the docs, which sounds pretty much like what I need for creating relationships based on "extID".

  • I'm having a similar problem, trying to match the "non preferred labels" with the listItemSplitter. It seems the "labels" option for match on only matches preferred labels.

  • There isn't a way to match on non idno or preferred labels currently. If you want that facility added please make a JIRA for it.

  • @seth: Thanks for the info.
    There are 2 use cases:

    1. existingRecordPolicy = works only with idno/preferred_labels
    2. entitySplitter.matchOn = works with other fields, too? (at least how I interpret the docs)

    Is that correct?

    Thanks again :smile:

  • No, matchOn works with idno and labels only.

  • edited October 12

    Oh.
    Then I've misinterpreted this part of the documentation:

    {"matchOn": ["^ca_entities.your_custom_code"]} will match on a custom metadata element in the entity record. Use the syntax ^ca_entities.metadataElement code.

    I believe you know your code, so the docs seem to be wrong?

    In the sourcecode, it only mentions this:

    'description' => _t('List indicating sequence of checks for an existing record; values of array can be "label", "idno" or "displayname". Ex. array("idno", "label", "displayname") will first try to match on idno and label [forename, surname] if the first match fails, then finally the displayname field')
    ),

  • As I understand it, the idno is the "house-internal" identifier (usually own syntax per institution).

    Do you have a suggestion how you import and match data that comes from mixed sources, but has common (=external) identifiers it refers to?

    For example, we're importing a list of people (=agents, entities) who happen to be listed in the "GND" catalogue of the German National Library.

    Here's an example: http://d-nb.info/gnd/120904144
    The unique identifier here for "George Stevens" is "120904144".

    Following the EN15907 standard for cinematographic works, the identifier is stored as:
    * identifier.schema = "gnd"
    * identifier.value = "120904144"

    The CA importer engine seems very elaborate, and I hope I'm not the first person to import 3rd party data, so is there any chance I can pull this off? :smile:

    Thank you very much in advance!

  • @seth: Sorry, I just saw that there's another "matchOn" option "for a straight mapping (no splitters)". But reading the text seems it seems it's for list items?

    It mentions "item_id" as another option.
    Now I'm endlessly confused.

  • @seth
    Since I'm really stuck with this, I just wanted to ask again if you meant that it's not possible for the "straight" import - and if the documentation for matchOn of the splitter is correct or not?

    Thank you very much in advance!

Sign In or Register to comment.