Indexing process aborts abruptly

I wanted to give Elasticsearch a test run, set it up and started a reindex. However, the indexing process apbrubtly aborts after the first few elements for all central primary types - objects, entities, places and occurrences. E.g. here is the output when indexing objects:

CollectiveAccess 1.8 (166/GIT) Utilities
(c) 2013-2020 Whirl-i-Gig

TRUNCATING ca_objects

WILL INDEX [objects]

tput: No value for $TERM and no -T specified

Ind...0.0% 0/45220 ETC: ???. Elapsed: < 1 sec [>                              ]
Memory: 60.50...0.0% 1/45220 ETC: < 1 sec. Elapsed: < 1 sec [>                              ]
Mem...0.0% 2/45220 ETC: 06h:16m. Elapsed: 01s [>                              ]

Places and occurrences also abort that early, the entities indexing works a little bit better, but also aborts after 7%. Here are the last few lines from the entities indexing:

...7.0% 2201/31270 ETC: 11m:13s. Elapsed: 51s [==>                            ]
...7.0% 2202/31270 ETC: 11m:13s. Elapsed: 51s [==>                            ]
...7.0% 2203/31270 ETC: 11m:12s. Elapsed: 51s [==>                            ]

Indexing of other tables, e.g. list_items or users works well btw.

What could be the problem here and how could I investigate those issues further? Can these issues be related to our quite complex data model?

I then switched back to SqlSearch and started another reindex to test if the reindexing would work correctly with SqlSearch. This worked well for entities, places and occurrences, but also failed for objects. Here are the last lines of the object indexing:

Memory: 1...26.0% 11538/45220 ETC: 12h:46m. Elapsed: 04h:22m [=======>                       ]
Memory: 1...26.0% 11539/45220 ETC: 12h:46m. Elapsed: 04h:22m [=======>                       ]
Memory: 1...26.0% 11540/45220 ETC: 12h:46m. Elapsed: 04h:22m [=======>                       ]

How can I find out which object is causing issues? It seems that the whole indexing process is aborted when it fails to index a certain object. Is it possible to somehow simply skip a failed indexing operation of an object, but resume the indexing process in general?

Comments

  • What is the PHP memory limit set to?

  • The PHP memory limit is set to "999M"

    What I also observed is that the object indexing is trying to index 45K objects, while there are currently only 36K objects visible in CA. Around 9K objects were deleted, so this could explain the discrepancy. Maybe this is related to an issue with objects that have been deleted?

  • It should trap errors and not die. I don't think it's a deleted record issue, but you may want to purge those records fully to test the theory.

Sign In or Register to comment.