Indexing & ElasticSearch Error After Updating to 1.7.11

Hello,

I'm new to CA and in the process of migrating an existing instance.

We are testing the update of our CA app from 1.7.6 to 1.7.11. The database seemed to migrate successfully and the app looks and functions as expected so far. We are now testing ElasticSearch as we've previously only used SQLSearch.

I've got ES installed, connected, and configured. I'm able to manually re-index all of the tables except for the ca_objects table. Each time I reindex the objects table it fails at 752/23485 objects (error pasted below). I'm using the default conf for search and indexing. I've tried using the 1.7.11 and master-fix branches, but get the same error each time.

In php.ini: memory_limit = 1000M

Here's the result from curl http://localhost:9200

{
 "name" : "KDz2a0M",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "ZLSwhbnkTrCZvpJXqmNOgg",
 "version" : {
  "number" : "5.6.16",
  "build_hash" : "3a740d1",
  "build_date" : "2019-03-13T15:33:36.565Z",
  "build_snapshot" : false,
  "lucene_version" : "6.6.1"
 },
 "tagline" : "You Know, for Search"
}

Here's the output from ./bin/caUtils/rebuild_search_index -t ca_objects

CollectiveAccess 1.7.11 (158/RELEASE) Utilities
(c) 2013-2019 Whirl-i-Gig


TRUNCATING ca_objects


WILL INDEX [objects]

Memory: 385.26M                            3.0% 752/23485 ETC: 13 mins, 5 secs. Elapsed: 26 secs [>               ]PHP Fatal error: Uncaught Elasticsearch\Common\Exceptions\BadRequest400Exception in /collectiveaccess/dev/admin/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php:630
Stack trace:
#0 /collectiveaccess/dev/admin/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php(293): Elasticsearch\Connections\Connection->process4xxError()
#1 /collectiveaccess/dev/admin/vendor/react/promise/src/FulfilledPromise.php(28): Elasticsearch\Connections\Connection->Elasticsearch\Connections\{closure}()
#2 /collectiveaccess/dev/admin/vendor/guzzlehttp/ringphp/src/Future/CompletedFutureValue.php(55): React\Promise\FulfilledPromise->then()
#3 /collectiveaccess/dev/admin/vendor/guzzlehttp/ringphp/src/Core.php(341): GuzzleHttp\Ring\Future\CompletedFutureValue->then()
#4 /collectiveaccess/dev/admin/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php(314): GuzzleHttp\Ring\Core::proxy()
#5 /collectiveaccess/dev/admin/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Conne in /collectiveaccess/dev/admin/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php on line 630

Thanks for any help.

Joshua

The New School

Comments

  • My first guess would be there are some odd characters in a record that Elastic is choking on. Try indexing with SqlSearch to see if that completes. If it does (or doesn't) it's a clue.

  • Thanks Seth!

    I was able to reindex the objects table after switching back to SqlSearch.

    If the reindex always stops at 752/25000, does that equate to the object at this link: CA_URL/index.php/editor/objects/ObjectEditor/Edit/object_id/752 ? If yes, this object is part of a series and doesn't have any red flags as to why it would cause an error, but almost identical objects wouldn't.

    Part of this process is transferring our installation to new hardware, including a MySQL update. The original installation runs MySql 5.5 and the new server runs MySql 8.0.21. To transfer the DB we did a simple dump to a .sql file and a command line import on the new server mysql ca_database < database.sql , then ran the update script. Could this be creating a potential error? Is there a better way to insure data integrity between versions? They are both RHEL machines.

    Thanks again.

  • Hi,

    The number doesn't necessarily indicate object_id. The current versions run fine on MySQL 8. If the reindexing is dying at the same record using both Elastic and SqlSearch then it's very likely some kind of odd data issue with the record. It should be displaying the object identifier along with the record count while reindexing. Do you see anything else when it fails?



    seth

  • Hey Seth,

    The reindexing does work for SqlSearch. It completes all of the ca_objects table.

    I don't see the object identifier while reindexing. I'm using the caUtils method. Does it show the object ID if you re-index through the GUI?

    CollectiveAccess 1.7.11 (158/RELEASE) Utilities
    
    (c) 2013-2019 Whirl-i-Gig
    
    
    
    
    
    
    
    WILL INDEX [set items, list items, object representations, objects, entities, collections, sets, relationship types, users, comments, user groups, occurrences, places, storage locations, loans, movements, tours, object lots, tour stops, tags, object checkouts]
    
    
    
    
    Memory: 78.00M                                    100.0% 63804/63804 ETC: < 1 sec. Elapsed: 1 mins [===============================]
    
    Memory: 78.00M                                   100.0% 36688/36688 ETC: < 1 sec. Elapsed: 55 secs [===============================]
    
    Memory: 92.00M                          100.0% 35418/35418 ETC: < 1 sec. Elapsed: 55 mins, 10 secs [===============================]
    
    Memory: 525.66M                                                                4.0% 884/23485 ETC: 48 mins, 34 secs. Elapsed: 1 mins, 54 secs [=>                             ]PHP Fatal error:  Uncaught Elasticsearch\Common\Exceptions\BadRequest400Exception in /collectiveaccess/providence/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php:630
    
    Stack trace:
    
    #0 /collectiveaccess/providence/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php(293): Elasticsearch\Connections\Connection->process4xxError()
    
    #1 /collectiveaccess/providence/vendor/react/promise/src/FulfilledPromise.php(28): Elasticsearch\Connections\Connection->Elasticsearch\Connections\{closure}()
    
    #2 /collectiveaccess/providence/vendor/guzzlehttp/ringphp/src/Future/CompletedFutureValue.php(55): React\Promise\FulfilledPromise->then()
    
    #3 /collectiveaccess/providence/vendor/guzzlehttp/ringphp/src/Core.php(341): GuzzleHttp\Ring\Future\CompletedFutureValue->then()
    
    #4 /collectiveaccess/providence/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php(314): GuzzleHttp\Ring\Core::proxy()
    
    #5 /collectiveaccess/providence/vendor/elasticsearch/elasticsearch/src/Elasticsearc in /collectiveaccess/providence/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php on line 630
    

    The ES logs aren't giving me any clues either.

    Thanks!

  • Ok sorry. I had read your previous email as it failing across the board. Regarding idno, it's in the CLI but perhaps not in the version you're running. In the text above it's saying record 884... does it change every time ? Or is it always the same record?

  • No worries Seth. The number changes depending on which command I run.

    If I run ./bin/caUtils rebuild-search-index -t ca_objects , it stops at 752/23485 every time.

    If I run ./bin/caUtils rebuild-search-index , it stops at 884 or sometimes 960. I've run both commands today and got 752 and 884 respectively.

Sign In or Register to comment.