solr - SolrCloud Indexing issue -




we facing 1 indexing issue in solr cloud.
solr version: 6.3.0
solr cloud configuration:

3 node zookeeper 3 solr instance. 1 collection 3 shards , 2 replicas. indexing 10000 document indexed docs shard1=3393, shard2=3351, shard3=3256

testing: down leader replica when indexing running. connection refuse exception generated continuously in sendupdatestream method of class concurrentupdatesolrclient in log. display 10000 records indexed, while execute docs (:) query, solr return 6607 documents other 3393 documents not return in result.

example: gave dataimport command machine1:8091, machine1:8091 solr instance responsible document routing have indexed of document in local index , of them send dedicated shards.

suppose down machine2:8092 solr instance, contains 1 leader replica of shard1.so select machine3:8093 leader replica.

data import screen display 10000 documents indexed/processed. while searching 6607 document indexed. , while searching in shard1 0 documents indexed 3393 document of shard1 not indexed. , connection refused exception generated.

from debugging code, come know that,

in concurrentupdatesolrclient class document send dedicated shard , replica solrj request , added in runner queue , these request run scheduler later on.

when leader goes down, runner queue may have many requests point old leader , down, try send request old leader , got connection refuse exception. because have not modified pending request though new leader selected shard runner queue holding client request point old leader not new leader. there no code handling issue, if exception generated concurrentupdatesolrclient’s sendinputstream(), not handle caller.

we have tried solve issue modifying concurrentupdatesolrclient class. added code modify current client request when ioexception generated in concurrentupdatesolrclient’s sendinputstream method , resubmit current request new leader node.

by modifying code able indexed failed document but, documents still missing getting 9850 document while searching.

note: schema file have 140, of text_general type , 5 copy fields 1 of metadata field contains metadata information of document , 1 of full_text contains text file of 10 500 mb.





wiki

Comments

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -