logging - Elasticsearch cluster design for ~200G logs a day -




i've created es cluster (version 5.4.1) 4 data nodes, 3 master, 1 client node (kibana).

the data nodes r4.2xlarge aws instance (61g memory, 8vcpu) 30g memory allocated es java.

we're writing around 200g of logs every day , keep last 14 days.

i'm looking recommendations our cluster improve cluster performance, search performance (kibana).

more data nodes? more client nodes? bigger nodes? more replica's? can improve performance option.

is there close design or loads? i'll happy hear other designs , loads.

thanks, moshe

  1. how many shards using? default of 5? pretty number. depending on ask shard should between 10g , 50g; logging use-case rather on 50gb side.
  2. which queries want speed up? target recent data or long time-spans? if interested in recent data, use different node types in hot-warm architecture. more power nodes recent data , less data; bulk of older , less accessed data on less powerful nodes.
  3. generally you'll need find bottleneck. i'd free monitoring plugin , take @ how both kibana , elasticsearch doing.

wild guess: limited on io. prefer local disks on ebs, prefer ssds on spinning disks, , if can, many iops can afford use-case.





wiki

Comments

Popular posts from this blog

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -

Asterisk AGI Python Script to Dialplan does not work -