Optimal way to set up ELK stack on three servers
I am looking to set up an ELK stack and have three servers to do so. While I have found plenty of documentation and tutorials about how to actually install, and configure elasticsearch, logstash, and kibana, I have found less information about how I should set up the software across my servers to maximize performance. For example, would it be better to set up elasticsearch, logstash, and kibana on all three instances, or perhaps install elasticsearch on two instances and logstash and kibana on the third? Related to that question, if i have multiple elasticsearch servers in my cluster, will I need a load balancer to spread requests to them, or can I send the data to one server, and it will distribute it accordingly?
The size of your machines would also be important. Three machines with 8GB of RAM is much different than three with 64GB or more... Kibana takes very few resources. Logstash is more CPU-heavy. Elasticsearch is more RAM heavy. With an elasticsearch cluster, you usually want a replica of each shard for redundancy. That's usually done with two servers. If you have a third elasticsearch server, then you'll get an IO boost (writing two copies of the data to three servers lowers the load). Also, an even number of servers can get confused as to which is the master, so three will help prevent "split brain" problems. Those two or three nodes would be "data" nodes, so if you throw queries or indexing requests at them, they may need to move the request to a different server (the one with the data, etc). A request also has a "reduce" phase, where the data from each node is combined before being returned. Having a smaller "client" node - where queries and index requests go - helps with that. Of course, you'd want two, to make them redundant. Logstash is best run multithreaded, so having multiple cpus that you can dedicate is nice. Having a redundant/load-balanced logstash machine is also nice. Kibana could run on these machines as well. So, we're quickly up to 7 machines. Not what you wanted to hear, right? If you're firmly limited to 3 machines, you'd want to run elasticsearch on all three as mentioned above. You need to shoehorn in the rest. Logstash on two, kibana on one? Then you have a single point of failure for kibana. How about logstash on all three and kibana on all three? The load would be distributed around, so hopefully would be a small increment for each server. And, if the machines are beefy enough, it should be OK. I have machines in one cluster that run logstash, The general recommendation is to allocate 1/2 the system RAM (up to ~31GB) to elasticsearch, leaving the rest to the operating system. If you were going to run logstash and kibana on the same machines, you'd want to lower that (to maybe 40%?), give logstash some (15%?) and leave the rest to the OS. Clearly, the size of your machines is important here.
aws cloudsearch/lucene query street names
Getting cardinality of multiple fields?
Aggregating a Key/Value list in ElasticSearch
“reverse cardinality” in elasticsearch?
ElasticSearch- Using Fields doesn't return any documents on Nest
Analyzer to find , e.g: “starbucks” when mistakenly querying “star bucks”
Elasticsearch - boost document based on field's specific value
How to get elasticsearch most used words?
Umlaut in Elastic Suggesters
Index creation move elastic search cluster to red
Is there multiword synonyms with slop in ES
Elasticsearch: How do you delete a mapping type without deleting an entire index?
What are the performance drawbacks of flat documents vs. nested ones?
Template does not exists for cookbook elasticsearch
Elastic search hits are N but returned results are much less
Is it possible to search nested objects in ElasticSearch with the lucene query syntax?