In a single machine deployment, one instance of Splunk handles the entire end-to-end process, from data input through indexing to search. A single-machine deployment can be useful for testing and evaluation purposes and might serve the needs of department-sized environments. For larger environments, where data originates on many machines and where many users need to search the dtae, you’ll want to distribute functionality across multiple instances of Splunk. How Splunk Scales Splunk performs three key functions as it moves data through the data pipeline.
First, Splunk consumes data from files, the network, and elsewhere. * It then indexes the data (Actually, it first parses and then indexes the data, but for purposes of this, we consider parsing to be part of the indexing process) * Finally, it runs interactive or scheduled searches on the indexed data. This functionality can be split across multiple specialized instances of Splunk, ranging in number from just a few to thousands, depending on the quantity of data you’re dealing with and other variables in your environment.
You might for example, create a deployment with many Splunk instances that only consume data, several other instances that index the data, and one or more instances that handle search requests. The specialized instances of Splunk are known collectively as components. There are several types of components. For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk, called forwarders, on the machines where the data originates. The forwarders consume data locally, and then forward it across the network to another Splunk component, called the indexer.
The indexer does the heavy lifting; it indexes the data and runs searches. It should reside on a machine by itself. The forwarders on the other hand, can easily coexist on the machines generating the data, because the data-consuming function has minimal impact on machine performance. As you scale up, you can add more forwarders and indexers. For larger deployment, you might have hundreds of forwarders sending data to a number of indexers. You can configure load balancing on the forwarders, so that they distribute their data across some or all of the indexers.
Not only does load balancing help with scaling, but it also provides a fail-over capability If one of the indexers goes down. The forwarders will automatically switch and start sending their data to any indexers that remain alive. To coordinate search activities across multiple indexers, you can also separate out the functions of indexing and searching. In this deplo0yment called a distributed search, the indexers might just index data. A Splunk instance dedicated to searching, called the search head, then runs searches across the set of indexers, consolidating the results and presenting them to the user.
Cite this Splunk
Splunk. (2016, Oct 04). Retrieved from https://graduateway.com/splunk/