The Graylog community is what makes the product so exciting. It is awesome to see our community members take the time to help everyone over on our community forums, twitter, reddit or on their own private channels. I wanted to take some time to highlight a blog post by Community member BlueTeamNinja (aka Big Abe) who, after tackling a Graylog deployment shared lessons learned from a non-Linux/non-Elk person.
The original Community outreach was to get some advice on setting up a Graylog Deployment as what he was reading was not getting him to the right starting off point. After figuring it all out, BlueTeamNinja did a great job of writing up an overview of Graylog and giving out some great Pro Tips. I wanted to expand a bit on the sections of the article with additional thoughts and considerations when you are first getting started.
I recommend starting with Big Abe’s blog post here.
INPUTS
Inputs can be tied to individual Graylog nodes and to a specific network interface of the node if you would like making it very strict on where we listen for data coming in. Additionally if you have a cluster of Graylog nodes and want a syslog listener on all of them, you can select the “Global” option at the top of the input to have it deployed everywhere.
If you would like to encrypt your communications you can use TLS encryption for your agents by following the guide in our docs.
You can also add a static field to each log collected by the input to help with the processing and visualization down the line. An example would be to add a tag “from_paloalto” from an input accepting all your firewall data. We could move this to its own stream easily with a stream rule.
STREAMS
Streams do allow for different index retention times, as the streams are tied to an index set. Along with the retention times you can also put in role based access control to each stream, through the Authentication Management module. For example, roles can be created for Windows Administrators and Network Administrators, and only the relevant streams with their data can be given access to the role.
Streams can also have their own outputs as well, if you would like to take one stream of log data and send it off to a third party system for further use.
INDICES
Indices or Index Sets are groupings of logs saved on the Graylog server/node. Many streams can write to one index, or you can have one index per stream. This really comes down to your retention strategy, desire to segregate your logs and performance as you grow. Each index set allows for replicas (backup copies) if the data is highly critical, and with Enterprise features, allows you to archive off your logs for long term storage. You can select your desired retention length and type of compression as well to save on disk space.
PIPELINES
Pipelines allow for full message processing to happen from cleaning up the logs, to adding in lookup tables and changing field names. While there are many possibilities with pipelines listed in the article, I wanted to add pipelines allow you to enrich your data with GeoLocation information, threat intelligence data and local lookup table data. Please look at these blog posts for further details on pipelines. Windows Threat Intel
CONCLUSION
I can’t say thank you enough to our great community to put out articles like the one above and to Big Abe at blueteam.ninja for putting down all his experience and knowledge for everyone to use. Please join our community forum or follow us on the r/graylog subreddit for more great help from your fellow Graylog users.