puppetlabs-puppet_metrics_collector module to get data that helps you diagnose performance issues including failed catalog compilation and/or a growing command queue in PuppetDB.
Version and installation information
PE version: All supported
Note: Links to our documentation are for the latest version of Puppet Enterprise. Use the right version for your deployment.
Installation type: All supported
The first step in fixing performance issues is getting data. You can get metrics from Puppet Server and PuppetDB using the
puppetlabs-puppet_metrics_collector module which is installed by default with PE. You can use output from the module to measure and optimize two key metrics in your deployment,
average-free-jrubies for Puppet Server and
queue_depth for PuppetDB.
You can view data gathered by the module as text or as graphs by installing tools. If you aren’t able to install tools right away, view data as text output. If you are able to install additional tools, it’s easier to view data trends as graphs.
View data as text
Metrics are organized in the following directory structure:
The JSON metrics files are by default one line, which makes them difficult to read. Make them easier to read by prettifying them.
jq to non-destructively improve formatting :
cd /opt/puppetlabs/puppet-metrics-collector cat <SERVICE>/<HOSTNAME>/*.json | jq '.' | grep <QUERY>
To use Python’s
json.tool to improve formatting:
cd /opt/puppetlabs/puppet-metrics-collector for i in <SERVICE>/<HOSTNAME>/*.json; do echo "$(python -m json.tool < $i)" > $i; done grep <QUERY> <SERVICE>/<HOSTNAME>/*.json
View data as graphs
You can configure the
puppetlabs-puppet_metrics_dashboard module to display metrics as graphs.
Puppet Server metrics
Your Puppet Server infrastructure must be capable of providing enough JRuby instances or threads for the amount of activity it handles. For requests to be processed, at least one JRuby must be available, or the request must wait. So, one of the most important metrics for Puppet Server is
average-free-jrubies. See that metric over time by running:
grep average-free-jrubies puppetserver/<HOSTNAME>/*.json
Example metrics outputs:
cd /opt/puppetlabs/puppet-metrics-collector grep average-free-jrubies puppetserver/server.example.com/*.jsonpuppetserver/server.example.com/20190404T170501Z.json: "average-free-jrubies": 0.9950009285369501, puppetserver/server.example.com/20190404T171001Z.json: "average-free-jrubies": 0.9999444653324225, puppetserver/server.example.com/20190404T171502Z.json: "average-free-jrubies": 0.9999993830655706,
average-free-jrubies is consistently below one, improve performance by increasing the
max-active-instances setting to increase the number of JRubies available to Puppet Server.
The default recommendation for this setting is CPUs-1, but varies based on your deployment. You can use the
puppet infrastructure tune command with the
--estimate flag to find the number of JRubies needed for your installation.
Be careful not to set the number of JRubies so high that Puppet Server is competing with PuppetDB or console services for resources. Try increasing
max-active-instances by one and watch CPU utilization in conjunction with Puppet Server metrics. Depending on how many CPUs are allocated to your primary server, you might need to increase the number of available CPUs before you can further increase JRubies.
When you increase the number of JRubies, increase the heap size of Puppet Server linearly using the steps in our documentation.
- Number of JRubies: 3
- Java heap: 2GB
- Number of JRubies: 4
- Java heap: 2.66GB = 2GB + (2GB/3 JRubies)
- Number of JRubies: 6
- Java heap: 5GB
- Number of JRubies: 7
- Java heap: 5.83GB = 5GB + (5GB/6 JRubies)
One of the most important metrics for PuppetDB is
queue_depth, also referred to as the Command Queue in the PDB dashboard.
See that metric over time by running:
grep queue_depth puppetdb/<HOSTNAME>/*.json
cd /opt/puppetlabs/puppet-metrics-collector grep queue_depth puppetdb/server.example.com/*.json puppetdb/server.example.com/20170404T170501Z.json: "queue_depth": 0, puppetdb/server.example.com/20170404T171001Z.json: "queue_depth": 0, puppetdb/server.example.com/20170404T171502Z.json: "queue_depth": 0,
queue_depth metric should stay around zero, or slightly above zero in busy systems. The
queue_depth can ebb and flow, but it should not show a pattern of consistently increasing. If the
queue_depth is continually growing, then increase your command processing threads.
Command processing threads defaults to CPUs/2. If PuppetDB is on its own node, increase the number of command processing threads.
If PuppetDB is on the same node as the primary server, increase the number of CPUs available to the primary server to provide enough resources to increase the number of command processing threads.
You might want to collect metrics on compiler nodes. PuppetDB is installed on all compilers, so you can collect PuppetDB metrics on compilers. By default, you can’t get metrics data from compilers remotely due to the following:
/metrics/v1endpoint is disabled by default.
/metrics/v2endpoint access is restricted to localhost.
If you have questions about tuning your deployment, open a support ticket. Please attach the following files to help us troubleshoot your performance issues and resolve them quickly:
- Metrics output from the module
- A support script tarball