Mitigate a thundering herd by spreading agents out when too many check in at once.
Version and installation information
PE version: All supported versions
When too many agents check in at once, configure the pe-puppetserver service to return
503 Service Unavailable error responses with random
Retry-After headers. Agents sleep for a random amount of time set in the
Retry-After field and then check in, breaking up the herd.
"puppet_enterprise::master::puppetserver::jruby_puppet_max_queued_requests": 48 "puppet_enterprise::master::puppetserver::jruby_puppet_max_retry_delay": 600
jruby_puppet_max_queued_requests setting limits the maximum number of waiting requests allowed before pe-puppetserver starts sending 503 responses to spread agents out. Change this setting based on the number of JRuby workers Puppet Server is running. Start with a limit of 12 queued requests per JRuby. The example above is based on the default JRuby worker pool of 4 instances. The maximum value for
jruby_puppet_max_queued_requests is 150.
jruby_puppet_max_retry_delay setting limits the maximum amount of time that
pe-puppetserver returns as a
Retry-After header on 503 responses. This limit is multiplied by a random number, and each agent sleeps for a different amount of time, preventing a thundering herd. The example above uses a limit of 10 minutes.