After you’ve added hundreds of nodes to your deployment, you might notice that your agents are running slowly or timing out. By default, an agent checks in to retrieve a catalog when the agent service is started in PE and every 30 minutes thereafter. As a result, if you restart Puppet server on all of your nodes they might be trying to check in at the same time. This could cause a thundering herd of processes, degrading CPU and memory performance.
PE version: All supported
OS: Any *nix
Installation type: Any
Solution
Check if you have a thundering herd.
If you do, use the following article to stop it and prevent it from happening again.
Use one or more of the following to prevent a thundering herd in the long term.
-
Prevent a thundering herd: Run Puppet with cron or the
reidmv-puppet_run_scheduler
module -
Spread out agent catalog requests using
splay
How can we improve this article?
0 comments
Please sign in to leave a comment.
Related articles