Yet more Subversion repository polling headaches
Either there is something very wrong with the way our repository is set up (i.e. my fault) or there is something wrong about the polling model for detecting changes in the repository.
This is the second or third time I’ve run into this issue. Granted, we have not had the most fastidious design for our repository, however, I can’t believe that what we are doing on this small scale is so outside the pale of what other, much larger organizations (such as open-source hosters) are doing. I just don’t think we’re that special.
The problem manifested itself today in the form of hundreds of authentications per second against Active Directory from a particular account and server. The server is our Subversion repository server, naturally, and the account is the special build server account set up to monitor and build from our product locations in the repository.
The tide was sustained over a wide range of time and seemed to be contributing to visible load on the domain controller. If not harmful, this is at the least embarrassing and annoying for our build infrastructure, which I’m very much invested in. Nothing I know should be causing this kind of traffic, and if it is a consequence intrinsic to the system, I should find a way to make it less demanding.
I’ve run into similar issues with other tools before. Evidently, polling a repository for changes is demanding on a repository. We had an application called SVN monitor which was very nice. Every time a commit was made to the repository, a notification would pop up from the system tray giving you the details, which you could then inspect with a pretty nice repository browser. The polling interval could be configured as well.
Unfortunately, we would see enormous spikes in CPU usage on the SVN server when a few users would be polling for changes. Limiting the number of polled locations and frequency of polling helped performance but greatly hampered the usefulness of the tool. We eventually replaced it with the repository view from Redmine.
Back to tracking down the current issue. The SVN server (VisualSVN) is configured for secure connections with Windows Authentication. The authentication requests to AD coming from this server are responses to SVN clients making requests of the server. Turning off the SVN server verified that it was indeed the source of the authentication requests. There were only two servers configured to use that poll the subversion server using the account in question…Redmine and TeamCity, our build server.
The Redmine repository view does take some time (and presumably therefore, effort) to refresh its view. It uses the subversion command line tools to do so. However, it should only refresh when you visit the page. So this isn’t a likely culprit unless something is very wrong.
TeamCity, on the other hand, is constantly polling to show you what changes might be pending for the next build. This was the likely culprit. Shutting it down quickly killed all of the authentication requests. That was the winner.
However, what to do about it is an issue. Obviously, it isn’t necessary to poll every 60 seconds for every VCS root in TeamCity, but there has to be a happy medium. I reduced the polling interval globally to 10 minutes, which definitely helped. There are only four VCS roots, and the TeamCity vcs log shows that they are polled individually each minute (by default)…so why the hundreds of authentication requests. Something about the way TeamCity polls must break up the poll into lots of small requests, each requiring authentication. This is what is nailing our domain controller.
Aside from reducing the polling interval, I’m not sure what else there is to do at the moment. I haven’t posted a question on their support forum, but that’s on tap. Perhaps there is more to come.
This is the second or third time I’ve run into this issue. Granted, we have not had the most fastidious design for our repository, however, I can’t believe that what we are doing on this small scale is so outside the pale of what other, much larger organizations (such as open-source hosters) are doing. I just don’t think we’re that special.
The problem manifested itself today in the form of hundreds of authentications per second against Active Directory from a particular account and server. The server is our Subversion repository server, naturally, and the account is the special build server account set up to monitor and build from our product locations in the repository.
The tide was sustained over a wide range of time and seemed to be contributing to visible load on the domain controller. If not harmful, this is at the least embarrassing and annoying for our build infrastructure, which I’m very much invested in. Nothing I know should be causing this kind of traffic, and if it is a consequence intrinsic to the system, I should find a way to make it less demanding.
I’ve run into similar issues with other tools before. Evidently, polling a repository for changes is demanding on a repository. We had an application called SVN monitor which was very nice. Every time a commit was made to the repository, a notification would pop up from the system tray giving you the details, which you could then inspect with a pretty nice repository browser. The polling interval could be configured as well.
Unfortunately, we would see enormous spikes in CPU usage on the SVN server when a few users would be polling for changes. Limiting the number of polled locations and frequency of polling helped performance but greatly hampered the usefulness of the tool. We eventually replaced it with the repository view from Redmine.
Back to tracking down the current issue. The SVN server (VisualSVN) is configured for secure connections with Windows Authentication. The authentication requests to AD coming from this server are responses to SVN clients making requests of the server. Turning off the SVN server verified that it was indeed the source of the authentication requests. There were only two servers configured to use that poll the subversion server using the account in question…Redmine and TeamCity, our build server.
The Redmine repository view does take some time (and presumably therefore, effort) to refresh its view. It uses the subversion command line tools to do so. However, it should only refresh when you visit the page. So this isn’t a likely culprit unless something is very wrong.
TeamCity, on the other hand, is constantly polling to show you what changes might be pending for the next build. This was the likely culprit. Shutting it down quickly killed all of the authentication requests. That was the winner.
However, what to do about it is an issue. Obviously, it isn’t necessary to poll every 60 seconds for every VCS root in TeamCity, but there has to be a happy medium. I reduced the polling interval globally to 10 minutes, which definitely helped. There are only four VCS roots, and the TeamCity vcs log shows that they are polled individually each minute (by default)…so why the hundreds of authentication requests. Something about the way TeamCity polls must break up the poll into lots of small requests, each requiring authentication. This is what is nailing our domain controller.
Aside from reducing the polling interval, I’m not sure what else there is to do at the moment. I haven’t posted a question on their support forum, but that’s on tap. Perhaps there is more to come.