We recognize the severity of this outage and apologize for the impact it has to our customers. This failover test helped expose the configuration issue, and we are addressing the gaps in both configuration and our failover testing which will help make GitHub more resilient. Once online it took time for traffic to be rebalanced and for our border routers to reconverge restoring public connectivity to affected GitHub systems. ago As others have stated, the issue is the FS interop. git.exe status (ran in 5 seconds returned instantly when running it immediately after that) 7 Devloper 3 yr. Within two minutes of being alerted we reverted the change and brought the primary facility back online. ago use git.exe instead wow thats a huge performance difference. We were immediately notified of the issue in our monitoring and alerting. 132 I installed Git on my Windows 10 a couple of months ago. This caused issues with Internet connectivity to GitHub, ultimately resulting in an outage. git config -global core. gitignore file was located in my windows userprofile, which was stored on an inaccessible network share. And around 1.5s using readdir() single threaded. My git status was very slow (up to one minute), because the global. Unfortunately, during this failover we inadvertently caused a production outage.ĭuring the test we exposed that the secondary site had a network pathing configuration issue that prevented it from properly functioning as the primary facility. As noted above, git status takes 500ms to run lstat with 20 threads to check for changes to tracked files. Today we were performing a live failover test to validate that we could in fact use this second Internet edge facility if the primary were to fail. This second Internet edge facility was completed in January and has been actively routing production traffic since then. We have been working on building redundancy to an earlier single point of failure in our network architecture at a second Internet edge facility. GitHub takes measures to ensure that we have redundancy in our system for various disaster scenarios. git status can be pathologically slow when you have leftover index.lock files. Running “systemctl status systemd-logind” should also show the process in a restart loop (failing Watchdog/Health checks and being restarted constantly) as well as errors being thrown when attempting to terminate sessions manually with “loginctl terminate-session”.From 17:39-18:12 UTC GitHub was down in parts of North America, particularly the US East coast, and South America. You can clone a repository using the GVFS protocol if your repository is hosted by Azure Repos. Scalar maximizes your Git command performance by setting recommended config values and running background maintenance. If you are seeing similar behaviour, you should be able to diagnose whether this is the issue by running “loginctl list-sessions”, seeing a large number of sessions from the “git” user (or whatever user GitLab SSH runs as) for a start. NET Core application with installers available for Windows and macOS. 'loginctl terminate-session ' sometimes return error: Failed to issue method call: Resource deadlock avoided.Specifically, it looks like the “systemd-logind” got into a bad state where it was unable to terminate user sessions, so possibly related to one of these bugs/issues: Ok, this turned out to be a systems issue, unrelated to the GitLab upgrade.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |