As a developer, if you have discovered that you have just exposed a sensitive file or secrets to a public git repository, there are some very important steps to follow.
What is a secret? In this document when we use the term secret, we are referring to anything that is used to authenticate or authorize ourselves, most common are API keys, database credentials or security certificates.
The first step, breathe: in most circumstances, if you follow this guide carefully, it will only take a few minutes for you to nullify most of the potential damage. This post will go through the four steps needed to remove the risk and make sure it doesn't happen in the future.
1. Revoke the secret or credentials 2. (Optional) Permanently delete all evidence of the leak 3. Check access logs for intruders 4. Implement future tools and best practices
If I delete the file or repository then I’m safe right?
Unfortunately, no. If you make the repository private or delete the files you can reduce the risk of someone new discovering your leak but the reality is that it is very likely your files will still exist for those that know where to look. Git keeps a history of everything you do, so even if you delete the file, it will still exist in your git history. Even making the repository private, deleting your history or even deleting the entire repository, your secrets are still compromised.
There are easy ways to monitor public git repositories, for instance, GitHub has a public API where you can monitor every single git commit that is made. That means that anyone can (and they do) monitor this API to find credentials and sensitive information within repositories. For this reason it is best to assume that if you have leaked a secret, it is compromised forever.
Leaking secrets onto GitHub and then removing them, is like accidentally posting an embarrassing tweet, deleting it and just hoping no one saw it or took a screenshot.
Step 1. Revoke the secret and remove the risk
The first thing we need to do is make sure that the secret you have exposed is no longer active so no one can exploit it.
If you need specific advice on how to revoke keys you can see instructions within our API Best Practices Document.
If this key belongs to an organization you work for then it is important to speak to the senior developers at your organization. It can be scary to let your company know you have leaked sensitive information, particularly if it has happened on a personal repository. But honesty is the best approach, it is possible the company has already discovered the leak, mistakes can be forgiven if the problem is resolved and you genuinely care.
Don’t hope that the problem will go away - mistakes happen and it is best to simply be honest and upfront.
Did you know? Slack keys are among the rare API tokens that have the ability to autorevoke themselves! As simple as using the auth.revoke endpoint!
Step 2. (Optional) Get rid of the evidence
Once the secret has been revoked, it cannot be used anymore. Nonetheless, having any credential, even an expired one, could look unprofessional and raise concerns. Additionally, there are some secrets that cannot be revoked (for example database records), or credentials no one can guarantee were properly revoked (for example, SSH keys that can be used in many different places). So we will go through the steps of how to remove history of it, please note that this is not a trivial task and it is advised to seek the guidance of a senior developer.
a. Either delete your repository or make it private
It is often a good idea to buy yourself some time first: navigate to your GitHub repository then click "Settings".
Then, all the way down to the "Danger Zone" and click "Make Private" to hide the repository from the public.
Note: if you may wish to make a backup then click on "Delete this repository". You will push it back later.
You can push the repository back later.
b. Rewrite git history
⚠️ Warning Before jumping in, be advised that rewriting .git history is not a trivial act, especially if there are many developers contributing to your repository. You will either have to completely delete the repository then push the cleaned version back, or git push --force to your initial repository. In either cases, you’ll completely break other contributing developers’ workflow.
We are going to use the well-known BFG Repo-Cleaner
The BFG is a simpler, faster (10 - 720x faster) alternative to git-filter-branch for cleansing bad data out of your Git repository
Let's suppose you committed a sensitive file called config.py and this contains a secret key.
i. Make sure you have java installed.
ii. Clone your repository
git clone https:github.com/YOUR-USER-NAME/YOUR-REPOSITORY.git
iii. Delete the Sensitive file
The very latest commit on your current branch is protected by BFG so you have to make sure it is clean. Delete "config.py" and commit your changes
Branches different from the current one are not protected so if "config.py" can be found on other branches, it will be cleaned by BFG
git commit -m "clean commit"
iv. Run BFG
Download the latest version of BFG from their website, move the java file into your repository and run the command below
Note:replace bfg-VERSION with the latest version (bfg-1.13.0)
java -jar bfg-VERSION.jar YOUR-REPOSITORY/.git --delete-files "config.py"
vi. Check your history.
You can use the log - p command to show the difference (called a "patch") introduced in each commit. If you navigate through your different branches, you should see that everything is fine.
git log -p
vii. Push your repository back
Create a new repository and push it back. Make sure everybody deleted old clones and is using your new version.
Did you know? Even though git prevents you overwriting the central repository’s history by rejecting non-fast forward push requests, you can still push changes using the --force flag. This forces your remote branch to match your local one. Be careful though, this is a dangerous command!
3. Check your access logs!
This is very important depending on the keys that were leaked. Sometimes when one access key leaks it creates a domino effect and leads to exposing new secrets. For example, an access key to Slack may give a bad actor access to messages containing new credentials and access codes so very important to make sure that there is no suspicious data!
Checking access logs really depends on the type of credential that was leaked. For example, AWS logs are sometimes centralized into Cloudwatch. Slack has a dedicated API endpoint that allows access to audit logs. This is probably a good moment for you to get closer to your SRE or Application Security team to make sure everything’s fine!
What now?
So the credential has been revoked, the repositories history has been cleaned. What should I do now? Now you have had a good scare, it is a good time to start to implement some good practices.
1. Get Protected with GitGuardian
GitGuardian is a good guy service that scans every single GitHub commit in public repositories in real-time for leaks.
2. Review API Best Practices
To better protect your secrets in the future we advise that you look at our API Best practice guide. It has lots of helpful tips on how to make sure you don't accidentally leak secrets in the future.
Author's note: this post was originally written on GitGuardian's blog.