I learned a neat tip from my co-worker, Craig Silverstein (more on Craig joining Khan Academy), recently and I thought others might find it to be useful.
It has to deal with the eternal question: How do you store sensitive configuration options (such as usernames, passwords, etc.) in source control? Typically what I’ve done is to just punt on the problem entirely. I create a dummy configuration file, such as conf/sample-settings.json
which has the basic structure but none of the details filled out. For example:
conf/sample-settings.json
// Copy to conf/settings.json // and fill these in with your login details! { "db": { "username": "", "password": "" } }
If someone else needed the details I would just email it to them, or some such (not ideal). Especially when it came time to add additional information to the file or make other changes.
The technique I picked up from Craig was to, instead, keep an encrypted version of the configuration file in source control and then provide a means through which the user can encrypt and decrypt that data.
In this case you can still have the a dummy config file, if you wish.
To start you’ll want to make sure you have your source control ignore the configuration file — just to make super-sure that no one ever accidentally commits it. In Git you’d add a line like this to your .gitignore
file:
.gitignore
conf/settings.json
Next you’ll want to create your actual config file and populate it with the real values.
conf/settings.json
(* Do not check this in to source control!!)
{ "db": { "username": "cool_guy", "password": "A1B2C3!" } }
Finally you’ll want to create a script (I’m using a Makefile
) that the user can run to encrypt and decrypt the file. This script uses OpenSSL, and specifically CAST5, to encrypt/decrypt the file. OpenSSL was chosen in particular as it worked out-of-the-box on both Linux and Mac machines.
OpenSSL reads in the appropriate files (depending upon if you’re encrypting or decrypting) then will prompt you for a password to encrypt/decrypt the file. (You’re free to use any encryption scheme that OpenSSL supports, of course.)
Makefile
.PHONY: _pwd_prompt decrypt_conf encrypt_conf CONF_FILE=conf/settings.json # 'private' task for echoing instructions _pwd_prompt: @echo "Contact [email protected] for the password." # to create conf/settings.json decrypt_conf: _pwd_prompt openssl cast5-cbc -d -in ${CONF_FILE}.cast5 -out ${CONF_FILE} chmod 600 ${CONF_FILE} # for updating conf/settings.json encrypt_conf: _pwd_prompt openssl cast5-cbc -e -in ${CONF_FILE} -out ${CONF_FILE}.cast5
With all this in place the next step is simple, you’ll run:
make encrypt_conf
and you’ll enter in a password with which to encrypt the config file:
Contact [email protected] for the password. enter cast5-cbc decryption password:
Make sure you write this down and don’t forget it — it’ll be very hard (if not impossible) to get your config file back if you forget the password.
At this point you’ll have a conf/settings.json.cast5
file and you can commit all the changes, using something like:
git add .gitignore Makefile conf/settings.json.cast5 git commit -m "Adding in an encrypted config file."
Now whenever someone downloads the code from source control they’ll need to either fill in their own values into the config file or they’ll need to get the password from you (the one you entered when you ran make encrypt_conf
— or even better, use a shared password safe to manage this). Once they have the password they just run the following and enter it:
make decrypt_conf
If you ever need to update the values in the config file, it’s really straight-forward. Just update the config file, run make encrypt_conf
again, and commit the new conf/settings.json.cast5
file.
One extra bit that you can add to your application, to make this process more intuitive, is a check for a missing config file and output with instructions for using the Makefile
.
For example if you were using Node.js you could do:
if (!fs.existsSync("conf/settings.json")) { console.error("Config file [conf/settings.json] missing!"); console.error("Did you forget to run `make decrypt_conf`?"); process.exit(1); }
Also, you may want to consider having a check to see if the decrypted file is out of date (which can happen if some changes were made in the source control, then were checked out, but you didn’t also run make decrypt_conf
). Perhaps something like the following:
(function() {
var conf_time = fs.statSync(“conf/settings.json”).mtime.getTime();
var cast5_time = fs.statSync(“conf/settings.json.cast5”).mtime.getTime();
if (conf_time < cast5_time) { console.error("Your config file is out of date!"); console.error("You need to run `make decrypt_conf` to update it."); process.exit(1); } })();[/js] And that's it! Simpler than passing around config files manually and you still get all the benefit of using revision control to manage the file and changes.
Ash (February 6, 2013 at 4:33 pm)
I wrote about a similar solution recently -http://www.tewari.info/2012/09/02/protecting-your-api-keys/
Your Encrypt/Decrypt step is a neat addition. I need to figure out how to incorporate that step in a Windows/Visual Studio environment.
Harley (February 6, 2013 at 4:39 pm)
Not really, it’s impossible to diff the file without checking it out. Basically I’d say this would be ok to do for only the username/password (or perhaps more specifically the database information.) If you have much else in that config any kind of reasonably useful repository actions like blame or diff become useless.
John Resig (February 6, 2013 at 4:45 pm)
@Harley: blame/diff aren’t as important in a situation like this, though — what’s more important is having the changes stored within a commit, meaning that you can check out old versions and have the config synced with the code base at that time. You still have “blame” to some degree, in that you could see that the conf is broken and see who was the last person to commit to it (doubtful that it’ll be changing that often).
Thomas (February 6, 2013 at 4:51 pm)
Great tip. However my main concern with passwords in source control comes when you deploy your app to Heroku (or equivalent). You app will be stuck without config/settings.json
Now I am using https://github.com/flatiron/nconf in each of my node app that needs passwords. In local, credentials are stored in a file, on Heroku credentials are stored as environment variables.
I am curious to know how other people deal with this problem :)
John Resig (February 6, 2013 at 4:54 pm)
@Thomas: At least at Khan Academy we don’t have that issue because we compile and push the code to the production server from a local machine (or, more specifically, to App Engine). It’s possible that a different strategy would have to be used for Heroku, but I’m not entirely sure off-hand!
Oh and thanks for the pointer to nconf, it looks really cool!
Bob (February 6, 2013 at 5:09 pm)
.PHONY: _pwd_prompt decrypt_conf encrypt_conf
This is non-portable, it won’t work on pmake or bmake (i.e. most BSDs)
Also, I like your choice of openssl over gpg :)
Casey Marshall (February 6, 2013 at 5:38 pm)
Encrypting your passwords in the build like this is a very clever improvement over plaintext, but as you mention, it does have its limitations — remembering that passphrase, not to mention, rotating it or securely sharing it among a team.
At Gazzang, I lead development of a key management service called zTrustee (http://gazzang.com/products/ztrustee) which can securely store and manage passwords like these, with many policy options. It’s built on crypto standards including OpenPGP and can be easily integrated into build scripts.
Sam (February 6, 2013 at 5:39 pm)
Why not source your configuration settings from environment variables? You can achieve basically the same thing without the “DON’T CHECK THIS IN!!!” fear. Don’t even let it be a possibility.
bingo (February 6, 2013 at 6:13 pm)
What do you think of storing the encrypted passwords in the config? E.g.:
{
"db": {
"username": "cool_guy",
"password": ""
},
"encryption_key": ""
}
Put the encryption key in a file at the specified location. When you retreive these config values they are decrypted. E.g.:
$username = $config->get("db.username");
$password = $config->decrypt("db.password");
You don’t lose your entire config file if you lose the key. Developers can forget about the key after they create that file (assuming everyone uses the same path). Diffing is easy, etc.
bingo (February 6, 2013 at 6:14 pm)
Lost some text there, oops:
{
"db": {
"username": "cool_guy",
"password": "encrypted password string goes here"
},
"encryption_key": "path to a file outside of the project containing the encryption key"
}
Neil (February 6, 2013 at 7:07 pm)
A similar strategy is used by https://github.com/oreoshake/passw3rd
Store passwords in encrypted form in files.
Store the key(s) elsewhere
Deploy to the same place
Use in code/config/commands
Configurable command line client to do reads, changes, key rotations, etc
Fred (February 6, 2013 at 7:55 pm)
I’m sorry but storing sensitive information (e.g. passwords) in SCM is a terrible idea even if they are encrypted. Why are you mixing deployment with development? They should be two different things IMHO.
ntnt (February 6, 2013 at 9:07 pm)
At my company, we wanted a scheme which would involve no secret values stored unencrypted on disk at all, but still enough ease of use around fetching them for use at runtime that people would not feel compelled to stash them off in e.g. my.cnf or other places. I’ll describe the scheme we arrived at here and see what people think. I think we’re pretty realistic about the flaws and limitations, but I’m curious to hear other people’s opinions.
There is a machine called the secret server. It has no other purposes but to run a daemon that handles (via SSL with self-signed certs and verified signatures) requests for secret values. Once it’s bootstrapped, it contains, in memory, a file which contains all the secret values, each identified with a path, plus an access control list which dictates whether a given client is allowed access to a particular secret. Clients are identified using SSH Agent identities, and when you request a secret, the server will challenge you using the same SSH Agent protocol that sshd uses to log you in. Your private key doesn’t leave your local machine — your ssh agent simply proves you have it by doing a signing request that the remote server verifies via your public key. Once your challenge is satisfied, it checks the ACL and decides whether your identity has access to the requested secret and returns it accordingly.
There is a simple client script and corresponding ruby library to allow applications and tools to request secrets from the secret server, all access-controlled based on the SSH agent you are running on your local machine (note: ssh agent forwarding is obviously required, and we are aware that this requires trusting the remote machine). So when, for example, I use capistrano to restart the apps on a particular server, my SSH agent gets forwarded to the deployment machine where I’m running cap, and the cap command forwards my agent to the app server, and when the app starts up and discovers that it needs a secret called /prod/db/foo/password, it calls the ruby library, which contacts the secret server, which challenges all the way back to my laptop’s ssh agent before giving up the goods. The secret then remains in memory in the app process (yes, RAM inspection by privileged processes is a vulnerability, but it’s outside the scope of the goal — no secrets on disk — and eliminating it is orders of magnitude more difficult). Once my capistrano session disconnects, that app process loses its ability to retrieve any more secrets on my behalf.
And a final piece is that one might wonder how the secret server gets access to all these secrets without storing them somewhere. The data file is stored symmetrically encrypted (it’s in SVN too) and the server itself does not know the key. When the server starts up, it is in a mode where it is unable to provide any secrets. In order to unlock it, a client must connect and provide the master key. The master key itself is stored in SVN, but GPG-encrypted with only a few individuals able to decrypt it. So after a reboot, someone with a high level of access needs to run gpg –decrypt masterkeyfile | secretclient –send-unlock-command , which requires giving their own GPG private key interactively. There are also redundant secret servers and the client script and library fail over appropriately.
The result is that we have a bunch of app servers which get access to the secrets they need without them having to be stored on disk. If we power cycle a machine, it comes up in a mode that should theoretically contain nothing sensitive. The secrets end up in RAM, so we need to worry about swap (we generally just try to avoid swapping), and there is the in-memory data to worry about, but the scope of the secrets-laying-around problem is drastically reduced. In addition, this level of indirection makes it super easy to rotate passwords frequently without even having to notify anyone, because we all are in the habit of using the tools to avoid ever even looking at the godawful passwords on various things. All the mysql tools, for example, are wrapped so you simply have to specify the path to the secret you want to use for the credentials. If you have access, you can run innotop as mysql root without worrying about copy/pasting the root password. This doesn’t prevent a malicious user who has access from retrieving the password and writing it down, but that wasn’t the goal.
ntnt (February 6, 2013 at 9:19 pm)
I know this is way outside the scope of your initial question, which is about dev configs, but I’ve never heard anyone else talk about managing this and I am curious what others think of this approach. It admittedly sounds pretty convoluted, and is an additional point of failure, but it serves its purpose.
Joshn (February 6, 2013 at 10:02 pm)
Its a great thing to do but isn’t that dangerous?
mpmedia (February 6, 2013 at 10:32 pm)
My opinion is dont keep passwords in source control. There is no reason why you cant rely on envirronment variables at some point in the code. what a config file should do is point to them when required. ie
{config:{password:process.env.PASS,env:”dev”}} in a js file or come up with some scheme that you can parse then cache for a json file
{config:{password:”%%PASS%%”,env:”dev”}} where “%%PASS%%” would be transformed by the reviver.
ntnt (February 7, 2013 at 12:24 am)
For those who are advocating using environment variables: what sets up the environment variables? Where is the data stored before the ENV is set up?
Andre (February 7, 2013 at 1:38 am)
If you want to keep developers from getting security-sensitive knowledge, you better keep that knowledge out of the development stream, encrypted or not.
A good way to do it is to set up a second repository which stores only the security-relevant entries, e.g. as a .properties file. Restrict access to this repository to the deployment team. This way you can set up the deployment process so that the deployment team can just run it without having to manually enter anything. Less steps required = less errors. And your security-relevant data is as safe as the deployment repository.
Shyam Sundar C S (February 7, 2013 at 2:29 am)
We did something similar recently in our Java/ant based deployment scripts.
What we did was store the sensitive information encrypted in the configuration files. We wrote an ant custom task ( using the Jasypt library http://www.jasypt.org/ ) to decrypt the password during runtime. The password ofcourse is set as an environment variable.
Davide Fugazza (February 7, 2013 at 4:41 am)
I also keep credentials and sensitive information as environment and/or system variables.
Property files eventually overrides them, and serves as a reference, so it is under version control, but without values.
Ciantic (February 7, 2013 at 4:53 am)
That seems more hassle than worth it, e.g. in Windows there is no make command, and many developers haven’t bothered to find one for Windows.
If I had to come up a solution it would simply ignore the original settings file (like in idea above) and create a simple script that SCP’d the latest pre-filled settings.conf during GIT hooks for those developers who are allowed to SCP it from somewhere.
Anonymous (February 7, 2013 at 6:59 am)
If you use proper packaging/deployment tools like maven or ant you will always have the option to do something like:
configuration file, checked into source control:
———————–
{
“db”: {
“username”: “${username}”,
“password”: “${password}”
}
}
credentials.properties (not checked into source control)
—————————
username : actual_user_name
password : actual_password
Dave Cottlehuber (February 7, 2013 at 9:18 am)
There is a well-tested solution in use for well over a decade, called Kerberos http://web.mit.edu/kerberos/ to provide mutually assured authentication of hosts and services, even when one party cannot guarantee the trust of the other. It’s well supported in all modern-day OS more or less out of the box, and integrated into OpenSSH as well as PAM & ActiveDirectory support.
I’m curious to know if this was tried and discarded?
GhiOm (February 7, 2013 at 11:23 am)
I may have missed something, but it looks like replacing a system where one needs to ask info via email with a system where one needs to ask info via email and then run decrytpion programs. The net gain seems negative. I wonder what I missed.
carlivar (February 7, 2013 at 1:46 pm)
There is a packaging and deployment system that handles sensitive files quite well:
http://tpkg.github.com/
tpkg sounds like it reinvents the wheel, but if you’ve ever tried to create your own RPMs (or DEBs) you might take a second look at it. Or if you’ve ever struggled with specifying your own Perl or Python dependencies while leaving system packages alone. Or mixing system package dependencies with custom packages.
Anyway, tpkg has native support for encryption in the way you describe here.
carlivar (February 7, 2013 at 1:58 pm)
ntnt, your system is exactly what I’ve always wanted to implement. Is it completely custom-built in-house? Any chance of open-sourcing?
jd (February 7, 2013 at 3:44 pm)
Storing passwords in plaintext on disk (i.e., the decrypted settings.json file) is not a great idea.
ntnt (February 7, 2013 at 3:52 pm)
@carlivar, yes, completely in-house. Probably no chance of open-sourcing it any time soon.
@Dave Cottlehuber, are you talking about using kerberos for securing miscellaneous credentials in deployed configs, or are you just talking about having apps rely on services which use kerberos for authentication? If the former, could you describe how it’s done? If the latter, I think most people need a solution for services which do not support kerberos.
lexual (February 7, 2013 at 5:38 pm)
@john is one weakness of this approach that if you need to change your encryption password, it makes it difficult to checkout previous versions of the “secret” files?
Greg (February 8, 2013 at 7:26 am)
Passwords aren’t source CODE. They are DATA. Why would you store them under source control?
ntnt (February 9, 2013 at 1:24 am)
Why wouldn’t you want your critical configuration data version-controlled?
gotofritz (February 9, 2013 at 9:06 am)
I agree with @GhiOm – I don’t see the point of this. You still need to get hold of the decription password. Plus, if you want to create your own version, you still need a conf/sample-settings.json to use as blueprint.
I also use the system where each developer sets up envirovenment vars / a local properties files (up to them) and they are substituted into the config file like “username”: “${username}”
Graham (February 10, 2013 at 5:11 pm)
I suspect – but haven’t tried – that you can achieve something similar using Git hooks, and storing the password in the .git/config file. Every developer will need to have the hook scripts and the password in their config, but then when you try to update or commit the special files the hooks can automatically encrypt/decrypt them behind the scenes.
Luis Atencio (February 11, 2013 at 8:15 am)
Hey John,
Quick question: What if the password is to configure an interface with a third party service? In this case, the password is given to you and you have no way of generating this password. How would you store this in source control?
Matt Hickford (February 12, 2013 at 6:01 pm)
> You still get all the benefit of using revision control to manage the file and changes.
No you don’t! Passwords are fundamentally inconsistent with version control. Suppose you write an application with a Twitter API key (a password). There’s no value to knowing historical revisions of the password. If you revert your app to last month’s code, you’ll still need to use *today’s* password. Think about it.
Your original set-up was simpler too!
ntnt (February 13, 2013 at 8:48 pm)
There’s still value in having a record of who changed what.
online timer (February 14, 2013 at 11:58 pm)
Excellent beat ! I would like to apprentice while you amend your website, how could i subscribe
for a weblog website? The account helped me a appropriate deal.
I were a little bit familiar of this your broadcast offered bright transparent
concept
F. Haupt (February 20, 2013 at 5:08 am)
I agree with Greg, passwords aren’t source code and so they shouldn’t be under source control.
In the example there is also a user name, so it is a user dependent password, why should anyone else have it?
I think it is easier to ask the user at first start and save the data somewhere in a user dependent store or use the already mentioned environment variables.
I also do not see the advantage of the encryption/decryption as you still have to exchange some password.
irctc (February 20, 2013 at 2:03 pm)
yes i agree with most of above as storing passwords and keeping in a file on the disk would have a danger of hacking always. So there must be some strong form on encryption to avoid risk of stealing passwords.
yahoo mail (February 20, 2013 at 2:09 pm)
thanks john, i must say that your blog is an expert material that helps a lot. I just subcribed to get all the information from now on!