This comic strip might be incomprehensible to you, but once you have read this article, you will understand its meaning.

Many experts say it, but everybody ignores that advice. In order for your password to be secure, you will need at least 8 characters, you will need punctuation, numbers, uppercase and lowercase letters, and most of all, not a common English (or French, German, etc.) word.

What the experts say is half true. Well, it is true if you have an 8 to 10-character password (and it is true that less than that is insecure). But such a password is hard to remember and extremely hard to type (not to mention foreign keyboards as in "where's that backslash key again? and do we need the SHIFT key for typing a number, or not?"). However, if you can consider having a longer password (called in this case a passphrase), you can avoid these inconveniences (it will be easy to remember and to type), while keeping it even more secure than if you used a standard password.

How to do that? First we must explain some theory.

First of all, all passwords come from an unpredictable process (scientists use the terms random or stochastic, but these terms can cause confusion based on assumptions some people have about randomness). Whether it's a word a human brain has thought of, a pseudo-random number generator in a computer, or a dice roll, it has some degree of unpredictability. And it's this unpredictability that makes it difficult to guess a password. And the thing about computers is, something which will be hard and tedious to guess by a human (e.g. a six-digit number) can be determined by an automated process (i.e. a computer program) very easily, just by testing all possibilities (we suppose here a system where a "lock account after 3 failed attempts" scheme is not feasible, such as any system where we don't demand the possession of a physical device unique to the user in order to log-in, for instance a website or e-mail account).

In security and cryptography, very little is perfectly secure (that's a scientific term for the mathematical impossibility of getting any information about the contents of a secret). Rather, security is based on being more secure with respect to the resources (time, physical, imaginative, etc.) that a potential attacker can use to break it.

To return to our unpredictable processes, in order to be secure, it is not necessary for the definition of the unpredictable process to be a secret, nor is any other part of our security system. Only the outcome of our unpredictable process should be kept secret, and the process needs to be unpredictable enough so that the attacker cannot guess the outcome with its potential resources (so, for these purposes, computer-generated random numbers are not as secure as they may seem, because if someone can guess the exact configuration of the memory when the process was launched, they can easily guess the outcome).

So, how do we measure the unpredictability of such a process. The measurement used in practice is called informational entropy. I won't bore you with mathematical formulas, let's just use this metaphor. The entropy of an unpredictable process is the theoretical number of perfectly-balanced coin tosses (mathematicians call them Bernouilli Random Variables with p=0.5) that are necessary to be as unpredictable as such a process (it is not necessarily an integer number, for instance the entropy of a perfectly-balanced 6-faced dice roll is approximately 2.585 - it's the 2-base logarithm of 6 for those mathematically inclined). The "unit" name of entropy is bits. If all outcomes are equally probable, the entropy of such a process is log2(number of outcomes). Or equivalently, the number of outcomes is 2^(entropy). As such, entropy becomes a practical way of talking about the number of possible outcomes of a process with equally likely outcomes (It's easier to talk and reason about 39.86 bits of entropy than 1 trillion possible outcomes). Also, a logarithmic scale such as entropy takes into account Moore's law better than the number of outcomes (based on that law, the number of bits possible to break with fixed financial resources increases by a constant amount every N months - according to the standard formulation of Moore's law where processing power doubles every 18 months, it's one additional bit every 18 months).

Now we have defined entropy, let's talk about a random process that will allow you to generate secure passphrases. This method (diceware) was first defined by A. G. Reinhold in 1995.

First of all, check out Reinhold's webpage about this method. Then, download a word list from it (I chose the list compiled by Alan Beale, but take any one the words on it are mostly familiar to you - if English is not your mother tongue, choose a list that corresponds to a language you are more familiar with - although try to take a list in which the keys are on most keyboards and you know where they are - Latin letters, spaces, maybe numbers). Take some dice, and a pad of paper. Roll the dice, and then write down the results in rows of 5 elements (the number of rows is the number of words chosen for the passphrase. You look up your list, searching for the word corresponding to each row of numbers (search for that number using the search feature in your browser/text editor/word processor). The resulting passphrase will be those words separated by a space

Update: keep spaces as separators. If you put no separator, it is possible for two distinct dice rolls to generate the same passphrase. For instance, given a word list including such words (the official English lists don't have the word string), the passphrase fragments "fir string" and "first ring", if used with no separator, would both give the same sequence "firstring". Unless you know for sure that no such combinations are possible in your list, it is better to put a space between words.

Now, the question you ask me: how secure is this system, and how many words should I choose?

To answer this, let's compare a 4-word passphrase with a perfect 8-character password. Such a password is a completely random assemblage of the 95 printable ASCII characters taken uniformly (see http://www.asciitable.com/, the characters in question are #32 to #126) . Here is one I just generated:

]W8w=PRq

And here is a 4-word passphrase generated with diceware (not using real dice, I must admit, but it is just an example, not something I will use, nor should you use this particular one, or any example password found on the internet, that goes without saying):

qs shake cold orb

Which is easier to remember? The latter, definitely. Easier to type? ditto. The most secure? The entropy of the former is 52.6, while the entropy of the latter is 51.7, so technically yes, it is less secure. By an underwhelming margin, but if you are a purist, you will want to be on the safe side But now if you capitalize the first letter of a word at random (using, for instance, 2 coin tosses), you will get 2 extra bits if entropy making it 53.7 bits. The last manipulation is typically unnecessary. If you really want more security, instead of capitalizing a word, add a fifth word, making the entropy 64.6 bits.

This said, how many of you who have an 8-character password have generated it uniformly by a computer or dice rolls over all ASCII characters.

Now am I suggesting you change EVERY SINGLE ONE OF YOUR PASSWORDS for all websites and accounts in one day? Not really. You perhaps don't need to change your personal computer log-in (unless it's also a server on the internet, in that case you should disable remote password-based log-in). I would instead advise you to change your passwords on websites you use often when you know your password is insecure (and you probably have better ideas about what is an insecure password now).

Even if these passwords are easy to remember while staying secure, you might wonder how to remember all the different passwords in question (yes, you still need a different password per service - currently there is no guarantee a given host will hash and salt its passwords properly). A subsequent article will cover storing your different passwords encrypted by a master passphrase on-line. We will cover host-proof services and online file-lockers (such as Dropbox and Ubuntu One).