been using pass for more than 3 years now never had an issue, great CLI tool

tialaramex · on Dec 16, 2021

I still really like how it makes random passwords given a permitted character set. It uses tr -cd to read only matching character bytes - discarding any others - from the random device. If you instead try to be less wasteful in turning random bytes into characters from the chosen set, you are in a sticky situation very quickly where your passwords might be less random than they should be, whereas bytes from the random device are cheap, so just throw away all the ones that aren't suitable and get more. So simple and yet I'm sure it wouldn't have occurred to me.

jhgb · on Dec 16, 2021

If you have an M-sized alphabet and you need an n-character password, why not just take ⌈log₂ Mⁿ⌉ bits and interpret it as a number in base M? That seems even simpler to me.

tialaramex · on Dec 16, 2021

Let's try it.

For example I should like a 1-character password, from the alphabet of A, B and C.

So that's 2 bits. We read two bits (awkward, the random device is of course byte oriented). Now, we have a 2-bit value and we're trying to pick A, B or C. But what do we do with the 4th possibility from our 2-bit value?

We could decide too bad we'll treat it as A (or B, or C) anyway, but now we've introduced a non-random bias to our supposedly random password.

The non-horrible option is to throw away this possibility and read two more bits hoping for a different outcome. The exact same algorithm pass uses, except you've added the extra complexity of reading less than one byte at a time from the device...

jhgb · on Dec 16, 2021

With a non-trivial (= much more realistic) length of your random number, how do you ever lose more than a fraction of a bit of entropy this way? Let's say you want a 16-character alphanumeric (a-zA-Z0-9) password. log₂ 62¹⁶≈95.267. Let's say you pull 96 bits and ignore the imbalance. Then you have 2×62¹⁶-2⁹⁶=16116640899382729306982711296 options with a probability of 2×2⁻⁹⁶ and 2⁹⁶-62¹⁶ options with a probability of just 2⁻⁹⁶. That comes out as around 95.203 bits of entropy, which isn't significantly different from 95.267. What obvious fault am I missing here?

tialaramex · on Dec 16, 2021

> What obvious fault am I missing here?

Well, let's begin with the fact it's worse. You argue it's not "much" worse, but it is clearly worse not the same. Under 'pass' all passwords conforming to chosen rules are equally likely, under your approach some are twice as likely as others.

But then the other characteristic is simplicity. Using 'tr -cd' is, by my counting, five characters. How big is your best implementation of this supposedly "even simpler" but worse solution?

jhgb · on Dec 16, 2021

> Under 'pass' all passwords conforming to chosen rules are equally likely, under your approach some are twice as likely as others.

But they're passwords. At most half of the passwords will have different probability than the rest which means that you're still left with a staggering number of un-brute-forceable passwords. Two to the power of ninety nine is practically indistinguishable from two to the power of one hundred.

By the way, if you're still obsessed about completely equal probabilities here, you can just drop the whole sequence of bits if it's larger than the number of alternatives when interpreted as a binary number and fetch a new one. It seems to me that the expected value of number of bits consumed for such a process won't exceed twice the number of bits that are technically necessary (because the probabilities of individual numbers of attempts are a geometric sequence with a factor of at most 0.5 summing to exactly 1, which would have an expected value of 2), which is around 12.5 bits per character. With a filtration like tr -dc "0-z" mentioned below you'll consume roughly 27.3 bits per character on average. So "better" is a subjective term here; one solution may be simpler to write in a shell command line, another consumes much less entropy. If you're deciding what to do, you need to weigh your criteria properly if you have more of them. You may very well have different criteria than I do; that happens.

> Using 'tr -cd' is, by my counting, five characters

I probably wouldn't use this in the first place because it requires a subprocess and a binary that I might not have on some computers. So by my counting this could be hundreds of kilobytes of extra code. Division with a remainder is most likely something you already have in your standard library regardless of whether you use it or not.

tialaramex · on Dec 17, 2021

> another consumes much less entropy

To the extent that it could mean something to "consume" entropy from the random device they're both outputting the same size of password and so they're consuming roughly the same amount of entropy. Assuming the random device works as intended this would be a distinction that makes no difference anyway.

> a binary that I might not have on some computers

A binary that's included in the Linux Standard Base. Which computers don't you have it on?

> this could be hundreds of kilobytes of extra code

More like a few dozen kilobytes, although of course on tiny Linux systems it's bundled into something like busybox so much less.

> Division with a remainder is most likely something you already have in your standard library regardless of whether you use it or not.

The shell, which is what pass is written for, does have division with a remainder, but it cannot do this operation on 96-bit integers. So now you've added another constraint "rewrite this software in a language which has 96-bit integers, or more if longer passwords are allowed" [Hint: Longer passwords are very much allowed in pass]

So far what you've achieved is to show that this is trickier than you thought, yet produces worse results. Again, this is why I like what Jason did here.

jhgb · on Dec 19, 2021

> A binary that's included in the Linux Standard Base. Which computers don't you have it on?

I'm using some Windows computers. But for example I can use (and often do) a dumped SBCL binary on them just fine, even on those where I can't install anything.

assbuttbuttass · on Dec 16, 2021

This is "simpler" because it's a one line shell snippet:

    tr -dc "0-z" < /dev/urandom | head -c10

wizzwizz4 · on Dec 16, 2021

It means password generation takes variable time; this:

• opens you up to timing attacks on your PRNG (unlikely to be a problem in real life, but you never know)

• might run forever

shakna · on Dec 16, 2021

A successful timing attack against /dev/urandom would be a major bug that you could expect to be highly prioritised by the Linux team.

That is, for most people it is well outside the attack surface that they need to care about.

marcosdumay · on Dec 16, 2021

The timing depends on your charset. You may not want to tell it to everyone on the street, but it's not something you need to keep secret either.

assbuttbuttass · on Dec 16, 2021

It has probability 0 of running forever

wizzwizz4 · on Dec 16, 2021

Assuming a perfect RNG, but PRNGs don't have infinite state. So it's either impossible, or has a finite probability, with a finite probability for each of those cases, for an overall finite (non-zero) subjective probability of it running forever (in ideal program-space; obviously it will never run forever in real life).

tialaramex · on Dec 16, 2021

No, you don't need a perfect RNG. You need anything that's not completely horrible.

Remember, we're not trying to read a whole password from the device, just one matching byte at a time. So, even if our password character set is a single character (pass will let you do that, but obviously you should not use such passwords in real life) we're talking about statistically 1 out of 256 bytes matches or else our PRNG is useless.

You correctly observe that the PRNG has finite state, but its state isn't so tiny that it only emits a handful of the 256 possible bytes before getting back to where it started, nobody would use such a busted algorithm.

RC4 is considered broken and unusable because the keystream is biased and thus distinguishable from random - to the extent that if you read a few million bytes from it you can detect the bias, but you're asking us to imagine that the random device has a PRNG many orders of magnitude worse in order for this to have any effect at all.

wizzwizz4 · on Dec 16, 2021

> You correctly observe that the PRNG has finite state, but its state isn't so tiny that it only emits a handful of the 256 possible bytes before getting back to where it started, nobody would use such a busted algorithm.

If the PRNG always loops through all possible states, then sure. But if there's a possibility, however small, of it getting stuck in a small loop, very rarely?

tialaramex · on Dec 17, 2021

> But if there's a possibility, however small, of it getting stuck in a small loop, very rarely?

We already went around this particular "small loop" so I'm guessing you aren't learning anything from repeating it. That would be a lousy design for a PRNG, so, nobody does that.

GoblinSlayer · on Dec 17, 2021

/dev/urandom isn't pure mathematics, it consumes hardware entropy.

konschubert · on Dec 16, 2021

But a nonzero possibility of running longer than x for any x>0

;)

In practice, it doesn't matter, I agree.

ufo · on Dec 16, 2021

I don't understant what you mean here. Why would the passwords be less random?

gspr · on Dec 16, 2021

Yeah. I think it might be the tool that's given me the most quality of life per line of code :-)

Stellar. Absolutely stellar piece of software. Always does exactly what I want it to, never gets in my way ever!