enable monkeysphere use without allowing host enumeration (hashed User IDs)

Some administrators might be leery about publishing a map of all their hosts to the public keyservers.

One proposed way to avoid this would be to publish digested hostnames instead. This was fairly formally proposed in November of 2008 on the monkeysphere list:

https://lists.riseup.net/www/arc/monkeysphere/2008-11/msg00013.html

There was an observation that simply publishing host names in User IDs stored in public keyservers would basically expose a list of your hosts to the outside world. Administrators of a realm that does not want host enumeration to be possible might see this as a reason to avoid using the Monkeysphere.

To provide a way that these administrators could use the Monkeysphere without enumerating their hosts, it was proposed that we allow for hashing of hostnames in the User ID field. Then clients could search for the hashed hostnames via the public keyservers, but anyone trying to establish the name of, say, all hosts in a domain based on a WoT inventory would be stymied.

Specifically, if i cared about being this kind of sneaky, instead of having published a key with User ID of ssh://squeak.fifthhorseman.net i would have published something like: ssh+sha1://99c5789b129598051ca82578c14ba3923c71c73d

Connecting clients would look up the hashed host name first, and if they failed to find it (and were configured to be allowed to look up unhashed hosts), they would query a second time for the unhashed hostname before falling back to non-monkeysphere ssh connections.

A few outstanding questions:

What should we do with hosts which have both kinds of keys available? For a client who is allowed to look up unhashed hosts, should a published hashed record preclude it from considering the published unhashed record? I think it should not (otherwise you can create a DoS attack by simply publishing a forged unhashed record)
What prefix should we use for the hashed hostnames? One proposal was to just keep using ssh://. Another proposal was to use ssh+hash://. Above, i've used ssh+sha1://. One concern with using ssh:// is that there is a possibility that icann will eventually open up the root zone to public registration the same way that .com and the other TLDs are open (i.e. anyone with enough money could claim a TLD for a certain period of time). In principle, i see no reason why they shouldn't do this, and it could be extremely lucrative for them if they could pull it off. However, if this happened, someone could actually register 99c5789b129598051ca82578c14ba3923c71c73d as a TLD, which would conflict with squeak. This seems unlikely at present, but the potential collisions that could be forced if such a change was evenutally made would suck for the monkeysphere if it used the same prefix for hashed and non-hashed hosts.
How do we handle alternate ports? If the connection is on a non-standard port, what should we do? Should a connection to squeak on port 2222 look first for ssh+sha1://18ea3840c81c8f6647c8f5af7f5add419650d2d8 or ssh+sha1://99c5789b129598051ca82578c14ba3923c71c73d:2222 ? Why?
Could we provide similar options for users (as opposed to hosts) who prefer to mask their User IDs in a similar way? How would that work?

(from redmine: created on 2009-08-02)