Post

Avatar
Picking identifiers... I suppose UUIDv7's 40 bit random component is almost always enough but I like ksuid's 128 "kill it with fire" approach. Seems like end game for UUIDs is time-ordered, millisecond precision Unix epoch timestamps, >128 bit random component, sortable string representation.
Avatar
UUIDv9 should be 256 bits (maybe)
Avatar
I've been a fan of Xid for a while now tbh - github.com/rs/xid The time-sortability is nice, you get 24 bits of unique ids per second per host per process (>16M unique IDs per sec) and they're small, don't need hyphens, and somewhat human readable.
GitHub - rs/xid: xid is a globally unique id generator thought for the webgithub.com xid is a globally unique id generator thought for the web - rs/xid
Avatar
Yeah xid is great but no random component and second-level precision is a deal-breaker for some use-cases...
Avatar
I went with random UUID for scholar.archive.org, trying to plan ahead a bit. did end up burning a lot of page cache on secondary tables (maybe I could have optimized pg schemas better), but the main unexpected cost was how poorly dumps and exports would compress. similar to CIDs in atproto or IPFS
Avatar
Definitely not a bad option but at least some of the time I really need (want) the ability to use the timestamp stored in the ID for TTLs, range/prefix searches, etc. 💭 If I used sufficiently large random UUIDs the data I want would be in the ID itself *somewhere*.
Avatar
I've also yet to come across a case where I really care about leaking the timestamp in the ID. Or maybe once or twice and just had a secondary field that could be used.
Avatar
I just don't want people inferring any other semantics from the identifier! ... and then I inevitably end up doing exactly that when debugging
Avatar
And it certainly solves the problem of having to evaluate all the options every so often. The fact that there's a UUIDv8 that is customizeable really drives home the frustration!
Avatar
Avatar
sounds about right! though boy do i enjoy something a little more compact than hex.
Avatar
Good point, that's another problem with the typical string version of UUIDs. I'd like base32 at least. Added to the wishlist!
Avatar
Avatar
The problem you get with distributed databases (like CockroachDB) is hot partitions. The database splits the table into ranges. If all the new IDs have the same prefix, all the writes will end up going to the same node, and overload that node, while the rest of the nodes will not be doing anything.