Jake Gold: Picking identifiers... I suppose UUIDv7's 40 bit random component is almost always enough but I like ksuid's 128 "kill it with fire" approach. Seems like end game for UUIDs is time-ordered, millisecond precision Unix epoch timestamps, >128 bit random component, sortable string representation.

Picking identifiers... I suppose UUIDv7's 40 bit random component is almost always enough but I like ksuid's 128 "kill it with fire" approach. Seems like end game for UUIDs is time-ordered, millisecond precision Unix epoch timestamps, >128 bit random component, sortable string representation.

UUIDv9 should be 256 bits (maybe)

I've been a fan of Xid for a while now tbh - github.com/rs/xid The time-sortability is nice, you get 24 bits of unique ids per second per host per process (>16M unique IDs per sec) and they're small, don't need hyphens, and somewhat human readable.

GitHub - rs/xid: xid is a globally unique id generator thought for the webgithub.com xid is a globally unique id generator thought for the web - rs/xid

Yeah xid is great but no random component and second-level precision is a deal-breaker for some use-cases...

I went with random UUID for scholar.archive.org, trying to plan ahead a bit. did end up burning a lot of page cache on secondary tables (maybe I could have optimized pg schemas better), but the main unexpected cost was how poorly dumps and exports would compress. similar to CIDs in atproto or IPFS

Definitely not a bad option but at least some of the time I really need (want) the ability to use the timestamp stored in the ID for TTLs, range/prefix searches, etc. 💭 If I used sufficiently large random UUIDs the data I want would be in the ID itself *somewhere*.

I've also yet to come across a case where I really care about leaking the timestamp in the ID. Or maybe once or twice and just had a secondary field that could be used.

I just don't want people inferring any other semantics from the identifier! ... and then I inevitably end up doing exactly that when debugging

And it certainly solves the problem of having to evaluate all the options every so often. The fact that there's a UUIDv8 that is customizeable really drives home the frustration!

check out this uuid compatible id generator algorithm that specifically addresses the database locality/prefix issue: www.2ndquadrant.com/en/blog/sequ...

Sequential UUID Generators - 2ndQuadrant | PostgreSQLwww.2ndquadrant.com sequential-uuids extension introduces generators of sequential UUIDs, addressing some of the common issues - random I/O patterns and WAL write amplification

sounds about right! though boy do i enjoy something a little more compact than hex.

Good point, that's another problem with the typical string version of UUIDs. I'd like base32 at least. Added to the wishlist!

ksuids are a base62 and at 27 chars it still feels long. But...we probably don't need base1024 haha!

GitHub - keith-turner/ecoji: Encodes (and decodes) data as emojisgithub.com Encodes (and decodes) data as emojis. Contribute to keith-turner/ecoji development by creating an account on GitHub.

The problem you get with distributed databases (like CockroachDB) is hot partitions. The database splits the table into ranges. If all the new IDs have the same prefix, all the writes will end up going to the same node, and overload that node, while the rest of the nodes will not be doing anything.