Post

Avatar
I wanna read about appview v2 in lieu of it being open source imo writing about the design decisions in running an atproto application is the most impactful next step in making developing one feel feasible
Avatar
there is really not much to it: just a re-implementation of parts of the open source v1 using different datastores that scale different. could probably still be running the v1 code with postgresql today with full network load (read and write), and especially if only write (indexing) and low read
Avatar
the big technical thing is that scaling up postgresql (or any trad SQL) "vertically" (huge instances) is super expensive. you can do replicas for read volume, but you start hitting hundreds of thousands of dollars a month in cloud spend quickly
Avatar
and with complex queries and rapid development (new queries, tables, indices), it gets hard to guess where scaling limits and problems will be. you can horizontally scale (shard), but that is a bunch of dev and ops work
Avatar
switching to key/value or columnar data is much cheaper (dollars) for larger request volume, and way more predictable scaling. at the cost of being constrained about queries and needing to manually implement indices.
Avatar
that was kind of it? honestly, I think the tricks we did with postgresql to keep v1 scaled up are in some ways more technically impressive and interesting than the v2 stuff. eg, use of sequencers, use of advisory locks, to avoid needing other services
Avatar
the v2 stuff is probably just never going to be useful for others to run. lots of little infra things that fit our very specific hardware selection and network environment, etc. and (IMHO), we did a great job factoring things so that you can still see how appview works (app logic) in the open code
Avatar
makes sense! I guess I figured there was more to it infra-wise & wanted to hear about if/how considerations for atproto at scale are different from centralized apps
Avatar
Avatar
Avatar
the market segmentation/pricing for cloud SQL database is pretty wild. the base assumption seems to be that if you have a 200 GB+ postgresql database you surely must have 7-digit+ monthly revenue
Avatar
NVMe hardware isn't cheap, but it isn't that expensive! there are plenty of 1 TB db tables in the world that are just fun smol projects that you can literally run on a laptop but would cost more than big city housing cost to run in the cloud
Avatar
eh, lotta margin on the upside and its one of those things where the people writing checks don’t blink at 250k+
Avatar
What about, like, more classic VPS-style big compute instances or rented bare metal servers and setting up the database manually there instead of a managed DB service, that should be possible?
Avatar
yeah, bare-metal with dedicated disks is vastly more accessible. we potentially could have done just that transition and kept the appview v1 arch the same and continued to scale very affordably
this is the most ”I worked on something for months/years and forget others barely know the surface of it” comment I’ve read in a while 😁 no offense meant, it happens, I know it’s tricky to unpack everything one learns while building stuff
also, I’m out of my depth with running big backends, others in thread seem more experienced. so might just be that
Avatar
how the appview works and why it was designed that way is pretty complex! bunch of fun little details about "fan out" of timelines, hydration, etc
Avatar
what I was mostly trying to answer was "what is different between the open postgres and closed scylladb implementations" and that is "not a lot"
Avatar
in that case we need stories about the trials and tribulations of developing v1
Avatar
I will say v1 is still pretty intimidating, probably in part because I haven't had much reason to look into it
Avatar
it's definitely the black box of ATproto I'd like to know more about it too but not enough to get buried in docs and source this is why we need story time. A chance to have juice and a cookie while ATproto engineers talk about the challenges they faced and decisions they made
Avatar
written in sweat and tears in the git history!
Avatar
I've been meaning to do a big writeup on it for a blog or something but need to find the time/space to do it and also not quite sure how to lay it out. What would be the things you're most excited to hear about from something like that? Most of the work was in the implementation once we designed it
Avatar
I think the most valuable perspective would come from placing yourself in the shoes of someone wanting to build a new appview and how you would do it knowing always you know now
Avatar
oh I totally hijacked the thread, sorry futur
Avatar
nah I spent a few minutes trying to figure out what I'd really like to see and I think you nailed it
Avatar
*knowing what you know now
Avatar
a related thing i'd love to write up is what the common patterns of implementing an appview are likely to be. want to experiment with more apps/impls, but can already guess generic patterns
Avatar
for example, could build a big generic directional graph database and query references between records; even if you don't know the lexicon schemas! can just look for fields which are valid AT-URIs, put them in a giant table. lots that could be built with a "generic" atproto aggregator/framework
Avatar
another angle at this is, "what would Ruby on Rails look like for atproto". or django. can you code-gen most of that given a bunch of Lexicons schema docs? maybe with some additional annotations like "this field should have fulltext search"?
Avatar
Yeah, all of this! Stuff that might be obvious to y'all now, but not at all to someone who has worked with backends and databases in general, but nothing of that kind
Avatar
Agreed, a writeup about what worked, what didn't work, and why certain approaches we tried gives the most insight. Not even "what if you knew what you know now", but more about the actual development process, what it took to figure out what you know now.
Agree. The best person to write that might be one of the interested people outside the bsky dev team, who would then ask a lot of questions from the people on the dev team to find the answers they need. Like, agree to write an article/guide, set a DL for publish and schedule a bunch of Q&A calls
Maybe its easier to find the time for the kind of writeups you guys mention @bnewbold.net @jaz.bsky.social if it’s part of an official, scheduled and communicated public project Just spitballing some ”no pressure” 🙃 ideas for @jay.bsky.team or someone to pressure you with
Avatar
Unfortunately I think "we didn't bother publishing the source" is going to be a rather bad point of divergence for any real prospect of decentralization :/
Avatar
ehh in this case I don't think it's a big deal, though I do find the reasoning weak the appview is really only useful as a rough reference at best anyways — while I'd prefer not to have to trust them to keep it updated, crud impl details aren't too high on my list of concerns, relatively speaking