futur: I wanna read about appview v2 in lieu of it being open source imo writing about the design decisions in running an atproto application is the most impactful next step in making developing one feel feasible

I wanna read about appview v2 in lieu of it being open source imo writing about the design decisions in running an atproto application is the most impactful next step in making developing one feel feasible

there is really not much to it: just a re-implementation of parts of the open source v1 using different datastores that scale different. could probably still be running the v1 code with postgresql today with full network load (read and write), and especially if only write (indexing) and low read

the big technical thing is that scaling up postgresql (or any trad SQL) "vertically" (huge instances) is super expensive. you can do replicas for read volume, but you start hitting hundreds of thousands of dollars a month in cloud spend quickly

and with complex queries and rapid development (new queries, tables, indices), it gets hard to guess where scaling limits and problems will be. you can horizontally scale (shard), but that is a bunch of dev and ops work

switching to key/value or columnar data is much cheaper (dollars) for larger request volume, and way more predictable scaling. at the cost of being constrained about queries and needing to manually implement indices.

that was kind of it? honestly, I think the tricks we did with postgresql to keep v1 scaled up are in some ways more technically impressive and interesting than the v2 stuff. eg, use of sequencers, use of advisory locks, to avoid needing other services

the v2 stuff is probably just never going to be useful for others to run. lots of little infra things that fit our very specific hardware selection and network environment, etc. and (IMHO), we did a great job factoring things so that you can still see how appview works (app logic) in the open code

makes sense! I guess I figured there was more to it infra-wise & wanted to hear about if/how considerations for atproto at scale are different from centralized apps

's hard

… how much?! 😐

the market segmentation/pricing for cloud SQL database is pretty wild. the base assumption seems to be that if you have a 200 GB+ postgresql database you surely must have 7-digit+ monthly revenue

NVMe hardware isn't cheap, but it isn't that expensive! there are plenty of 1 TB db tables in the world that are just fun smol projects that you can literally run on a laptop but would cost more than big city housing cost to run in the cloud

eh, lotta margin on the upside and its one of those things where the people writing checks don’t blink at 250k+

What about, like, more classic VPS-style big compute instances or rented bare metal servers and setting up the database manually there instead of a managed DB service, that should be possible?

yeah, bare-metal with dedicated disks is vastly more accessible. we potentially could have done just that transition and kept the appview v1 arch the same and continued to scale very affordably

this is the most ”I worked on something for months/years and forget others barely know the surface of it” comment I’ve read in a while 😁 no offense meant, it happens, I know it’s tricky to unpack everything one learns while building stuff

also, I’m out of my depth with running big backends, others in thread seem more experienced. so might just be that

how the appview works and why it was designed that way is pretty complex! bunch of fun little details about "fan out" of timelines, hydration, etc

what I was mostly trying to answer was "what is different between the open postgres and closed scylladb implementations" and that is "not a lot"

in that case we need stories about the trials and tribulations of developing v1

I will say v1 is still pretty intimidating, probably in part because I haven't had much reason to look into it

it's definitely the black box of ATproto I'd like to know more about it too but not enough to get buried in docs and source this is why we need story time. A chance to have juice and a cookie while ATproto engineers talk about the challenges they faced and decisions they made

written in sweat and tears in the git history!

I've been meaning to do a big writeup on it for a blog or something but need to find the time/space to do it and also not quite sure how to lay it out. What would be the things you're most excited to hear about from something like that? Most of the work was in the implementation once we designed it

I think the most valuable perspective would come from placing yourself in the shoes of someone wanting to build a new appview and how you would do it knowing always you know now

oh I totally hijacked the thread, sorry futur

nah I spent a few minutes trying to figure out what I'd really like to see and I think you nailed it

*knowing what you know now

a related thing i'd love to write up is what the common patterns of implementing an appview are likely to be. want to experiment with more apps/impls, but can already guess generic patterns

for example, could build a big generic directional graph database and query references between records; even if you don't know the lexicon schemas! can just look for fields which are valid AT-URIs, put them in a giant table. lots that could be built with a "generic" atproto aggregator/framework

another angle at this is, "what would Ruby on Rails look like for atproto". or django. can you code-gen most of that given a bunch of Lexicons schema docs? maybe with some additional annotations like "this field should have fulltext search"?

Yeah, all of this! Stuff that might be obvious to y'all now, but not at all to someone who has worked with backends and databases in general, but nothing of that kind

Agreed, a writeup about what worked, what didn't work, and why certain approaches we tried gives the most insight. Not even "what if you knew what you know now", but more about the actual development process, what it took to figure out what you know now.

Agree. The best person to write that might be one of the interested people outside the bsky dev team, who would then ask a lot of questions from the people on the dev team to find the answers they need. Like, agree to write an article/guide, set a DL for publish and schedule a bunch of Q&A calls

Maybe its easier to find the time for the kind of writeups you guys mention @bnewbold.net @jaz.bsky.social if it’s part of an official, scheduled and communicated public project Just spitballing some ”no pressure” 🙃 ideas for @jay.bsky.team or someone to pressure you with

Unfortunately I think "we didn't bother publishing the source" is going to be a rather bad point of divergence for any real prospect of decentralization :/

ehh in this case I don't think it's a big deal, though I do find the reasoning weak the appview is really only useful as a rough reference at best anyways — while I'd prefer not to have to trust them to keep it updated, crud impl details aren't too high on my list of concerns, relatively speaking

Post