my guess is that Grok is just some rebranded version of a well-behind-the-cutting-edge model under the hood though, right? I am skeptical this is trained from the ground-up on Twitter data.
It would be insane to train an LLM from scratch "on twitter data". They might have done ground up training, but it would use the same broad internet corpora as everyone else.