I don't get. Why isn't the model open if it works? If it isn't this is just a fa...

amelius · on Dec 1, 2024

Yes, the community should force Nature to up its standards or ditch it. Software replication should be trivial in this day and age.

woodson · on Dec 1, 2024

All these papers doing "research" on how to better prompt ChatGPT would be unpublishable then, given that API access to older models gets retired, so the findings of these papers can no longer be reproduced.

(I agree with you in principle; my example above is meant to show that standards for things such as reproducibility aren't easily defined. There are so many factors to consider.)

amelius · on Dec 2, 2024

Well since you put "research" in quotes, I think you also agree that this type of work does not really belong in a quality journal with a high impact factor ;)

rowanG077 · on Dec 4, 2024

This, their training data doesn't even seem to be open either. So it's literally impossible to replicate their model. This makes me highly skeptical.

wholehog · on Dec 2, 2024

It is open: https://github.com/google-research/circuit_training

LittleTimothy · on Dec 3, 2024

As far as I understand it, only kind of? It's open source, but in their paper they did a tonne of pre-training and whilst they've released a small pre-training checkpoint they haven't released the results of the pre-training they've done for their paper. So anyone reproducing this will innevitably be accused of failing to pretrain the model correctly?

wholehog · on Dec 7, 2024

I think the pre-trained checkpoint uses the same 20 TPU blocks as the original paper, but it probably isn't the exact-same checkpoint, as the paper itself is from 2020/2021.