原推:@_BruceX_ Then why does the RLHF model need to have so many parameters?
And is there something I’m misunderstanding in terms of the trade-off between number of parameters and amount of data used to train a model; shouldn’t that ratio be generalizable in some sense?
https://twitter.com/wintonARK/status/1623814712894058498