1 Comment

CriticGPT is not intended for regular users; instead, it is OpenAI's internal research project focused on model-supervised models. It's somewhat similar to the previous weak-to-strong generalization research, where models are used to supervise other models.

In simple terms, CriticGPT aims to solve a particular problem: the limitations of human evaluation of model outputs. As models become more powerful, humans may not reliably assess the correctness of their outputs.

The approach of CriticGPT is relatively easy to understand, resembling the idea of self-competition in multi-agent systems. The GPT-4 model is responsible for generating outputs, while CriticGPT checks for errors. This time, however, CriticGPT is specifically trained for code error correction.

CriticGPT is also trained using RLHF (Reinforcement Learning from Human Feedback). Unlike ChatGPT, which is trained for chatting, CriticGPT is trained with a large number of inputs containing errors and is required to critique these inputs. (As the name suggests, ChatGPT is for chatting, while CriticGPT is for critiquing and identifying flaws).

Expand full comment