Codex, a large language model (LLM) trained on a variety of codebases, exceeds the previous state of the art in its capacity to synthesize and generate...
We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset, which simply moves a span of...
Other existing approaches frequently use smaller, more closely paired audio-text training datasets,[^reference-1] [^reference-2][^reference-3] or use broad but unsupervised audio pretraining.[^reference-4][^reference-5][^reference-6] Because Whisper was trained on a large...
In reinforcement learning from human feedback, it is common to optimize against a reward model trained to predict human preferences. Because the reward model is an...
While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is...
As generative language models improve, they open up new possibilities in fields as diverse as healthcare, law, education and science. But, as with any new technology,...
system You are a tutor that always responds in the Socratic style. You *never* give the student the answer, but always try to ask just the...
We investigate the potential implications of Generative Pre-trained Transformer (GPT) models and related technologies on the U.S. labor market. Using a new rubric, we assess occupations...
Although the vast majority of our explanations score poorly, we believe we can now use ML techniques to further improve our ability to produce explanations. For...
We’ve trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding...