1). Sparks of AGI - a comprehensive investigation of an early version of GPT-4 when it was still in active development by OpenAI. (paper)
2). Reflexion - proposes an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities. (paper)
3). GPT-4 for Medical Challenge Problems - shows that GPT-4 exceeds the passing score on USMLE by over 20 points and outperforms GPT-3.5 as well as models specifically fine-tuned on medical knowledge (Med-PaLM, a prompt-tuned version of Flan-PaLM 540B). (paper)
4). GPTs are GPTs - investigates the potential implications of GPT models and related systems on the US labor market. (paper)
5). CoLT5 - a long-input Transformer model that employs conditional computation, devoting more resources to important tokens in both feedforward and attention layers. (paper)
6). Artificial muses - compares human-generated ideas with those generated by generative AI chatbots like ChatGPT and YouChat; reports that 9.4% of humans were more creative than GPT-4 and that GAIs are valuable assistants in the creative process. (paper)
7). Analysis of GPT-3 and GPT-3.5 - a comprehensive capability analysis of GPT series models; evaluates performance on 9 natural language understanding tasks using 21 datasets. (paper)
8). Context-faithful Prompting for LLMs - presents a prompting technique that aims to improve LLMs' faithfulness using strategies such as opinion-based prompts and counterfactual demonstrations. (paper)
9). Text2Room - a method for extracting room-scale textured 3D meshes from 2D text-to-image models. (paper)
10). PanGu-Σ - a trillion parameter language model with sparse heterogeneous computing. (paper)