🥇Top ML Papers of the Week

The top ML Papers of the Week (Mar 20 - Mar 26)

Mar 26, 2023

1). Sparks of AGI - a comprehensive investigation of an early version of GPT-4 when it was still in active development by OpenAI. (paper)

Sebastien Bubeck@SebastienBubeck

At @MSFTResearch we had early access to the marvelous #GPT4 from @OpenAI for our work on @bing. We took this opportunity to document our experience. We're so excited to share our findings. In short: time to face it, the sparks of #AGI have been ignited. arxiv.org/abs/2303.12712

12:48 AM · Mar 23, 2023

717 Reposts · 2.93K Likes

2). Reflexion - proposes an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities. (paper)

Siqi Chen@blader

🤯 this paper demonstrates you can improve gpt4 performance an astounding 30% by asking gpt4 to reflect on “why were you wrong?”, and generate a new prompt for itself taking that reason into account until it is correct. this is how humans learn! arxiv.org/pdf/2303.11366… 👇

8:40 PM · Mar 25, 2023

218 Reposts · 1.61K Likes

3). GPT-4 for Medical Challenge Problems - shows that GPT-4 exceeds the passing score on USMLE by over 20 points and outperforms GPT-3.5 as well as models specifically fine-tuned on medical knowledge (Med-PaLM, a prompt-tuned version of Flan-PaLM 540B). (paper)

DAIR.AI@dair_ai

Capabilities of GPT-4 on Medical Challenge Problems Shows that GPT-4 exceeds the passing score on USMLE by over 20 points and outperforms GPT-3.5 as well as models specifically fine-tuned on medical knowledge (Med-PaLM, a prompt-tuned version of Flan-PaLM 540B).

2:36 PM · Mar 22, 2023

143 Reposts · 642 Likes

4). GPTs are GPTs - investigates the potential implications of GPT models and related systems on the US labor market. (paper)

JB Rubinovitz@rubinovitz

The "Will GPT automate all the jobs?" paper is out With participation from @OpenAI, OpenResearch and @Penn 🧵 1/9

3:05 AM · Mar 20, 2023

2.12K Reposts · 8.89K Likes

5). CoLT5 - a long-input Transformer model that employs conditional computation, devoting more resources to important tokens in both feedforward and attention layers. (paper)

Aran Komatsuzaki@arankomatsuzaki

CoLT5: Faster Long-Range Transformers with Conditional Computation Achieves: - stronger performance than LongT5 with much faster training and inference - SOTA on the SCROLLS benchmark - strong gains up to 64k input length arxiv.org/abs/2303.09752

12:32 AM · Mar 20, 2023

101 Reposts · 642 Likes

6). Artificial muses - compares human-generated ideas with those generated by generative AI chatbots like ChatGPT and YouChat; reports that 9.4% of humans were more creative than GPT-4 and that GAIs are valuable assistants in the creative process. (paper)

Ethan Mollick@emollick

GPT-4 was more creative than all but 9.4% of humans tested in this new paper It gave the Alternative Uses Test (a measure of creativity where you need to come up with unique uses of everyday objects) to AI & 100 people. GPT-4 got high ratings from judges. arxiv.org/abs/2303.12003 https://t.co/T6mUOs3zxT

4:39 PM · Mar 23, 2023

142 Reposts · 701 Likes

7). Analysis of GPT-3 and GPT-3.5 - a comprehensive capability analysis of GPT series models; evaluates performance on 9 natural language understanding tasks using 21 datasets. (paper)

Aran Komatsuzaki@arankomatsuzaki

A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models arxiv.org/abs/2303.10420

1:11 AM · Mar 21, 2023

36 Reposts · 153 Likes

8). Context-faithful Prompting for LLMs - presents a prompting technique that aims to improve LLMs' faithfulness using strategies such as opinion-based prompts and counterfactual demonstrations. (paper)

elvis@omarsar0

Context-faithful Prompting for Large Language Models Presents a neat prompting technique that aims to improve LLMs' faithfulness using strategies such as opinion-based prompts and counterfactual demonstrations. arxiv.org/abs/2303.11315