🥇Top ML Papers of the Week

The top ML Papers of the Week (Feb 27 - Mar 5)

Mar 05, 2023

This issue highlights the top ML Papers of the Week (Feb 27 - Mar 5).

1). Language Is Not All You Need - introduces a multimodal large language model called Kosmos-1; achieves great performance on language understanding, OCR-free NLP, perception-language tasks, visual QA, and more. (paper)

elvis @omarsar0

Here we go! Microsoft introduces a multimodal large language model called Kosmos-1. Achieves great performance on language understanding, OCR-free NLP, perception-language tasks, visual QA, and more.

2). Comparing Brain Activations and Language Models - finds that human brain activity is best explained by the activations of modern language models enhanced with long-range and hierarchical predictions. (paper)

Meta AI @MetaAI

New in Nature Human Behavior, Meta AI researchers show how current language models differ from the human brain & highlight the role of long-range & hierarchical predictions. We hope these findings will help inform the next generation of AI ➡️ go.nature.com/3SKb3gX

3). EvoPrompting - combines evolutionary prompt engineering with soft prompt-tuning to find high-performing models; it leverages few-shot prompting which is further improved by using an evolutionary search approach to improve the in-context examples. (paper)

Angelica Chen @_angie_chen

New paper w/ @dmdohan and @david_r_so! Can LMs be used to design novel model architectures? We propose EvoPrompting, which evolves few-shot prompts to enable a code-pretrained LM to generate novel state-of-the-art architectures. arxiv.org/abs/2302.14838 (1/4)

4). Consistency Models - a new family of generative models that achieve high sample quality without adversarial training. (paper)

AK @_akhaliq

Consistency Models achieve the new state-of-the-art FID of 3.55 on CIFAR10 and 6.20 on ImageNet 64 ˆ 64 for one-step generation abs: arxiv.org/abs/2303.01469

5). D5 - a new task that automatically discovers corpus-level differences via language description in a goal-driven way; applications include discovering insights from commercial reviews and error patterns in NLP systems. (paper)

Ruiqi Zhong @ZhongRuiqi

I dream of building AI that can do research🤖🔬 We are still far from this, but in our most recent paper, we formalize the D5 task: tell GPT-3 your research goal and provide two large text datasets. It may discover patterns you didn’t notice!😉

6). Reconstructing Images from Human Brain Activity with Diffusion Models - proposes an approach for high-resolution image reconstruction with latent diffusion models from human brain activity. (paper)

deiniolb 🐻👉 aisuite.io @danberridge

I'm speechless. Not peer-reviewed yet but a submitted paper. The 'presented images' were shown to a group of humans. The 'reconstructed images' were the result of an fMRI output to Stable Diffusion. In other words, #stablediffusion literally read people's minds. Source 👇

7). Grounded Decoding - a scalable approach to planning with LLMs in embodied settings through grounding functions; GD is found to be a general, flexible, and expressive approach to embodied tasks. (paper)

Wenlong Huang @wenlong_huang

Large language models gathered tons of world knowledge by speaking human language. But can they ever speak “robot language”? Introducing “Grounded Decoding”: a scalable way to decode *grounded text* from LLM for robots. Website: grounded-decoding.github.io 🧵👇

8). Voltron - a framework for language-driven representation learning from human videos and captions for robotics. (paper)

Siddharth Karamcheti @siddkaramcheti

How can we use language supervision to learn better visual representations for robotics? Introducing Voltron: Language-Driven Representation Learning for Robotics! Paper: arxiv.org/abs/2302.12766 Models: github.com/siddk/voltron-… Evaluation: github.com/siddk/voltron-… 🧵👇(1 / 12)

Voltron Framework – Balancing language conditioning and generation to shape visual representation learning.

9). Dropout Reduces Underfitting - demonstrates that dropout can mitigate underfitting when used at the start of training; it counteracts SGD stochasticity and limits the influence of individual batches when training models. (paper)

Aran Komatsuzaki @arankomatsuzaki

Dropout Reduces Underfitting Finds that models equipped with early dropout achieve lower final training loss compared to their counterparts without dropout. arxiv.org/abs/2303.01500

10). LLM for Conversational Interactions with Mobile UIs - an approach that enables versatile conversational interactions with mobile UIs using a single LLM. (paper)

Bryan Wang @bryanhaoenwang

#LLMs are powerful, but can they make existing GUIs interactable with language? Last summer at @GoogleAI, we found that LLMs can perform diverse language-based mobile UI tasks using few-shot prompting. Exciting implications for future interaction design! #chi2023 Thread 🧵

See you next week for another round of awesome ML papers!

AI Newsletter

Discussion about this post