1). GNoME - a new AI system for material design that finds 2.2 million new crystals, including 380,000 stable materials; presents a new deep learning tool that increases the speed and efficiency of discovery by predicting the stability of new materials. (paper | tweet)
2). Open-Source LLMs vs. ChatGPT - provides an exhaustive overview of tasks where open-source LLMs claim to be on par or better than ChatGPT. (paper | tweet)
3). Adversarial Diffusion Distillation - a novel training approach that efficiently samples large-scale foundation image diffusion models in just 1-4 steps while maintaining high image quality; combines score distillation and an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps; reaches performance of state-of-the-art diffusion models in only four steps. (paper | tweet)
4). Seamless - a family of research models that enable end-to-end expressive cross-lingual communication in a streaming fashion; introduces an improved SeamlssM4T model trained on more low-resource language data; also applies red-teaming effort for safer multimodal machine translation. (paper | tweet)
5). MEDITRON-70B - a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain; builds on Llama-2 and extends pretraining on a curated medical corpus; MEDITRON-70B outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of Med-PaLM-2. (paper | tweet)
6). Foundation Models Outcompeting Special-Purpose Tuning - performs a systematic exploration of prompt engineering to boost the performance of LLMs on medical question answering; uses prompt engineering methods that are general purpose and make no use of domain expertise; prompt engineering led to enhancing GPT-4’s performance and achieves state-of-the-art results on nine benchmark datasets in the MultiMedQA suite. (paper | tweet)
7). UniIR - a unified instruction-guided multimodal retriever that handles eight retrieval tasks across modalities; can generalize to unseen retrieval tasks and achieves robust performance across existing datasets and zero-shot generalization to new tasks; presents a multimodal retrieval benchmark to help standardize the evaluation of multimodal information retrieval. (paper | tweet)
8). Safe Deployment of Generative AI - argues that to protect people’s privacy, medical professionals, not commercial interests, must drive the development and deployment of such models. (paper | tweet)
9). On Bringing Robots Home - introduces Dobb-E, an affordable and versatile general-purpose system for learning robotic manipulation within household settings; Dobbe-E can learn new tasks with only 5 minutes of user demonstrations; experiments reveal unique challenges absent or ignored in lab robotics, including effects of strong shadows, variable demonstration quality by non-expert users, among others. (paper | tweet)
10). Translatotron 3 - proposes an unsupervised approach to speech-to-speech translation that can learn from monolingual data alone; combines masked autoencoder, unsupervised embedding mapping, and back-translation; results show that the model outperforms a baseline cascade system and showcases its capability to retain para-/non-linguistic such as pauses, speaking rates, and speaker identity. (paper | tweet)
#10 sounds mouthful, but it is interesting