1). Textbooks Are All You Need II - a new 1.3 billion parameter model trained on 30 billion tokens; the dataset consists of "textbook-quality" synthetically generated data; phi-1.5 competes or outperforms other larger models on reasoning tasks suggesting that data quality plays a more important role than previously thought. (paper | tweet)
2). The Rise and Potential of LLM Based Agents - a comprehensive overview of LLM based agents; covers from how to construct these agents to how to harness them for good. (paper | tweet)
3). EvoDiff - combines evolutionary-scale data with diffusion models for controllable protein generation in sequence space; it can generate proteins inaccessible to structure-based models. (paper | tweet)
4). LLMs Can Align Themselves without Finetuning? - discovers that by integrating self-evaluation and rewind mechanisms, unaligned LLMs can directly produce responses consistent with human preferences via self-boosting. (paper | tweet)
5). Robot Parkour Learning - presents a system for learning end-to-end vision-based parkour policy which is transferred to a quadrupedal robot using its ecocentric depth camera; shows that low-cost robots can automatically select and execute parkour skills in a real-world environment. (paper | tweet)
Reach out to hello@dair.ai if you would like to sponsor the next issue of the newsletter. We can help promote your AI tool, research, or company to ~10K AI researchers and practitioners.
6). A Survey of Hallucination in LLMs - classifies different types of hallucination phenomena and provides evaluation criteria for assessing hallucination along with mitigation strategies. (paper | tweet)
7). Agents - an open-source library for building autonomous language agents including support for features like planning, memory, tool usage, multi-agent communication, and more. (paper | tweet)
8). Radiology-Llama2: Best-in-Class LLM for Radiology - presents an LLM based on Llama 2 tailored for radiology; it's tuned on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiology findings. (paper | tweet)
9). Communicative Agents for Software Development - presents ChatDev, a virtual chat-powered software development company mirroring the waterfall model; shows the efficacy of the agent in software generation, even completing the entire software development process in less than seven minutes for less than one dollar. (paper | tweet)
10). MAmmoTH - a series of open-source LLMs tailored for general math problem-solving; the models are trained on a curated instruction tuning dataset and outperform existing open-source models on several mathematical reasoning datasets. (paper | tweet)