1). Voicebox - an all-in-one generative speech model; it can synthesize speech across 6 languages; it can perform noise removal, content editing, style conversion, and more; it's 20x faster than current models and outperforms single-purpose models through in-context learning. (paper | tweet)
2). FinGPT - an open-source LLM for the finance sector; it takes a data-centric approach, providing researchers & practitioners with accessible resources to develop FinLLMs. (paper | tweet)
3). Crowd Workers Widely Use Large Language Models for Text Production Tasks - estimates that 33-46% of crowd workers on MTurk used LLMs when completing a text production task. (paper | tweet)
4). Reliability of Watermarks for LLMs - watermarking is useful to detect LLM-generated text and potentially mitigate harms; this work studies the reliability of watermarking for LLMs and finds that watermarks are detectable even when the watermarked text is re-written by humans or paraphrased by another non-watermarked LLM. (paper | tweet)
5). Applications of Transformers - a new survey paper highlighting major applications of Transformers for deep learning tasks; includes a comprehensive list of Transformer models. (paper | tweet)
Sponsor message
DAIR.AI presents a new cohort-based course, Prompt Engineering for LLMs, that teaches how to effectively use the latest prompt engineering techniques and tools to improve the capabilities, performance, and reliability of LLMs. Enroll here.
6). Benchmarking NN Training Algorithms - it’s currently challenging to properly assess the best optimizers to train neural networks; this paper presents a new benchmark, AlgoPerf, for benchmarking neural network training algorithms using realistic workloads. (paper | tweet)
7). Unifying LLMs & Knowledge Graphs - provides a roadmap for the unification of LLMs and KGs; covers how to incorporate KGs in LLM pre-training/inferencing, leverage LLMs for KG tasks such as question answering, and enhance both KGs and LLMs for bidirectional reasoning. (paper | tweet)
8). Augmenting LLMs with Long-term Memory - proposes a framework to enable LLMs to memorize long history; it’s enhanced with memory-augmented adaptation training to memorize long past context and use long-term memory for language modeling; achieves improvements on memory-augmented in-context learning over LLMs. (paper | tweet)
9). TAPIR - enables tracking any queried point on any physical surface throughout a video sequence; outperforms all baselines and facilitates fast inference on long and high-resolution videos (track points faster than real-time when using modern GPUs). (paper | tweet)
10). Mind2Web - a new dataset for evaluating generalist agents for the web; contains 2350 tasks from 137 websites over 31 domains; it enables testing generalization ability across tasks and environments, covering practical use cases on the web. (paper | tweet)
Reach out to team@dair.ai if you want to sponsor the next issue of the newsletter.