Personalized Daily Arxiv Papers 01/30/2025

Total relevant papers: 2

Paper selection prompt and criteria at the bottom

Table of contents with paper titles:

LEKA:LLM-Enhanced Knowledge Augmentation Authors: Xinhao Zhang, Jinghan Zhang, Fengran Mo, Dongjie Wang, Yanjie Fu, Kunpeng Liu
Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding Authors: Yun-Shiuan Chuang, Nikunj Harlalka, Sameer Narendran, Alexander Cheung, Sizhe Gao, Siddharth Suresh, Junjie Hu, Timothy T. Rogers

0. LEKA:LLM-Enhanced Knowledge Augmentation

ArXiv ID: 2501.17802 Authors: Xinhao Zhang, Jinghan Zhang, Fengran Mo, Dongjie Wang, Yanjie Fu, Kunpeng Liu

Abstract: arXiv:2501.17802v1 Announce Type: new Abstract: Humans excel in analogical learning and knowledge transfer and, more importantly, possess a unique understanding of identifying appropriate sources of knowledge. From a model's perspective, this presents an interesting challenge. If models could autonomously retrieve knowledge useful for transfer or decision-making to solve problems, they would transition from passively acquiring to actively accessing and learning from knowledge. However, filling models with knowledge is relatively straightforward -- it simply requires more training and accessible knowledge bases. The more complex task is teaching models about which knowledge can be analogized and transferred. Therefore, we design a knowledge augmentation method LEKA for knowledge transfer that actively searches for suitable knowledge sources that can enrich the target domain's knowledge. This LEKA method extracts key information from textual information from the target domain, retrieves pertinent data from external data libraries, and harmonizes retrieved data with the target domain data in feature space and marginal probability measures. We validate the effectiveness of our approach through extensive experiments across various domains and demonstrate significant improvements over traditional methods in reducing computational costs, automating data alignment, and optimizing transfer learning outcomes.

Comment: This paper introduces a knowledge augmentation method (LEKA) that aligns with criterion 1 by focusing on active knowledge retrieval and transfer for large language models. It emphasizes the importance of identifying and harmonizing knowledge sources, which is a novel approach to prompt tuning and soft prompt optimization. Relevance: 7 Novelty: 8

1. Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding

ArXiv ID: 2501.17310 Authors: Yun-Shiuan Chuang, Nikunj Harlalka, Sameer Narendran, Alexander Cheung, Sizhe Gao, Siddharth Suresh, Junjie Hu, Timothy T. Rogers

Abstract: arXiv:2501.17310v1 Announce Type: new Abstract: Guesstimation, the task of making approximate quantity estimates, is a common real-world challenge. However, it has been largely overlooked in large language models (LLMs) and vision language models (VLMs) research. We introduce a novel guesstimation dataset, MARBLES. This dataset requires one to estimate how many items (e.g., marbles) can fit into containers (e.g., a one-cup measuring cup), both with and without accompanying images. Inspired by the social science concept of the {Wisdom of Crowds'' (WOC) - taking the median from estimates from a crowd), which has proven effective in guesstimation, we propose WOC decoding'' strategy for LLM guesstimation. We show that LLMs/VLMs perform well on guesstimation, suggesting that they possess some level of a "world model" necessary for guesstimation. Moreover, similar to human performance, the WOC decoding method improves LLM/VLM guesstimation accuracy. Furthermore, the inclusion of images in the multimodal condition enhances model performance. These results highlight the value of WOC decoding strategy for LLMs/VLMs and position guesstimation as a probe for evaluating LLMs/VLMs' world model.

Comment: This paper introduces a novel guesstimation dataset (MARBLES) and a decoding strategy (WOC decoding) for LLMs, which aligns with criterion 1 by exploring how LLMs can be enhanced for specific tasks through innovative decoding methods. It also touches on the interpretability of LLMs, which is relevant to prompt tuning. Relevance: 6 Novelty: 7

Paper selection prompt

Novel methods for prompt tuning and soft prompt optimization in large language models

Relevant: Papers that introduce innovative approaches for prompt tuning or soft prompt optimization in large language models (LLMs). This includes research on methods to dynamically adapt or learn soft prompts for specific tasks, domains, or datasets, as well as techniques to improve the efficiency and effectiveness of prompt-based fine-tuning. Studies that explore the integration of prompt tuning with other optimization techniques, such as meta-learning, few-shot learning, or multi-task learning, are highly relevant. Additionally, research that investigates the interpretability, transferability, or scalability of soft prompts across diverse applications is of particular interest.
Not relevant: Papers that focus solely on traditional fine-tuning methods without addressing prompt tuning or soft prompt optimization, or papers that only discuss fixed, handcrafted prompts without introducing novel methodologies. Research that does not involve language models or is unrelated to prompt-based adaptation is also not relevant.

Monte Carlo Tree Search (MCTS) combined with Large Language Models (LLMs) for diverse applications
- Relevant: Papers that explore the integration of Monte Carlo Tree Search (MCTS) with large language models (LLMs) across a wide range of applications, not limited to decision-making. This includes research where LLMs are used to enhance MCTS in tasks such as game playing, natural language understanding, code generation, creative writing, or scientific discovery. Studies that leverage LLMs to improve MCTS components like state evaluation, action generation, or exploration strategies are highly relevant. Additionally, research that investigates the synergy between MCTS and LLMs in multi-modal tasks, interactive environments, or scenarios requiring iterative refinement (e.g., dialogue systems or program synthesis) is particularly valuable.
- Not relevant: Papers that focus solely on traditional MCTS methods without leveraging LLMs, or those that apply MCTS to tasks where LLMs play no significant role. Research unrelated to the integration of MCTS and LLMs is also not relevant.
OpenAI's O1-related methods and their applications in large language models
- Relevant: Papers that investigate or extend OpenAI's O1-related methods (e.g., optimization techniques, scaling laws, or training paradigms) for large language models (LLMs). This includes research that applies or adapts O1-inspired approaches to improve model efficiency, scalability, or performance in tasks such as language generation, reasoning, or multi-modal learning. Studies that explore the theoretical foundations of O1 methods, their practical implications for LLM training, or their integration with other optimization techniques (e.g., sparse training, quantization, or distillation) are highly relevant. Additionally, research that benchmarks or compares O1-related methods against other state-of-the-art approaches in LLM development is of particular interest.
- Not relevant: Papers that do not explicitly address O1-related methods or their applications in LLMs, or those that focus on unrelated optimization techniques without connecting them to O1 principles. Research that lacks a clear focus on large language models or their development is also not relevant.
Reinforcement Learning with Large Language Models for Complex Decision-Making Tasks
- Relevant: Papers that study the application of reinforcement learning (RL) to large language models (LLMs) for complex decision-making or interactive tasks. This includes research where LLMs are used as agents in RL environments, such as in multi-agent systems, planning, or control tasks. Methods that enhance RL performance using LLMs for goal-directed behavior or reinforcement learning in interactive or conversational settings (e.g., RLHF) are particularly relevant.
- Not relevant: Papers that apply RL to simpler tasks or tasks unrelated to language models, or papers that focus solely on task-specific applications without considering how RL can improve or integrate with large language models.
Supervised Fine-Tuning (SFT) methods for improving language model adaptability to diverse real-world tasks
- Relevant: Papers that introduce new supervised fine-tuning (SFT) methodologies for large language models, with a focus on improving their ability to generalize across a wide range of real-world tasks. This can include methods that adapt language models to specific domains, improve task robustness, or fine-tune models on multi-modal or multilingual data. Research that explores new loss functions, data augmentation techniques, or transfer learning paradigms to enhance SFT performance is highly relevant.
- Not relevant: Papers that focus only on fine-tuning models for very specific, narrow tasks without addressing generalizability or adaptability to a broader set of real-world tasks.
Swarm intelligence and large language model integration for autonomous drone (cluster) systems
- Relevant: Papers that explore the use of large language models in the coordination, control, or decision-making of autonomous drone systems, particularly in drone swarms or drone clusters. This can include applications where LLMs assist in high-level decision-making, communication between drones, or adaptive planning in dynamic environments. Research that investigates the integration of LLMs with swarm intelligence algorithms, real-time adaptation, or collaborative behavior among drones is particularly valuable.
- Not relevant: Papers that focus solely on traditional drone control systems, without the integration of language models or swarm intelligence concepts. Similarly, papers that focus only on non-language model-based approaches for individual drones without exploring multi-drone coordination.

In suggesting papers to your friend, remember that he enjoys papers on statistical machine learning, and generative modeling in natural language processing. Your friend also likes learning about surprising empirical results in language models, as well as clever statistical tricks. He does not want to read papers that are about primarily applications of methods to specific domains.