ICLR2026 论文笔记 TODO¶
总计: 1819 篇 | 已完成: 1819 | 待更新: 0
- \(\textbf{Re}^{2}\): Unlocking LLM Reasoning via Reinforcement Learning with Re-solving | arXiv: 2603.07197
- 3DGEER: 3D Gaussian Rendering Made Exact and Efficient for Generic Cameras | arXiv: 2505.24053
- A Benchmark for Deep Information Synthesis (DeepSynth) | arXiv: 2602.21143
- A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization | arXiv: 2510.21314
- A Cortically Inspired Architecture for Modular Perceptual AI | arXiv: 2603.07295
- A Fano-Style Accuracy Upper Bound for LLM Single-Pass Reasoning in Multi-Hop QA | arXiv: 2509.21199
- A Federated Generalized Expectation-Maximization Algorithm for Mixture Models with an Unknown Number of Components | arXiv: 2601.21160
- A Genetic Algorithm for Navigating Synthesizable Molecular Spaces | arXiv: 2509.20719
- A Geometric Perspective on the Difficulties of Learning GNN-based SAT Solvers | arXiv: 2508.21513
- A Hidden Semantic Bottleneck in Conditional Embeddings of Diffusion Transformers | arXiv: 2602.21596
- A Law of Data Reconstruction for Random Features (and Beyond) | arXiv: 2509.22214
- A Problem-Oriented Perspective and Anchor Verification for Code Optimization | arXiv: 2406.11935
- A Recovery Guarantee for Sparse Neural Networks | arXiv: 2509.20323
- A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization | arXiv: 2510.08916
- A Scalable Inter-edge Correlation Modeling in CopulaGNN for Link Sign Prediction | arXiv: 2601.19175
- A Single Architecture for Representing Invariance Under Any Space Group | arXiv: 2512.13989
- A State-Transition Framework for Efficient LLM Reasoning | arXiv: 2602.01198
- A Step to Decouple Optimization in 3DGS | arXiv: 2601.16736
- A Unifying View of Coverage in Linear Off-Policy Evaluation | arXiv: 2601.19030
- A universal compression theory for lottery ticket hypothesis and neural scaling laws | arXiv: 2510.00504
- A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models | arXiv: 2510.26441
- A.I.R.: Adaptive, Iterative, and Reasoning-based Frame Selection For Video Question Answering | arXiv: 2510.04428
- A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models | arXiv: 2509.23286
- ABBA-Adapters: Efficient and Expressive Fine-Tuning of Foundation Models | arXiv: 2505.14238
- AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking | arXiv: 2506.07751
- AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer | arXiv: 2603.15597
- Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms | arXiv: 2509.24228
- ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning | arXiv: 2503.24378
- Action-Free Offline-to-Online RL via Discretised State Policies | arXiv: 2602.00629
- Action-Guided Attention for Video Action Anticipation | arXiv: 2603.01743
- Activation Steering for Masked Diffusion Language Models | arXiv: 2512.24143
- ActivationReasoning: Logical Reasoning in Latent Activation Spaces | arXiv: 2510.18184
- Active Learning for Decision Trees with Provable Guarantees | arXiv: 2601.20775
- AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size | arXiv: 2509.26432
- AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference | arXiv: 2505.13531
- Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models | arXiv: 2603.00629
- Adaptive Augmentation-Aware Latent Learning for Robust LiDAR Semantic Segmentation | arXiv: 2603.01074
- Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation | arXiv: 2602.11743
- Adaptive Domain Shift in Diffusion Models for Cross-Modality Image Translation | arXiv: 2601.18623
- Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective | arXiv: 2603.03226
- Adaptive Rollout Allocation for Online RL with Verifiable Rewards (VIP) | arXiv: 2602.01601
- Adaptive Social Learning via Mode Policy Optimization for Language Agents | arXiv: 2505.02156
- Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts | arXiv: 2512.06652
- Adaptive Width Neural Networks | arXiv: 2501.15889
- AdaRank: Adaptive Rank Pruning for Enhanced Model Merging | arXiv: 2503.22178
- Addressing Divergent Representations from Causal Interventions on Neural Networks | arXiv: 2511.04638
- AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design | arXiv: 2602.04916
- Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models | arXiv: 2510.04618
- Agentified Assessment of Logical Reasoning Agents | arXiv: 2603.02788
- AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent | arXiv: 2512.20745
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents | arXiv: 2506.14205
- AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems | arXiv: 2603.14688
- AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in LVLMs | arXiv: 2603.01236
- Agnostics: Learning to Synthesize Code in Any Programming Language with a Universal RL Environment | arXiv: 2508.04865
- AIMCoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning | arXiv: 2509.25699
- Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment | arXiv: 2602.16660
- Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization | arXiv: 2509.23371
- AlignTok: Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models | arXiv: 2509.25162
- All-day Multi-scenes Lifelong Vision-and-Language Navigation with Tucker Adaptation | arXiv: 2603.14276
- AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint | arXiv: 2506.07022
- Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering | arXiv: 2502.13069
- AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations | arXiv: 2603.01966
- AMiD: Knowledge Distillation for LLMs with α-mixture Assistant Distribution | arXiv: 2510.15982
- AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation | arXiv: 2602.22740
- Amortising Inference and Meta-Learning Priors in Neural Networks (BNNP) | arXiv: 2602.08782
- AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification | arXiv: 2506.05980
- An Efficient, Provably Optimal Algorithm for the 0-1 Loss Linear Classification Problem | arXiv: 2306.12344
- An Information-Theoretic Framework For Optimizing Experimental Design To Distinguish Probabilistic Neural Codes | arXiv: 2603.01387
- An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes | arXiv: 2509.26429
- AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning | arXiv: 2504.02404
- Annotation-Efficient Universal Honesty Alignment | arXiv: 2510.17509
- ANO: Faster is Better in Noisy Landscapes | arXiv: 2508.18258
- Antibody: Strengthening Defense Against Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Gradient Influence | arXiv: 2603.00498
- AntigenLM: Structure-Aware DNA Language Modeling for Influenza | arXiv: 2602.09067
- AnveshanaAI: A Multimodal Platform for Adaptive AI/ML Education through Automated Question Generation and Interactive Assessment | arXiv: 2509.23811
- AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs | arXiv: 2510.10467
- AnyTouch 2: General Optical Tactile Representation Learning For Dynamic Tactile Perception | arXiv: 2602.09617
- AnyUp: Universal Feature Upsampling | arXiv: 2510.12764
- AP-OOD: Attention Pooling for Out-of-Distribution Detection | arXiv: 2602.06031
- APPLE: Toward General Active Perception via Reinforcement Learning | arXiv: 2505.06182
- AQuA: Toward Strategic Response Generation for Ambiguous Visual Questions | arXiv: 2603.07394
- Arbitrary Generative Video Interpolation | arXiv: 2510.00578
- Are Deep Speech Denoising Models Robust to Adversarial Noise? | arXiv: 2503.11627
- Are Reasoning LLMs Robust to Interventions on Their Chain-of-Thought? | arXiv: 2602.07470
- Are We Measuring Oversmoothing in Graph Neural Networks Correctly? | arXiv: 2502.04591
- ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning | arXiv: 2510.14176
- Articulation in Motion: Prior-Free Part Mobility Analysis for Articulated Objects | arXiv: 2603.02910
- ASIDE: Architectural Separation of Instructions and Data in Language Models | arXiv: 2503.10566
- assess a semantic and structural evaluation framework for statement similarity
- assetformer modular 3d assets generation with autoregressive transformer
- AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite | arXiv: 2510.21652
- Astra: General Interactive World Model with Autoregressive Denoising | arXiv: 2512.08931
- Astral: Training Physics-Informed Neural Networks with Error Majorants | arXiv: 2406.02645
- Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation | arXiv: 2510.04504
- atex-cf attack-informed counterfactual explanations for graph neural networks
- ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality | arXiv: 2510.22037
- ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue | arXiv: 2603.02216
- Attention Smoothing Is All You Need For Unlearning | arXiv: 2603.01285
- Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation | arXiv: 2505.16415
- Attribution-Guided Decoding | arXiv: 2509.26307
- ATTS: Asynchronous Test-Time Scaling via Conformal Prediction | arXiv: 2509.15148
- AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models | arXiv: 2505.16211
- Auditing Cascading Risks in Multi-Agent Systems via Semantic–Geometric Co-evolution | arXiv: 2603.13325
- Augmented Radiance Field: A General Framework for Enhanced Gaussian Splatting | arXiv: 2602.19916
- augmenting representations with scientific papers
- AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations | arXiv: 2602.03828
- AutoFly: Vision-Language-Action Model for UAV Autonomous Navigation in the Wild | arXiv: 2602.09657
- autoqd automatic discovery of diverse behaviors with quality-diversity optimizat
- Autoregressive Image Generation with Randomized Parallel Decoding | arXiv: 2503.10568
- autotool automatic scaling of tool-use capabilities in rl via decoupled entropy
- AVERE: Improving Audiovisual Emotion Reasoning with Preference Optimization | arXiv: 2602.07054
- AWM: Accurate Weight-Matrix Fingerprint for Large Language Models | arXiv: 2510.06738
- BA-MCTS: Bayes Adaptive Monte Carlo Tree Search for Offline Model-based RL | arXiv: 2410.11234
- Back to Square Roots: An Optimal Bound on the Matrix Factorization Error for Multi-Epoch Differentially Private SGD | arXiv: 2505.12128
- BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Digital Behavioural Change | arXiv: 2505.19328
- Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation | arXiv: 2505.22842
- Bayesian Influence Functions for Hessian-Free Data Attribution | arXiv: 2509.26544
- BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning | arXiv: 2510.27623
- Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data | arXiv: 2602.20152
- Benchmarking ECG FMs: A Reality Check Across Clinical Tasks | arXiv: 2509.25095
- Benchmarking Overton Pluralism in LLMs | arXiv: 2512.01351
- Beware Untrusted Simulators -- Reward-Free Backdoor Attacks in Reinforcement Learning | arXiv: 2602.05089
- Beyond Confidence: The Rhythms of Reasoning in Generative Models | arXiv: 2602.10816
- Beyond Linear Probes: Dynamic Safety Monitoring for Language Models | arXiv: 2509.26238
- Beyond Linearity in Attention Projections: The Case for Nonlinear Queries | arXiv: 2603.13381
- Beyond Match Maximization and Fairness: Retention-Optimized Two-Sided Matching | arXiv: 2602.15752
- Beyond Pairwise: Empowering LLM Alignment With Ranked Choice Modeling | arXiv: 2510.23631
- Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts | arXiv: 2508.06361
- Beyond RAG vs. Long-Context: Learning Distraction-Aware Retrieval for Efficient Knowledge Grounding | arXiv: 2509.21865
- Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework | arXiv: 2506.05619
- Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes | arXiv: 2603.05454
- Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs | arXiv: 2506.22095
- BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models | arXiv: 2509.24210
- BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models | arXiv: 2510.00307
- BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses | arXiv: 2510.00232
- BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation | arXiv: 2602.09383
- Bilinear Representation Mitigates Reversal Curse and Enables Consistent Model Editing | arXiv: 2509.21993
- BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration | arXiv: 2510.00438
- BioCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models | arXiv: 2510.20095
- Biologically Plausible Online Hebbian Meta-Learning: Two-Timescale Local Rules for Spiking Neural Brain Interfaces | arXiv: 2509.14447
- BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases | arXiv: 2505.20321
- Block-Sample MAC-Bayes Generalization Bounds | arXiv: 2602.12605
- Blueprint-Bench: Comparing Spatial Intelligence of LLMs, Agents and Image Models | arXiv: 2509.25229
- Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems | arXiv: 2508.12026
- Boolean Satisfiability via Imitation Learning | arXiv: 2509.25411
- Boomerang Distillation Enables Zero-Shot Model Size Interpolation | arXiv: 2510.05064
- Boosting Entropy with Bell Box Quantization | arXiv: 2603.01599
- Boosting Medical Visual Understanding From Multi-Granular Language Learning | arXiv: 2511.15943
- Bootstrapping MLLM for Weakly-Supervised Class-Agnostic Object Counting (WS-COC) | arXiv: 2602.12774
- BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning | arXiv: 2510.26374
- Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer | arXiv: 2510.25976
- Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model | arXiv: 2512.11582
- Branched Schrödinger Bridge Matching | arXiv: 2506.09007
- Breaking Barriers: Do Reinforcement Post Training Gains Transfer To Unseen Domains? | arXiv: 2506.19733
- Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training | arXiv: 2512.05132
- Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors | arXiv: 2602.17898
- Breaking the Limits of Open-Weight CLIP: An Optimization Framework for Self-supervised Fine-tuning of CLIP | arXiv: 2601.09859
- Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation | arXiv: 2508.13587
- BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving | arXiv: 2509.23589
- Bridging Degradation Discrimination and Generation for Universal Image Restoration | arXiv: 2602.00579
- Bridging Explainability and Embeddings: BEE Aware of Spuriousness | arXiv: 2410.18970
- Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection? | arXiv: 2509.22291
- Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models | arXiv: 2508.01669
- Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers | arXiv: 2509.22445
- BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs | arXiv: 2603.11991
- Building Spatial World Models from Sparse Transitional Episodic Memories | arXiv: 2505.13696
- ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer | arXiv: 2603.03583
- C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems | arXiv: 2510.02215
- CaDrift: A Time-dependent Causal Generator of Drifting Data Streams | arXiv: 2602.20329
- cadrille: Multi-modal CAD Reconstruction with Reinforcement Learning | arXiv: 2505.22914
- CAGE: A Framework for Culturally Adaptive Red-Teaming Benchmark Generation | arXiv: 2602.20170
- Calibrating Verbalized Confidence with Self-Generated Distractors | arXiv: 2509.25532
- Can SAEs Reveal and Mitigate Racial Biases of LLMs in Healthcare? | arXiv: 2511.00177
- Can Vision Language Models Assess Graphic Design Aesthetics? A Benchmark, Evaluation, and Dataset Perspective | arXiv: 2603.01083
- Can Vision-Language Models Answer Face to Face Questions in the Real-World? | arXiv: 2503.19356
- Can You Hear Me Now? A Benchmark for Long-Range Graph Propagation and Beyond | arXiv: 2512.17762
- Capability-Based Scaling Trends for LLM-Based Red-Teaming | arXiv: 2505.20162
- Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts | arXiv: 2503.05066
- CAPO: Curvature-Aware Policy Optimization for Sample-Efficient RL in LLM Reasoning | arXiv: 2510.00819
- CARD: Towards Conditional Design of Multi-agent Topological Structures | arXiv: 2603.01089
- CARE: Towards Clinical Accountability in Multi-Modal Medical Reasoning with an Evidence-Grounded Agentic Framework | arXiv: 2603.01607
- Causal Interpretation of Neural Network Computations with Contribution Decomposition | arXiv: 2603.06557
- Celo2: Towards Learned Optimization Free Lunch | arXiv: 2602.19142
- CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection | arXiv: 2602.22621
- Chain-of-Context Learning: Dynamic Constraint Understanding for Multi-Task VRPs | arXiv: 2603.01667
- CHAMMI-75: Pre-training multi-channel models with heterogeneous microscopy images | arXiv: 2512.20833
- Characterizing and Optimizing the Spatial Kernel of Multi Resolution Hash Encodings | arXiv: 2602.10495
- Characterizing Human Semantic Navigation in Concept Production as Trajectories in Embedding Space | arXiv: 2602.05971
- Chart Deep Research in LVLMs via Parallel Relative Policy Optimization | arXiv: 2603.06677
- Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training | arXiv: 2509.21500
- ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents | arXiv: 2509.22830
- CHLU: The Causal Hamiltonian Learning Unit as a Symplectic Primitive for Deep Learning | arXiv: 2603.01768
- CityLens: Evaluating Large Vision-Language Models for Urban Socioeconomic Sensing | arXiv: 2506.00530
- CLARC: C/C++ Benchmark for Robust Code Search | arXiv: 2603.04484
- CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally | arXiv: 2502.03566
- CloDS: Visual-Only Unsupervised Cloth Dynamics Learning in Unknown Conditions | arXiv: 2602.01844
- Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws | arXiv: 2510.16927
- Closing the Modality Gap Aligns Group-Wise Semantics | arXiv: 2601.18525
- CMT: Mid-Training for Efficient Learning of Consistency, Mean Flow, and Flow Map Models | arXiv: 2509.24526
- Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients | arXiv: 2506.11024
- Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models | arXiv: 2508.00410
- CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving | arXiv: 2601.01874
- COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics | arXiv: 2603.06495
- CollectiveKV: Decoupling and Sharing Collaborative Information in Sequential Recommendation | arXiv: 2601.19178
- Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer | arXiv: 2510.10152
- COMI: Coarse-to-fine Context Compression via Marginal Information Gain | arXiv: 2602.01719
- CoMind: Towards Community-Driven Agents for Machine Learning Engineering | arXiv: 2506.20640
- Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training | arXiv: 2506.01732
- COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics | arXiv: 2509.22240
- Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition | arXiv: 2510.01068
- Compositional amortized inference for large-scale hierarchical Bayesian models | arXiv: 2505.14429
- Compositional Diffusion with Guided Search for Long-Horizon Planning | arXiv: 2601.00126
- Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning | arXiv: 2502.04667
- Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning | arXiv: 2504.01445
- Compute-Optimal Quantization-Aware Training | arXiv: 2509.22935
- Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution | arXiv: 2507.06547
- Concepts' Information Bottleneck Models | arXiv: 2602.14626
- Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss | arXiv: 2602.07022
- Condition Matters in Full-head 3D GANs | arXiv: 2602.07198
- Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting | arXiv: 2509.20928
- ConfHit: Conformal Generative Design with Oracle Free Guarantees | arXiv: 2603.07371
- Conflict-Aware Fusion: Resolving Logic Inertia in Large Language Models via Structured Cognitive Priors | arXiv: 2512.06393
- Conformal Prediction Adaptive to Unknown Subpopulation Shifts | arXiv: 2506.05583
- ConFu: Contemplate the Future for Better Speculative Sampling | arXiv: 2603.08899
- Conjuring Semantic Similarity | arXiv: 2410.16431
- Consistent Low-Rank Approximation | arXiv: 2603.02148
- Consistent Text-to-Image Generation via Scene De-Contextualization | arXiv: 2510.14553
- Constraint Matters: Multi-Modal Representation for Reducing Mixed-Integer Linear programming | arXiv: 2508.18742
- Constructive Distortion: Improving MLLMs with Attention-Guided Image Warping | arXiv: 2510.09741
- Contact Wasserstein Geodesics for Non-Conservative Schrödinger Bridges | arXiv: 2511.06856
- Contact-Guided 3D Genome Structure Generation of E. coli via Diffusion Transformers | arXiv: 2603.07472
- Contamination Detection for VLMs using Multi-Modal Semantic Perturbation | arXiv: 2511.03774
- Context Tokens are Anchors: Understanding the Repetition Curse in dMLLMs from an Information Flow Perspective | arXiv: 2601.20520
- ContextBench: Modifying Contexts for Targeted Latent Activation | arXiv: 2506.15735
- Contextual and Seasonal LSTMs for Time Series Anomaly Detection | arXiv: 2602.09690
- Continual Unlearning for Text-to-Image Diffusion Models: A Regularization Perspective | arXiv: 2511.07970
- Continuous Chain of Thought Enables Parallel Exploration and Reasoning | arXiv: 2505.23648
- Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning | arXiv: 2509.09135
- Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations | arXiv: 2601.01003
- Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning | arXiv: 2602.20197
- Controllable Sequence Editing for Biological and Clinical Trajectories | arXiv: 2502.03569
- Controlling Repetition in Protein Language Models | arXiv: 2602.00782
- Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization | arXiv: 2506.02370
- Convergence of Muon with Newton-Schulz | arXiv: 2601.19156
- Convex Dominance in Deep Learning I: A Scaling Law of Loss and Learning Rate | arXiv: 2602.07145
- Cooperative Sheaf Neural Networks | arXiv: 2507.00647
- COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception | arXiv: 2602.13287
- Copy-Paste to Mitigate Large Language Model Hallucinations | arXiv: 2510.00508
- CORDS: Continuous Representations of Discrete Structures | arXiv: 2601.21583
- CORE-3D: Context-aware Open-vocabulary Retrieval by Embeddings in 3D | arXiv: 2509.24528
- COSMO-INR: Complex Sinusoidal Modulation for Implicit Neural Representations | arXiv: 2505.11640
- CoT-RVS: Zero-Shot Chain-of-Thought Reasoning Segmentation for Videos | arXiv: 2505.18561
- CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of LLMs in Mental Health QA | arXiv: 2506.08584
- Counterfactual Explanations on Robust Perceptual Geodesics | arXiv: 2601.18678
- Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss | arXiv: 2512.23447
- CPiRi: Channel Permutation-Invariant Relational Interaction for Multivariate Time Series Forecasting | arXiv: 2601.20318
- CREPE: Controlling Diffusion with Replica Exchange | arXiv: 2509.23265
- CRISP: Contact-Guided Real2Sim from Monocular Video with Planar Scene Primitives | arXiv: 2512.14696
- Cross-Domain Lossy Compression via Rate- and Classification-Constrained Optimal Transport
- Cross-Domain Policy Optimization via Bellman Consistency and Hybrid Critics | arXiv: 2603.12087
- Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets | arXiv: 2602.18025
- Cross-Modal Redundancy and the Geometry of Vision-Language Embeddings | arXiv: 2602.06218
- CryoNet.Refine: A One-step Diffusion Model for Rapid Refinement of Structural Models with Cryo-EM Density Map Restraints | arXiv: 2602.22263
- Ctrl&Shift: High-Quality Geometry-Aware Object Manipulation in Visual Generation | arXiv: 2602.11440
- CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning | arXiv: 2507.14111
- Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach | arXiv: 2509.21950
- Cut Less, Fold More: Model Compression through the Lens of Projection Geometry | arXiv: 2602.18116
- CyclicReflex: Improving Reasoning Models via Cyclical Reflection Token Scheduling | arXiv: 2506.11077
- d\(^2\)Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching | arXiv: 2509.23094
- D-REX: Differentiable Real-to-Sim-to-Real Engine for Learning Dexterous Grasping | arXiv: 2603.01151
- D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI | arXiv: 2510.05684
- DA-AC: Distributions as Actions — A Unified RL Framework for Diverse Action Spaces | arXiv: 2506.16608
- DAG-Math: Graph-of-Thought Guided Mathematical Reasoning in LLMs | arXiv: 2510.19842
- DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science | arXiv: 2602.24288
- Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature | arXiv: 2602.17385
- Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression | arXiv: 2602.20650
- Dataset Distillation as Pushforward Optimal Quantization | arXiv: 2501.07681
- Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity | arXiv: 2509.22641
- Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series | arXiv: 2602.18473
- Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading | arXiv: 2505.02872
- Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning | arXiv: 2508.01916
- Deconstructing Positional Information: From Attention Logits to Training Biases | arXiv: 2505.13027
- Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement | arXiv: 2410.04264
- Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding | arXiv: 2512.01565
- Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models | arXiv: 2509.17874
- Deep Learning for Subspace Regression | arXiv: 2509.23249
- Deep SPI: Safe Policy Improvement via World Models | arXiv: 2510.12312
- DeepAFL: Deep Analytic Federated Learning | arXiv: 2603.00579
- Delta-XAI: A Unified Framework for Explaining Prediction Changes in Online Time Series Monitoring | arXiv: 2511.23036
- DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment | arXiv: 2601.20218
- DESIGNER: Design-Logic-Guided Multidisciplinary Data Synthesis for LLM Reasoning | arXiv: 2508.12726
- Detecting and Mitigating Memorization in Diffusion Models through Anisotropy of the Log-Probability | arXiv: 2601.20642
- Detecting Misbehaviors of Large Vision-Language Models by Evidential Uncertainty Quantification | arXiv: 2602.05535
- Deterministic Bounds and Random Estimates of Metric Tensors on Neuromanifolds | arXiv: 2505.13614
- DGNet: Discrete Green Networks for Data-Efficient Learning of Spatiotemporal PDEs | arXiv: 2603.01762
- DiaBlo: Diagonal Blocks Are Sufficient For Finetuning | arXiv: 2506.03230
- Did You Check the Right Pocket? Cost-Sensitive Store Routing for Memory-Augmented Agents | arXiv: 2603.15658
- Difficult Examples Hurt Unsupervised Contrastive Learning: A Theoretical Perspective | arXiv: 2501.01317
- DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation | arXiv: 2509.23624
- Diffusion Alignment as Variational Expectation-Maximization | arXiv: 2510.00502
- Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models | arXiv: 2505.18547
- Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function | arXiv: 2512.04559
- DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation | arXiv: 2506.14202
- DiffusionNFT: Online Diffusion Reinforcement with Forward Process | arXiv: 2509.16117
- DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing | arXiv: 2411.17957
- DiffWind: Physics-Informed Differentiable Modeling of Wind-Driven Object Dynamics | arXiv: 2603.09668
- Digging Deeper: Learning Multi-Level Concept Hierarchies | arXiv: 2603.10084
- Direct Doubly Robust Estimation of Conditional Quantile Contrasts | arXiv: 2601.19666
- Direct Reward Fine-Tuning on Poses for Single Image to 3D Human in the Wild | arXiv: 2603.02619
- Directional Convergence, Benign Overfitting of Gradient Descent in leaky ReLU two-layer Neural Networks | arXiv: 2505.16204
- Directional Embedding Smoothing for Robust Vision Language Models | arXiv: 2603.15259
- Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs | arXiv: 2510.04727
- Directional Textual Inversion for Personalized Text-to-Image Generation | arXiv: 2512.13672
- DISCO: Densely-overlapping Cell Instance Segmentation via Adjacency-aware Collaborative Coloring | arXiv: 2602.05420
- Discount Model Search for Quality Diversity Optimization in High-Dimensional Measure Spaces | arXiv: 2601.01082
- Discovering and Steering Interpretable Concepts in Large Generative Music Models | arXiv: 2505.18186
- Discrete Adjoint Matching | arXiv: 2602.07132
- Discrete Diffusion Trajectory Alignment via Stepwise Decomposition | arXiv: 2507.04832
- Disentangling Shared and Private Neural Dynamics with SPIRE: A Latent Modeling Framework for Deep Brain Stimulation | arXiv: 2510.25023
- Displacement-Resistant Extensions of DPO with Nonconvex \(f\)-Divergences | arXiv: 2602.06788
- Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models | arXiv: 2603.10071
- Distillation of Large Language Models via Concrete Score Matching | arXiv: 2509.25837
- Distilling and Adapting: A Topology-Aware Framework for Zero-Shot Interaction Prediction in Multiplex Biological Networks | arXiv: 2603.06618
- DistillKac: Few-Step Image Generation via Damped Wave Equations | arXiv: 2509.21513
- DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials | arXiv: 2506.02023
- Distributed Algorithms for Euclidean Clustering | arXiv: 2603.08615
- Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems | arXiv: 2510.13972
- Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning | arXiv: 2603.04780
- Distributionally Robust Classification for Multi-source Unsupervised Domain Adaptation | arXiv: 2601.21315
- Distributionally Robust Cooperative Multi-Agent Reinforcement Learning via Robust Value Factorization | arXiv: 2602.11437
- DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage | arXiv: 2603.01106
- DiVE-k: Differential Visual Reasoning for Fine-grained Image Recognition | arXiv: 2511.18305
- Diverse Text-to-Image Generation via Contrastive Noise Optimization | arXiv: 2510.03813
- Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models | arXiv: 2602.11057
- DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction | arXiv: 2602.18589
- DMAP: A Distribution Map for Text | arXiv: 2602.11871
- DND: Boosting Large Language Models with Dynamic Nested Depth | arXiv: 2510.11001
- Do Vision-Language Models Respect Contextual Integrity in Location Disclosure? | arXiv: 2602.05023
- Do We Really Need Permutations? Impact of Model Width on Linear Mode Connectivity | arXiv: 2510.08023
- Does FLUX Already Know How to Perform Physically Plausible Image Composition? | arXiv: 2509.21278
- Does Semantic Noise Initialization Transfer from Images to Videos? A Paired Diagnostic Study | arXiv: 2603.06672
- DoFlow: Flow-based Generative Models for Interventional and Counterfactual Forecasting on Time Series | arXiv: 2511.02137
- Domain Expansion: A Latent Space Construction Framework for Multi-Task Learning | arXiv: 2601.20069
- Don't Just Fine-tune the Agent, Tune the Environment | arXiv: 2510.10197
- Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas | arXiv: 2509.22957
- Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models | arXiv: 2504.19373
- Draft-based Approximate Inference for LLMs | arXiv: 2506.08373
- DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing | arXiv: 2510.02253
- Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing | arXiv: 2509.01986
- DREAM: Completing Missing Annotation via Multi-Agent Debate for Accurate and Scalable Relevance Assessment | arXiv: 2602.06526
- DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas | arXiv: 2602.01326
- DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning | arXiv: 2509.24868
- DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models | arXiv: 2509.21655
- DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving | arXiv: 2601.01528
- DRO-InstructZero: Distributionally Robust Prompt Optimization for Large Language Models | arXiv: 2510.15260
- DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization | arXiv: 2510.04474
- Dual Distillation for Few-Shot Anomaly Detection | arXiv: 2603.01713
- Dual Goal Representations | arXiv: 2510.06714
- Dual Optimistic Ascent (PI Control) is the Augmented Lagrangian Method in Disguise | arXiv: 2509.22500
- Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation | arXiv: 2502.02088
- Dual-Robust Cross-Domain Offline Reinforcement Learning Against Dynamics Shifts | arXiv: 2512.02486
- Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction | arXiv: 2603.03973
- DVLA-RL: Dual-Level Vision-Language Alignment with Reinforcement Learning Gating for Few-Shot Learning | arXiv: 2602.00795
- Dynamic Multimodal Activation Steering for Hallucination Mitigation in Large Vision-Language Models | arXiv: 2602.21704
- Dynamic Novel View Synthesis in High Dynamic Range | arXiv: 2509.21853
- Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation | arXiv: 2507.09076
- Dynamic Reflections: Probing Video Representations with Text Alignment | arXiv: 2511.02767
- Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure | arXiv: 2602.08783
- Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models | arXiv: 2603.10887
- EAMET: Robust Massive Model Editing via Embedding Alignment Optimization | arXiv: 2505.11876
- Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents | arXiv: 2509.23141
- Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play? | arXiv: 2509.03516
- Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning | arXiv: 2602.11909
- EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models | arXiv: 2510.22758
- EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements | arXiv: 2506.08762
- EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing | arXiv: 2509.26346
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling | arXiv: 2509.23909
- Efficient Adversarial Attacks on High-dimensional Offline Bandits | arXiv: 2602.01658
- Efficient Agent Training for Computer Use | arXiv: 2505.13909
- Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention | arXiv: 2509.23610
- Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking | arXiv: 2510.06820
- Efficient Ensemble Conditional Independence Test Framework for Causal Discovery | arXiv: 2509.21021
- Efficient Estimation of Kernel Surrogate Models for Task Attribution | arXiv: 2602.03783
- Efficient Reasoning with Balanced Thinking | arXiv: 2603.12372
- Efficient Resource-Constrained Training of Transformers via Subspace Optimization | arXiv: 2510.09160
- Efficient Test-Time Scaling for Small Vision-Language Models | arXiv: 2510.03574
- Efficient-LVSM: Faster, Cheaper, and Better Large View Synthesis Model via Decoupled Co-Refinement Attention | arXiv: 2602.06478
- Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval | arXiv: 2602.08224
- EGG-SR: Embedding Symbolic Equivalence into Symbolic Regression via Equality Graph | arXiv: 2511.05849
- EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video | arXiv: 2505.11709
- EgoHandICL: Egocentric 3D Hand Reconstruction with In-Context Learning | arXiv: 2601.19850
- EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark | arXiv: 2510.06218
- EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations | arXiv: 2506.17896
- Einstein Fields: A Neural Perspective To Computational General Relativity | arXiv: 2507.11589
- Eliminating VAE for Fast and High-Resolution Generative Detail Restoration | arXiv: 2602.10630
- ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework | arXiv: 2603.07946
- Embedding Compression via Spherical Coordinates | arXiv: 2602.00079
- Embedding-Based Context-Aware Reranker | arXiv: 2510.13329
- Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization | arXiv: 2505.16348
- Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning | arXiv: 2510.04970
- Emergence of Spatial Representation in an Actor-Critic Agent with Hippocampus-Inspired Sequence Generator | arXiv: 2510.09951
- Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought | arXiv: 2509.23365
- Emergent Misalignment is Easy, Narrow Misalignment is Hard | arXiv: 2602.07852
- EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning | arXiv: 2601.15668
- Empirical Stability Analysis of Kolmogorov-Arnold Networks in Hard-Constrained Recurrent Physics-Informed Discovery | arXiv: 2602.09988
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration | arXiv: 2506.23061
- EMR-AGENT: Automating Cohort and Feature Extraction from EMR Databases | arXiv: 2510.00549
- Enabling Fine-Grained Operating Points for Black-Box LLMs | arXiv: 2510.17727
- Energy-Regularized Sequential Model Editing on Hyperspheres | arXiv: 2510.01172
- Enhanced Continual Learning of Vision-Language Models with Model Fusion | arXiv: 2503.10705
- Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search | arXiv: 2509.15927
- Enhancing Hallucination Detection through Noise Injection | arXiv: 2502.03799
- Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection | arXiv: 2603.06745
- Enhancing Molecular Property Predictions by Learning from Bond Modelling and Interactions | arXiv: 2603.00568
- Enhancing Multi-Image Understanding through Delimiter Token Scaling | arXiv: 2602.01984
- Enhancing Multivariate Time Series Forecasting with Global Temporal Retrieval | arXiv: 2602.10847
- Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents | arXiv: 2603.01438
- Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks | arXiv: 2512.06297
- Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding | arXiv: 2602.02742
- Entropy-Preserving Reinforcement Learning (REPO / ADAPO) | arXiv: 2603.11682
- Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning | arXiv: 2509.22263
- Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance | arXiv: 2603.03692
- Error Notebook-Guided, Training-Free Part Retrieval in 3D CAD Assemblies via Vision-Language Models | arXiv: 2509.01350
- ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping | arXiv: 2603.10088
- Estimating the Empowerment of Language Model Agents | arXiv: 2509.22504
- Evaluating GFlowNet from partial episodes for stable and flexible policy-based training | arXiv: 2603.01047
- Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator | arXiv: 2505.19236
- Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences | arXiv: 2603.13100
- Event-T2M: Event-level Conditioning for Complex Text-to-Motion Synthesis | arXiv: 2602.04292
- Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models | arXiv: 2601.20354
- EvoEngineer: Mastering Automated CUDA Kernel Code Evolution with Large Language Models | arXiv: 2510.03760
- EvoFlows: Evolutionary Edit-Based Flow-Matching for Protein Engineering | arXiv: 2603.11703
- Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval | arXiv: 2603.09250
- Evolution and compression in LLMs: On the emergence of human-aligned categorization | arXiv: 2509.08093
- Evolution of Concepts in Language Model Pre-Training | arXiv: 2509.17196
- Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model | arXiv: 2506.15682
- Exchangeability of GNN Representations with Applications to Graph Retrieval
- Execution-Grounded Credit Assignment for GRPO in Code Generation | arXiv: 2603.16158
- ExGRPO: Learning to Reason from Experience | arXiv: 2510.02245
- Exo-Plore: Exploring Exoskeleton Control Space through Human-aligned Simulation | arXiv: 2601.22550
- ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning | arXiv: 2509.26255
- Experience-based Knowledge Correction for Robust Planning in Minecraft | arXiv: 2505.24157
- Expert Divergence Learning for MoE-based Language Models | arXiv: 2603.00054
- ExpGuard: LLM Content Moderation in Specialized Domains | arXiv: 2603.02588
- Explaining Grokking and Information Bottleneck through Neural Collapse Emergence | arXiv: 2509.20829
- Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification | arXiv: 2505.15504
- Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward | arXiv: 2512.16912
- Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization | arXiv: 2602.23008
- Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling | arXiv: 2602.21728
- Exploring Diverse Generation Paths via Inference-time Stiefel Activation Steering | arXiv: 2601.22010
- Exploring Interpretability for Visual Prompt Tuning with Cross-layer Concepts | arXiv: 2503.06084
- ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection | arXiv: 2510.08630
- Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search | arXiv: 2512.08724
- Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction | arXiv: 2602.21550
- Factuality Matters: When Image Generation and Editing Meet Structured Visuals | arXiv: 2510.05091
- Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs | arXiv: 2603.00590
- Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions | arXiv: 2602.05234
- FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning | arXiv: 2510.22543
- FASA: Frequency-aware Sparse Attention | arXiv: 2602.03152
- Fast and Stable Riemannian Metrics on SPD Manifolds via Cholesky Product Geometry | arXiv: 2407.02607
- Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws | arXiv: 2602.14208
- Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances | arXiv: 2509.20508
- Faster Gradient Methods for Highly-Smooth Stochastic Bilevel Optimization | arXiv: 2509.02937
- FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation | arXiv: 2601.13837
- FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning | arXiv: 2509.21792
- FastLSQ: Solving PDEs in One Shot via Fourier Features with Exact Analytical Derivatives | arXiv: 2602.10541
- FeatureBench: Benchmarking Agentic Coding for Complex Feature Development | arXiv: 2602.10975
- FeDaL: Federated Dataset Learning for General Time Series Foundation Models | arXiv: 2508.04045
- FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments | arXiv: 2602.23504
- Federated ADMM from Bayesian Duality | arXiv: 2506.13150
- Feedback-driven recurrent quantum neural network universality | arXiv: 2506.16332
- FictionalQA: A Dataset for Studying Memorization and Knowledge Acquisition | arXiv: 2506.05639
- Fine-Grained Activation Steering: Steering Less, Achieving More | arXiv: 2602.04428
- Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning | arXiv: 2602.07605
- Fine-Tuning Diffusion Models via Intermediate Distribution Shaping | arXiv: 2510.02692
- Fine-tuning Done Right in Model Editing | arXiv: 2509.22072
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization | arXiv: 2505.13430
- Fine-tuning with RAG for Improving LLM Learning of New Skills | arXiv: 2510.01375
- FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM Agents | arXiv: 2507.21071
- FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability-Plasticity Tradeoff | arXiv: 2602.08040
- First is Not Really Better Than Last: Evaluating Layer Choice and Aggregation Strategies in Language Model Data Influence Estimation | arXiv: 2511.04715
- Fixing the Broken Compass: Diagnosing and Improving Inference-Time Reward Modeling | arXiv: 2503.05188
- FlashVID: Efficient Video Large Language Models via Training-free Tree-Based Spatiotemporal Token Merging | arXiv: 2602.08024
- Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models | arXiv: 2506.05339
- FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates | arXiv: 2510.00981
- FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding | arXiv: 2511.00141
- Flow Actor-Critic for Offline Reinforcement Learning (FAC) | arXiv: 2602.18015
- Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning | arXiv: 2602.18117
- Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets | arXiv: 2602.10583
- Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation | arXiv: 2512.23278
- FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching | arXiv: 2511.09731
- FlowCast: Trajectory Forecasting for Scalable Zero-Cost Speculative Flow Matching | arXiv: 2602.01329
- Fly-CL: A Fly-Inspired Framework for Enhancing Efficient Decorrelation and Reduced Training Time in Pre-trained Model-based Continual Representation Learning | arXiv: 2510.16877
- FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning | arXiv: 2602.01976
- Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control | arXiv: 2508.08134
- ForestPersons: A Large-Scale Dataset for Under-Canopy Missing Person Detection | arXiv: 2603.02541
- Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees | arXiv: 2602.16823
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models | arXiv: 2506.07177
- Free Energy Mixer | arXiv: 2602.07160
- Free Lunch for Stabilizing Rectified Flow Inversion | arXiv: 2602.11850
- FreqKV: Key-Value Compression in Frequency Domain for Context Window Extension | arXiv: 2505.00570
- FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models | arXiv: 2512.08016
- From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics | arXiv: 2601.23048
- From Assumptions to Actions: Turning LLM Reasoning into Uncertainty-Aware Planning for Embodied Agents | arXiv: 2602.04326
- From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents | arXiv: 2509.23415
- From Evaluation to Defense: Advancing Safety in Video Large Language Models | arXiv: 2505.16643
- From Movement to Cognitive Maps: RNNs Reveal How Locomotor Development Shapes Hippocampal Spatial Coding
- From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning | arXiv: 2603.03825
- From Observations to Events: Event-Aware World Model for Reinforcement Learning | arXiv: 2601.19336
- From Parameters to Behaviors: Unsupervised Compression of the Policy Space | arXiv: 2509.22566
- From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation | arXiv: 2505.16324
- From Samples to Scenarios: A New Paradigm for Probabilistic Forecasting | arXiv: 2509.19975
- From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors | arXiv: 2510.17439
- From Static Benchmarks to Dynamic Protocol: Agent-Centric Text Anomaly Detection for Evaluating LLM Reasoning | arXiv: 2602.23729
- From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization | arXiv: 2602.01068
- From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation | arXiv: 2601.18533
- From Vicious to Virtuous Cycles: Synergistic Representation Learning for Unsupervised Video Object-Centric Learning | arXiv: 2602.03390
- FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization | arXiv: 2505.16952
- FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models | arXiv: 2509.20624
- FSOD-VFM: Few-Shot Object Detection with Vision Foundation Models and Graph Diffusion | arXiv: 2602.03137
- Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition | arXiv: 2507.09875
- Function Spaces Without Kernels: Learning Compact Hilbert Space Representations | arXiv: 2509.20605
- Functional embeddings enable Aggregation of multi-area SEEG recordings over subjects and sessions | arXiv: 2510.27090
- Fused-Planes: Why Train a Thousand Tri-Planes When You Can Share? | arXiv: 2410.23742
- Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology | arXiv: 2602.13944
- FutureMind: Equipping Small Language Models with Strategic Thinking-Pattern Priors via Adaptive Knowledge Distillation | arXiv: 2602.01222
- G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge | arXiv: 2509.24276
- Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments | arXiv: 2602.11964
- GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences | arXiv: 2508.07782
- GASP: Guided Asymmetric Self-Play For Coding LLMs | arXiv: 2603.15957
- Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach | arXiv: 2510.13094
- GAVEL: Towards Rule-Based Safety through Activation Monitoring | arXiv: 2601.19768
- GenCP: Towards Generative Modeling Paradigm of Coupled Physics | arXiv: 2601.19541
- GenDR: Lighten Generative Detail Restoration | arXiv: 2503.06790
- General Exploratory Bonus for Optimistic Exploration in RLHF | arXiv: 2510.03269
- Generalizable Coarse-to-Fine Robot Manipulation via Language-Aligned 3D Keypoints | arXiv: 2509.23575
- Generalizable End-to-End Tool-Use RL with Synthetic CodeGym | arXiv: 2509.17325
- Generalization Below the Edge of Stability: The Role of Data Geometry | arXiv: 2510.18120
- Generalization of Diffusion Models Arises with a Balanced Representation Space | arXiv: 2512.20963
- Generalizing Linear Autoencoder Recommenders with Decoupled Expected Quadratic Loss | arXiv: 2603.07402
- Generate Any Scene: Scene Graph Driven Data Synthesis for Visual Generation Training | arXiv: 2412.08221
- Generating Directed Graphs with Dual Attention and Asymmetric Encoding | arXiv: 2506.16404
- Generative Value Conflicts Reveal LLM Priorities | arXiv: 2509.25369
- GeoDiv: Framework For Measuring Geographical Diversity In Text-To-Image Models | arXiv: 2602.22120
- GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs | arXiv: 2505.17653
- Geometry-aware 4D Video Generation for Robot Manipulation | arXiv: 2507.01099
- GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation | arXiv: 2510.02186
- GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning | arXiv: 2507.19457
- GGBall: Graph Generative Model on Poincaré Ball | arXiv: 2506.07198
- GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra | arXiv: 2506.08194
- Glance and Focus Reinforcement for Pan-cancer Screening | arXiv: 2601.19103
- GLASS Flows: Efficient Inference for Reward Alignment of Flow and Diffusion Models
- GLYPH-SR: Can We Achieve Both High-Quality Image Super-Resolution and High-Fidelity Text Recovery via VLM-guided Latent Diffusion Model? | arXiv: 2510.26339
- GoalRank: Group-Relative Optimization for a Large Ranking Model | arXiv: 2509.22046
- GOT-Edit: Geometry-Aware Generic Object Tracking via Online Model Editing | arXiv: 2602.08550
- GRADIEND: Feature Learning within Neural Networks Exemplified through Biases | arXiv: 2502.01406
- Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models | arXiv: 2510.09658
- Graph homophily booster: Reimagining the role of discrete features in heterophilic graph learning | arXiv: 2602.07256
- Graph Tokenization for Bridging Graphs and Transformers | arXiv: 2603.11099
- GraphOmni: A Comprehensive and Extensible Benchmark Framework for Large Language Models on Graph-theoretic Tasks | arXiv: 2504.12764
- GraphUniverse: Synthetic Graph Generation for Evaluating Inductive Generalization | arXiv: 2509.21097
- Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs | arXiv: 2510.18876
- Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test | arXiv: 2506.21551
- Grounding and Enhancing Informativeness and Utility in Dataset Distillation | arXiv: 2601.21296
- Grounding Generative Planners in Verifiable Logic: A Hybrid Architecture for Trustworthy Embodied AI | arXiv: 2602.08373
- Grounding-IQA: Grounding Multimodal Language Models for Image Quality Assessment | arXiv: 2411.17237
- Group Representational Position Encoding (GRAPE) | arXiv: 2512.07805
- Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends | arXiv: 2509.24203
- GTM: A General Time-series Model for Enhanced Representation Learning of Time-Series Data | arXiv: 2502.03264
- GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models | arXiv: 2510.07791
- GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models | arXiv: 2602.24027
- GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time | arXiv: 2510.03777
- Hallucination Begins Where Saliency Drops | arXiv: 2601.20279
- HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics | arXiv: 2507.15518
- Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation | arXiv: 2601.20614
- Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents | arXiv: 2509.09265
- Harpoon: Generalised Manifold Guidance for Conditional Tabular Diffusion | arXiv: 2602.07875
- HDR-NSFF: High Dynamic Range Neural Scene Flow Fields | arXiv: 2603.08313
- HEEGNet: Hyperbolic Embeddings for EEG | arXiv: 2601.03322
- Helix: Evolutionary Reinforcement Learning for Open-Ended Scientific Problem Solving | arXiv: 2603.07642
- Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation | arXiv: 2602.16936
- HeurekaBench: A Benchmarking Framework for AI Co-scientist | arXiv: 2601.01678
- Hidden Breakthroughs in Language Model Training | arXiv: 2506.15872
- Hide and Find: A Distributed Adversarial Attack on Federated Graph Learning | arXiv: 2603.07743
- HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit | arXiv: 2602.23699
- Hierarchical Concept-based Interpretable Models | arXiv: 2602.23947
- Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion | arXiv: 2602.02722
- Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks | arXiv: 2602.22817
- HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation | arXiv: 2601.23064
- HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design | arXiv: 2508.13333
- Highly Efficient and Effective LLMs with Multi-Boolean Architectures | arXiv: 2505.22811
- Hilbert-Guided Sparse Local Attention | arXiv: 2511.05832
- HistoPrism: Unlocking Functional Pathway Analysis from Pan-Cancer Histology via Gene Expression Prediction | arXiv: 2601.21560
- HiVid: LLM-Guided Video Saliency For Content-Aware VOD And Live Streaming | arXiv: 2602.14214
- HOG-Diff: Higher-Order Guided Diffusion for Graph Generation | arXiv: 2502.04308
- Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models | arXiv: 2602.08032
- How Catastrophic is Your LLM? Certifying Risk in Conversation | arXiv: 2510.03969
- How Do Medical MLLMs Fail? A Study on Visual Grounding in Medical Images | arXiv: 2603.14323
- How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability | arXiv: 2601.19208
- How Far Are LLMs from Professional Poker Players? Revisiting Game-Theoretic Reasoning with Agentic Tool Use | arXiv: 2602.00528
- How Far Can Unsupervised RLVR Scale LLM Training? | arXiv: 2603.08660
- How LLMs Learn to Reason: A Complex Network Perspective | arXiv: 2509.23629
- How Reliable is Language Model Micro-Benchmarking? | arXiv: 2510.08730
- How to make the most of your masked language model for protein engineering | arXiv: 2603.10302
- Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding | arXiv: 2510.04899
- Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction | arXiv: 2602.24080
- Human-LLM Collaborative Feature Engineering for Tabular Data | arXiv: 2601.21060
- HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks | arXiv: 2510.10062
- Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning | arXiv: 2508.19113
- HyperKKL: Enabling Non-Autonomous State Estimation through Dynamic Weight Conditioning | arXiv: 2602.22630
- I Can't Believe It's Not Robust: Catastrophic Collapse of Safety Classifiers under Embedding Drift | arXiv: 2603.01297
- ICYM2I: The illusion of multimodal informativeness under missingness | arXiv: 2505.16953
- Identifying and Evaluating Inactive Heads in Pretrained LLMs | arXiv: 2504.03889
- IDER: IDempotent Experience Replay for Reliable Continual Learning | arXiv: 2603.00624
- Ignore All Previous Instructions: Jailbreaking as a de-escalatory peace building practise to resist LLM social media bots | arXiv: 2603.01942
- Image Can Bring Your Memory Back: A Novel Multi-Modal Guided Attack against Image Generation Model Unlearning | arXiv: 2507.07139
- Imagine How To Change: Explicit Procedure Modeling for Change Captioning | arXiv: 2603.05969
- Implicit Bias and Loss of Plasticity in Matrix Completion: Depth Promotes Low-Rankness | arXiv: 2603.04703
- Implicit Bias of Per-sample Adam on Separable Data: Departure from the Full-batch Regime | arXiv: 2510.26303
- Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context | arXiv: 2603.10573
- Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment | arXiv: 2601.01224
- Improving 2D Diffusion Models for 3D Medical Imaging with Inter-Slice Consistent Stochasticity | arXiv: 2602.04162
- Improving Black-Box Generative Attacks via Generator Semantic Consistency | arXiv: 2506.18248
- Improving Code Localization with Repository Memory | arXiv: 2510.01003
- Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies | arXiv: 2510.05725
- Improving Long-Range Interactions in Graph Neural Simulators via Hamiltonian Dynamics | arXiv: 2511.08185
- Improving Set Function Approximation with Quasi-Arithmetic Neural Networks | arXiv: 2602.04941
- Improving the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models | arXiv: 2602.01428
- IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation | arXiv: 2603.07926
- In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations | arXiv: 2602.15456
- In-Context Algebra | arXiv: 2512.16902
- In-Context Learning for Pure Exploration | arXiv: 2506.01876
- In-Context Learning of Temporal Point Processes with Foundation Inference Models | arXiv: 2509.24762
- Incentive-Aligned Multi-Source LLM Summaries | arXiv: 2509.25184
- Incentives in Federated Learning with Heterogeneous Agents | arXiv: 2509.21612
- Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning | arXiv: 2510.23038
- Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models | arXiv: 2509.06415
- Inference-Time Backdoors via Hidden Instructions in LLM Chat Templates | arXiv: 2602.04653
- Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification | arXiv: 2601.22853
- Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision | arXiv: 2603.01494
- InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios | arXiv: 2509.22502
- Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models | arXiv: 2512.11542
- InfoDet: A Dataset for Infographic Element Detection | arXiv: 2505.17473
- InfoNCE Induces Gaussian Distribution | arXiv: 2602.24012
- Information Shapes Koopman Representation | arXiv: 2510.13025
- InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models | arXiv: 2503.06692
- Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals | arXiv: 2603.03258
- Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study | arXiv: 2509.03417
- InnoGym: Benchmarking the Innovation Potential of AI Agents | arXiv: 2512.01822
- Inoculation Prompting: Eliciting Traits from LLMs during Training Can Suppress Them at Test-Time | arXiv: 2510.04340
- Intention-Conditioned Flow Occupancy Models | arXiv: 2506.08902
- InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions | arXiv: 2506.09984
- Internal Planning in Language Models: Characterizing Horizon and Branch Awareness | arXiv: 2509.25260
- Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry | arXiv: 2510.08638
- Intrinsic Lorentz Neural Network | arXiv: 2602.23981
- Intrinsic Training Dynamics of Deep Neural Networks | arXiv: 2508.07370
- Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals | arXiv: 2505.21062
- Is Finer Better? The Limits of Microscaling Formats in Large Language Models | arXiv: 2601.19026
- Is In-Context Learning Learning? | arXiv: 2509.10414
- Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort | arXiv: 2510.01367
- Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment? | arXiv: 2508.10530
- Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation? | arXiv: 2601.20694
- Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure | arXiv: 2504.01928
- Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review | arXiv: 2502.19614
- IterResearch: Rethinking Long-Horizon Agents with Interaction Scaling | arXiv: 2511.07327
- IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning | arXiv: 2602.03060
- Jackpot: Optimal Budgeted Rejection Sampling for Extreme Actor-Policy Mismatch RL | arXiv: 2602.06107
- JailNewsBench: Multi-Lingual and Regional Benchmark for Fake News Generation under Jailbreak Attacks | arXiv: 2603.01291
- JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation | arXiv: 2509.22548
- JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation | arXiv: 2602.19163
- JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization | arXiv: 2503.23377
- Joint Shadow Generation and Relighting via Light-Geometry Interaction Maps | arXiv: 2602.21820
- JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation | arXiv: 2509.22522
- Journey to the Centre of Cluster: Harnessing Interior Nodes for A/B Testing under Network Interference | arXiv: 2602.04457
- Judge Reliability Harness: Stress Testing the Reliability of LLM Judges | arXiv: 2603.05399
- Judge's Verdict: A Comprehensive Analysis of LLM Judge Capability Through Human Agreement | arXiv: 2510.09738
- JULI: Jailbreak Large Language Models by Self-Introspection | arXiv: 2505.11790
- K-Sort Eval: Efficient Preference Evaluation for Visual Generation via Corrected VLM-as-a-Judge | arXiv: 2602.09411
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models | arXiv: 2602.11184
- KeepLoRA: Continual Learning with Residual Gradient Adaptation | arXiv: 2601.19659
- Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Self-Attention Transformers | arXiv: 2510.23912
- Knowing When to Quit: Probabilistic Early Exits for Speech Separation | arXiv: 2507.09768
- Knowledge Fusion of Large Language Models Via Modular SkillPacks | arXiv: 2505.18502
- Knowledgeable Language Models as Black-Box Optimizers for Personalized Medicine | arXiv: 2509.20975
- KV Cache Transform Coding for Compact Storage in LLM Inference | arXiv: 2511.01815
- KVComm: Enabling Efficient LLM Communication through Selective KV Sharing | arXiv: 2510.03346
- LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection | arXiv: 2510.08580
- Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models | arXiv: 2503.22165
- Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative | arXiv: 2502.08942
- Language-guided Open-world Video Anomaly Detection under Weak Supervision | arXiv: 2503.13160
- Laplacian Multi-scale Flow Matching for Generative Modeling | arXiv: 2602.19461
- Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency | arXiv: 2510.08431
- Latent Diffusion Model without Variational Autoencoder | arXiv: 2510.15301
- Latent Equivariant Operators for Robust Object Recognition: Promises and Challenges | arXiv: 2602.18406
- Latent Fourier Transform
- Latent Speech-Text Transformer | arXiv: 2510.06195
- Latent Wasserstein Adversarial Imitation Learning | arXiv: 2603.05440
- LaVCa: LLM-assisted Visual Cortex Captioning | arXiv: 2502.13606
- Layer by layer, module by module: Choose both for optimal OOD probing of ViT | arXiv: 2603.05280
- LCA: Local Classifier Alignment for Continual Learning | arXiv: 2603.09888
- LD-MoLE: Learnable Dynamic Routing for Mixture of LoRA Experts | arXiv: 2509.25684
- Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights | arXiv: 2603.13186
- Learning a distance measure from the information-estimation geometry of data | arXiv: 2510.02514
- Learning Adaptive Distribution Alignment with Neural Characteristic Function for Graph Domain Adaptation | arXiv: 2602.10489
- Learning Concept Bottleneck Models from Mechanistic Explanations | arXiv: 2603.07343
- Learning Domain-Aware Task Prompt Representations for Multi-Domain All-in-One Image Restoration | arXiv: 2603.01725
- Learning from Synthetic Data Improves Multi-hop Reasoning | arXiv: 2603.02091
- Learning Molecular Chirality via Chiral Determinant Kernels | arXiv: 2602.07415
- Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization | arXiv: 2509.22115
- Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks | arXiv: 2505.11846
- Learning Ordinal Probabilistic Reward from Preferences | arXiv: 2602.12660
- Learning Part-Aware Dense 3D Feature Field for Generalizable Articulated Object Manipulation | arXiv: 2602.14193
- Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation | arXiv: 2512.09185
- Learning Physics-Grounded 4D Dynamics with Neural Gaussian Force Fields | arXiv: 2602.00148
- Learning Recursive Multi-Scale Representations for Irregular Multivariate Time Series Forecasting | arXiv: 2602.21498
- Learning Robust Intervention Representations with Delta Embeddings | arXiv: 2508.04492
- Learning Structure-Semantic Evolution Trajectories for Graph Domain Adaptation | arXiv: 2602.10506
- Learning to Generate Unit Test via Adversarial Reinforcement Learning | arXiv: 2508.21107
- Learning to Orchestrate Agents in Natural Language with the Conductor | arXiv: 2512.04388
- Learning to Play Multi-Follower Bayesian Stackelberg Games | arXiv: 2510.01387
- Learning to Reason without External Rewards | arXiv: 2505.19590
- Learning to Recall with Transformers Beyond Orthogonal Embeddings | arXiv: 2603.15923
- Learning to Solve Orienteering Problem with Time Windows and Variable Profits | arXiv: 2603.06260
- Learning Unified Representation of 3D Gaussian Splatting | arXiv: 2509.22917
- Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control | arXiv: 2506.01943
- Learning-guided Kansa Collocation for Forward and Inverse PDE Problems | arXiv: 2602.07970
- Less is More: Clustered Cross-Covariance Control for Offline RL | arXiv: 2601.20765
- Less is More: Towards Simple Graph Contrastive Learning | arXiv: 2509.25742
- Let's Split Up: Zero-Shot Classifier Edits for Fine-Grained Video Understanding | arXiv: 2602.16545
- Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification | arXiv: 2507.11662
- Leveraging Data to Say No: Memory Augmented Plug-and-Play Selective Prediction | arXiv: 2601.22570
- LH-Deception: Simulating and Understanding LLM Deceptive Behaviors in Long-Horizon Interactions | arXiv: 2510.03999
- Lifelong Learning with Behavior Consolidation for Vehicle Routing | arXiv: 2509.21765
- LightMem: Lightweight and Efficient Memory-Augmented Generation | arXiv: 2510.18866
- LightRetriever: A LLM-based Text Retrieval Architecture with Extremely Faster Query Inference | arXiv: 2505.12260
- LingOly-TOO: Disentangling Reasoning from Knowledge with Templatised Orthographic Obfuscation | arXiv: 2503.02972
- LipNeXt: Scaling up Lipschitz-based Certified Robustness to Billion-parameter Models | arXiv: 2601.18513
- Lipschitz Bandits with Stochastic Delayed Feedback | arXiv: 2510.00309
- LiTo: Surface Light Field Tokenization | arXiv: 2603.11047
- LiveNewsBench: Evaluating LLM Web Search Capabilities with Freshly Curated News | arXiv: 2602.13543
- LiveWeb-IE: A Benchmark For Online Web Information Extraction | arXiv: 2603.13773
- LLaVA-FA: Learning Fourier Approximation for Compressing Large Multimodal Models | arXiv: 2602.00135
- LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery | arXiv: 2510.22503
- LLM DNA: Tracing Model Evolution via Functional Representations | arXiv: 2509.24496
- LLM Unlearning with LLM Beliefs | arXiv: 2510.19422
- LLM2Fx-Tools: Tool Calling For Music Post-Production | arXiv: 2512.01559
- LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations | arXiv: 2602.09924
- Locality-Attending Vision Transformer | arXiv: 2603.04892
- Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation | arXiv: 2507.01957
- Localized Concept Erasure in Text-to-Image Diffusion Models via High-Level Representation Misdirection | arXiv: 2602.19631
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning | arXiv: 2505.21289
- Log Probability Tracking of LLM APIs | arXiv: 2512.03816
- LogicReward: Incentivizing LLM Reasoning via Step-Wise Logical Supervision | arXiv: 2512.18196
- LogicXGNN: Grounded Logical Rules for Explaining Graph Neural Networks | arXiv: 2503.19476
- Long-Context Generalization with Sparse Attention | arXiv: 2506.16640
- LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards | arXiv: 2603.02146
- LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning | arXiv: 2506.18841
- Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation | arXiv: 2602.24041
- LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation | arXiv: 2603.10899
- LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts | arXiv: 2510.19363
- Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall | arXiv: 2510.19304
- LoRA-Edit: Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning | arXiv: 2506.10082
- LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordinal Data | arXiv: 2602.04192
- Lossless Vocabulary Reduction for Auto-Regressive Language Models | arXiv: 2510.08102
- LPWM: Latent Particle World Models for Object-Centric Stochastic Dynamics | arXiv: 2603.04553
- LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals | arXiv: 2509.21875
- Lumos-1: On Autoregressive Video Generation with Discrete Diffusion from a Unified Model Perspective | arXiv: 2507.08801
- LVTINO: LAtent Video consisTency INverse sOlver for High Definition Video Restoration | arXiv: 2510.01339
- LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding | arXiv: 2602.04541
- M\(^2\)-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining | arXiv: 2602.05429
- MAC-AMP: A Closed-Loop Multi-Agent Collaboration System for Multi-Objective Antimicrobial Peptide Design | arXiv: 2602.14926
- Mamba-3: Improved Sequence Modeling using State Space Principles | arXiv: 2603.15569
- Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs | arXiv: 2510.13251
- Mapping Semantic & Syntactic Relationships with Geometric Rotation | arXiv: 2510.09790
- MAPSS: Manifold-based Assessment of Perceptual Source Separation | arXiv: 2509.09212
- MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding | arXiv: 2510.07915
- Market Games for Generative Models: Equilibria, Welfare, and Strategic Entry | arXiv: 2602.17787
- Markovian Transformers for Informative Language Modeling | arXiv: 2404.18988
- MARS-Sep: Multimodal-Aligned Reinforced Sound Separation | arXiv: 2510.10509
- MATA: A Trainable Hierarchical Automaton System for Multi-Agent Visual Reasoning | arXiv: 2601.19204
- MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task | arXiv: 2502.11684
- Maximizing Asynchronicity in Event-based Neural Networks | arXiv: 2505.11165
- Maximizing Incremental Information Entropy for Contrastive Learning | arXiv: 2603.12594
- MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains | arXiv: 2603.00873
- mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules | arXiv: 2505.12565
- Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark | arXiv: 2510.02356
- Measuring the Intrinsic Dimension of Earth Representations | arXiv: 2511.02101
- Measuring Uncertainty Calibration | arXiv: 2512.13872
- Mechanism of Task-oriented Information Removal in In-context Learning | arXiv: 2509.21012
- MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science | arXiv: 2506.04405
- MEGS\(^{2}\): Memory-Efficient Gaussian Splatting via Spherical Gaussians and Unified Pruning | arXiv: 2509.07021
- Memba: Membrane-driven Parameter-Efficient Fine-Tuning for Mamba | arXiv: 2506.18184
- Membership Inference Attacks Against Fine-tuned Diffusion Language Models | arXiv: 2601.20125
- Membership Privacy Risks of Sharpness Aware Minimization | arXiv: 2310.00488
- MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation | arXiv: 2508.19236
- MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages | arXiv: 2509.26601
- MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding | arXiv: 2510.23479
- Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering | arXiv: 2506.06905
- Meta-RL Induces Exploration in Language Agents | arXiv: 2512.16848
- Metis-SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start | arXiv: 2510.25801
- Minor First, Major Last: A Depth-Induced Implicit Bias of Sharpness-Aware Minimization | arXiv: 2603.08290
- Missing Mass for Differentially Private Domain Discovery | arXiv: 2603.14016
- Mitigating Mismatch within Reference-based Preference Optimization | arXiv: 2602.11902
- Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets | arXiv: 2510.02818
- Mitigating the Safety Alignment Tax with Null-Space Constrained Policy Optimization | arXiv: 2512.11391
- Mixing Importance with Diversity: Joint Optimization for KV Cache Compression in Large Vision-Language Models | arXiv: 2510.20707
- MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning | arXiv: 2506.00555
- MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning | arXiv: 2603.02024
- MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | arXiv: 2506.04779
- MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs | arXiv: 2508.18264
- MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes | arXiv: 2509.24945
- Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter | arXiv: 2505.18612
- Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory? | arXiv: 2510.21842
- Modal Logical Neural Networks for Financial AI | arXiv: 2603.12487
- Modality-free Graph In-context Alignment | arXiv: 2603.13434
- Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs | arXiv: 2507.04219
- Model Predictive Adversarial Imitation Learning for Planning from Observation | arXiv: 2507.21533
- MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting | arXiv: 2510.19210
- MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation | arXiv: 2505.15054
- MOLM: Mixture of LoRA Markers | arXiv: 2510.00293
- MoMa: A Modular Deep Learning Framework for Material Property Prediction | arXiv: 2502.15483
- MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation | arXiv: 2510.18316
- MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE | arXiv: 2507.00390
- Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos | arXiv: 2510.18489
- Monocular Normal Estimation via Shading Sequence Estimation | arXiv: 2602.09929
- MoSA: Motion-Coherent Human Video Generation via Structure-Appearance Decoupling | arXiv: 2508.17404
- MOSIV: Multi-Object System Identification from Videos | arXiv: 2603.06022
- Motion Prior Distillation in Time Reversal Sampling for Generative Inbetweening | arXiv: 2602.12679
- MotionStream: Real-Time Video Generation with Interactive Motion Controls | arXiv: 2511.01266
- Moving Beyond Medical Exams: A Clinician-Annotated Fairness Dataset of Real-World Tasks and Ambiguity in Mental Healthcare | arXiv: 2502.16051
- mR3: Multilingual Rubric-Agnostic Reward Reasoning Models | arXiv: 2510.01146
- MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates | arXiv: 2510.05361
- Multi-agent Coordination via Flow Matching | arXiv: 2511.05005
- Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies | arXiv: 2502.02533
- Multi-Head Low-Rank Attention (MLRA) | arXiv: 2603.02188
- Multi-LLM Adaptive Conformal Inference for Reliable LLM Responses | arXiv: 2602.01285
- Multi-modal Data Spectrum: Multi-modal Datasets are Multi-dimensional | arXiv: 2509.23499
- Multi-View Encoders for Performance Prediction in LLM-Based Agentic Workflows | arXiv: 2505.19764
- Multilingual Routing in Mixture-of-Experts | arXiv: 2510.04694
- MultiMat: Multimodal Program Synthesis for Procedural Materials using Large Multimodal Models | arXiv: 2509.22151
- Multimodal Classification via Total Correlation Maximization | arXiv: 2602.13015
- Multimodal Dataset Distillation Made Simple by Prototype-Guided Data Synthesis | arXiv: 2602.19756
- Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs | arXiv: 2510.09201
- MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning | arXiv: 2505.12742
- MVCustom: Multi-View Customized Diffusion via Geometric Latent Rendering and Completion | arXiv: 2510.13702
- MVR: Multi-view Video Reward Shaping for Reinforcement Learning | arXiv: 2603.01694
- Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences | arXiv: 2510.13900
- Native Reasoning Models: Training Language Models to Reason on Unverifiable Data | arXiv: 2602.11549
- Near-Optimal Online Deployment and Routing for Streaming LLMs | arXiv: 2506.17254
- Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning | arXiv: 2510.09487
- Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information | arXiv: 2502.00204
- NeMo-map: Neural Implicit Flow Fields for Spatio-Temporal Motion Mapping | arXiv: 2510.14827
- Neon: Negative Extrapolation From Self-Training Improves Image Generation | arXiv: 2510.03597
- NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks | arXiv: 2603.06922
- Neural Force Field: Few-shot Learning of Generalized Physical Reasoning | arXiv: 2502.08987
- Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit | arXiv: 2511.15120
- Neural Synchrony Between Socially Interacting Language Models | arXiv: 2602.17815
- NeuralOS: Towards Simulating Operating Systems via Neural Generative Models | arXiv: 2507.08800
- Neuro-Symbolic Decoding of Neural Activity | arXiv: 2603.03343
- Neurocircuitry-Inspired Hierarchical Graph Causal Attention Networks for Explainable Depression Identification | arXiv: 2511.17622
- NeuroGaze-Distill: Brain-informed Distillation and Depression-Inspired Geometric Priors for Robust Facial Emotion Recognition | arXiv: 2509.11916
- NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents | arXiv: 2510.07172
- Next Visual Granularity Generation | arXiv: 2508.12811
- NIMO: a Nonlinear Interpretable MOdel | arXiv: 2506.05059
- No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes | arXiv: 2509.10625
- No Caption, No Problem: Caption-Free Membership Inference via Model-Fitted Embeddings | arXiv: 2602.22689
- No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves | arXiv: 2505.02831
- No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping | arXiv: 2509.21880
- Noise Stability of Transformer Models | arXiv: 2602.08287
- Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization | arXiv: 2504.02996
- Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning | arXiv: 2510.01278
- Non-Asymptotic Analysis of Efficiency in Conformalized Regression | arXiv: 2510.07093
- Non-Clashing Teaching in Graphs: Algorithms, Complexity, and Bounds | arXiv: 2602.00657
- Non-Collaborative User Simulators for Tool Agents | arXiv: 2509.23124
- Nonparametric Teaching of Attention Learners | arXiv: 2602.20461
- NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction | arXiv: 2603.04179
- NRGPT: An Energy-based Alternative for GPT | arXiv: 2512.16762
- Nudging the Boundaries of LLM Reasoning | arXiv: 2509.25666
- Null-Space Filtering for Data-Free Continual Model Merging: Preserving Stability, Promoting Plasticity | arXiv: 2509.21413
- Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search | arXiv: 2602.22983
- ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment | arXiv: 2602.17560
- Offline Reinforcement Learning with Generative Trajectory Policies | arXiv: 2510.11499
- OFMU: Optimization-Driven Framework for Machine Unlearning | arXiv: 2509.22483
- Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research | arXiv: 2602.16072
- Omni-View: Unlocking How Generation Facilitates Understanding in Unified 3D Model based on Multiview images | arXiv: 2511.07222
- OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning | arXiv: 2509.09332
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models | arXiv: 2506.03135
- On Discovering Algorithms for Adversarial Imitation Learning | arXiv: 2510.00922
- On Entropy Control in LLM-RL Algorithms | arXiv: 2509.03493
- On the \(O(1/T)\) Convergence of Alternating Gradient Descent-Ascent in Bilinear Games | arXiv: 2510.03855
- On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning | arXiv: 2505.17508
- On the Eligibility of LLMs for Counterfactual Reasoning: A Decompositional Study | arXiv: 2505.11839
- On the Expressive Power of GNNs for Boolean Satisfiability | arXiv: 2602.08745
- On The Fragility of Benchmark Contamination Detection in Reasoning Models | arXiv: 2510.02386
- On the Generalization Capacities of MLLMs for Spatial Intelligence | arXiv: 2603.06704
- On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification | arXiv: 2508.05629
- On the Impact of the Utility in Semivalue-based Data Valuation | arXiv: 2502.06574
- On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets | arXiv: 2505.24403
- On the Wings of Imagination: Conflicting Script-based Multi-role Framework for Humor Caption Generation | arXiv: 2602.06423
- One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration | arXiv: 2505.18382
- One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations | arXiv: 2603.08869
- One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning | arXiv: 2509.07945
- One Operator to Rule Them All? On Boundary-Indexed Operator Families in Neural PDE Solvers | arXiv: 2603.01406
- One-Prompt Strikes Back: Sparse Mixture of Experts for Prompt-based Continual Learning | arXiv: 2509.24483
- One2Scene: Geometric Consistent Explorable 3D Scene Generation from a Single Image | arXiv: 2602.19766
- Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits | arXiv: 2510.00803
- Online Prediction of Stochastic Sequences with High Probability Regret Bounds | arXiv: 2602.16236
- Online time series prediction using feature adjustment | arXiv: 2509.03810
- OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety | arXiv: 2507.06134
- Openfly: A comprehensive platform for aerial vision-language navigation | arXiv: 2502.18041
- Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks | arXiv: 2508.18672
- Optimal Transport-Induced Samples against Out-of-Distribution Overconfidence | arXiv: 2601.21320
- Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards | arXiv: 2507.03041
- Optimistic Task Inference for Behavior Foundation Models | arXiv: 2510.20264
- Optimizer choice matters for the emergence of Neural Collapse | arXiv: 2602.16642
- Out of the Shadows: Exploring a Latent Space for Neural Network Verification | arXiv: 2505.17854
- Oversmoothing, Oversquashing, Heterophily, Long-Range, and more: Demystifying Common Beliefs in Graph Machine Learning | arXiv: 2505.15547
- Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling | arXiv: 2509.25827
- OWLEYE: Zero-Shot Learner for Cross-Domain Graph Data Anomaly Detection | arXiv: 2601.19102
- P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling | arXiv: 2602.12116
- PaAno: Patch-Based Representation Learning for Time-Series Anomaly Detection | arXiv: 2602.01359
- PACE: Pretrained Audio Continual Learning | arXiv: 2602.03355
- Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding | arXiv: 2602.06733
- Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences | arXiv: 2510.13201
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning | arXiv: 2504.17192
- Parallel Token Prediction for Language Models | arXiv: 2512.21323
- ParaS2S: Benchmarking and Aligning Spoken Language Models for Paralinguistic-aware Speech-to-Speech Interaction | arXiv: 2511.08723
- Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization | arXiv: 2602.00737
- ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference | arXiv: 2511.10645
- Partially Equivariant Reinforcement Learning in Symmetry-Breaking Environments | arXiv: 2512.00915
- PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data | arXiv: 2509.21965
- PASER: Post-Training Data Selection for Efficient Pruned Large Language Model Recovery | arXiv: 2502.12594
- Pay Attention to CTC: Fast and Robust Pseudo-Labelling for Unified Speech Recognition | arXiv: 2602.19316
- PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models | arXiv: 2509.25774
- PD²GS: Part-Level Decoupling and Continuous Deformation of Articulated Objects via Gaussian Splatting | arXiv: 2506.09663
- Pedagogically-Inspired Data Synthesis for Language Model Knowledge Distillation | arXiv: 2602.12172
- Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction | arXiv: 2506.14856
- PerfGuard: A Performance-Aware Agent for Visual Content Generation | arXiv: 2601.22571
- PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra | arXiv: 2602.15669
- Personalized Collaborative Learning with Affinity-Based Variance Reduction | arXiv: 2510.16232
- PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits | arXiv: 2509.11362
- Perturbation-Induced Linearization: Constructing Unlearnable Data with Solely Linear Classifiers | arXiv: 2601.19967
- PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement | arXiv: 2602.14968
- pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation | arXiv: 2510.14974
- PI-Light: Physics-Inspired Diffusion for Full-Image Relighting | arXiv: 2601.22135
- PICS: Pairwise Image Compositing with Spatial Interactions | arXiv: 2603.06873
- Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers | arXiv: 2508.10480
- Plan and Budget: Effective and Efficient Test-Time Scaling on Reasoning Large Language Models | arXiv: 2505.16122
- PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment | arXiv: 2505.21366
- PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints | arXiv: 2509.21057
- PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives | arXiv: 2505.19558
- Policy myopia as a mechanism of gradual disempowerment in Post-AGI governance, Circa 2049 | arXiv: 2603.03267
- PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning | arXiv: 2602.01156
- PolyGraph Discrepancy: a classifier-based metric for graph generation | arXiv: 2510.06122
- Polynomial, trigonometric, and tropical activations | arXiv: 2502.01247
- PolySHAP: Extending KernelSHAP with Interaction-Informed Polynomial Regression | arXiv: 2601.18608
- PonderLM: Pretraining Language Models to Ponder in Continuous Space | arXiv: 2505.20674
- PoSh: Using Scene Graphs To Guide LLMs-as-a-Judge For Detailed Image Descriptions | arXiv: 2510.19060
- Post-hoc Probabilistic Vision-Language Models | arXiv: 2412.06014
- Post-training Large Language Models for Diverse High-Quality Responses | arXiv: 2509.04784
- PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models | arXiv: 2510.22936
- Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning | arXiv: 2603.16127
- PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation | arXiv: 2603.00976
- Predicting kernel regression learning curves from only raw data statistics | arXiv: 2510.14878
- Predicting LLM Reasoning Performance with Small Proxy Model | arXiv: 2509.21013
- Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs | arXiv: 2509.25380
- Preference Leakage: A Contamination Problem in LLM-as-a-judge | arXiv: 2502.01534
- PreferThinker: Reasoning-based Personalized Image Preference Assessment | arXiv: 2511.00609
- Principled Fast and Meta Knowledge Learners for Continual Reinforcement Learning | arXiv: 2603.00903
- Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity | arXiv: 2509.18577
- PrismAudio: Decomposed Chain-of-Thoughts and Multi-dimensional Rewards for Video-to-Audio Generation | arXiv: 2511.18833
- PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies | arXiv: 2510.16505
- Probabilistic Kernel Function for Fast Angle Testing | arXiv: 2505.20274
- Procedural Mistake Detection via Action Effect Modeling | arXiv: 2512.03474
- Prompt and Parameter Co-Optimization for Large Language Models | arXiv: 2509.24245
- Propaganda AI: An Analysis of Semantic Divergence in Large Language Models | arXiv: 2504.12344
- ProPerSim: Developing Proactive and Personalized AI Assistants through User-Assistant Simulation | arXiv: 2509.21730
- Protein as a Second Language for LLMs | arXiv: 2510.11188
- Protein Counterfactuals via Diffusion-Guided Latent Optimization | arXiv: 2603.10811
- Protein Structure Tokenization via Geometric Byte Pair Encoding | arXiv: 2511.11758
- ProtoTS: Learning Hierarchical Prototypes for Explainable Time Series Forecasting | arXiv: 2509.23159
- Provable and Practical In-Context Policy Optimization for Self-Improvement | arXiv: 2603.01335
- Provably Explaining Neural Additive Models | arXiv: 2602.17530
- Pruning as a Cooperative Game: Surrogate-Assisted Layer Contribution Estimation for Large Language Models | arXiv: 2602.07804
- Pseudo-Nonlinear Data Augmentation: A Constrained Energy Minimization Viewpoint | arXiv: 2410.00718
- PT\(^2\)-LLM: Post-Training Ternarization for Large Language Models | arXiv: 2510.03267
- PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models | arXiv: 2601.21238
- PURGE: Reinforcement Unlearning via Group Relative Policy Optimization | arXiv: 2601.20568
- Purifying Generative LLMs from Backdoors without Prior Knowledge or Clean Reference | arXiv: 2603.13461
- Purrception: Variational Flow Matching for Vector-Quantized Image Generation | arXiv: 2510.01478
- Pyramidal Patchification Flow for Visual Generation | arXiv: 2506.23543
- pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning | arXiv: 2603.00905
- Q-FSRU: Quantum-Augmented Frequency-Spectral For Medical Visual Question Answering | arXiv: 2509.23899
- Q-RAG: Long Context Multi-Step Retrieval via Value-Based Embedder Training | arXiv: 2511.07328
- QKV Projections Require a Fraction of Their Memory | arXiv: 2506.02939
- QuadGPT: Native Quadrilateral Mesh Generation with Autoregressive Models | arXiv: 2509.21420
- QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture | arXiv: 2601.19580
- Quantized Visual Geometry Grounded Transformer | arXiv: 2509.21302
- QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification | arXiv: 2509.23681
- Query-Guided Spatial-Temporal-Frequency Interaction for Music Audio-Visual Question Answering | arXiv: 2601.19821
- Query-Level Uncertainty in Large Language Models | arXiv: 2506.09669
- QuRL: Efficient Reinforcement Learning with Quantized Rollout | arXiv: 2602.13953
- QVGen: Pushing the Limit of Quantized Video Generative Models | arXiv: 2505.11497
- RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training | arXiv: 2510.04008
- RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs | arXiv: 2509.25426
- Radiometrically Consistent Gaussian Surfels for Inverse Rendering | arXiv: 2603.01491
- RAE: A Neural Network Dimensionality Reduction Method for Nearest Neighbors Preservation in Vector Search | arXiv: 2509.25839
- RAEE: A Robust Retrieval-Augmented Early Exit Framework for Efficient Inference | arXiv: 2405.15198
- RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format | arXiv: 2602.22538
- Randomization Boosts KV Caching, Learning Balances Query Load: A Joint Perspective | arXiv: 2601.18999
- RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty | arXiv: 2602.12424
- Rapid training of Hamiltonian graph networks using random features | arXiv: 2506.06558
- RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation | arXiv: 2502.10996
- Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment | arXiv: 2506.01290
- RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding | arXiv: 2505.14462
- REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Reasoning | arXiv: 2505.19862
- Real-Time Robot Execution with Masked Action Chunking | arXiv: 2601.20130
- Reasoned Safety Alignment: Ensuring Jailbreak Defense via Answer-Then-Check | arXiv: 2509.11629
- Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment | arXiv: 2510.11369
- Reasoning Boosts Opinion Alignment in LLMs | arXiv: 2603.01214
- Reasoning on Time-Series for Financial Technical Analysis | arXiv: 2511.08616
- Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models | arXiv: 2509.24156
- Reasoning-Driven Multimodal LLM for Domain Generalization | arXiv: 2602.23777
- ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory | arXiv: 2509.25140
- RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind | arXiv: 2601.15715
- RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization | arXiv: 2505.13289
- Rectified Decoupled Dataset Distillation: A Closer Look for Fair and Comprehensive Evaluation | arXiv: 2509.19743
- Redirection for Erasing Memory (REM): Towards a universal unlearning method for corrupted data | arXiv: 2505.17730
- RedSage: A Cybersecurity Generalist LLM | arXiv: 2601.22159
- RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments | arXiv: 2505.21936
- Reducing Belief Deviation in Reinforcement Learning for Active Reasoning | arXiv: 2510.12264
- Reducing Class-Wise Performance Disparity via Margin Regularization | arXiv: 2602.00205
- Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks | arXiv: 2602.23898
- RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation | arXiv: 2601.22094
- Reference-Guided Machine Unlearning | arXiv: 2603.11210
- References Improve LLM Alignment in Non-Verifiable Domains | arXiv: 2602.16802
- Referring Layer Decomposition | arXiv: 2602.19358
- Refine Now, Query Fast: A Decoupled Refinement Paradigm for Implicit Neural Fields | arXiv: 2602.15155
- ReFORM: Reflected Flows for On-support Offline RL via Noise Manipulation | arXiv: 2602.05051
- ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization | arXiv: 2510.24592
- RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning | arXiv: 2505.21413
- RegionReasoner: Region-Grounded Multi-Round Visual Reasoning | arXiv: 2602.03733
- Regret-Guided Search Control for Efficient Learning in AlphaZero | arXiv: 2602.20809
- Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models | arXiv: 2603.15857
- REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning? | arXiv: 2505.10872
- ReIn: Conversational Error Recovery with Reasoning Inception | arXiv: 2602.17022
- Rejuvenating Cross-Entropy Loss in Knowledge Distillation for Recommender Systems | arXiv: 2509.20989
- Relational Feature Caching for Accelerating Diffusion Transformers | arXiv: 2602.19506
- Relational Graph Transformer | arXiv: 2505.10960
- Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data | arXiv: 2510.06377
- Relatron: Automating Relational Machine Learning over Relational Databases | arXiv: 2602.22552
- REMem: Reasoning with Episodic Memory in Language Agent | arXiv: 2602.13530
- ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning | arXiv: 2603.10160
- ResCP: Reservoir Conformal Prediction for Time Series Forecasting | arXiv: 2510.05060
- Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement | arXiv: 2506.05154
- Resource-Adaptive Federated Text Generation with Differential Privacy | arXiv: 2603.07027
- Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis | arXiv: 2602.15909
- ResWorld: Temporal Residual World Model for End-to-End Autonomous Driving | arXiv: 2602.10884
- Retaining Suboptimal Actions to Follow Shifting Optima in Multi-Agent Reinforcement Learning | arXiv: 2602.17062
- Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures | arXiv: 2602.03379
- Rethinking Code Similarity for Automated Algorithm Design with LLMs | arXiv: 2603.02787
- Rethinking Consistent Multi-Label Classification Under Inexact Supervision | arXiv: 2510.04091
- Rethinking Continual Learning with Progressive Neural Collapse | arXiv: 2505.24254
- Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning | arXiv: 2603.01741
- Rethinking Uncertainty Estimation in LLMs: A Principled Single-Sequence Measure | arXiv: 2412.15176
- Retrieval-Augmented Generation for Predicting Cellular Responses to Gene Perturbation | arXiv: 2603.07233
- Revela: Dense Retriever Learning via Language Modeling | arXiv: 2506.16552
- Reverse Distillation: Consistently Scaling Protein Language Model Representations | arXiv: 2603.07710
- Revisit Visual Prompt Tuning: The Expressiveness of Prompt Experts | arXiv: 2501.18936
- Revisiting [CLS] and Patch Token Interaction in Vision Transformers | arXiv: 2602.08626
- Revisiting Matrix Sketching in Linear Bandits: Achieving Sublinear Regret via Dyadic Block Sketching | arXiv: 2410.10258
- Revisiting Node Affinity Prediction in Temporal Graphs | arXiv: 2510.06940
- Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation | arXiv: 2603.10048
- Revisiting the Past: Data Unlearning with Model State History | arXiv: 2506.20941
- Revisiting Weight Regularization for Low-Rank Continual Learning | arXiv: 2602.17559
- RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning | arXiv: 2510.02240
- RF-MatID: Dataset and Benchmark for Radio Frequency Material Identification | arXiv: 2601.20377
- RFEval: Benchmarking Reasoning Faithfulness under Counterfactual Reasoning Intervention in Large Reasoning Models | arXiv: 2602.17053
- RIDER: 3D RNA Inverse Design with Reinforcement Learning-Guided Diffusion | arXiv: 2602.16548
- Risk-Sensitive Agent Compositions | arXiv: 2506.04632
- RLP: Reinforcement as a Pretraining Objective | arXiv: 2510.01265
- RM-R1: Reward Modeling as Reasoning | arXiv: 2505.02387
- RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation | arXiv: 2602.00849
- RNE: plug-and-play diffusion inference-time control and energy-based training | arXiv: 2506.05668
- RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots | arXiv: 2603.04356
- RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation | arXiv: 2602.09973
- RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks | arXiv: 2506.06683
- Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation | arXiv: 2406.03862
- Robust Multi-Objective Controlled Decoding of Large Language Models | arXiv: 2503.08796
- Robust Preference Alignment via Directional Neighborhood Consensus | arXiv: 2510.20498
- Robust Spiking Neural Networks Against Adversarial Attacks | arXiv: 2602.20548
- Rolling Ball Optimizer: Learning by ironing out loss landscape wrinkles | arXiv: 2505.19527
- ROMI: Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting | arXiv: 2603.08118
- Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs | arXiv: 2507.21914
- Routing Channel-Patch Dependencies in Time Series Forecasting with Graph Spectral Decomposition | arXiv: 2603.13702
- Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance | arXiv: 2510.24711
- Routing, Cascades, and User Choice for LLMs | arXiv: 2602.09902
- RRNCO: Towards Real-World Routing with Neural Combinatorial Optimization | arXiv: 2503.16159
- RS-ORT: A Reduced-Space Branch-and-Bound Algorithm for Optimal Regression Trees | arXiv: 2510.23901
- RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling | arXiv: 2506.08672
- S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion | arXiv: 2504.07667
- SABRE-FL: Selective and Accurate Backdoor Rejection for Federated Prompt Learning | arXiv: 2506.22506
- Saddle-to-Saddle Dynamics Explains A Simplicity Bias Across Neural Network Architectures | arXiv: 2512.20607
- Safe Continuous-time Multi-Agent Reinforcement Learning via Epigraph Form | arXiv: 2602.17078
- SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety | arXiv: 2505.20065
- SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions | arXiv: 2509.24243
- Safety Subspaces are Not Linearly Distinct: A Fine-Tuning Case Study | arXiv: 2505.14185
- SAGE: Spatial-visual Adaptive Graph Exploration for Efficient Visual Place Recognition | arXiv: 2509.25723
- SALVE: Sparse Autoencoder-Latent Vector Editing for Mechanistic Control of Neural Networks | arXiv: 2512.15938
- Same Content, Different Representations: A Controlled Study for Table QA | arXiv: 2509.22983
- Sample-efficient and Scalable Exploration in Continuous-Time RL | arXiv: 2510.24482
- Sample-Efficient Distributionally Robust Multi-Agent Reinforcement Learning via Online Interaction | arXiv: 2508.02948
- Sample-efficient evidence estimation of score based priors for model selection | arXiv: 2602.20549
- SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs | arXiv: 2507.14894
- Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning | arXiv: 2510.19807
- Scalable Exploration for High-Dimensional Continuous Control via Value-Guided Flow | arXiv: 2601.19707
- Scalable In-Context Q-Learning | arXiv: 2506.01299
- Scalable Multi-Task Low-Rank Model Adaptation | arXiv: 2603.01526
- Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion | arXiv: 2602.21646
- Scalable Random Wavelet Features: Efficient Non-Stationary Kernel Approximation with Convergence Guarantees | arXiv: 2602.00987
- Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics | arXiv: 2602.02128
- Scaling Generalist Data-Analytic Agents | arXiv: 2509.25084
- Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD? | arXiv: 2603.02069
- Scaling Reasoning Hop Exposes Weaknesses: Demystifying and Improving Hop Generalization in Large Language Models | arXiv: 2601.21214
- Scaling Sequence-to-Sequence Generative Neural Rendering | arXiv: 2510.04236
- Scaling Speech Tokenizers with Diffusion Autoencoders | arXiv: 2602.06602
- Scaling with Collapse: Efficient and Predictable Training of LLM Families | arXiv: 2509.25087
- scDFM: Distributional Flow Matching Model for Robust Single-Cell Perturbation Prediction | arXiv: 2602.07103
- SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes | arXiv: 2510.16714
- SceneTransporter: Optimal Transport-Guided Compositional Latent Diffusion for Single-Image Structured 3D Scene Generation | arXiv: 2602.22785
- SciTS: Scientific Time Series Understanding and Generation with LLMs | arXiv: 2510.03255
- SCRAPL: Scattering Transform with Random Paths for Machine Learning | arXiv: 2602.11145
- SEAL: Segment Any Events with Language | arXiv: 2601.23159
- SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models | arXiv: 2506.01062
- Search Arena: Analyzing Search-Augmented LLMs | arXiv: 2506.05334
- SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC | arXiv: 2506.15307
- SEED-SET: Scalable Evolving Experimental Design for System-level Ethical Testing | arXiv: 2603.01630
- SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding | arXiv: 2503.06437
- SeeDNorm: Self-Rescaled Dynamic Normalization | arXiv: 2510.22777
- Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes | arXiv: 2510.19400
- Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek | arXiv: 2505.17702
- Segment-Level Attribution for Selective Learning of Long Reasoning Traces | arXiv: 2602.00425
- Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models | arXiv: 2510.13315
- Self-Destructive Language Model | arXiv: 2505.12186
- Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking | arXiv: 2509.25787
- Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning | arXiv: 2511.01191
- Self-Improving Loops for Visual Robotic Planning | arXiv: 2506.06658
- Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning | arXiv: 2502.03752
- Self-Supervised Learning from Structural Invariance | arXiv: 2602.02381
- SelfReflect: Can LLMs Communicate Their Internal Answer Distribution? | arXiv: 2505.20295
- SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks | arXiv: 2602.06854
- Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling | arXiv: 2503.04398
- Semantic Regexes: Auto-Interpreting LLM Features with a Structured Language | arXiv: 2510.06378
- Semantic-aware Wasserstein Policy Regularization for Large Language Model Alignment | arXiv: 2602.01685
- SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation | arXiv: 2503.06764
- SeMoBridge: Semantic Modality Bridge for Efficient Few-Shot Adaptation of CLIP | arXiv: 2509.26036
- SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation | arXiv: 2506.00523
- SERE: Similarity-based Expert Re-routing for Efficient Batch Decoding in MoE Models | arXiv: 2602.07616
- SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation | arXiv: 2603.13396
- SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs | arXiv: 2509.20758
- Sharing State Between Prompts and Programs | arXiv: 2512.14805
- Sharp Monocular View Synthesis in Less Than a Second | arXiv: 2512.10685
- Sharpness-Aware Machine Unlearning | arXiv: 2506.13715
- SHE-LoRA: Selective Homomorphic Encryption for Federated Tuning with Heterogeneous LoRA | arXiv: 2505.21051
- SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense | arXiv: 2510.16596
- ShieldedCode: Learning Robust Representations for Virtual Machine Protected Code | arXiv: 2601.20679
- Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People | arXiv: 2510.20886
- Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning | arXiv: 2507.17842
- Shuffle-R1: Efficient RL Framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle | arXiv: 2508.05612
- SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion | arXiv: 2603.02882
- SiMO: Single-Modality-Operable Multimodal Collaborative Perception | arXiv: 2603.08240
- SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs | arXiv: 2410.13648
- SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents | arXiv: 2509.24282
- SiNGER: A Clearer Voice Distills Vision Transformers Further | arXiv: 2509.20986
- Single Index Bandits: Generalized Linear Contextual Bandits with Unknown Reward Functions | arXiv: 2506.12751
- Skip to the Good Part: Representation Structure & Inference-Time Layer Skipping in Diffusion vs. Autoregressive LLMs | arXiv: 2603.07475
- Skirting Additive Error Barriers for Private Turnstile Streams | arXiv: 2602.10360
- Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy | arXiv: 2507.01352
- Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning | arXiv: 2510.04072
- Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation | arXiv: 2510.20812
- SMART-R1: Advancing Multi-agent Traffic Simulation via R1-Style Reinforcement Fine-Tuning | arXiv: 2509.23993
- SMOTE and Mirrors: Exposing Privacy Leakage from Synthetic Minority Oversampling | arXiv: 2510.15083
- SNAP-UQ: Self-supervised Next-Activation Prediction for Single-Pass Uncertainty in TinyML | arXiv: 2508.12907
- SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests | arXiv: 2510.04891
- SoFlow: Solution Flow Models for One-Step Generative Modeling | arXiv: 2512.15657
- Soft Equivariance Regularization for Invariant Self-Supervised Learning | arXiv: 2603.06693
- Soft Quality-Diversity Optimization | arXiv: 2512.00810
- Solving Football by Exploiting Equilibrium Structure of 2p0s Differential Games with One-Sided Information | arXiv: 2502.00560
- Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning | arXiv: 2602.15817
- Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents | arXiv: 2510.03253
- SongEcho: Towards Cover Song Generation via Instance-Adaptive Element-wise Linear Modulation | arXiv: 2602.19976
- SONIC: Spectral Oriented Neural Invariant Convolutions | arXiv: 2601.19884
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward | arXiv: 2505.17018
- SPACeR: Self-Play Anchoring with Centralized Reference Models | arXiv: 2510.18060
- Sparse Imagination for Efficient Visual World Model Planning | arXiv: 2506.01392
- Sparsity Forcing: Reinforcing Token Sparsity of MLLMs | arXiv: 2504.18579
- SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables | arXiv: 2602.23286
- Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation | arXiv: 2510.03863
- Spatial Reasoning is Not a Free Lunch: A Controlled Study on LLaVA | arXiv: 2603.12545
- Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models | arXiv: 2510.13394
- SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? | arXiv: 2602.03916
- Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models | arXiv: 2509.24510
- Spectral Attention Steering for Prompt Highlighting | arXiv: 2603.01281
- Spectral Bellman Method: Unifying Representation and Exploration in RL | arXiv: 2507.13181
- Spectral Gaps and Spatial Priors: Studying Hyperspectral Downstream Adaptation Using TerraMind | arXiv: 2603.06690
- SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery | arXiv: 2602.17395
- Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability | arXiv: 2510.06084
- Speculative Actions: A Lossless Framework for Faster AI Agents
- SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models | arXiv: 2503.07392
- SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models | arXiv: 2509.23863
- SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs | arXiv: 2509.25390
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning | arXiv: 2506.24119
- Splat and Distill: Augmenting Teachers with Feed-Forward 3D Reconstruction For 3D-Aware Distillation | arXiv: 2602.06032
- Splat Feature Solver | arXiv: 2508.12216
- Spotlight on Token Perception for Multimodal Reinforcement Learning | arXiv: 2510.09285
- SPWOOD: Sparse Partial Weakly-Supervised Oriented Object Detection | arXiv: 2602.03634
- SR-Scientist: Scientific Equation Discovery With Agentic AI | arXiv: 2510.11661
- SSCP: Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | arXiv: 2506.21427
- SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation | arXiv: 2602.05534
- ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents | arXiv: 2410.06703
- ST4VLA: Spatially Guided Training for Vision-Language-Action Models | arXiv: 2602.10109
- Stackelberg Coupling of Online Representation Learning and Reinforcement Learning | arXiv: 2508.07452
- STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models | arXiv: 2602.03022
- Station2Radar: query conditioned gaussian splatting for precipitation field | arXiv: 2603.00418
- Statistical Advantage of Softmax Attention: Insights from Single-Location Regression | arXiv: 2509.21936
- Statistical Guarantees for Offline Domain Randomization | arXiv: 2506.10133
- Steer Away From Mode Collisions: Improving Composition In Diffusion Models | arXiv: 2509.25940
- Steerable Adversarial Scenario Generation through Test-Time Preference Alignment (SAGE) | arXiv: 2509.20102
- Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection | arXiv: 2602.24021
- Steering Language Models with Weight Arithmetic | arXiv: 2511.05408
- Steering MoE LLMs via Expert (De)Activation | arXiv: 2509.09660
- Step-Aware Residual-Guided Diffusion for EEG Spatial Super-Resolution | arXiv: 2510.19166
- STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models | arXiv: 2507.15375
- Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models | arXiv: 2508.12880
- Stochastic Self-Organization in Multi-Agent Systems | arXiv: 2510.00685
- Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs | arXiv: 2602.11528
- Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty | arXiv: 2602.12113
- Stop Wasting Your Tokens: Towards Efficient Runtime Multi-Agent Systems | arXiv: 2510.26585
- Stopping Computation for Converged Tokens in Masked Diffusion-LM Decoding | arXiv: 2602.06412
- Streaming Autoregressive Video Generation via Diagonal Distillation | arXiv: 2603.09488
- StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams | arXiv: 2506.08862
- Stress-Testing Alignment Audits With Prompt-Level Strategic Deception | arXiv: 2602.08877
- Stretching Beyond the Obvious: A Gradient-Free Framework to Unveil the Hidden Landscape of Visual Invariance | arXiv: 2506.17040
- Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning | arXiv: 2506.21039
- STRIDE: Subset-Free Functional Decomposition for XAI in Tabular Settings | arXiv: 2509.09070
- String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation | arXiv: 2510.21150
- Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models | arXiv: 2602.09713
- Structurally Human, Semantically Biased: Detecting LLM-Generated References with Embeddings and GNNs | arXiv: 2601.20704
- Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting | arXiv: 2509.26455
- Subliminal Signals in Preference Labels | arXiv: 2603.01204
- Sublinear Time Quantum Algorithm for Attention Approximation | arXiv: 2602.00874
- Summaries as Centroids for Interpretable and Scalable Text Clustering | arXiv: 2502.09667
- Superficial Safety Alignment Hypothesis | arXiv: 2410.10862
- Supervised Metric Regularization Through Alternating Optimization for Multi-Regime Physics-Informed Neural Networks | arXiv: 2602.09980
- Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning | arXiv: 2510.25992
- SurfSplat: Conquering Feedforward 2D Gaussian Splatting with Surface Continuity Priors | arXiv: 2602.02000
- SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis | arXiv: 2603.05483
- SUSD: Structured Unsupervised Skill Discovery through State Factorization | arXiv: 2602.01619
- Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback | arXiv: 2603.12595
- SwiftTS: A Swift Selection Framework for Time Series Pre-trained Models via Multi-task Meta-Learning | arXiv: 2510.23051
- SwingArena: Adversarial Programming Arena for Long-context GitHub Issue Solving | arXiv: 2505.23932
- SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs | arXiv: 2510.05069
- SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling | arXiv: 2507.11818
- SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation | arXiv: 2603.01101
- Synthesising Counterfactual Explanations via Label-Conditional Gaussian Mixture Variational Autoencoders | arXiv: 2510.04855
- SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models | arXiv: 2510.24427
- Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts | arXiv: 2506.15751
- t-SNE Exaggerates Clusters, Provably | arXiv: 2510.07746
- T1: One-to-One Channel-Head Binding for Multivariate Time-Series Imputation | arXiv: 2602.21043
- TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding | arXiv: 2509.14671
- TabStruct: Measuring Structural Fidelity of Tabular Data | arXiv: 2509.11950
- Talk, Evaluate, Diagnose: User-aware Agent Evaluation with Automated Error Analysis | arXiv: 2603.15483
- Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation | arXiv: 2602.24283
- TAMMs: Change Understanding and Forecasting in Satellite Image Time Series with Temporal-Aware Multimodal Models | arXiv: 2506.18862
- Target-Aware Video Diffusion Models | arXiv: 2503.18950
- Task-free Adaptive Meta Black-box Optimization | arXiv: 2601.21475
- TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling | arXiv: 2504.07053
- TAVAE: A VAE with Adaptable Priors Explains Contextual Modulation in the Visual Cortex | arXiv: 2602.11956
- Temperature as a Meta-Policy: Adaptive Temperature in LLM Reinforcement Learning | arXiv: 2602.11779
- Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions | arXiv: 2512.08486
- Temporal Slowness in Central Vision Drives Semantic Object Learning | arXiv: 2602.04462
- Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability | arXiv: 2511.05541
- Tensor learning with orthogonal, Lorentz, and symplectic symmetries | arXiv: 2406.01552
- Test-Time Efficient Pretrained Model Portfolios for Time Series Forecasting | arXiv: 2510.06419
- Test-Time Iterative Error Correction for Efficient Diffusion Models | arXiv: 2511.06250
- Test-Time Meta-Adaptation with Self-Synthesis | arXiv: 2603.03524
- Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments | arXiv: 2601.22647
- Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator | arXiv: 2510.13454
- Textual Equilibrium Propagation for Deep Compound AI Systems | arXiv: 2601.21064
- The Affine Divergence: Aligning Activation Updates Beyond Normalisation | arXiv: 2512.22247
- The Controllability Trap: A Governance Framework for Military AI Agents | arXiv: 2603.03515
- The Counting Power of Transformers | arXiv: 2505.11199
- The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs | arXiv: 2507.11097
- The Expressive Limits of Diagonal SSMs for State-Tracking | arXiv: 2603.01959
- The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models | arXiv: 2505.16448
- The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm | arXiv: 2507.18553
- The Geometry of Reasoning: Flowing Logics in Representation Space | arXiv: 2510.09782
- The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity? | arXiv: 2601.23045
- The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs | arXiv: 2509.09677
- The Intricate Dance of Prompt Complexity, Quality, Diversity, and Consistency in T2I Models | arXiv: 2510.19557
- The Invisibility Hypothesis: Promises of AGI and the Future of the Global South | arXiv: 2603.01616
- The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's Algorithm | arXiv: 2508.01077
- The Lattice Representation Hypothesis of Large Language Models | arXiv: 2603.01227
- The Limits of Long-Context Reasoning in Automated Bug Fixing | arXiv: 2602.16069
- The Path of Least Resistance: Guiding LLM Reasoning Trajectories with Prefix Consensus | arXiv: 2601.21494
- The Price of Robustness: Stable Classifiers Need Overparameterization | arXiv: 2603.02806
- The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness | arXiv: 2603.09200
- The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective | arXiv: 2501.15910
- The Spacetime of Diffusion Models: An Information Geometry Perspective | arXiv: 2505.17517
- The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution | arXiv: 2510.25726
- The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM | arXiv: 2510.01650
- Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration? | arXiv: 2602.07055
- There and Back Again: On the relation between Noise and Image Inversions in Diffusion Models | arXiv: 2410.23530
- There Was Never a Bottleneck in Concept Bottleneck Models | arXiv: 2506.04877
- Thermodynamics of Reinforcement Learning Curricula | arXiv: 2603.12324
- Thicker and Quicker: A Jumbo Token for Fast Plain Vision Transformers | arXiv: 2502.15021
- Think-While-Generating: On-the-Fly Reasoning for Personalized Long-Form Generation | arXiv: 2512.06690
- Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs | arXiv: 2603.15051
- Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization | arXiv: 2510.04182
- ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding | arXiv: 2602.23306
- Thompson Sampling via Fine-Tuning of LLMs | arXiv: 2510.13328
- THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning | arXiv: 2509.13761
- Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs | arXiv: 2603.02556
- Time Is All It Takes: Spike-Retiming Attacks on Event-Driven Spiking Neural Networks | arXiv: 2602.03284
- TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models | arXiv: 2509.24803
- TimeSliver : Symbolic-Linear Decomposition for Explainable Time Series Classification | arXiv: 2601.21289
- TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA | arXiv: 2510.04682
- Token Distillation: Attention-aware Input Embeddings For New Tokens | arXiv: 2505.20133
- Token Taxes: mitigating AGI's economic risks | arXiv: 2603.04555
- Token-Efficient Item Representation via Images for LLM Recommender Systems | arXiv: 2503.06238
- Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding | arXiv: 2601.21969
- Token-Importance Guided Direct Preference Optimization | arXiv: 2505.19653
- Token-level Data Selection for Safe LLM Fine-tuning | arXiv: 2603.01185
- Tokenizing Single-Channel EEG with Time-Frequency Motif Learning | arXiv: 2502.16060
- TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching | arXiv: 2601.19739
- TokMem: One-Token Procedural Memory for Large Language Models | arXiv: 2510.00444
- ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning | arXiv: 2603.12740
- ToolWeaver: Weaving Collaborative Semantics for Scalable Tool Use in Large Language Models | arXiv: 2601.21947
- TopoBench: Benchmarking LLMs on Hard Topological Reasoning | arXiv: 2603.12133
- Topology and Geometry of the Learning Space of ReLU Networks: Connectivity and Singularities | arXiv: 2602.00693
- Topology-Preserved Auto-regressive Mesh Generation in the Manner of Weaving Silk | arXiv: 2507.02477
- ToProVAR: Efficient Visual Autoregressive Modeling via Tri-Dimensional Entropy-Aware Semantic Analysis and Sparsity Optimization | arXiv: 2602.22948
- Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking | arXiv: 2507.08207
- Toward Complex-Valued Neural Networks for Waveform Generation | arXiv: 2603.11589
- Toward Enhancing Representation Learning in Federated Multi-Task Settings | arXiv: 2602.01626
- Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders | arXiv: 2512.08892
- Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerabilities | arXiv: 2510.00565
- Toward Universal and Transferable Jailbreak Attacks on Vision-Language Models | arXiv: 2602.01025
- Towards Anomaly-Aware Pre-Training and Fine-Tuning for Graph Anomaly Detection | arXiv: 2504.14250
- Towards Bridging the Gap between Large-Scale Pretraining and Efficient Finetuning for Humanoid Control | arXiv: 2601.21363
- Towards Efficient Constraint Handling in Neural Solvers for Routing Problems | arXiv: 2602.16012
- Towards Generalizable PDE Dynamics Forecasting via Physics-Guided Invariant Learning | arXiv: 2509.24332
- Towards Improved Sentence Representations using Token Graphs | arXiv: 2603.03389
- Towards Interpretable Visual Decoding with Attention to Brain Representations | arXiv: 2509.23566
- Towards Reliable Benchmarking: A Contamination Free, Controllable Evaluation Framework for Multi-step LLM Function Calling | arXiv: 2509.26553
- Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness | arXiv: 2506.08660
- Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention | arXiv: 2509.24393
- Towards Scalable Oversight via Partitioned Human Supervision | arXiv: 2510.22500
- Towards Strategic Persuasion with Language Models | arXiv: 2509.22989
- Towards Sustainable Investment Policies Informed by Opponent Shaping | arXiv: 2602.11829
- Towards Understanding Subliminal Learning: When and How Hidden Biases Transfer | arXiv: 2509.23886
- Towards Understanding Valuable Preference Data for Large Language Model Alignment | arXiv: 2510.13212
- TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models | arXiv: 2602.18884
- TRACE: Your Diffusion Model is Secretly an Instance Edge Detector | arXiv: 2503.07982
- Traceable Black-box Watermarks for Federated Learning | arXiv: 2505.13651
- Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology | arXiv: 2507.07999
- TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design | arXiv: 2506.19997
- Tracing and Reversing Edits in LLMs | arXiv: 2505.20819
- Tracing Pharmacological Knowledge In Large Language Models | arXiv: 2603.03407
- Train Once, Answer All: Many Pretraining Experiments for the Cost of One | arXiv: 2509.23383
- Training Deep Normalization-Free Spiking Neural Networks with Lateral Inhibition | arXiv: 2509.23253
- Training Large Language Models To Reason In Parallel With Global Forking Tokens | arXiv: 2510.05132
- Training Large Reasoning Models Efficiently via Progressive Thought Encoding | arXiv: 2602.16839
- Training-Free Reward-Guided Image Editing via Trajectory Optimal Control | arXiv: 2509.25845
- Transitive RL: Value Learning via Divide and Conquer | arXiv: 2510.22512
- Translate Policy to Language: Flow Matching Generated Rewards for LLM Explanations | arXiv: 2502.12530
- Trapped by simplicity: When Transformers fail to learn from noisy features | arXiv: 2602.08695
- Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks | arXiv: 2510.02286
- TRINITY: An Evolved LLM Coordinator | arXiv: 2512.04695
- TripleSumm: Adaptive Triple-Modality Fusion for Video Summarization | arXiv: 2603.01169
- TROLL: Trust Regions improve Reinforcement Learning for Large Language Models | arXiv: 2510.03817
- Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling | arXiv: 2602.01864
- Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction | arXiv: 2601.20299
- TSPulse: Tiny Pre-Trained Models with Disentangled Representations for Rapid Time-Series Analysis | arXiv: 2505.13033
- TTOM: Test-Time Optimization and Memorization for Compositional Video Generation | arXiv: 2510.07940
- TumorChain: Interleaved Multimodal Chain-of-Thought Reasoning for Traceable Clinical Tumor Analysis | arXiv: 2603.05867
- Tuning the burn-in phase in training recurrent neural networks improves their performance | arXiv: 2602.10911
- TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation | arXiv: 2602.04929
- TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows | arXiv: 2512.05150
- TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models | arXiv: 2511.05275
- U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs | arXiv: 2507.14902
- UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images | arXiv: 2602.24290
- UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking | arXiv: 2603.08117
- Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct | arXiv: 2509.25035
- UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings | arXiv: 2511.00405
- Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction | arXiv: 2510.12768
- Uncovering Grounding IDs: How External Cues Shape Multimodal Binding | arXiv: 2509.24072
- Understanding and Improving Hyperbolic Deep Reinforcement Learning | arXiv: 2512.14202
- Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models | arXiv: 2510.17196
- Understanding and Improving Shampoo and SOAP via Kullback-Leibler Minimization | arXiv: 2509.03378
- Understanding Dataset Distillation via Spectral Filtering | arXiv: 2503.01212
- Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding | arXiv: 2509.23050
- Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness | arXiv: 2510.00517
- Understanding the Emergence of Seemingly Useless Features in Next-Token Predictors | arXiv: 2603.14087
- Understanding the Role of Training Data in Test-Time Scaling | arXiv: 2510.03605
- Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision | arXiv: 2508.05606
- Uni-DPO: A Unified Paradigm for Dynamic Preference Optimization of LLMs | arXiv: 2506.10054
- Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning | arXiv: 2509.24222
- Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models | arXiv: 2509.24365
- Unified Biomolecular Trajectory Generation via Pretrained Variational Bridge | arXiv: 2602.07588
- Unified Multi-Modal Interactive & Reactive 3D Motion Generation via Rectified Flow | arXiv: 2509.24099
- Unified Privacy Guarantees for Decentralized Learning via Matrix Factorization | arXiv: 2510.17480
- Unified Vision-Language Modeling via Concept Space Alignment | arXiv: 2603.01096
- UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation | arXiv: 2510.10575
- Unifying Formal Explanations: A Complexity-Theoretic Perspective | arXiv: 2602.18160
- Unifying Stable Optimization and Reference Regularization in RLHF | arXiv: 2602.11523
- UniHM: Unified Dexterous Hand Manipulation with Vision Language Model | arXiv: 2603.00732
- Universal Beta Splatting | arXiv: 2510.03312
- Universal Multi-Domain Translation via Diffusion Routers | arXiv: 2510.03252
- Universal Properties of Activation Sparsity in Modern Large Language Models | arXiv: 2509.00454
- Universe Routing: Why Self-Evolving Agents Need Epistemic Control | arXiv: 2603.14799
- Unlearning Evaluation through Subset Statistical Independence | arXiv: 2603.00587
- Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting | arXiv: 2603.15452
- Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models | arXiv: 2510.04347
- Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework | arXiv: 2603.04409
- Unsupervised Conformal Inference: Bootstrapping and Alignment to Control LLM Uncertainty | arXiv: 2509.23002
- Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions | arXiv: 2511.03047
- Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals | arXiv: 2601.19810
- Unveiling Super Experts in Mixture-of-Experts Large Language Models | arXiv: 2507.23279
- Unveiling the Cognitive Compass: Theory-of-Mind-Guided Multimodal Emotion Reasoning | arXiv: 2602.00971
- UrbanGS: A Scalable and Efficient Architecture for Geometrically Accurate Large-Scene Reconstruction | arXiv: 2602.02089
- UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos | arXiv: 2510.15018
- Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol | arXiv: 2602.10152
- Value Flows | arXiv: 2510.07650
- vCache: Verified Semantic Prompt Caching | arXiv: 2502.03771
- VCWorld: A Biological World Model for Virtual Cell Simulation | arXiv: 2512.00306
- Verification of the Implicit World Model in a Generative Model via Adversarial Sequences | arXiv: 2602.05903
- Verifier-Constrained Flow Expansion for Discovery Beyond the Data | arXiv: 2602.15984
- VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models | arXiv: 2505.15801
- Verifying Chain-of-Thought Reasoning via Its Computational Graph | arXiv: 2510.09312
- Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning | arXiv: 2508.21048
- VeriTrail: Closed-Domain Hallucination Detection with Traceability | arXiv: 2505.21786
- VFScale: Intrinsic Reasoning through Verifier-Free Test-time Scalable Diffusion Model | arXiv: 2502.01989
- Video-KTR: Reinforcing Video Reasoning via Key Token Attribution | arXiv: 2601.19686
- VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning | arXiv: 2503.13444
- VideoNSA: Native Sparse Attention Scales Video Understanding | arXiv: 2510.02295
- VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL | arXiv: 2510.02282
- VINCIE: Unlocking In-context Image Editing from Video | arXiv: 2506.10941
- Virne: A Comprehensive Benchmark for RL-based Network Resource Allocation in NFV | arXiv: 2507.19234
- VIRTUE: Visual-Interactive Text-Image Universal Embedder | arXiv: 2510.00523
- VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs | arXiv: 2506.06727
- Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models | arXiv: 2503.06749
- Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play | arXiv: 2509.25541
- VisJudge-Bench: Aesthetics and Quality Assessment of Visualizations | arXiv: 2510.22373
- Visual Autoregressive Modeling for Instruction-Guided Image Editing | arXiv: 2508.15772
- Visual Planning: Let's Think Only with Images | arXiv: 2505.11409
- Visual Prompt-Agnostic Evolution | arXiv: 2601.20232
- Visual Symbolic Mechanisms: Emergent Symbol Processing in Vision Language Models | arXiv: 2506.15871
- VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation | arXiv: 2509.21723
- VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning? | arXiv: 2603.07888
- VowelPrompt: Hearing Speech Emotions from Text via Vowel-level Prosodic Augmentation | arXiv: 2602.06270
- VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents | arXiv: 2506.02456
- VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use | arXiv: 2505.19255
- Watermark Robustness and Radioactivity May Be at Odds in Federated Learning | arXiv: 2510.17033
- Watermark-based Attribution of AI-Generated Content | arXiv: 2404.04254
- wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models | arXiv: 2507.08838
- Weak-SIGReg: Covariance Regularization for Stable Deep Learning | arXiv: 2603.05924
- Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents | arXiv: 2508.01858
- WebArbiter: A Principle-Guided Reasoning Process Reward Model for Web Agents | arXiv: 2601.21872
- WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality | arXiv: 2510.18560
- WebDS: An End-to-End Benchmark for Web-based Data Science | arXiv: 2508.01222
- WebOperator: Action-Aware Tree Search for Autonomous Agents in Web Environment | arXiv: 2512.12692
- Weight Decay may matter more than μP for Learning Rate Transfer in Practice | arXiv: 2510.19093
- Weight Space Representation Learning on Diverse NeRF Architectures | arXiv: 2502.09623
- Weight-Space Linear Recurrent Neural Networks | arXiv: 2506.01153
- What Layers When: Learning to Skip Compute in LLMs with Residual Gates | arXiv: 2510.13876
- What's the plan? Metrics for implicit planning in LLMs and their application to rhyme generation and question answering | arXiv: 2601.20164
- Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity | arXiv: 2512.05962
- When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems | arXiv: 2602.00428
- When Agents Persuade: Propaganda Generation and Mitigation in LLMs | arXiv: 2603.04636
- When and Where to Reset Matters for Long-Term Test-Time Adaptation | arXiv: 2603.03796
- When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework | arXiv: 2506.16411
- When Large Multimodal Models Confront Evolving Knowledge: Challenges and Explorations | arXiv: 2505.24449
- When Machine Learning Gets Personal: Evaluating Prediction and Explanation | arXiv: 2502.02786
- When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models | arXiv: 2603.06508
- When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining | arXiv: 2603.04731
- When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models | arXiv: 2504.02010
- When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis | arXiv: 2509.24912
- When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift | arXiv: 2603.04648
- When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning | arXiv: 2603.03475
- When Stability Fails: Hidden Failure Modes Of LLMS in Data-Constrained Scientific Decision-Making | arXiv: 2603.15840
- When Style Breaks Safety: Defending LLMs Against Superficial Style Alignment | arXiv: 2506.07452
- When Thinking Backfires: Mechanistic Insights Into Reasoning-Induced Misalignment | arXiv: 2509.00544
- When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling | arXiv: 2510.15346
- When to restart? Exploring escalating restarts on convergence | arXiv: 2603.04117
- When to Retrain after Drift: A Data-Only Test of Post-Drift Data Size Sufficiency | arXiv: 2603.09024
- When would Vision-Proprioception Policies Fail in Robotic Manipulation? | arXiv: 2602.12032
- Which LLM Multi-Agent Protocol to Choose? | arXiv: 2510.17149
- Why Attention Patterns Exist: A Unifying Temporal Perspective Analysis | arXiv: 2601.21709
- Why Do Unlearnable Examples Work: A Novel Perspective of Mutual Information | arXiv: 2603.03725
- Why DPO is a Misspecified Estimator and How to Fix It | arXiv: 2510.20413
- Why is Your Language Model a Poor Implicit Reward Model? | arXiv: 2507.07981
- Why Keep Your Doubts to Yourself? Trading Visual Uncertainties in Multi-Agent Bandit Systems | arXiv: 2601.18735
- Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning | arXiv: 2510.20108
- Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective | arXiv: 2506.23508
- WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control | arXiv: 2602.14351
- xLSTM Scaling Laws: Competitive Performance with Linear Time-Complexity | arXiv: 2510.02228
- Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents | arXiv: 2509.26354
- Your Language Model Secretly Contains Personality Subnetworks | arXiv: 2602.07164
- Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials | arXiv: 2602.22251
- Zero-shot HOI Detection with MLLM-based Detector-agnostic Interaction Recognition | arXiv: 2602.15124
- ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense | arXiv: 2603.02297
- ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training | arXiv: 2505.11739
- ∇-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space | arXiv: 2603.04948
- into the rabbit hull from task-relevant concepts | arXiv: 2510.08638
- one2scene geometric consistent explorable 3d scene generation from a single ima | arXiv: 2602.19766
- radiogs radiometric gaussian surfels | arXiv: 2603.01491
- sam membership privacy risks | arXiv: 2310.00488
- secp-tuning efficient privacy-preserving prompt tuning for large language mode | arXiv: 2506.15307
- skirting additive error barriers for private turnstile streaming | arXiv: 2602.10360
- single pixel image classification using an ultrafast digital light projector | arXiv: 2603.12036
- spectral-geometric neural fields for pose-free lidar view synthesis | arXiv: 2603.12903
- x2-fusion cross-modality and cross-dimension flow estimation in event edge space | arXiv: 2603.16671
- rfeval benchmarking reasoning faithfulness under counterfactual perturbations | arXiv: 2602.17053
- ambig-swe interactive agents to overcome underspecificity in software engineerin | arXiv: 2502.13069
- breaking the sft plateau multimodal structured reinforcement learning for chart- | arXiv: 2508.13587
- card towards conditional design of multi-agent topological structures | arXiv: 2603.01089
- diablo diagonal blocks are sufficient for finetuning | arXiv: 2506.03230
- dro-instructzero distributionally robust prompt optimization for instruction fol | arXiv: 2510.15260
- dro-instructzero distributionally robust prompt optimization for large language | arXiv: 2510.15260
- execution-grounded credit assignment for grpo in code generation | arXiv: 2603.16158
- improving code localization with repository memory | arXiv: 2510.01003
- imse intrinsic mixture of spectral experts fine-tuning for test-time adaptation | arXiv: 2603.07926
- inference-time safety for code llms via retrieval-augmented revision | arXiv: 2603.01494
- innogym benchmarking the innovation potential of ai agents | arXiv: 2512.01822
- kv cache transform coding for compact storage in llm inference | arXiv: 2511.01815
- learning to reason without external rewards | arXiv: 2505.19590
- mathfimer enhancing mathematical reasoning by expanding reasoning steps through | arXiv: 2502.11684
- paper2code automating code generation from scientific papers in machine learning | arXiv: 2504.17192
- sharing state between prompts and programs | arXiv: 2512.14805
- shieldedcode learning robust representations for virtual machine protected code | arXiv: 2601.20679
- supervised reinforcement learning from expert trajectories to step-wise reasonin | arXiv: 2510.25992
- the limits of long-context reasoning in automated bug fixing | arXiv: 2602.16069
- training large language models to reason in parallel with global forking tokens | arXiv: 2510.05132
- training large language models to reason in parallel with global reflection | arXiv: 2510.05132
- aqua toward strategic response generation for ambiguous visual questions | arXiv: 2603.07394
- non-collaborative user simulators for tool agents | arXiv: 2509.23124
- rein conversational error recovery with reasoning inception | arXiv: 2602.17022
- understanding language prior of lvlms by contrasting chain-of-embedding | arXiv: 2509.23050
- explore-on-graph incentivizing autonomous exploration of large language models o | arXiv: 2602.21728
- quamo quaternion motion kinematics | arXiv: 2601.19580
- doflow flow-based generative models for interventional and counterfactual foreca | arXiv: 2511.02137
- scdfm distributional flow matching model for robust single-cell perturbation pre | arXiv: 2602.07103
- attributing response to context a jensen-shannon divergence driven mechanistic s | arXiv: 2505.16415
- bayesian attention mechanism a probabilistic framework for positional encoding a | arXiv: 2505.22842
- beyond rag vs long-context learning distraction-aware retrieval for efficient kn | arXiv: 2509.21865
- btzsc a benchmark for zero-shot text classification across cross-encoders embedd | arXiv: 2603.11991
- digging deeper learning multi-level concept hierarchies | arXiv: 2603.10084
- efficient discriminative joint encoders for large scale vision-language rerankin | arXiv: 2510.06820
- embedding-based context-aware reranker | arXiv: 2510.13329
- fine-tuning with rag for improving llm learning of new skills | arXiv: 2510.01375
- flow of spans generalizing language models to dynamic span-vocabulary via gflown | arXiv: 2602.10583
- futuremind equipping small language models with strategic thinking-pattern prior | arXiv: 2602.01222
- g-reasoner foundation models for unified reasoning over graph-structured knowled | arXiv: 2509.24276
- hierarchical concept-based interpretable models | arXiv: 2602.23947
- hume measuring the human-model performance gap in text embedding tasks | arXiv: 2510.10062
- hybrid deep searcher scalable parallel and sequential search reasoning | arXiv: 2508.19113
- judges verdict a comprehensive analysis of llm judge capability through human ag | arXiv: 2510.09738
- leveraging data to say no memory augmented plug-and-play selective prediction | arXiv: 2601.22570
- lightretriever a llm-based text retrieval architecture with extremely faster que | arXiv: 2505.12260
- mapping semantic syntactic relationships with geometric rotation | arXiv: 2510.09790
- multimodal dataset distillation made simple by prototype-guided data synthesis | arXiv: 2602.19756
- on the wings of imagination conflicting script-based multi-role framework for hu | arXiv: 2602.06423
- query-level uncertainty in large language models | arXiv: 2506.09669
- raee a robust retrieval-augmented early exit framework for efficient inference | arXiv: 2405.15198
- ravenea a benchmark for multimodal retrieval-augmented visual culture understand | arXiv: 2505.14462
- reftool reference-guided tool creation for knowledge-intensive reasoning | arXiv: 2505.21413
- retrieval-augmented generation for predicting cellular responses to gene perturb | arXiv: 2603.07233
- revela dense retriever learning via language modeling | arXiv: 2506.16552
- summaries as centroids for interpretable and scalable text clustering | arXiv: 2502.09667
- token-guard towards token-level hallucination control via self-checking decoding | arXiv: 2601.21969
- tokmem one-token procedural memory for large language models | arXiv: 2510.00444
- your language model secretly contains personality subnetworks | arXiv: 2602.07164
- a cortically inspired architecture for modular perceptual ai | arXiv: 2603.07295
- activationreasoning logical reasoning in latent activation spaces | arXiv: 2510.18184
- auditing cascading risks in multi-agent systems via semanti-geometric co-evolut | arXiv: 2603.13325
- behavior learning bl learning hierarchical optimization structures from data | arXiv: 2602.20152
- beyond linear probes dynamic safety monitoring for language models | arXiv: 2509.26238
- closing the curvature gap full transformer hessians and their implications for s | arXiv: 2510.16927
- concepts information bottleneck models | arXiv: 2602.14626
- cross-modal redundancy and the geometry of vision-language embeddings | arXiv: 2602.06218
- decomposing representation space into interpretable subspaces with unsupervised | arXiv: 2508.01916
- decoupling dynamical richness from representation learning towards practical mea | arXiv: 2410.04264
- dynamic reflections probing video representations with text alignment | arXiv: 2511.02767
- dynamic reflections probing video representations with text driven reasoning | arXiv: 2511.02767
- evolution of concepts in language model pre-training | arXiv: 2509.17196
- exploring interpretability for visual prompt tuning with cross-layer concepts | arXiv: 2503.06084
- expo-hm learning to explain-then-detect for hateful meme detection | arXiv: 2510.08630
- formal mechanistic interpretability automated circuit discovery with provable gu | arXiv: 2602.16823
- gavel towards rule-based safety through activation monitoring | arXiv: 2601.19768
- gepa reflective prompt evolution can outperform reinforcement learning | arXiv: 2507.19457
- grokking in llm pretraining monitor memorization-to-generalization without test | arXiv: 2506.21551
- hallucination begins where saliency drops | arXiv: 2601.20279
- hidden breakthroughs in language model training | arXiv: 2506.15872
- how do transformers learn to associate tokens gradient leading terms bring mecha | arXiv: 2601.19208
- implicit statistical inference in transformers approximating likelihood-ratio te | arXiv: 2603.10573
- Information Shapes Koopman Representation | arXiv: 2510.13025
- initialization schemes for kolmogorov-arnold networks an empirical study | arXiv: 2509.03417
- internal planning in language models characterizing horizon and branch awareness | arXiv: 2509.25260
- layer by layer module by module choose both for optimal ood probing of vit | arXiv: 2603.05280
- lore jointly learning the intrinsic dimensionality and relative similarity struc | arXiv: 2602.04192
- mata a trainable hierarchical automaton system for multi-agent visual reasoning | arXiv: 2601.19204
- modal logical neural networks for financial ai | arXiv: 2603.12487
- narrow finetuning leaves clearly readable traces in activation differences | arXiv: 2510.13900
- nimo a nonlinear interpretable model | arXiv: 2506.05059
- noise stability of transformer models | arXiv: 2602.08287
- polyshap extending kernelshap with interaction-informed polynomial regression | arXiv: 2601.18608
- posh using scene graphs to guide llms-as-a-judge for detailed image descriptions | arXiv: 2510.19060
- provably explaining neural additive models | arXiv: 2602.17530
- radar reasoning-ability and difficulty-aware routing for reasoning llms | arXiv: 2509.25426
- salve sparse autoencoder-latent vector editing for mechanistic control of neural | arXiv: 2512.15938
- seed-set scalable evolving experimental design for system-level ethical testing | arXiv: 2603.01630
- semantic regexes auto-interpreting llm features with a structured language | arXiv: 2510.06378
- semantic regexes auto-interpreting llm features with a structured language of re | arXiv: 2510.06378
- stretching beyond the obvious a gradient-free framework to unveil the hidden lan | arXiv: 2506.17040
- temporal sparse autoencoders leveraging the sequential nature of language for in | arXiv: 2511.05541
- the reasoning trap -- logical reasoning as a mechanistic pathway to advanced jai | arXiv: 2603.09200
- the reasoning trap -- logical reasoning as a mechanistic pathway to situational | arXiv: 2603.09200
- there was never a bottleneck in concept bottleneck models | arXiv: 2506.04877
- tokenizing single-channel eeg with time-frequency motif learning | arXiv: 2502.16060
- tokenseek memory efficient fine tuning via instance-aware token ditching | arXiv: 2601.19739
- towards understanding subliminal learning when and how hidden biases transfer | arXiv: 2509.23886
- uncovering grounding ids how external cues shape multimodal binding | arXiv: 2509.24072
- uni-ntfm a unified foundation model for eeg signal representation learning | arXiv: 2509.24222
- universal properties of activation sparsity in modern large language models | arXiv: 2509.00454
- vcworld a biological world model for virtual cell simulation | arXiv: 2512.00306
- when machine learning gets personal evaluating prediction and explanation | arXiv: 2502.02786
- when thinking backfires mechanistic insights into reasoning-induced misalignment | arXiv: 2509.00544
- zerotuning unlocking the initial tokens power to enhance large language models w | arXiv: 2505.11739
- bilinear representation mitigates reversal curse and enables consistent model ed | arXiv: 2509.21993
- eamet robust massive model editing via embedding alignment optimization | arXiv: 2505.11876
- energy-regularized sequential model editing on hyperspheres | arXiv: 2510.01172
- fine-tuning done right in model editing | arXiv: 2509.22072
- got-edit geometry-aware generic object tracking via online model editing | arXiv: 2602.08550
- pics pairwise image compositing with spatial interactions | arXiv: 2603.06873
- rote learning considered useful generalizing over memorized data in llms | arXiv: 2507.21914
- rote learning considered useful generalizing over memorized training examples | arXiv: 2507.21914
- when large multimodal models confront evolving knowledge challenges and explorat | arXiv: 2505.24449
- livenewsbench evaluating llm web search capabilities with fresh news | arXiv: 2602.13543
- m2-miner multi-agent enhanced mcts for mobile gui agent | arXiv: 2602.05429
- mc-search evaluating and enhancing multimodal agentic search | arXiv: 2603.00873
- membership privacy risks of sharpness aware minimization | arXiv: 2602.10975
- radiometrically consistent gaussian surfels for inverse rendering | arXiv: 2601.22571
- reducing belief deviation in reinforcement learning for active reasoning | arXiv: 2510.12264
- the controllability trap a governance framework for military ai systems | arXiv: 2603.03515
- videomind a chain-of-lora agent for temporal-grounded video understanding | arXiv: 2503.13444
- web-cogreasoner towards knowledge-induced cognitive reasoning in web agents | arXiv: 2508.01858
- lycheedecode accelerating long-context llm inference via hybrid speculative deco | arXiv: 2602.04541
- rethinking benign relearning syntax as the hidden driver of the safety tax | arXiv: 2602.03379
- tokenseek memory efficient fine tuning via instance-aware token selection | arXiv: 2601.19739
- accessible realistic and fair evaluation of positive-unlabeled learning algorith | arXiv: 2509.24228
- anessuite a comprehensive benchmark and dataset suite for anesthesiology reasoni | arXiv: 2504.02404
- aside architectural separation of instructions and data in language models | arXiv: 2503.10566
- astabench benchmarking ai agents | arXiv: 2510.21652
- benchmarking overton pluralism in llms | arXiv: 2512.01351
- breaking the correlation plateau on the optimization and capacity limits of atte | arXiv: 2602.17898
- can vision language models assess graphic design aesthetics a benchmark evaluati | arXiv: 2603.01083
- can you hear me now a benchmark for long-range graph propagation and beyond | arXiv: 2512.17762
- conformal prediction adaptive to unknown subpopulation shifts | arXiv: 2506.05583
- counselbench llm mental health qa | arXiv: 2506.08584
- dare-bench evaluating modeling and instruction fidelity of llms in data science | arXiv: 2602.24288
- disentangling shared and private neural dynamics with spire a latent modeling fr | arXiv: 2510.25023
- do we really need permutations impact of model width on linear mode connectivity | arXiv: 2510.08023
- enabling fine-grained operating points for black-box llms | arXiv: 2510.17727
- how reliable is language model micro-benchmarking | arXiv: 2510.08730
- improving set function approximation with quasi-arithmetic neural networks | arXiv: 2602.04941
- in-context learning of temporal point processes with foundation inference models | arXiv: 2509.24762
- lca local classifier alignment for continual learning | arXiv: 2603.09888
- measuring uncertainty calibration | arXiv: 2512.13872
- mitigating spurious correlation via distributionally robust learning with hierar | arXiv: 2510.02818
- mosiv multi-object system identification from videos | arXiv: 2603.06022
- multi-llm adaptive conformal inference for reliable llm responses | arXiv: 2602.01285
- noise-aware generalization robustness to in-domain noise and out-of-domain gener | arXiv: 2504.02996
- non-clashing teaching in graphs algorithms complexity and bounds | arXiv: 2602.00657
- optimal transport-induced samples against out-of-distribution overconfidence | arXiv: 2601.21320
- planetalign a comprehensive python library for benchmarking network alignment | arXiv: 2505.21366
- predicting llm reasoning performance with small proxy model | arXiv: 2509.21013
- preference leakage a contamination problem in llm-as-a-judge | arXiv: 2502.01534
- prompt and parameter co-optimization for large language model task adaptation | arXiv: 2509.24245
- prompt and parameter co-optimization for large language models | arXiv: 2509.24245
- rankllm weighted ranking of llms by quantifying question difficulty | arXiv: 2602.12424
- revisiting the past data unlearning with model state history | arXiv: 2506.20941
- same content different representations a controlled study for t | arXiv: 2509.22983
- simpletom exposing the gap between explicit tom inference and implicit tom appli | arXiv: 2410.13648
- simuhome a temporal- and environment-aware benchmark for smart home agents | arXiv: 2509.24282
- soft quality-diversity optimization | arXiv: 2512.00810
- spectral attention steering for prompt highlighting | arXiv: 2603.01281
- subliminal signals in preference labels | arXiv: 2603.01204
- tabstruct measuring structural fidelity of tabular data | arXiv: 2509.11950
- talk evaluate diagnose user-aware agent evaluation with automated error analysis | arXiv: 2603.15483
- towards anomaly-aware pre-training and fine-tuning for graph anomaly detection | arXiv: 2504.14250
- truthfulness despite weak supervision evaluating and training llms using peer pr | arXiv: 2601.20299
- uis-digger towards comprehensive research agent systems for real-world unindexed | arXiv: 2603.08117
- unpacking human preference for llms demographically aware evaluation of long-fo | arXiv: 2603.04409
- unpacking human preference for llms demographically aware evaluation with the hu | arXiv: 2603.04409
- vcache verified semantic prompt caching | arXiv: 2502.03771
- when and where to reset matters for long-term test-time adaptation | arXiv: 2603.03796
- when priors backfire on the vulnerability of unlearnable examples to data augmen | arXiv: 2603.04731
- when priors backfire on the vulnerability of unlearnable examples to pretraining | arXiv: 2603.04731
- when to ensemble identifying token-level points for stable and fast llm ensembli | arXiv: 2510.15346
- which llm multi-agent protocol to choose | arXiv: 2510.17149
- assetformer modular 3d | arXiv: 2602.12100
- d2cache accelerating diffusion-based llms via dual adaptive caching | arXiv: 2509.23094
- ellmob event-driven human mobility generation with self-aligned language models | arXiv: 2603.07946
- enhancing persona following at decoding time via dynamic importance-guided token | arXiv: 2603.01438
- enhancing persona following at decoding time via dynamic importance estimation | arXiv: 2603.01438
- from assumptions to actions turning llm reasoning into uncertainty-aware plannin | arXiv: 2602.04326
- function induction and task generalization an interpretability study with off-by | arXiv: 2507.09875
- llema evolutionary search with llms for multi-objective material design | arXiv: 2510.22503
- predicting llm reasoning performance with small proxy models | arXiv: 2509.21013
- pt2-llm post-training ternarization for large language models | arXiv: 2510.03267
- quamo quaternion motions for vision-based 3d human kinematics capture | arXiv: 2509.25369
- the path of least resistance guiding llm reasoning trajectories for efficient co | arXiv: 2601.21494
- toward safer diffusion language models discovery and mitigation of priming vulne | arXiv: 2510.00565
- when stability fails hidden failure modes of llms in data-critical statistical | arXiv: 2603.15840
- a law of data reconstruction for random features and beyond | arXiv: 2509.22214
- block-sample mac-bayes generalization bounds | arXiv: 2602.12605
- chammi-75 pre-training multi-channel models with heterogeneous microscopy images | arXiv: 2512.20833
- common corpus ethical data for llm pretraining | arXiv: 2506.01732
- deconstructing positional information from attention logits to training biases | arXiv: 2505.13027
- emergent misalignment is easy narrow misalignment is hard | arXiv: 2602.07852
- explaining grokking and information bottleneck through neural collapse emergence | arXiv: 2509.20829
- fictionalqa a dataset for studying memorization and knowledge acquisition | arXiv: 2506.05639
- identifying and evaluating inactive heads in pretrained llms | arXiv: 2504.03889
- imagine how to change explicit procedure modeling for change captioning | arXiv: 2603.05969
- implicit bias and loss of plasticity in matrix completion depth promotes low-ran | arXiv: 2603.04703
- intrinsic training dynamics of deep neural networks | arXiv: 2508.07370
- lossless vocabulary reduction for auto-regressive language models | arXiv: 2510.08102
- moma a modular deep learning framework for material property prediction | arXiv: 2502.15483
- mt-dao multi-timescale distributed adaptive optimizers with local updates | arXiv: 2510.05361
- pre-training llm without learning rate decay enhances supervised fine-tuning | arXiv: 2603.16127
- predicting training re-evaluation curves enables effective data curriculums for | arXiv: 2509.25380
- recon robust symmetry discovery via explicit canonical orientation normalization | arXiv: 2505.13289
- reducing class-wise performance disparity via margin regularization | arXiv: 2602.00205
- semhitok a unified image tokenizer via semantic-guided hierarchical codebook for | arXiv: 2503.06764
- stochastic self-organization in multi-agent systems | arXiv: 2510.00685
- taste text-aligned speech tokenization and embedding for spoken language modelin | arXiv: 2504.07053
- understanding and improving shampoo and soap via kullback-leibler minimization | arXiv: 2509.03378
- understanding the emergence of seemingly useless features in deep learning | arXiv: 2603.14087
- understanding the emergence of seemingly useless features in next-token predicto | arXiv: 2603.14087
- doxing via the lens revealing location-related privacy leakage in vlms | arXiv: 2504.19373
- from abstract to contextual what llms still cannot do in math word problem solvi | arXiv: 2601.23048
- rain-merging a gradient-free method to enhance instruction following through mod | arXiv: 2602.22538
- towards safe reasoning in large reasoning models via correct-by-construction gu | arXiv: 2509.24393
- training large reasoning models efficiently via progressive solution complexity | arXiv: 2602.16839
- when reasoning meets compression understanding the effects of pruning and quant | arXiv: 2504.02010
- enhancing hallucination detection through noise injection | arXiv: 2502.03799
- gaussian certified unlearning | arXiv: 2510.13094
- how far are llms from professional poker players revisiting game-theoretic reaso | arXiv: 2602.00528
- lh-deception simulating and understanding llm deceptive behaviors in long-horizo | arXiv: 2510.03999
- lifelong learning with behavior consolidation for vehicle routing | arXiv: 2509.21765
- perturbation-induced linearization constructing unlearnable data with solely lin | arXiv: 2601.19967
- redirection for erasing memory rem towards a universal unlearning method for cor | arXiv: 2505.17730
- self-destructive language model | arXiv: 2505.12186
- understanding sensitivity of differential attention through the lens of adversar | arXiv: 2510.00517
- understanding sensitivity of differential attention through the lens of softmax | arXiv: 2510.00517
- unlearning evaluation through subset statistical independence | arXiv: 2603.00587
- veritrail closed-domain hallucination detection with traceability | arXiv: 2505.21786
- veritrail closed-domain hallucination detection with traceable evidence synthes | arXiv: 2505.21786
- flyprompt brain-inspired random-expanded routing | arXiv: 2602.01976
- pi-flow policy-based few-step generation via imitation distillation | arXiv: 2510.14974
- assess a semantic and structural evaluation framework for statement similarity | arXiv: 2509.22246
- assess autoformalization eval | arXiv: 2509.22246
- atlas adaptive transfer scaling laws for multilingual pretraining finetuning and | arXiv: 2510.22037
- multilingual routing in mixture-of-experts | arXiv: 2510.04694
- prior-based noisy text data filtering fast and strong alternative for perplexity | arXiv: 2509.18577
- prior-based noisy text data filtering fast and strong alternative to perplexity | arXiv: 2509.18577
- sasft sparse autoencoder-guided supervised finetuning to mitigate unexpected cod | arXiv: 2507.14894
- why keep your doubts to yourself trading visual uncertainty | arXiv: 2601.18735
- fs-dfm fast and accurate long text generation with few-step diffusion language m | arXiv: 2509.20624
- spectralgcd spectral concept selection and cross-modal representation learni | arXiv: 2602.17395
- autoqd diverse behaviors | arXiv: 2506.05634
- autotool scaling tool use | arXiv: 2603.13348
- infom intention flow occupancy | arXiv: 2506.08902
- remix reinforcement routing lora | arXiv: 2603.10160
- remot reinforcement learning with motion contrast triplets | arXiv: 2603.00461
- toward a dynamic stackelberg game-theoretic framework for agent-based conversat | arXiv: 2507.08207
- snap-uq self-supervised next-activation prediction for single-pass uncertainty i | arXiv: 2508.12907
- adaptive debiasing tsallis entropy for test-time adaptation | arXiv: 2602.11743
- biasfreebench a benchmark for mitigating bias in large language model responses | arXiv: 2510.00232
- functional embeddings enable aggregation of multi-area seeg data for robust bci | arXiv: 2510.27090
- functional embeddings enable aggregation of multi-area seeg recordings over subj | arXiv: 2510.27090
- gradiend feature learning within neural networks exemplified through biases | arXiv: 2502.01406
- human or machine a preliminary turing test for speech-to-speech interaction | arXiv: 2602.24080
- propaganda ai an analysis of semantic divergence in large language models | arXiv: 2504.12344
- scalable multi-task low-rank model adaptation | arXiv: 2603.01526
- stop wasting your tokens towards efficient runtime multi-agent systems | arXiv: 2510.26585
- cpiri channel permutation-invariant relational interaction for multivariate time se | arXiv: 2601.20318
- gtm a general time-series model for enhanced representation learning of time-series | arXiv: 2502.03264
- scits scientific time series llm | arXiv: 2510.03255
- tsrating time series quality llm | arXiv: 2506.01290
- arbitrary generative video interpolation | arXiv: 2510.00578
- bindweave subject-consistent video generation via cross-modal integration | arXiv: 2510.00438
- frame guidance training-free guidance for frame-level control in video diffusion | arXiv: 2506.07177
- geometry-aware 4d video generation for robot manipulation | arXiv: 2507.01099
- javisdit joint audio-video diffusion transformer with hierarchical spatio-tempor | arXiv: 2503.23377
- javisdit unified modeling and optimization for joint audio-video generation | arXiv: 2602.19163
- language-guided open-world video anomaly detection under weak supervision | arXiv: 2503.13160
- learning video generation for robotic manipulation with collaborative trajectory | arXiv: 2506.01943
- lora-edit controllable first-frame-guided video editing via mask-aware lora fine | arXiv: 2506.10082
- lumos-1 on autoregressive video generation with discrete diffusion from a unifie | arXiv: 2507.08801
- mosa motion-coherent human video generation via structure-appearance decoupling | arXiv: 2508.17404
- motionstream real-time video generation with interactive motion controls | arXiv: 2511.01266
- precisecache precise feature caching for efficient and high-fidelity video gener | arXiv: 2603.00976
- quantsparse comprehensively compressing video diffusion transformer with model q | arXiv: 2509.23681
- sigmark scalable in-generation watermark with blind extraction for video diffusi | arXiv: 2603.02882
- streaming autoregressive video generation via diagonal distillation | arXiv: 2603.09488
- target-aware video diffusion models | arXiv: 2503.18950
- ttom test-time optimization and memorization for compositional video generation | arXiv: 2510.07940