NeurIPS2025 论文笔记 TODO¶

总计: 2901 篇 | 已完成: 2901 | 待更新: 0

3-Model Speculative Decoding (PyramidSD) | arXiv: 2510.12966
3D Visual Illusion Depth Estimation | arXiv: 2505.13061
3D-Agent: Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation | arXiv: 2601.04404
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks | arXiv: 2506.11147
3DID: Direct 3D Inverse Design for Aerodynamics with Physics-Aware Optimization | arXiv: 2512.08987
3EED: Ground Everything Everywhere in 3D | arXiv: 2511.01755
4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming | arXiv: 2509.17513
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos | arXiv: 2506.08015
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11) | arXiv: 2504.11651
A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective | arXiv: 2509.16499
A Connection Between Score Matching and Local Intrinsic Dimension | arXiv: 2510.12975
A Controllable Examination for Long-Context Language Models | arXiv: 2506.02921
A Cramér–von Mises Approach to Incentivizing Truthful Data Sharing | arXiv: 2506.07272
A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors | arXiv: 2510.05205
a differentiable model of supply-chain shocks | arXiv: 2511.05231
A Differential and Pointwise Control Approach to Reinforcement Learning | arXiv: 2404.15617
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking | arXiv: 2510.06699
A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1 | arXiv: 2503.10635
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications | arXiv: 2509.18714
A Generalized Label Shift Perspective for Cross-Domain Gaze Estimation | arXiv: 2505.13043
A Gradient Flow Approach to Solving Inverse Problems with Latent Diffusion Models | arXiv: 2509.19276
A Granular Study of Safety Pretraining under Model Abliteration | arXiv: 2510.02768
A Graph Neural Network Approach for Localized and High-Resolution Temperature Forecasting | arXiv: 2512.00546
A High-Dimensional Statistical Method for Optimizing Transfer Quantities in Multi-Source Transfer Learning | arXiv: 2502.04242
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders | arXiv: 2409.14507
a joint learning approach to hardware caching and prefetching | arXiv: 2510.10862
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers | arXiv: 2503.03961
A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings | arXiv: 2505.12116
A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection | arXiv: 2510.21679
A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond | arXiv: 2502.07514
A Novel Approach to Classification of ECG Arrhythmia Types with Latent ODEs | arXiv: 2511.16933
A Partition Cover Approach for Tokenization | arXiv: 2501.06246
A Practical Guide for Incorporating Symmetry in Diffusion Policy | arXiv: 2505.13431
A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning | arXiv: 2510.17697
A Probabilistic U-Net Approach to Downscaling Climate Simulations | arXiv: 2511.03197
A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees | arXiv: 2502.04799
A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation | arXiv: 2404.11577
A Self-Improving Coding Agent | arXiv: 2504.15228
A Set of Generalized Components to Achieve Effective Poison-only Clean-label Backdoor Attacks with Collaborative Sample Selection and Triggers | arXiv: 2509.19947
A Simple Linear Patch Revives Layer-Pruned Large Language Models | arXiv: 2505.24680
A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization | arXiv: 2510.24710
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning | arXiv: 2505.19281
A Standardized Benchmark for Multilabel Antimicrobial Peptide Classification | arXiv: 2511.04814
A Stochastic Differential Equation Framework for Multi-Objective LLM Interactions | arXiv: 2510.10739
A Sustainable AI Economy Needs Data Deals That Work for Generators | arXiv: 2601.09966
A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs | arXiv: 2512.08786
A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation | arXiv: 2505.20172
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning | arXiv: 2510.15444
A Theory of Multi-Agent Generative Flow Networks | arXiv: 2509.20408
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone | arXiv: 2505.12781
A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity | arXiv: 2509.24734
A Unified Approach to Submodular Maximization Under Noise | arXiv: 2510.21128
A Unified Framework for Establishing the Universal Approximation of Transformer-Type Architectures | arXiv: 2506.23551
A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values | arXiv: 2506.05216
A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random | arXiv: 2505.19093
A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis | arXiv: 2511.00962
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking | arXiv: 2505.19858
A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias | arXiv: 2511.17378
A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning | arXiv: 2501.01774
A Variational Manifold Embedding Framework for Nonlinear Dimensionality Reduction | arXiv: 2511.22128
A-MEM: Agentic Memory for LLM Agents | arXiv: 2502.12110
a-thought efficient reasoning via bidirectional compression for low-resource set | arXiv: 2505.24550
AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation | arXiv: 2506.05768
AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling | arXiv: 2507.08567
Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency | arXiv: 2510.19980
AC-LoRA: (Almost) Training-Free Access Control-Aware Multi-Modal LLMs | arXiv: 2505.11557
Accelerate Creation of Product Claims Using Generative AI | arXiv: 2509.20652
Accelerating Parallel Diffusion Model Serving with Residual Compression | arXiv: 2507.17511
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models | arXiv: 2510.20348
Accurate and Efficient Low-Rank Model Merging in Core Space | arXiv: 2509.17786
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play | arXiv: 2509.24193
ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking | arXiv: 2511.09833
Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies | arXiv: 2509.25822
Active Measurement: Efficient Estimation at Scale | arXiv: 2507.01372
Active Slice Discovery in Large Language Models | arXiv: 2511.20713
Active Target Discovery under Uninformative Prior: The Power of Permanent and Transient Memory | arXiv: 2510.16676
Actor-Free Continuous Control via Structurally Maximizable Q-Functions | arXiv: 2510.18828
AcuRank: 不确定性感知的自适应计算重排序 | arXiv: 2505.18512
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference | arXiv: 2407.11550
AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining | arXiv: 2506.13274
AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness | arXiv: 2511.09316
AdaptGrad: Adaptive Sampling to Reduce Noise | arXiv: 2410.07711
Adapting Speech Language Model to Singing Voice Synthesis | arXiv: 2512.14657
Adapting Vision-Language Models for Evaluating World Models | arXiv: 2506.17967
Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization | arXiv: 2509.15399
adaptive cooperative transmission design for ultra-reliable low-latency communic | arXiv: 2511.02216
adaptive coopetition leveraging coarse verifier signals for resilient multi-agen | arXiv: 2510.18179
Adaptive Data Analysis for Growing Data | arXiv: 2405.13375
Adaptive Discretization for Consistency Models | arXiv: 2510.17266
Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning | arXiv: 2510.10207
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing | arXiv: 2505.21671
Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs | arXiv: 2509.17998
Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning | arXiv: 2509.15087
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning | arXiv: 2511.02567
adaptive online emulation for accelerating complex physical simulations | arXiv: 2508.08012
Adaptive Originality Filtering: Rejection-Based Prompting and RiddleScore for Culturally Grounded Multilingual Riddle Generation | arXiv: 2508.18709
Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees | arXiv: 2505.18659
Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling | arXiv: 2510.23285
Adaptively Coordinating with Novel Partners via Learned Latent Strategies | arXiv: 2511.12754
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners | arXiv: 2505.16322
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding | arXiv: 2506.13589
additive models explained a computational complexity approach | arXiv: 2510.21292
addressing mark imbalance in integrationfree neural marked t | arXiv: 2510.20414
adjacent words divergent intents jailbreaking large language models via task con | arXiv: 2510.21189
Adjoint Schrödinger Bridge Sampler | arXiv: 2506.22565
adjusted count quantification learning on graphs | arXiv: 2503.09395
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources | arXiv: 2502.07862
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees | arXiv: 2512.04550
ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining | arXiv: 2511.05245
Adv-SSL: Adversarial Self-Supervised Representation Learning with Theoretical Guarantees | arXiv: 2408.08533
Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning | arXiv: 2505.24424
Advancing Expert Specialization for Better MoE | arXiv: 2505.22323
adversarial locomotion and motion imitation for humanoid policy learning | arXiv: 2504.14305
Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text | arXiv: 2506.07001
aero a redirection-based optimization framework inspired by judo for robust prob | arXiv: 2506.02415
AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models | arXiv: 2511.10017
AgentAuditor: Human-Level Safety and Security Evaluation for LLM Agents | arXiv: 2506.00641
agentchangebench a multi-dimensional evaluation framework for goal-shift robustn | arXiv: 2510.18170
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents | arXiv: 2503.09780
Agentic NL2SQL to Reduce Computational Costs | arXiv: 2510.14808
agentic persona control and task state tracking for realistic user simulation in | arXiv: 2601.15290
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents | arXiv: 2506.14852
AgentiQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation | arXiv: 2510.10661
AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents | arXiv: 2506.04018
agentstealth reinforcing large language model for anonymizing user-generated tex | arXiv: 2506.22508
AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks | arXiv: 2508.00890
aggregation hides out-of-distribution generalization failures from spurious corr | arXiv: 2510.24884
Agint: Agentic Graph Compilation for Software Engineering Agents | arXiv: 2511.19635
AHA -- Predicting What Matters Next: Online Highlight Detection Without Looking Ahead | arXiv: 2509.16421
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs | arXiv: 2511.01077
AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift | arXiv: 2507.07820
AI-Generated Video Detection via Perceptual Straightening | arXiv: 2507.00583
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering | arXiv: 2506.09050
alias-free vit fractional shift invariance via linear attention | arXiv: 2510.22673
Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment | arXiv: 2511.08399
Aligning Compound AI Systems via System-level DPO | arXiv: 2502.17721
Aligning Text to Image in Diffusion Models is Easier Than You Think | arXiv: 2503.08250
Alignment of Large Language Models with Constrained Learning | arXiv: 2505.19387
aline joint amortization for bayesian inference and active data acquisition | arXiv: 2506.07259
All You Need is One: Capsule Prompt Tuning with a Single Vector | arXiv: 2510.16670
Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression | arXiv: 2503.07561
almguard safety shortcuts and where to find them as guardrails for audio-languag | arXiv: 2510.26096
alternating gradient flows a theory of feature learning in two-layer neural netw | arXiv: 2506.06489
Amortized Active Generation of Pareto Sets | arXiv: 2510.21052
Amortized Sampling with Transferable Normalizing Flows | arXiv: 2508.18175
an adaptive algorithm for bilevel optimization on riemannian manifolds | arXiv: 2504.06042
an analysis of causal effect estimation using outcome invariant data augmentatio | arXiv: 2510.25128
an analysis of concept bottleneck models measuring understanding and mitigating | arXiv: 2505.16705
an empirical investigation of neural odes and symbolic regression for dynamical | arXiv: 2601.20637
an evidence-based post-hoc adjustment framework for anomaly detection under data | arXiv: 2510.21296
Angular Constraint Embedding via SpherePair Loss for Constrained Clustering | arXiv: 2510.06907
angular steering behavior control via rotation in activation space | arXiv: 2510.26243
Anti-Aliased 2D Gaussian Splatting | arXiv: 2506.11252
AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making | arXiv: 2506.12374
Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector | arXiv: 2505.17100
Approximate Domain Unlearning for Vision-Language Models | arXiv: 2510.08132
Approximately Aligned Decoding | arXiv: 2410.01103
approximating shapley explanations in reinforcement learning | arXiv: 2511.06094
aquamam an autoregressive quaternion manifold model for rapidly estimating compl | arXiv: 2301.08838
are greedy task orderings better than random in continual linear regression | arXiv: 2510.19941
Are Language Models Efficient Reasoners? A Perspective from Logic Programming | arXiv: 2510.25626
are large language models sensitive to the motives behind communication | arXiv: 2510.19687
are large reasoning models good translation evaluators analysis and performance | arXiv: 2510.20780
are pixel-wise metrics reliable for sparse-view computed tomography reconstructi | arXiv: 2506.02093
are vision language models ready for clinical diagnosis a 3d medical benchmark f | arXiv: 2505.18915
arecho autoregressive evaluation via chain-based hypothesis optimization for spe | arXiv: 2505.24518
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model | arXiv: 2510.20803
ARM: Adaptive Reasoning Model | arXiv: 2505.20258
ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction | arXiv: 2509.20824
artificial hivemind the open-ended homogeneity of language models and beyond | arXiv: 2510.22954
asap an agentic solution to auto-optimize performance of large-scale llm trainin | arXiv: 2511.03844
Ascent Fails to Forget | arXiv: 2509.26427
asciibench evaluating language-model-based understanding of visually-oriented te | arXiv: 2512.04125
ask a strong llm judge when your reward model is uncertain | arXiv: 2510.20369
associative syntax and maximal repetitions reveal context-dependent complexity i | arXiv: 2512.01033
astroco self-supervised conformer-style transformers for light-curve embeddings | arXiv: 2509.24134
astrovisbench a code benchmark for scientific computing and visualization in ast | arXiv: 2505.20538
asymmetric duos sidekicks improve uncertainty | arXiv: 2505.18636
asymptotic and finite-time guarantees for langevin-based temperature annealing i | arXiv: 2603.12552
asymptotically stable quaternion-valued hopfield-structured neural network with | arXiv: 2510.16607
atlas autoformalizing theorems through lifting augmentation and synthesis of dat | arXiv: 2502.05567
atlasgs atlanta-world guided surface reconstruction with implicit structured gau | arXiv: 2510.25129
Atom of Thoughts for Markov LLM Test-Time Scaling | arXiv: 2502.12018
Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra | arXiv: 2512.03127
attack via overfitting 10-shot benign fine-tuning to jailbreak llms | arXiv: 2510.02833
attention as discrete-time markov chains | arXiv: 2507.17657
attention your vision language model could be maliciously manipulated | arXiv: 2505.19911
AttentionPredictor: Temporal Patterns Matter for KV Cache Compression
attractive metadata attack inducing llm agents to invoke malicious tools | arXiv: 2508.02110
attributing response to context a jensen-shannon divergence driven mechanistic s | arXiv: 2505.16415
audio super-resolution with latent bridge models | arXiv: 2509.17609
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models | arXiv: 2505.13143
audsemthinker enhancing audio-language models through reasoning over semantics o | arXiv: 2505.14142
AugGen: Synthetic Augmentation using Diffusion Models Can Improve Recognition | arXiv: 2503.11544
augmenting biological fitness prediction benchmarks with landscapes features fro | arXiv: 2510.24826
auto-compressing networks | arXiv: 2506.09714
auto-search and refinement an automated framework for gender bias mitigation in | arXiv: 2502.11559
autodiscovery open-ended scientific discovery via bayesian surprise | arXiv: 2507.00310
autoencoding random forests | arXiv: 2505.21441
autojudge judge decoding without manual annotation | arXiv: 2504.20039
automated algorithm design via nevanlinna-pick interpolation | arXiv: 2509.21416
Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection | arXiv: 2510.16499
Automated Detection of Visual Attribute Reliance with a Self-Reflective Agent | arXiv: 2510.21704
automated multi-agent workflows for rtl design | arXiv: 2509.20182
automaton constrained q-learning | arXiv: 2510.05061
autoopt a dataset and a unified framework for automating optimization problem so | arXiv: 2510.21436
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation | arXiv: 2506.09350
AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing | arXiv: 2510.21935
autotom scaling model-based mental inference via automated agent modeling | arXiv: 2502.15676
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | arXiv: 2506.13757
Availability-aware Sensor Fusion via Unified Canonical Space | arXiv: 2503.07029
averimatec a dataset for automatic verification of image-text claims with eviden | arXiv: 2505.17978
badiff bandwidth adaptive diffusion model | arXiv: 2510.21366
Balanced Conic Rectified Flow | arXiv: 2510.25229
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization | arXiv: 2505.22038
balancing performance and costs in best arm identification | arXiv: 2505.20583
bandit and delayed feedback in online structured prediction | arXiv: 2502.18709
barcodemamba advancing state-space models for fungal biodiversity research | arXiv: 2512.15931
barista brain scale informed spatiotemporal representation of human intracranial | arXiv: 2512.12135
Base Models Know How to Reason, Thinking Models Learn When | arXiv: 2510.07364
bayesian ego-graph inference for networked multi-agent reinforcement learning | arXiv: 2509.16606
bayesian evaluation of large language model behavior | arXiv: 2511.10661
bayesian surrogates for risk-aware pre-assessment of aging bridge portfolios | arXiv: 2509.25031
beast efficient tokenization of b-splines encoded action sequences for imitation | arXiv: 2506.06072
BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading | arXiv: 2506.06271
bedlam20 synthetic humans and cameras in motion | arXiv: 2511.14394
behavior injection preparing language models for reinforcement learning | arXiv: 2505.18917
Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks | arXiv: 2510.06307
benchmarking agentic systems in automated scientific information extraction with | arXiv: 2510.00795
Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents | arXiv: 2510.22443
benchmarking is broken -- dont let ai be its own judge | arXiv: 2510.07575
benchmarking large language models for zero-shot and few-shot phishing url detec | arXiv: 2602.02641
benchmarking probabilistic time series forecasting models on neural activity | arXiv: 2510.18037
Benchmarking Retrieval-Augmented Multimodal Generation for Document Question Answering | arXiv: 2505.16470
benfords curse tracing digit bias to numerical hallucination in llms | arXiv: 2506.01734
better estimation of the kullback--leibler divergence between language models | arXiv: 2504.10637
better ntk conditioning a free lunch from relu nonlinear activation in wide neur | arXiv: 2305.08813
Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging | arXiv: 2510.20639
Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning | arXiv: 2506.04723
beyond benign overfitting in nadaraya-watson interpolators | arXiv: 2502.07480
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations | arXiv: 2505.21318
Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits | arXiv: 2511.20273
beyond greedy exits improved early exit decisions for risk control and reliabili | arXiv: 2509.23666
beyond higher rank token-wise input-output projections for efficient low-rank ad | arXiv: 2510.23123
beyond last-click an optimal mechanism for ad attribution | arXiv: 2511.22918
Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking | arXiv: 2505.18495
beyond parallelism synergistic computational graph effects in multi-head attenti | arXiv: 2507.02944
beyond random automatic inner-loop optimization in dataset distillation | arXiv: 2510.04838
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Beyond the Singular: Value of Multiple Generations in Benchmark Evaluation | arXiv: 2502.08943
beyond the surface enhancing llm-as-a-judge alignment with human via internal re | arXiv: 2508.03550
beyond tildeosqrtt constraint violation for online convex optimization with adve | arXiv: 2505.06709
beyond token probes hallucination detection via activation tensors with act-vit | arXiv: 2510.00296
bezier splatting for fast and differentiable vector graphics rendering | arXiv: 2503.16424
Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization | arXiv: 2510.19517
bias in the picture benchmarking vlms with social-cue news images and llm-as-jud | arXiv: 2509.19659
bidirectional representations augmented autoregressive biological sequence gener | arXiv: 2510.08169
bigram subnetworks mapping to next tokens in transformer language models | arXiv: 2504.15471
binary quadratic quantization beyond first-order quantization for real-valued ma | arXiv: 2510.18650
biobench a blueprint to move beyond imagenet for scientific ml benchmarks | arXiv: 2511.16315
bioclip 2 emergent properties from scaling hierarchical contrastive learning | arXiv: 2505.23883
Bispectral OT: Dataset Comparison using Symmetry-Aware Optimal Transport | arXiv: 2509.20678
bitmark watermarking bitwise autoregressive image generative models | arXiv: 2506.21209
Bits Leaked per Query: Information-Theoretic Bounds on Adversarial Attacks against LLMs | arXiv: 2510.17000
blameless users in a clean room defining copyright protection for generative mod | arXiv: 2506.19881
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers | arXiv: 2506.00744
blind strong gravitational lensing inversion joint inference of source and lens | arXiv: 2511.04792
blink-twice you see but do you observe a reasoning benchmark on visual perceptio | arXiv: 2510.09361
bliss bandit layer importance sampling strategy for efficient training of graph | arXiv: 2512.22388
BlurDM: A Blur Diffusion Model for Image Deblurring | arXiv: 2512.03979
blurguard a simple approach for robustifying image protection against ai-powered | arXiv: 2511.00143
BNMusic: Blending Environmental Noises into Personalized Music | arXiv: 2506.10754
boltznce learning likelihoods for boltzmann generation with stochastic interpola | arXiv: 2507.00846
boosting adversarial transferability with spatial adversarial alignment | arXiv: 2501.01015
Boosting Generative Image Modeling via Joint Image-Feature Synthesis | arXiv: 2504.16064
bootstrap off-policy with world model | arXiv: 2511.00423
born a transformer -- always a transformer on the effect of pretraining on archi | arXiv: 2505.21785
boundary-to-region supervision for offline safe reinforcement learning | arXiv: 2509.25727
Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens | arXiv: 2509.24693
brain-like processing pathways form in models with heterogeneous experts | arXiv: 2506.02813
brain-like variational inference | arXiv: 2410.19315
brain-tuning improves generalizability and efficiency of brain alignment in spee | arXiv: 2510.21520
brainomni a brain foundation model for unified eeg and meg signals | arXiv: 2505.18185
Breaking AR's Sampling Bottleneck: Provable Acceleration via Diffusion Language Models | arXiv: 2505.21400
breaking the compression ceiling data-free pipeline for ultra-efficient delta co | arXiv: 2505.13563
breaking the frozen subspace importance sampling for low-rank optimization in ll | arXiv: 2502.05790
breaking the gradient barrier unveiling large language models for strategic clas | arXiv: 2511.06979
bridgevla input-output alignment for efficient 3d manipulation learning with vis | arXiv: 2506.07961
bridging embodiment gaps deploying vision-language-action models on soft robots | arXiv: 2510.17369
bridging graph and state-space modeling for intensive care unit length of stay p | arXiv: 2508.17554
bridging human and llm judgments understanding and narrowing the gap | arXiv: 2508.12792
bridging symmetry and robustness on the role of equivariance in enhancing advers | arXiv: 2510.16171
broken tokens your language model can secretly handle non-canonical tokenization | arXiv: 2506.19004
BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent | arXiv: 2509.15566
bubbleformer forecasting boiling with transformers | arXiv: 2507.21244
buffer layers for test-time adaptation | arXiv: 2510.21271
burstdeflicker a benchmark dataset for flicker removal in dynamic scenes | arXiv: 2510.09996
c-lora contextual low-rank adaptation for uncertainty estimation in large langua | arXiv: 2505.17773
c-nav towards self-evolving continual object navigation in open world | arXiv: 2510.20685
c2prompt class-aware client knowledge interaction for federated continual learni | arXiv: 2509.19674
c3po cross-view cross-modality correspondence by pointmap prediction | arXiv: 2511.18559
cadmorph geometry-driven parametric cad editing via a plan-generate-verify loop | arXiv: 2512.11480
cam a constructivist view of agentic memory for llm-based reading comprehension | arXiv: 2510.05520
CAMILA: Context-Aware Masking for Image Editing with Language Alignment
camit a time-aware car model dataset for classification and generation | arXiv: 2510.17626
can agents fix agent issues | arXiv: 2505.20749
Can DPO Learn Diverse Human Values? A Theoretical Scaling Law | arXiv: 2408.03459
can knowledge-graph-based retrieval augmented generation really retrieve what yo | arXiv: 2510.16582
can large language models master complex card games | arXiv: 2509.01328
can llms outshine conventional recommenders a comparative evaluation | arXiv: 2503.05493
can llms reason over non-text modalities in a training-free manner a case study | arXiv: 2509.17552
Can LLMs Write Faithfully? An Agent-Based Evaluation of LLM-generated Islamic Content | arXiv: 2510.24438
can multi-modal llms provide live step-by-step task guidance | arXiv: 2511.21998
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness | arXiv: 2502.14914
capturing individual human preferences with reward features | arXiv: 2503.17338
care-pd a multi-site anonymized clinical dataset for parkinsons disease gait ass | arXiv: 2510.04312
CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs | arXiv: 2510.26843
CAT: Circular-Convolutional Attention for Sub-Quadratic Transformers | arXiv: 2504.06704
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers | arXiv: 2505.13737
causal masking on spatial data an information-theoretic case for learning spatia | arXiv: 2510.27009
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models | arXiv: 2505.19474
causaldynamics a large-scale benchmark for structural discovery of dynamical cau | arXiv: 2505.16620
causality meets locality provably generalizable and scalable policy learning for | arXiv: 2510.21427
causality-induced positional encoding for transformer-based representation learn | arXiv: 2509.16629
causally reliable concept bottleneck models | arXiv: 2503.04363
CBMAS: Cognitive Behavioral Modeling via Activation Steering | arXiv: 2601.06109
cdflow building invertible layers with circulant and diagonal matrices | arXiv: 2510.25323
certifying concavity and monotonicity in games via sum-of-squares hierarchies | arXiv: 2512.10292
certifying stability of reinforcement learning policies using generalized lyapun | arXiv: 2505.10947
cgbench benchmarking language model scientific reasoning for clinical genetics r | arXiv: 2510.11985
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning | arXiv: 2503.19331
chain-of-retrieval augmented generation | arXiv: 2501.14342
channel matters estimating channel influence for multivariate time series | arXiv: 2408.14763
characterization and learning of causal graphs from hard interventions | arXiv: 2505.01037
characterizing the expressivity of fixed-precision transformer language models | arXiv: 2505.23623
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models | arXiv: 2505.13444
Checklists Are Better Than Reward Models For Aligning Language Models | arXiv: 2507.18624
chiqpm calibrated hierarchical interpretable image classification | arXiv: 2511.20779
choice benchmarking the remote sensing capabilities of large vision-language mod | arXiv: 2411.18145
chronograph a real-world graph-based multivariate time series dataset | arXiv: 2509.04449
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference | arXiv: 2502.00299
classical planning with llm-generated heuristics challenging the state of the ar | arXiv: 2503.18809
clawscreativity detection for llm-generated solutions using attention window of | arXiv: 2510.17921
clean first align later benchmarking preference data cleaning for reliable llm a | arXiv: 2509.23564
cleverbirds a multiple-choice benchmark for fine-grained human knowledge tracing | arXiv: 2511.08512
climb class-imbalanced learning benchmark on tabular data | arXiv: 2505.17451
clip-and-verify linear constraint-driven domain clipping for accelerating neural | arXiv: 2512.11087
clipgaussian universal and multimodal style transfer based on gaussian splatting | arXiv: 2505.22854
cloud4d estimating cloud properties at a high spatial and temporal resolution | arXiv: 2511.19431
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning | arXiv: 2506.03136
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation | arXiv: 2505.17534
CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance | arXiv: 2507.10646
codecrash exposing llm fragility to misleading natural language in code reasonin | arXiv: 2504.14119
codegemm a codebook-centric approach to efficient gemm in quantized llms | arXiv: 2512.17970
cognitive mirrors exploring the diverse functional roles of attention heads in l | arXiv: 2512.10978
cogvla cognition-aligned vision-language-action model via instruction-driven rou | arXiv: 2508.21046
coido efficient data selection for visual instruction tuning via coupled importa | arXiv: 2510.17847
collapsing taylor mode automatic differentiation | arXiv: 2505.13644
collective narrative grounding community-coordinated data contributions to impro | arXiv: 2601.04201
communicating plans not percepts scalable multi-agent coordination with embodied | arXiv: 2508.02912
comparing uniform price and discriminatory multi-unit auctions through regret mi | arXiv: 2510.19591
complexity scaling laws for neural models using combinatorial optimization | arXiv: 2506.12932
compo preference alignment via comparison oracles | arXiv: 2505.05465
composing global solutions to reasoning tasks via algebraic objects in neural ne | arXiv: 2410.01779
composing linear layers from irreducibles | arXiv: 2507.11688
composite flow matching for reinforcement learning with shifted-dynamics data | arXiv: 2505.23062
Composition and Alignment of Diffusion Models using Constrained Learning | arXiv: 2508.19104
compress gather and recompute reforming long-context processing in transformers | arXiv: 2506.01215
compressing biology evaluating the stable diffusion vae for phenotypic drug disc | arXiv: 2510.19887
computable universal online learning | arXiv: 2510.18352
computational hardness of reinforcement learning with partial qπ-realizability | arXiv: 2510.21888
concept-level explainability for auditing steering llm responses | arXiv: 2505.07610
conceptscope characterizing dataset bias via disentangled visual concepts | arXiv: 2510.26186
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations | arXiv: 2510.23607
conditional distribution compression via the kernel conditional mean embedding | arXiv: 2504.10139
Conditional Panoramic Image Generation via Masked Autoregressive Modeling | arXiv: 2505.16862
conformal online learning of deep koopman linear embeddings | arXiv: 2511.12760
conformal prediction for causal effects of continuous treatments | arXiv: 2407.03094
conformal prediction in the loop a feedback-based uncertainty model for trajecto | arXiv: 2510.16376
conformal risk training end-to-end optimization of conformal risk control | arXiv: 2510.08748
confounding robust deep reinforcement learning a causal approach | arXiv: 2510.21110
confrover simultaneous modeling of protein conformation and dynamics via autoreg | arXiv: 2505.17478
conftuner training large language models to express their confidence verbally | arXiv: 2508.18847
Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning | arXiv: 2510.20644
connecting the dots a machine learning ready dataset for ionospheric forecasting | arXiv: 2511.15743
connectomebench can llms proofread the connectome | arXiv: 2511.05542
consistent sampling and simulation molecular dynamics with energy-based diffusio | arXiv: 2506.17139
consistent supervised-unsupervised alignment for generalized category discovery | arXiv: 2507.04725
constant bit-size transformers are turing complete | arXiv: 2506.12027
constrained discrete diffusion | arXiv: 2503.09790
constrained network slice assignment via large language models | arXiv: 2512.00040
context informs pragmatic interpretation in vision-language models | arXiv: 2511.03908
contextagent context-aware proactive llm agents with open-world sensory percepti | arXiv: 2505.14668
contexttab a semantics-aware tabular in-context learner | arXiv: 2506.10707
contextual dynamic pricing with heterogeneous buyers | arXiv: 2512.09513
contextual integrity in llms via reasoning and reinforcement learning | arXiv: 2506.04245
contextual thompson sampling via generation of missing data | arXiv: 2502.07064
continual knowledge adaptation for reinforcement learning | arXiv: 2510.19314
Continual Multimodal Contrastive Learning | arXiv: 2503.14963
Continuous Diffusion Model for Language Modeling | arXiv: 2502.11564
continuous simplicial neural networks | arXiv: 2503.12919
continuous subspace optimization for continual learning | arXiv: 2505.11816
continuous thought machines | arXiv: 2505.05522
continuous uniqueness and novelty metrics for generative modeling of inorganic c | arXiv: 2510.12405
contrastive consolidation of top-down modulations achieves sparsely supervised c | arXiv: 2505.14125
Contrastive Representations for Temporal Reasoning
contribution of task-irrelevant stimuli to drift of neural representations | arXiv: 2510.21588
controlfusion a controllable image fusion framework with language-vision degrada | arXiv: 2503.23356
Controlling Thinking Speed in Reasoning Models | arXiv: 2507.03704
convergence theorems for entropy-regularized and distributional reinforcement le | arXiv: 2510.08526
convis-bench estimating video similarity through semantic concepts | arXiv: 2509.19245
convolutional monge mapping between eeg datasets to support independent componen | arXiv: 2509.01721
coopera continual open-ended human-robot assistance | arXiv: 2510.23495
Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers | arXiv: 2512.10422
copresheaf topological neural networks a generalized deep learning framework | arXiv: 2505.21251
CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
core benchmarking llms code reasoning capabilities through static analysis tasks | arXiv: 2507.05269
core constraint-aware one-step reinforcement learning for simulation-guided neur | arXiv: 2506.03474
core full-path evaluation of llm agents beyond final state | arXiv: 2509.20998
coreguard safeguarding foundational capabilities of llms against model stealing | arXiv: 2410.13903
coreset for robust geometric median eliminating size dependency on outliers | arXiv: 2510.24621
coresets for clustering under stochastic noise | arXiv: 2510.23438
correlation dimension of auto-regressive large language models | arXiv: 2510.21258
COS3D: Collaborative Open-Vocabulary 3D Segmentation | arXiv: 2510.20238
cosmobench a multiscale multiview multitask cosmology benchmark for geometric de | arXiv: 2507.03707
cost efficient fairness audit under partial feedback | arXiv: 2510.03734
cost-sensitive freeze-thaw bayesian optimization for efficient hyperparameter tu | arXiv: 2510.21379
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring | arXiv: 2505.23575
counteractive rl rethinking core principles for efficient and scalable deep rein | arXiv: 2603.15871
counterfactual identifiability via dynamic optimal transport | arXiv: 2510.08294
counterfactual reasoning for steerable pluralistic value alignment of large lang | arXiv: 2510.18526
coupling generative modeling and an autoencoder with the causal bridge | arXiv: 2509.25599
covariances for free exploiting mean distributions for training-free federated l | arXiv: 2412.14326
CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder | arXiv: 2510.18583
cpep contrastive pose-emg pre-training enhances gesture generalization on emg si | arXiv: 2509.04699
cpret a dataset benchmark and model for retrieval in competitive programming | arXiv: 2505.12925
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection | arXiv: 2503.18430
Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models | arXiv: 2505.10844
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training | arXiv: 2505.23971
cross-fluctuation phase transitions reveal sampling dynamics in diffusion models | arXiv: 2511.00124
Crucible: Quantifying the Potential of Control Algorithms through LLM Agents | arXiv: 2510.18491
cryptomoe privacy-preserving and scalable mixture of experts inference via balan | arXiv: 2511.01197
ctrl-alt-deceit sabotage evaluations for automated ai rd | arXiv: 2511.09904
cue3d quantifying the role of image cues in single-image 3d generation | arXiv: 2511.22121
cultural alien sampler open-ended art generation balancing originality and coher | arXiv: 2510.20849
cumolos-mae a masked autoencoder for remote sensing data reconstruction | arXiv: 2508.14957
cureagent a training-free executor-analyst framework for clinical reasoning | arXiv: 2512.05576
curiosity-driven rl for symbolic equation solving | arXiv: 2510.17022
curly flow matching for learning non-gradient field dynamics | arXiv: 2510.26645
Curriculum Abductive Learning | arXiv: 2505.12275
curvature tuning provable training-free model steering from a single parameter | arXiv: 2502.07783
cxreasonbench a benchmark for evaluating structured diagnostic reasoning in ches | arXiv: 2505.18087
cycle-sync robust global camera pose estimation through enhanced cycle-consisten | arXiv: 2511.02329
cyclic counterfactuals under shift-scale interventions | arXiv: 2510.25005
cyin cyclic informative latent space for bridging complete and incomplete multim | arXiv: 2602.04920
cymbadiff structured spatial diffusion for sketch-based 3d semantic urban scene | arXiv: 2510.13245
d2ust3r enhancing 3d reconstruction for dynamic scenes | arXiv: 2504.06264
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
dartquant efficient rotational distribution calibration for llm quantization | arXiv: 2511.04063
data efficient adaptation in large language models via continuous low-rank fine- | arXiv: 2509.18942
data-juicer 20 cloud-scale adaptive data processing for and with foundation mode | arXiv: 2501.14755
datarater meta-learned dataset curation | arXiv: 2505.17895
dataset distillation for pre-trained self-supervised vision models | arXiv: 2511.16674
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models | arXiv: 2507.09424
dbloss decomposition-based loss function for time series forecasting | arXiv: 2510.23672
dc4gs directional consistency-driven adaptive density control for 3d gaussian sp | arXiv: 2510.26921
dca graph-guided deep embedding clustering for brain atlases | arXiv: 2509.01426
dcad-2000 a multilingual dataset across 2000 languages with data cleaning as ano | arXiv: 2502.11546
dccluster-opt benchmarking dynamic multi-objective optimization for geo-distribu | arXiv: 2511.00117
de novo generation of functional terpene synthases using tpsgpt | arXiv: 2512.08772
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models? | arXiv: 2508.17536
decaflow a deconfounding causal generative model | arXiv: 2503.15114
deceptron learned local inverses for fast and stable physics inversion | arXiv: 2511.21076
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation | arXiv: 2507.06607
decomate leveraging generative models for co-creative svg animation | arXiv: 2511.06297
decomposition of small transformer models | arXiv: 2511.08854
Decoupled Entropy Minimization | arXiv: 2511.03256
deep compositional phase diffusion for long motion sequence generation | arXiv: 2510.14427
deep continuous-time state-space models for marked event sequences | arXiv: 2412.19634
deep learning for continuous-time stochastic control with jumps | arXiv: 2505.15602
deep legendre transform | arXiv: 2512.19649
deep modularity networks with diversity-preserving regularization | arXiv: 2501.13451
deep research brings deeper harm | arXiv: 2510.11851
deep rl needs deep behavior analysis exploring implicit planning by model-free a | arXiv: 2506.06981
deep taxonomic networks for unsupervised hierarchical prototype discovery | arXiv: 2509.23602
deep value benchmark measuring whether models generalize deep values or shallow | arXiv: 2511.02109
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding | arXiv: 2505.18079
deepasa an object-oriented multi-purpose network for auditory scene analysis | arXiv: 2509.17247
deepdiver adaptive search intensity scaling via open-web reinforcement learning | arXiv: 2505.24332
deeppersona a generative engine for scaling deep synthetic personas | arXiv: 2511.07338
deeptraverse a depth-first search inspired network for algorithmic visual unders | arXiv: 2506.10084
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO | arXiv: 2506.07464
defenderbench a toolkit for evaluating language agents in cybersecurity environm | arXiv: 2506.00739
DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models | arXiv: 2509.22793
deliberation on priors trustworthy reasoning of large language models on knowled | arXiv: 2505.15210
deltaflow an efficient multi-frame scene flow estimation method | arXiv: 2508.17054
deltaphi physical states residual learning for neural operators in data-limited | arXiv: 2406.09795
deltaproduct improving state-tracking in linear rnns via householder products | arXiv: 2502.10297
delving into cascaded instability a lipschitz continuity view on image restorati | arXiv: 2510.24232
demandcast global hourly electricity demand forecasting | arXiv: 2510.08000
demo generative ai helps radiotherapy planning with user preference | arXiv: 2512.08996
demo guide-rag evidence-driven corpus curation for retrieval-augmented generatio | arXiv: 2510.15782
demystifying language model forgetting with low-rank example associations | arXiv: 2406.14026
demystifying spectral feature learning for instrumental variable regression | arXiv: 2506.10899
denoiserotator enhance pruning robustness for llms via importance concentration | arXiv: 2505.23049
denoising weak lensing mass maps with diffusion model and generative adversarial | arXiv: 2511.16415
dense associative memory with epanechnikov energy | arXiv: 2506.10801
dense backpropagation improves training for sparse mixture-of-experts | arXiv: 2504.12463
dense sae latents are features not bugs | arXiv: 2506.15679
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models | arXiv: 2506.03517
dependency parsing is more parameter-efficient with normalization | arXiv: 2505.20215
depth-bounds for neural networks via the braid arrangement | arXiv: 2502.09324
depth-supervised fusion network for seamless-free image stitching | arXiv: 2510.21396
dermacon-in a multi-concept annotated dermatological image dataset of indian ski | arXiv: 2506.06099
design encrypted gnn inference via server-side input graph pruning | arXiv: 2507.05649
designx human-competitive algorithm designer for black-box optimization | arXiv: 2505.17866
detecting generated images by fitting natural image distributions | arXiv: 2511.01293
Detecting High-Stakes Interactions with Activation Probes | arXiv: 2506.10805
detection and simulation of urban heat islands using a fine-tuned geospatial fou | arXiv: 2510.18773
detectiumfire a comprehensive multi-modal dataset bridging vision and language f | arXiv: 2511.02495
deterministic continuous replacement fast and stable module replacement in pretr | arXiv: 2511.18670
detree detecting human-ai collaborative texts via tree-structured hierarchical r | arXiv: 2510.17489
devfd developmental face forgery detection by learning shared and orthogonal lor | arXiv: 2509.19230
dexflywheel a scalable and self-improving data generation framework for dexterou | arXiv: 2509.23829
dexter diffusion-guided explanations with textual reasoning for vision models | arXiv: 2510.14741
dgh dynamic gaussian hair | arXiv: 2512.17094
Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking | arXiv: 2505.23495
dice discrete interpretable comparative evaluation with probabilistic scoring fo | arXiv: 2512.22629
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling | arXiv: 2505.11196
dictpfl efficient and private federated learning on encrypted gradients | arXiv: 2510.21086
diff-icmh harmonizing machine and human vision in image compression with generat | arXiv: 2511.22549
differentiable hierarchical visual tokenization | arXiv: 2511.02652
differentiable structure learning and causal discovery for general binary data | arXiv: 2509.21658
differential privacy for euclidean jordan algebra with applications to private s | arXiv: 2509.16915
differentially private bilevel optimization efficient algorithms with near-optim | arXiv: 2506.12994
differentially private federated low rank adaptation beyond fixed-matrix | arXiv: 2507.09990
differentially private high-dimensional variable selection via integer programmi | arXiv: 2510.22062
diffeye diffusion-based continuous eye-tracking data generation conditioned on n | arXiv: 2509.16767
Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models | arXiv: 2510.23974
Diffusion Classifiers Understand Compositionality, but Conditions Apply | arXiv: 2505.17955
diffusion generative modeling on lie group representations | arXiv: 2502.02513
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization | arXiv: 2502.01051
diffusion models meet contextual bandits | arXiv: 2402.10028
diffusion transformers as open-world spatiotemporal foundation models | arXiv: 2411.12164
diffusion transformers for imputation statistical efficiency and uncertainty qua | arXiv: 2510.02216
diffusion-based electromagnetic inverse design of scattering structured media | arXiv: 2511.05357
diffusion-classifier synergy reward-aligned learning via mutual boosting loop fo | arXiv: 2510.03608
diffusion-driven progressive target manipulation for source-free domain adaptati | arXiv: 2510.25279
diffusion-driven two-stage active learning for low-budget semantic segmentation | arXiv: 2510.22229
DINO-Foresight: Looking into the Future with DINO | arXiv: 2412.11673
directional non-commutative monoidal structures for compositional embeddings in | arXiv: 2505.15507
disaggregation reveals hidden training dynamics the case of agreement attraction | arXiv: 2510.24934
DISC: Dynamic Decomposition Improves LLM Inference Scaling | arXiv: 2502.16706
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization | arXiv: 2505.12366
discover automated curricula for sparse-reward reinforcement learning | arXiv: 2505.19850
discovering transformer circuits via a hybrid attribution and pruning framework | arXiv: 2510.03282
disentangled concepts speak louder than words explainable video action recogniti | arXiv: 2511.03725
disentangling hyperedges through the lens of category theory | arXiv: 2510.16289
disentangling latent shifts of in-context learning with weak supervision | arXiv: 2410.01508
DisMo: Disentangled Motion Representations for Open-World Motion Transfer | arXiv: 2511.23428
dison decentralized isolation networks for out-of-distribution detection in medi | arXiv: 2506.09024
distillation robustifies unlearning | arXiv: 2506.06278
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
Distilling LLM Agent into Small Models with Retrieval and Code Tools | arXiv: 2505.17612
distribution learning meets graph structure sampling | arXiv: 2405.07914
distributional adversarial attacks and training in deep hedging | arXiv: 2508.14757
distributional autoencoders know the score | arXiv: 2502.11583
distributionally robust feature selection | arXiv: 2510.21113
distributive fairness in large language models evaluating alignment with human v | arXiv: 2502.00313
ditch the denoiser emergence of noise robustness in self-supervised learning fro | arXiv: 2505.12191
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection | arXiv: 2503.09271
dna-detectllm unveiling ai-generated text via a dna-inspired mutation-repair par | arXiv: 2509.15550
Do Different Prompting Methods Yield a Common Task Representation in Language Models? | arXiv: 2505.12075
Do Language Models Use Their Depth Efficiently? | arXiv: 2505.13898
do neural networks need gradient descent to generalize a theoretical study | arXiv: 2506.03931
do-pfn in-context learning for causal effect estimation | arXiv: 2506.06039
doctor approved generating medically accurate skin disease images through ai-exp | arXiv: 2506.12323
Document Summarization with Conformal Importance Guarantees | arXiv: 2509.20461
does object binding naturally emerge in large pretrained vision transformers | arXiv: 2510.24709
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models | arXiv: 2506.04210
domain-adapted granger causality for real-time cross-slice attack attribution in | arXiv: 2510.05165
domain-adaptive transformer for data-efficient glioma segmentation in sub-sahara | arXiv: 2511.02928
Don't Be Lazy: CompleteP Enables Compute-Efficient Deep Transformers | arXiv: 2505.01618
Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
dont just chase highlighted tokens in mllms revisiting visual holistic context r | arXiv: 2510.02912
DOTA: Distributional Test-Time Adaptation of Vision-Language Models | arXiv: 2409.19375
double descent meets out-of-distribution detection theoretical insights and empi | arXiv: 2411.02184
doubly robust alignment for large language models | arXiv: 2506.01183
dove efficient one-step diffusion model for real-world video super-resolution | arXiv: 2505.16239
dp-llm runtime model adaptation with dynamic layer-wise precision assignment | arXiv: 2508.06041
dp2o-sr direct perceptual preference optimization for real-world image super-res | arXiv: 2510.18851
dpa a one-stop metric to measure bias amplification in classification datasets | arXiv: 2412.11060
dragon guard llm unlearning in context via negative detection and reasoning | arXiv: 2511.05784
dreamprm domain-reweighted process reward model for multimodal reasoning | arXiv: 2505.20241
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents | arXiv: 2506.12104
DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving | arXiv: 2509.17940
dsas a universal plug-and-play framework for attention optimization in multi-doc | arXiv: 2510.12251
dual data alignment makes ai-generated image detector easier generalizable | arXiv: 2505.14359
dual mixture-of-experts framework for discrete-time survival analysis | arXiv: 2510.26014
dual-flow transferable multi-target instance-agnostic attacks via in-the-wild ca | arXiv: 2502.02096
dualfocus depth from focus with spatio-focal dual variational constraints | arXiv: 2509.21992
duetgraph coarse-to-fine knowledge graph reasoning with dual-pathway global-loca | arXiv: 2507.11229
duogpt training-free dual sparsity through activation-aware pruning in llms | arXiv: 2506.20194
duolens a framework for robust detection of machine-generated multilingual text | arXiv: 2510.18904
dyg-mamba continuous state space modeling on dynamic graphs | arXiv: 2408.06966
dynaact large language model reasoning with dynamic action spaces | arXiv: 2511.08043
dynaguide steering diffusion polices with active dynamic guidance | arXiv: 2506.13922
dynamic algorithm for explainable k-medians clustering under lp norm | arXiv: 2512.01150
dynamic bundling with large language models for zero-shot inference on text-attr | arXiv: 2505.17599
dynamic causal discovery in alzheimers disease through latent pseudotime modelli | arXiv: 2511.04619
dynamic diffusion schrödinger bridge in astrophysical observational inversions | arXiv: 2506.08065
dynamic features adaptation in networking toward flexible training and explainab | arXiv: 2510.08303
dynamic gaussian splatting from defocused and motion-blurred monocular videos | arXiv: 2510.10691
dynamic regret reduces to kernelized static regret | arXiv: 2507.05478
dynamics of spontaneous topic changes in next token prediction with self-attenti | arXiv: 2501.06382
dynamics-aligned latent imagination in contextual world models for zero-shot gen | arXiv: 2508.20294
dynamicvl benchmarking multimodal large language models for dynamic city underst | arXiv: 2505.21076
dynanav dynamic feature and layer selection for efficient visual navigation | arXiv: 2509.21930
dynarend learning 3d dynamics via masked future rendering for robotic manipulati | arXiv: 2510.24261
e-bats efficient backpropagation-free test-time adaptation for speech foundation | arXiv: 2506.07078
e-moflow learning egomotion and optical flow from event data via implicit regula | arXiv: 2510.12753
e2e-vguard adversarial prevention for production llm-based end-to-end speech syn | arXiv: 2511.07099
ea3d online open-world 3d object extraction from streaming videos | arXiv: 2510.25146
eag3r event-augmented 3d geometry estimation for dynamic and extreme-lighting sc | arXiv: 2512.00771
echoes of humanity exploring the perceived humanness of ai music | arXiv: 2509.25601
ecocast a spatio-temporal model for continual biodiversity and climate risk fore | arXiv: 2512.02260
edbench large-scale electron density data for molecular modeling | arXiv: 2505.09262
eddyformer accelerated neural simulations of three-dimensional turbulence at sca | arXiv: 2510.24173
edit less achieve more dynamic sparse neuron masking for lifelong knowledge edit | arXiv: 2510.22139
editinfinity image editing with binary-quantized generative models | arXiv: 2510.20217
eegrexfernet a lightweight gen-ai framework for eeg subspace reconstruction via | arXiv: 2511.02848
ef-3dgs event-aided free-trajectory 3d gaussian splatting | arXiv: 2410.15392
effective policy learning for multi-agent online coordination beyond submodular | arXiv: 2509.22596
efficient adaptive experimentation with noncompliance | arXiv: 2505.17468
efficient adaptive federated optimization | arXiv: 2410.18117
efficient fairness-performance pareto front computation | arXiv: 2409.17643
efficient federated learning against byzantine attacks and data heterogeneity vi | arXiv: 2408.09539
efficient kernelized learning in polyhedral games beyond full-information from c | arXiv: 2509.20919
efficient multi-modal large language models via progressive consistency distilla | arXiv: 2510.00515
efficient parametric svd of koopman operator for stochastic dynamical systems | arXiv: 2507.07222
efficient pre-training of llms via topology-aware communication alignment on mor | arXiv: 2509.15940
efficient rectified flow for image fusion | arXiv: 2509.16549
efficient semantic uncertainty quantification in language models via diversity-s | arXiv: 2510.21310
efficient speech language modeling via energy distance in continuous latent spac | arXiv: 2505.13181
efficient training-free online routing for high-volume multi-llm serving | arXiv: 2509.02718
efficient verified machine unlearning for distillation | arXiv: 2503.22539
efficient vision-language reasoning via adaptive token pruning | arXiv: 2512.12701
EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval | arXiv: 2510.18546
egobridge domain adaptation for generalizable imitation from egocentric human da | arXiv: 2509.19626
egoemotion egocentric vision and physiological signals for emotion and personali | arXiv: 2510.22129
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
elastic vits from pretrained models without retraining | arXiv: 2510.17700
elastic weight consolidation for knowledge graph continual learning an empirical | arXiv: 2512.01890
elasticmm efficient multimodal llms serving with elastic multimodal parallelism | arXiv: 2507.10069
electra a cartesian network for 3d charge density prediction with floating orbit | arXiv: 2503.08305
elucidated rolling diffusion models for probabilistic forecasting of complex dyn | arXiv: 2506.20024
embedding alignment in code generation for audio | arXiv: 2508.05473
Emergence and Evolution of Interpretable Concepts in Diffusion Models
emergence and scaling laws in sgd learning of shallow neural networks | arXiv: 2504.19983
emergence of linear truth encodings in language models | arXiv: 2510.15804
emergency response measures for catastrophic ai risk | arXiv: 2511.05526
emergent world beliefs exploring transformers in stochastic games | arXiv: 2512.23722
emloc emulator-based memory-efficient fine-tuning with lora correction | arXiv: 2506.12015
empathia multi-faceted human-ai collaboration for refugee integration | arXiv: 2508.07671
empirical study on robustness and resilience in cooperative multi-agent reinforc | arXiv: 2510.11824
Empower Words: DualGround for Structured Phrase and Sentence-Level Temporal Grounding
empowering decision trees via shape function branching | arXiv: 2510.19040
enabling differentially private federated learning for speech recognition benchm | arXiv: 2310.00098
encoder-decoder diffusion language models for efficient training and inference | arXiv: 2510.22852
encoding and understanding astrophysical information in large language model-gen | arXiv: 2511.14685
EnCompass: Enhancing Agent Programming with Search Over Program Execution Paths | arXiv: 2512.03571
endobench a comprehensive evaluation of multi-modal large language models for en | arXiv: 2505.23601
energy loss functions for physical systems | arXiv: 2511.02087
energy matching unifying flow matching and energy-based models for generative mo | arXiv: 2504.10612
enerverse envisioning embodied future space for robotics manipulation | arXiv: 2501.01895
enforcing governing equation constraints in neural pde solvers via training-free | arXiv: 2511.17258
enginuity building an open multi-domain dataset of complex engineering diagrams | arXiv: 2601.13299
Enhancing CLIP Robustness via Cross-Modality Alignment | arXiv: 2510.24038
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions | arXiv: 2510.16540
enhancing demand-oriented regionalization with agentic ai and local heterogeneou | arXiv: 2511.10857
enhancing diffusion model guidance through calibration and regularization | arXiv: 2511.05844
enhancing graph classification robustness with singular pooling | arXiv: 2510.22643
Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark | arXiv: 2510.09343
enhancing interpretability in deep reinforcement learning through semantic clust | arXiv: 2409.17411
enhancing multilingual llm pretraining with model-based data selection | arXiv: 2502.10361
enhancing sample selection against label noise by cutting mislabeled easy exampl | arXiv: 2502.08227
enhancing semi-supervised learning with zero-shot pseudolabels | arXiv: 2502.12584
Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders
enhancing the outcome reward-based rl training of mllms with self-consistency sa | arXiv: 2511.10648
enhancing training data attribution with representational optimization | arXiv: 2505.18513
Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding | arXiv: 2412.06474
entropy rectifying guidance for diffusion and flow models | arXiv: 2504.13987
environment inference for learning generalizable dynamical system | arXiv: 2510.19784
epistemic uncertainty for generated image detection | arXiv: 2412.05897
equivariance by contrast identifiable equivariant embeddings from unlabeled fini | arXiv: 2510.21706
equivariant flow matching for symmetry-breaking bifurcation problems | arXiv: 2509.03340
esca contextualizing embodied agents via scene-graph generation | arXiv: 2510.15963
escaping saddle points without lipschitz smoothness the power of nonlinear preco | arXiv: 2509.15817
establishing linear surrogate regret bounds for convex smooth losses via convolu | arXiv: 2505.09432
estimating hitting times locally at scale | arXiv: 2511.04343
estimation of stochastic optimal transport maps | arXiv: 2512.09499
ethics statements in ai music papers the effective and the ineffective | arXiv: 2509.25496
eu-agent-bench measuring illegal behavior of llm agents under eu law | arXiv: 2510.21524
eugens efficient unified and general dense layers | arXiv: 2410.09771
eurospeech a multilingual speech corpus | arXiv: 2510.00514
evalearn quantifying the learning capability and efficiency of llms via sequenti | arXiv: 2506.02672
evaluating in silico creativity an expert review of ai chess compositions | arXiv: 2510.23772
evaluating llms for combinatorial optimization one-phase and two-phase heuristic | arXiv: 2509.22255
Evaluating LLMs in Open-Source Games | arXiv: 2512.00371
evaluating multimodal large language models on core music perception tasks | arXiv: 2510.22455
evaluating multiple models using labeled and unlabeled data | arXiv: 2501.11866
evaluating the evaluators metrics for compositional text-to-image generation | arXiv: 2509.21227
evaluating the promise and pitfalls of llms in hiring decisions | arXiv: 2507.02087
evaluation of vision-llms in surveillance video | arXiv: 2510.23190
every camera effect every time all at once 4d gaussian ray tracing for physics-b | arXiv: 2509.10759
evobrain dynamic multi-channel eeg graph modeling for time-evolving brain networ | arXiv: 2509.15857
evodiff entropy-aware variance optimized diffusion inference | arXiv: 2509.26096
evolm in search of lost language model training dynamics | arXiv: 2506.16029
evolutionary learning in spatial agent-based models for physical climate risk as | arXiv: 2509.18633
evolutionary prediction games | arXiv: 2503.03401
evolve to inspire novelty search for diverse image generation | arXiv: 2511.00686
evorefuse evolutionary prompt optimization for evaluation and mitigation of llm | arXiv: 2505.23473
ewc-guided diffusion replay for exemplar-free continual learning in medical imag | arXiv: 2509.23906
exact and linear convergence for federated learning under arbitrary client parti | arXiv: 2503.20117
exact expressive power of transformers with padding | arXiv: 2505.18948
exact learning of arithmetic with differentiable agents | arXiv: 2511.22751
exgra-med extended context graph alignment for medical vision-language models | arXiv: 2410.02615
exoplanet formation inference using conditional invertible neural networks | arXiv: 2512.05751
explaining and mitigating crosslingual tokenizer inequities | arXiv: 2510.21909
explaining similarity in vision-language encoders with weighted banzhaf interact | arXiv: 2508.05430
exploiting task relationships in continual learning via transferability-aware ta | arXiv: 2502.11609
exploiting vocabulary frequency imbalance in language model pre-training | arXiv: 2508.15390
exploration of incremental synthetic non-morphed images for single morphing atta | arXiv: 2510.09836
exploration via feature perturbation in contextual bandits | arXiv: 2510.17390
exploration with foundation models capabilities limitations and hybrid approache | arXiv: 2509.19924
exploring and leveraging class vectors for classifier editing | arXiv: 2510.11268
exploring landscapes for better minima along valleys | arXiv: 2510.27153
exploring neural granger causality with xlstms unveiling temporal dependencies i | arXiv: 2502.09981
exploring semantic-constrained adversarial example with instruction uncertainty | arXiv: 2510.22981
exploring structural degradation in dense representations for self-supervised le | arXiv: 2510.17299
exploring the limits of strong membership inference attacks on large language mo | arXiv: 2505.18773
exploring the translation mechanism of large language models | arXiv: 2502.11806
exploring variational graph autoencoders for distribution grid data generation | arXiv: 2509.02469
expo unlocking hard reasoning with self-explanation-guided reinforcement learnin | arXiv: 2507.02834
extending ngu to multi-agent rl a preliminary study | arXiv: 2512.01321
extragradient method for l 0 l 1-lipschitz root-finding problems | arXiv: 2510.22421
extremely simple multimodal outlier synthesis for out-of-distribution detection | arXiv: 2505.16985
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video | arXiv: 2510.14560
f-adapter frequency-adaptive parameter-efficient fine-tuning in scientific machi | arXiv: 2509.23173
face a general framework for mapping collaborative filtering embeddings into llm | arXiv: 2510.15729
face faithful automatic concept extraction | arXiv: 2510.11675
face-human-bench a comprehensive benchmark of face and human understanding for m | arXiv: 2501.01243
fact faithful concept traces for explaining neural network decisions | arXiv: 2510.25512
factor decorrelation enhanced data removal from deep predictive models | arXiv: 2509.23443
failure prediction at runtime for generative robot policies | arXiv: 2510.09459
fair minimum labeling efficient temporal network activations for reachability an | arXiv: 2510.03899
fair representation learning with controllable high confidence guarantees via ad | arXiv: 2510.21017
fair universe higgsml uncertainty dataset and competition | arXiv: 2410.02867
faircontrast enhancing fairness through contrastive learning and customized augm | arXiv: 2510.02017
fairgrpo fair reinforcement learning for equitable clinical reasoning | arXiv: 2510.19893
fairimagen post-processing for bias mitigation in text-to-image models | arXiv: 2510.21363
fairness under competition | arXiv: 2505.16291
fairness-regularized online optimization with switching costs | arXiv: 2512.11131
faithful group shapley value | arXiv: 2505.19013
faithful summarization of consumer health queries a cross-lingual framework with | arXiv: 2511.10768
falcon an ml framework for fully automated layout-constrained analog circuit des | arXiv: 2505.21923
falcon few-step accurate likelihoods for continuous flows | arXiv: 2512.09914
falcon fine-grained activation manipulation by contrastive orthogonal unalignmen | arXiv: 2502.01472
falqon accelerating lora fine-tuning with low-bit floating-point arithmetic | arXiv: 2510.24061
fantastic features and where to find them a probing method to combine features f | arXiv: 2512.01405
fapex fractional amplitude-phase expressor for robust cross-subject seizure pred | arXiv: 2511.03263
far from the shallow brain-predictive reasoning embedding through residual disen | arXiv: 2510.22860
fast and fluent diffusion language models via convolutional decoding and rejecti | arXiv: 2509.15188
fast data attribution for text-to-image models | arXiv: 2511.10721
fast foreground-aware diffusion with accelerated sampling trajectory for segment | arXiv: 2509.20295
fast solvers for discrete diffusion models theory and applications of high-order | arXiv: 2502.00234
fastdinov2 frequency based curriculum learning improves robustness and training | arXiv: 2507.03779
faster algorithm for structured john ellipsoid computation | arXiv: 2211.14407
fastjam a fast joint alignment model for images | arXiv: 2510.22842
fastlongspeech enhancing large speech-language models for efficient long-speech | arXiv: 2507.14815
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
feat free energy estimators with adaptive transport | arXiv: 2504.11516
feature-aware modulation for learning from temporal tabular data | arXiv: 2512.03678
fedfact a provable framework for controllable group-fairness calibration in fede | arXiv: 2506.03777
fedqs optimizing gradient and model aggregation for semi-asynchronous federated | arXiv: 2510.07664
fedrain-lite federated reinforcement algorithms for improving idealised numerica | arXiv: 2508.14315
fedrts federated robust pruning via combinatorial thompson sampling | arXiv: 2501.19122
fedrw efficient privacy-preserving data reweighting for enhancing federated lear | arXiv: 2511.07505
fedsvd adaptive orthogonalization for private federated learning with lora | arXiv: 2505.12805
feel-good thompson sampling for contextual bandits a markov chain monte carlo sh | arXiv: 2507.15290
ferretnet efficient synthetic image detection via local pixel dependencies | arXiv: 2509.20890
few-shot knowledge distillation of llms with counterfactual explanations | arXiv: 2510.21631
few-shot learning from gigapixel images via hierarchical vision-language alignme | arXiv: 2505.17982
fgbench a dataset and benchmark for molecular property reasoning at functional g | arXiv: 2508.01055
fin3r fine-tuning feed-forward 3d reconstruction models via monocular knowledge | arXiv: 2511.22429
final-model-only data attribution with a unifying view of gradient-based methods | arXiv: 2412.03906
financial instruction following evaluation fife | arXiv: 2512.08965
find your needle small object image retrieval via multi-object attention optimiz | arXiv: 2503.07038
finding structure in continual learning | arXiv: 2602.04555
finegrain evaluating failure modes of text-to-image models with vision language | arXiv: 2512.02161
FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning | arXiv: 2510.21311
finite-sample analysis of policy evaluation for robust average reward reinforcem | arXiv: 2502.16816
finite-time analysis of stochastic nonconvex nonsmooth optimization on the riema | arXiv: 2510.21468
fiper factorized features for robust image super-resolution and compression | arXiv: 2410.18083
fira can we achieve full-rank training of llms under low-rank constraint | arXiv: 2410.01623
firegnn neuro-symbolic graph neural networks with trainable fuzzy rules for inte | arXiv: 2509.10510
First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training | arXiv: 2505.22453
firstaidqa a synthetic dataset for first aid and emergency response in low-conne | arXiv: 2511.01289
fixed-point rnns interpolating from diagonal to dense | arXiv: 2503.10799
flarex a physics-informed dataset for lens flare removal via 2d synthesis and 3d | arXiv: 2510.09995
flashmd long-stride universal prediction of molecular dynamics | arXiv: 2505.19350
flatness is necessary neural collapse is not rethinking generalization via grokk | arXiv: 2509.17738
flatten graphs as sequences transformers are scalable graph generators | arXiv: 2502.02216
flattening hierarchies with policy bootstrapping | arXiv: 2505.14975
flex-judge text-only reasoning unleashes zero-shot multimodal evaluators | arXiv: 2505.18601
flexac towards flexible control of associative reasoning in multimodal large lan | arXiv: 2510.11190
flexevent towards flexible event-frame object detection at varying operational f | arXiv: 2412.06708
flow density control generative optimization beyond entropy-regularized fine-tun | arXiv: 2511.22640
flow matching neural processes | arXiv: 2512.23853
flow matching-based autonomous driving planning with advanced interactive behavi | arXiv: 2510.11083
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models | arXiv: 2505.19536
FlowMoE: 分布式MoE训练的可扩展流水线调度框架 | arXiv: 2510.00207
flux efficient descriptor-driven clustered federated learning under arbitrary di | arXiv: 2511.22305
flux4d flow-based unsupervised 4d reconstruction | arXiv: 2512.03210
flylora boosting task decoupling and parameter efficiency via implicit rank-wise | arXiv: 2510.08396
flysearch exploring how vision-language models explore | arXiv: 2506.02896
focalcodec low-bitrate speech coding via focal modulation networks | arXiv: 2502.04465
focus internal mllm representations for efficient fine-grained visual question a | arXiv: 2506.21710
force prompting video generation models can learn and generalize physics-based c | arXiv: 2505.19386
forcevla enhancing vla models with a force-aware moe for contact-rich manipulati | arXiv: 2505.22159
forecasting in offline reinforcement learning for non-stationary environments | arXiv: 2512.01987
forensichub a unified benchmark codebase for all-domain fake image detection and | arXiv: 2505.11003
Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
fostering the ecosystem of ai for social impact requires expanding and strengthe | arXiv: 2510.18238
foundation cures personalization improving personalized models prompt consistenc | arXiv: 2411.15277
foundation models as world models a foundational study in text-based gridworlds | arXiv: 2509.15915
foundation models for scientific discovery from paradigm enhancement to paradigm | arXiv: 2510.15280
foxes a framework for operational x-ray emission synthesis | arXiv: 2510.22801
fractalbench diagnosing visual-mathematical reasoning through recursive program | arXiv: 2511.06522
fractional diffusion bridge models | arXiv: 2511.01795
freqpolicy efficient flow-based visuomotor policy via frequency consistency | arXiv: 2506.08822
frequency matters when time series foundation models fail under spectral shift | arXiv: 2511.05619
frequency-aware token reduction for efficient vision transformer | arXiv: 2511.21477
friren beyond trajectories -- a spectral lens on time | arXiv: 2505.17370
from average-iterate to last-iterate convergence in games a reduction and its ap | arXiv: 2506.03464
from black box to biomarker sparse autoencoders for interpreting speech models o | arXiv: 2507.16836
from black hole to galaxy neural operator framework for accretion and feedback d | arXiv: 2512.01576
from black-box to causal-box towards building more interpretable models | arXiv: 2510.21998
from cradle to cane a two-pass framework for high-fidelity lifespan face aging | arXiv: 2506.20977
from flat to hierarchical extracting sparse representations with matching pursui | arXiv: 2506.03093
from generation to attribution music ai agent architectures for the post-streami | arXiv: 2510.20276
from images to physics probabilistic inference of galaxy parameters and emission | arXiv: 2511.12737
from information to generative exponent learning rate induces phase transitions | arXiv: 2510.21020
from judgment to interference early stopping llm harmful outputs via streaming c | arXiv: 2506.09996
from linear to nonlinear provable weak-to-strong generalization through feature | arXiv: 2510.24812
from objects to anywhere a holistic benchmark for multi-level visual grounding i | arXiv: 2506.04897
from pixels to views learning angular-aware and physics-consistent representatio | arXiv: 2510.22577
from programs to poses factored real-world scene generation via learned program | arXiv: 2510.10292
from sequence to structure uncovering substructure reasoning in transformers | arXiv: 2507.10435
from shortcut to induction head how data diversity shapes algorithm selection in | arXiv: 2512.18634
from simulations to surveys domain adaptation for galaxy observations | arXiv: 2511.18590
fsnet feasibility-seeking neural network for constrained optimization with guara | arXiv: 2506.00362
fully dynamic algorithms for chamfer distance | arXiv: 2512.16639
functional scaling laws in kernel regression loss dynamics and learning rate sch | arXiv: 2509.19189
future-aware end-to-end driving bidirectional modeling of trajectory planning an | arXiv: 2510.11092
FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
g-dpo scalable preference optimization for protein language models | arXiv: 2510.19474
galactification painting galaxies onto dark matter only simulations using a tran | arXiv: 2511.08438
gasp efficient black-box generation of adversarial suffixes for jailbreaking llm | arXiv: 2411.14133
gated integration of low-rank adaptation for continual learning of large languag | arXiv: 2505.15424
gaudp reinventing multi-agent collaboration through gaussian-image synergy in di | arXiv: 2511.00998
gaussian process upper confidence bound achieves nearly-optimal regret in noise- | arXiv: 2502.19006
gaussian-augmented physics simulation and system identification with complex col | arXiv: 2511.06846
gaze beyond the frame forecasting egocentric 3d visual span | arXiv: 2511.18470
gc4nc a benchmark framework for graph condensation on node classification with n | arXiv: 2406.16715
gem empowering mllm for grounded ecg understanding with time series and images | arXiv: 2503.06073
Gemstones: A Model Suite for Multi-Faceted Scaling Laws | arXiv: 2502.06857
geneman generalizable single-image 3d human reconstruction from multi-source hum | arXiv: 2411.18624
Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training | arXiv: 2509.18631
generalizable insights for graph transformers in theory and practice | arXiv: 2511.08028
generalizable real-time neural decoding with hybrid state-space models | arXiv: 2506.05320
generalization bounds for rank-sparse neural networks | arXiv: 2510.21945
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention | arXiv: 2502.01473
generalization or hallucination understanding out-of-context reasoning in transf | arXiv: 2506.10887
Generalized Contrastive Learning for Universal Multimodal Retrieval | arXiv: 2509.25638
generalized linear bandits almost optimal regret with one-pass update | arXiv: 2507.11847
generalized linear mode connectivity for transformers | arXiv: 2506.22712
generalizing verifiable instruction following | arXiv: 2507.02833
generalizing while preserving monotonicity in comparison-based preference learni | arXiv: 2506.08616
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling | arXiv: 2504.13169
generating multi-table time series ehr from latent space with minimal preprocess | arXiv: 2507.06996
generating physically sound designs from text and a set of physical constraints | arXiv: 2602.02213
generative ai agents for controllable and protected content creation | arXiv: 2601.12348
generative distribution embeddings lifting autoencoders to the space of distribu | arXiv: 2505.18150
generative graph pattern machine | arXiv: 2505.16130
generative model inversion through the lens of the manifold hypothesis | arXiv: 2509.20177
generative modeling of full-atom protein conformations using latent diffusion on | arXiv: 2506.17064
genir generative visual feedback for mental image retrieval | arXiv: 2506.06220
geo-sign hyperbolic contrastive regularisation for geometrically aware sign lang | arXiv: 2506.00129
geocad local geometry-controllable cad generation with large language models | arXiv: 2506.10337
geocomplete geometry-aware diffusion for reference-driven image completion | arXiv: 2510.03110
geodynamics a geometric state-space neural network for understanding brain dynam | arXiv: 2601.13570
geolink empowering remote sensing foundation model with openstreetmap data | arXiv: 2509.26016
geometric data valuation via leverage scores | arXiv: 2511.02100
geometric imbalance in semi-supervised node classification | arXiv: 2303.10371
geometric priors for generalizable world models via vector symbolic architecture | arXiv: 2602.21467
geometry of decision making in language models | arXiv: 2511.20315
georanker distance-aware ranking for worldwide image geolocalization | arXiv: 2505.13731
georemover removing objects and their causal visual artifacts | arXiv: 2509.18538
geosvr taming sparse voxels for geometrically accurate surface reconstruction | arXiv: 2509.18090
gflownets for learning better drug-drug interaction representations | arXiv: 2508.06576
gfm-rag graph foundation model for retrieval augmented generation | arXiv: 2502.01113
global convergence for average reward constrained mdps with primal-dual actor cr | arXiv: 2505.15138
global minimizers of ellp-regularized objectives yield the sparsest relu neural | arXiv: 2505.21791
global minimizers of sigmoid contrastive loss | arXiv: 2509.18552
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity | arXiv: 2508.19972
gnnxemplar exemplars to explanations -- natural language rules for global gnn in | arXiv: 2509.18376
goalladder incremental goal discovery with vision-language models | arXiv: 2506.16396
goatex geometry occlusion-aware texturing | arXiv: 2511.23051
gora gradient-driven adaptive low rank adaptation | arXiv: 2502.12171
GPO: Learning from Critical Steps to Improve LLM Reasoning | arXiv: 2509.16456
gradient descent as loss landscape navigation a normative framework for deriving | arXiv: 2510.26997
gradient variance reveals failure modes in flow-based generative models | arXiv: 2510.18118
gradient-variation online adaptivity for accelerated optimization with hölder sm | arXiv: 2511.02276
gradient-weight alignment as a train-time proxy for generalization in classifica | arXiv: 2510.25480
gralora granular low-rank adaptation for parameter-efficient fine-tuning | arXiv: 2505.20355
graph alignment via birkhoff relaxation | arXiv: 2503.05323
graph diffusion that can insert and delete | arXiv: 2506.15725
graph distance as surprise free energy minimization in knowledge graph reasoning | arXiv: 2512.01878
graph neural networks for efficient ac power flow prediction in power grids | arXiv: 2502.05702
graph neural networks for interferometer simulations | arXiv: 2512.16051
graph persistence goes spectral | arXiv: 2506.06571
graph your own prompt | arXiv: 2509.23373
graph-based neural space weather forecasting | arXiv: 2509.19605
graphchain large language models for large-scale graph analysis via tool chainin | arXiv: 2511.00457
graphfaas serverless gnn inference for burst-resilient real-time intrusion detec | arXiv: 2511.10554
graphkeeper graph domain-incremental learning via knowledge disentanglement and | arXiv: 2511.00097
graphtop graph topology-oriented prompting for graph neural networks | arXiv: 2510.22451
grasp2grasp vision-based dexterous grasp translation via schrödinger bridges | arXiv: 2506.02489
grass scalable data attribution with gradient sparsification and sparse projecti | arXiv: 2505.18976
graver generative graph vocabularies for robust graph foundation models fine-tun | arXiv: 2511.05592
greedy algorithm for structured bandits a sharp characterization of asymptotic s | arXiv: 2503.04010
greedy sampling is provably efficient for rlhf | arXiv: 2510.24700
greenhyperspectra a multi-source hyperspectral dataset for global vegetation tra | arXiv: 2507.06806
ground-compose-reinforce grounding language in agentic behaviours using limited | arXiv: 2507.10741
grounding foundational vision models with 3d human poses for robust action recog | arXiv: 2511.05622
Group-in-Group Policy Optimization for LLM Agent Training | arXiv: 2505.10978
gsalign geometric and semantic alignment network for aerial-ground person re-ide | arXiv: 2510.22268
gspn-2 efficient parallel sequence modeling | arXiv: 2512.07884
gst-unet a neural framework for spatiotemporal causal inference with time-varyin | arXiv: 2502.05295
gtpbd a fine-grained global terraced parcel and boundary dataset | arXiv: 2507.14697
gui-rise structured reasoning and history summarization for gui navigation | arXiv: 2510.27210
guided diffusion sampling on function spaces with applications to pdes | arXiv: 2505.17004
guideflow3d optimization-guided rectified flow for appearance transfer | arXiv: 2510.16136
guiding cross-modal representations with mllm priors via preference alignment | arXiv: 2506.06970
gvpo group variance policy optimization for large language model post-training | arXiv: 2504.19599
gyroswin 5d surrogates for gyrokinetic plasma turbulence simulations | arXiv: 2510.07314
h-ddx a hierarchical evaluation framework for differential diagnosis | arXiv: 2510.03700
h-splid hsic-based saliency preserving latent information decomposition | arXiv: 2510.20627
haif-gs hierarchical and induced flow-guided gaussian splatting for dynamic scen | arXiv: 2506.09518
hallucination as an upper bound a new perspective on text-to-image evaluation | arXiv: 2509.21257
hamiltonian neural pde solvers through functional approximation | arXiv: 2505.13275
hankel singular value regularization for highly compressible state space models | arXiv: 2510.22951
HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance | arXiv: 2505.19742
hardware-aligned hierarchical sparse attention for efficient long-term memory ac | arXiv: 2504.16795
harnessing feature resonance under arbitrary target alignment for out-of-distrib | arXiv: 2502.16076
harnessing the computation redundancy in vits to boost adversarial transferabili | arXiv: 2504.10804
hawaii hierarchical visual knowledge transfer for efficient vision-language mode | arXiv: 2506.19072
Head Pursuit: Probing Attention Specialization in Multimodal Transformers | arXiv: 2510.21518
healthslm-bench benchmarking small language models for mobile and wearable healt | arXiv: 2509.07260
helpsteer3-preference open human-annotated preference data across diverse tasks | arXiv: 2505.11475
hephaestus mixture generative modeling with energy guidance for large-scale qos | arXiv: 2510.17036
hermesflow seamlessly closing the gap in multimodal understanding and generation | arXiv: 2502.12148
hessian-guided perturbed wasserstein gradient flows for escaping saddle points | arXiv: 2509.16974
heterogeneous adversarial play in interactive environments | arXiv: 2510.18407
heterogeneous swarms jointly optimizing model roles and weights for multi-llm sy | arXiv: 2502.04510
hierarchical balance packing towards efficient supervised fine-tuning for long-c | arXiv: 2503.07680
hierarchical koopman diffusion fast generation with interpretable diffusion traj | arXiv: 2510.12220
hierarchical retrieval the geometry and a pretrain-finetune recipe | arXiv: 2509.16411
hierarchical self-attention generalizing neural attention mechanics to multi-sca | arXiv: 2509.15448
HiFi-RAG: Hierarchical Content Filtering and Two-Pass Generation for Open-Domain RAG | arXiv: 2512.22442
high resolution udf meshing via iterative networks | arXiv: 2509.17212
high-order equivariant flow matching for density functional theory hamiltonian p | arXiv: 2505.18817
highlighting what matters promptable embeddings for attribute-focused image retr | arXiv: 2505.15877
himacon discovering hierarchical manipulation concepts from unlabeled multi-moda | arXiv: 2510.11321
hogwild inference parallel llm generation via concurrent attention | arXiv: 2504.06261
hoi-dyn learning interaction dynamics for human-object motion diffusion | arXiv: 2507.01737
hollowflow efficient sample likelihood evaluation using hollow message passing | arXiv: 2510.21542
holollm multisensory foundation model for language-grounded human sensing and re | arXiv: 2505.17645
homogeneous keys heterogeneous values exploiting local kv cache asymmetry for lo | arXiv: 2506.05410
hopadiff holistic-partial aware fourier conditioned diffusion for referring huma | arXiv: 2506.09650
HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models | arXiv: 2505.20444
horizon reduction makes rl scalable | arXiv: 2506.04168
houselayout3d a benchmark and training-free baseline for 3d layout estimation in | arXiv: 2512.02450
how data mixing shapes in-context learning asymptotic equivalence for transforme | arXiv: 2510.25753
how different from the past spatio-temporal time series forecasting with self-su | arXiv: 2510.04908
How Do Transformers Learn Implicit Reasoning? | arXiv: 2505.23653
how does sequence modeling architecture influence base capabilities of pre-train | arXiv: 2505.18522
how foundational are foundation models for time series forecasting | arXiv: 2510.00742
how many domains suffice for domain generalization a tight characterization via | arXiv: 2506.16704
how many tokens do 3d point cloud transformer architectures really need | arXiv: 2511.05449
how patterns dictate learnability in sequential data | arXiv: 2510.10744
how should we evaluate data deletion in graph-based ann indexes | arXiv: 2512.06200
how to build a consistency model learning flow maps via self-distillation | arXiv: 2505.18825
human-assisted robotic policy refinement via action preference optimization | arXiv: 2506.07127
human-inspired multi-level reinforcement learning | arXiv: 2501.07502
human-machine ritual synergic performance through real-time motion recognition | arXiv: 2511.02351
humancrafter synergizing generalizable human reconstruction and semantic 3d segm | arXiv: 2511.00468
hybrid autoencoders for tabular data leveraging model-based augmentation in low- | arXiv: 2511.06961
Hybrid Latent Reasoning via Reinforcement Learning | arXiv: 2505.18454
hybrid physical-neural simulator for fast cosmological hydrodynamics | arXiv: 2510.26593
hybrid-balance gflownet for solving vehicle routing problems | arXiv: 2510.04792
hybridnorm towards stable and efficient transformer training via hybrid normaliz | arXiv: 2503.04598
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location | arXiv: 2501.14808
hyperbolic dataset distillation | arXiv: 2505.24623
hyperbolic fine-tuning for large language models | arXiv: 2410.04010
hypergraphrag retrieval-augmented generation via hypergraph-structured knowledge | arXiv: 2503.21322
hyperparameter transfer enables consistent gains of matrix-preconditioned optimi | arXiv: 2512.05620
hyplanehead rethinking tri-plane-like representations in full-head image synthes | arXiv: 2509.16748
hyrf hybrid radiance fields for memory-efficient and high-quality novel view syn | arXiv: 2509.17083
i-raven-x benchmarking generalization and robustness of analogical and mathemati | arXiv: 2510.17496
ibgs image-based gaussian splatting | arXiv: 2511.14357
if-guide influence function-guided detoxification of llms | arXiv: 2506.01790
ifinder structured zero-shot vision-based llm grounding for dash-cam video reaso | arXiv: 2509.19552
image super-resolution with guarantees via conformalized generative models | arXiv: 2502.09664
imagenet-trained cnns are not biased towards texture revisiting feature reliance | arXiv: 2509.20234
imagesentinel protecting visual datasets from unauthorized retrieval-augmented i | arXiv: 2510.12119
impact of dataset properties on membership inference vulnerability of deep trans | arXiv: 2402.06674
impact of layer norm on memorization and generalization in transformers | arXiv: 2511.10566
implicit augmentation from distributional symmetry in turbulence super-resolutio | arXiv: 2509.20683
implicit bias of spectral descent and muon on multiclass separable data | arXiv: 2502.04664
implicit modeling for transferability estimation of vision foundation models | arXiv: 2510.23145
improved approximation algorithms for chromatic and pseudometric-weighted correl | arXiv: 2505.21939
improved balanced classification with theoretically grounded loss functions | arXiv: 2512.23947
improved regret and contextual linear extension for pandoras box and prophet ine | arXiv: 2505.18828
improved regret bounds for gaussian process upper confidence bound in bayesian o | arXiv: 2506.01393
improved training technique for shortcut models | arXiv: 2510.21250
improving consistency in retrieval-augmented systems with group similarity rewar | arXiv: 2510.04392
improving data efficiency for llm reinforcement fine-tuning through difficulty-t | arXiv: 2506.05316
improving decision trees through the lens of parameterized local search | arXiv: 2510.12726
improving diffusion-based inverse algorithms under few-step constraint via learn | arXiv: 2503.10103
improving forecasts of suicide attempts for patients with little data | arXiv: 2511.18199
improving perturbation-based explanations by understanding the role of uncertain | arXiv: 2511.10439
improving planning and mbrl with temporally-extended actions | arXiv: 2505.15754
improving posterior inference of galaxy properties with image-based conditional | arXiv: 2512.05078
improving retrieval-augmented generation through multi-agent reinforcement learn | arXiv: 2501.15228
improving the straight-through estimator with zeroth-order information | arXiv: 2510.23926
improving time series forecasting via instance-aware post-hoc revision | arXiv: 2505.23583
in search of adams secret sauce | arXiv: 2505.21829
in the eye of mllm benchmarking egocentric video intent understanding with gaze- | arXiv: 2509.07447
in-context compositional learning via sparse coding transformer | arXiv: 2511.20194
in-context edit enabling instructional image editing with in-context generation | arXiv: 2504.20690
in-context learning of linear dynamical systems with transformers approximation | arXiv: 2502.08136
in-context learning of stochastic differential equations with foundation inferen | arXiv: 2502.19049
inc an indirect neural corrector for auto-regressive hybrid pde solvers | arXiv: 2511.12764
incentivizing reasoning for advanced instruction-following of large language mod | arXiv: 2506.01413
incentivizing time-aware fairness in data sharing | arXiv: 2510.09240
incomplete multi-view clustering via hierarchical semantic alignment and coopera | arXiv: 2510.13887
increasing the utility of synthetic images through chamfer guidance | arXiv: 2508.10631
incremental sequence classification with temporal consistency | arXiv: 2505.16548
indego a dataset of industrial scenarios and collaborative work for egocentric a | arXiv: 2511.19684
inductive transfer learning for graph-based recommenders | arXiv: 2510.22799
ineq-comp benchmarking human-intuitive compositional reasoning in automated theo | arXiv: 2505.12680
inference-time alignment in continuous space | arXiv: 2505.20081
inference-time chain-of-thought pruning with latent informativeness signals | arXiv: 2511.00699
inference-time hyper-scaling with kv cache compression | arXiv: 2506.05345
inference-time reward hacking in large language models | arXiv: 2506.19248
inference-time scaling for flow models via stochastic generation and rollover bu | arXiv: 2503.19385
inferring stochastic dynamics with growth from cross-sectional data | arXiv: 2505.13197
infinipot-v memory-constrained kv cache compression for streaming video understa | arXiv: 2506.15745
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
influence functions for edge edits in non-convex graph neural networks | arXiv: 2506.04694
influx a benchmark for self-calibration of dynamic intrinsics of video cameras | arXiv: 2510.23589
information theoretic learning for diffusion models with warm start | arXiv: 2510.20903
information-computation tradeoffs for noiseless linear regression with oblivious | arXiv: 2510.10665
information-theoretic discrete diffusion | arXiv: 2510.24088
infrequent exploration in linear bandits | arXiv: 2510.26000
inner speech as behavior guides steerable imitation of diverse behaviors for hum | arXiv: 2602.20517
inst-it boosting instance understanding via explicit visual prompt instruction t | arXiv: 2412.03565
instance-level composed image retrieval | arXiv: 2510.25387
instance-specific test-time training for speech editing in the wild | arXiv: 2506.13295
InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention | arXiv: 2509.16691
instant video models universal adapters for stabilizing image-based networks | arXiv: 2512.03014
instructsam a training-free framework for instruction-oriented remote sensing ob | arXiv: 2505.15818
Integration Matters for Learning PDEs with Backward SDEs | arXiv: 2505.01078
interaction-centric knowledge infusion and transfer for open-vocabulary scene gr | arXiv: 2511.05935
interactive and hybrid imitation learning provably beating behavior cloning | arXiv: 2412.07057
interpretable next-token prediction via the generalized induction head | arXiv: 2411.00066
interpreting gflownets for drug discovery extracting actionable insights for med | arXiv: 2511.19264
interpreting resnet-based clip via neuron-attention decomposition | arXiv: 2509.19943
intervene-all-paths unified mitigation of lvlm hallucinations across alignment f | arXiv: 2511.17254
inverse optimization latent variable models for learning costs applied to route | arXiv: 2509.15999
invisibleink high-utility and low-cost text generation with differential privacy | arXiv: 2507.02974
ioncast a deep learning framework for forecasting ionospheric dynamics | arXiv: 2511.15004
is artificial intelligence generated image detection a solved problem | arXiv: 2505.12335
is sequence information all you need for bayesian optimization of antibodies | arXiv: 2509.24933
isotropic noise in stochastic and quantum convex optimization | arXiv: 2510.20745
It's LIT! Reliability-Optimized LLMs with Inspectable Tools | arXiv: 2511.14903
itdpdm information-theoretic discrete poisson diffusion model | arXiv: 2505.05082
iterative foundation model fine-tuning on multiple rewards | arXiv: 2511.00220
its complicated the relationship of algorithmic fairness and non-discrimination | arXiv: 2501.12962
its hard to be normal the impact of noise on structure-agnostic estimation | arXiv: 2507.02275
jailbound jailbreaking internal safety boundaries of vision-language models | arXiv: 2505.19610
jailbreak-zero a path to pareto optimal red teaming for large language models | arXiv: 2601.03265
jamun bridging smoothed molecular dynamics and score-based learning for conforma | arXiv: 2410.14621
janus-pro-r1 advancing collaborative visual comprehension and generation via rei | arXiv: 2506.01480
janusdna a powerful bi-directional hybrid dna foundation model | arXiv: 2505.17257
jasmine harnessing diffusion prior for self-supervised depth estimation | arXiv: 2503.15905
jet-nemotron efficient language model with post neural architecture search | arXiv: 2508.15884
johnson-lindenstrauss lemma beyond euclidean geometry | arXiv: 2510.22401
jutters | arXiv: 2601.11532
k-decore facilitating knowledge transfer in continual structured knowledge reaso | arXiv: 2509.16929
keep it on a leash controllable pseudo-label generation towards realistic long-t | arXiv: 2510.03993
keep it real challenges in attacking compression-based adversarial purification | arXiv: 2508.05489
kernel conditional tests from learning-theoretic bounds | arXiv: 2506.03898
kernel learning with adversarial features numerical efficiency and adaptive regu | arXiv: 2510.20883
keydiff key similarity-based kv cache eviction for long-context llm inference in | arXiv: 2504.15364
kimina lean server a high-performance lean server for large-scale verification | arXiv: 2504.21230
kindle knowledge-guided distillation for prior-free gene regulatory network infe | arXiv: 2505.09664
kl penalty control via perturbation for direct preference optimization | arXiv: 2502.13177
klass kl-guided fast inference in masked diffusion models | arXiv: 2511.05664
knolling bot teaching robots the human notion of tidiness | arXiv: 2310.04566
know thyself by knowing others learning neuron identity from population context | arXiv: 2512.01199
Know What You Don't Know: Uncertainty Calibration of Process Reward Models | arXiv: 2506.09338
knowing when to stop efficient context processing via latent sufficiency signals | arXiv: 2502.01025
knowledge distillation detection for open-weights models | arXiv: 2510.02302
knowledge is overrated a zero-knowledge machine learning and cryptographic hashi | arXiv: 2511.12592
knowledge-based visual question answer with multimodal processing retrieval and | arXiv: 2510.14605
KScope: A Framework for Characterizing the Knowledge Status of Language Models | arXiv: 2506.07458
ktae a model-free algorithm to key-tokens advantage estimation in mathematical r | arXiv: 2505.16826
kungfubot physics-based humanoid whole-body control for learning highly-dynamic | arXiv: 2506.12851
kuramoto orientation diffusion models | arXiv: 2509.15328
kvzip query-agnostic kv cache compression with context reconstruction | arXiv: 2505.23416
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context | arXiv: 2505.17505
l2rsi cross-view lidar-based place recognition for large-scale urban scenes via | arXiv: 2503.11245
labelany3d label any object 3d in the wild | arXiv: 2601.01676
LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents | arXiv: 2505.22634
lagrangian neural odes measuring the existence of a lagrangian with helmholtz me | arXiv: 2510.06367
LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation | arXiv: 2510.25263
langsplatv2 high-dimensional 3d language gaussian splatting with 450 fps | arXiv: 2507.07136
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
large language bayes | arXiv: 2504.14025
large language models as medical codes selectors a benchmark using the internati | arXiv: 2507.14681
large language models can learn and generalize steganographic chain-of-thought u | arXiv: 2506.01926
Large Language Models Miss the Multi-Agent Mark | arXiv: 2505.21298
large stepsizes accelerate gradient descent for regularized logistic regression | arXiv: 2506.02336
large-scale training data attribution for music generative models via unlearning | arXiv: 2506.18312
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits | arXiv: 2410.01735
last iterate convergence in monotone mean field games | arXiv: 2410.05127
latent chain-of-thought for visual reasoning | arXiv: 2510.23925
latent harmony synergistic unified uhd image restoration via latent space regula | arXiv: 2510.07961
latent principle discovery for language model self-improvement | arXiv: 2505.16927
latent representation learning in heavy-ion collisions with maskpoint transforme | arXiv: 2510.06691
latent space factorization in lora | arXiv: 2510.19640
latent zoning network a unified principle for generative modeling representation | arXiv: 2509.15591
latentguard controllable latent steering for robust refusal of attacks and relia | arXiv: 2509.19839
lattice boltzmann model for learning real-world pixel dynamicity | arXiv: 2509.16527
layer-wise modality decomposition for interpretable multimodal sensor fusion | arXiv: 2511.00859
layer-wise update aggregation with recycling for communication-efficient federat | arXiv: 2503.11146
layerif estimating layer quality for large language models using influence funct | arXiv: 2505.23811
lc-opt benchmarking reinforcement learning and agentic ai for end-to-end liquid | arXiv: 2511.00116
lcdb 11 a database illustrating learning curves are more ill-behaved than previo | arXiv: 2505.15657
leapfactual reliable visual counterfactual explanation using conditional flow ma | arXiv: 2510.14623
learnable sampler distillation for discrete diffusion models | arXiv: 2509.19962
learning approximately equivariant networks via constrained optimization | arXiv: 2505.13631
learning at the speed of physics equilibrium propagation on oscillator ising mac | arXiv: 2510.12934
Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Inverse Problems | arXiv: 2505.08909
learning conformational ensembles of proteins based on backbone geometry | arXiv: 2503.05738
learning dense hand contact estimation from imbalanced data | arXiv: 2505.11152
learning dynamics of rnns in closed-loop environments | arXiv: 2505.13567
learning efficient fuse-and-refine for feed-forward 3d gaussian splatting | arXiv: 2503.14698
learning from demonstrations via capability-aware goal sampling | arXiv: 2601.08731
learning from design procedure to generate cad programs for data augmentation | arXiv: 2603.06894
learning from interval targets | arXiv: 2510.20925
learning from videos for 3d world enhancing mllms with 3d vision geometry priors | arXiv: 2505.24625
learning generalizable shape completion with sim3 equivariance | arXiv: 2509.26631
learning grouped lattice vector quantizers for low-bit llm compression | arXiv: 2510.20984
learning human-like rl agents through trajectory optimization with action quanti | arXiv: 2511.15055
Learning in Compact Spaces with Approximately Normalized Transformer | arXiv: 2505.22014
learning in stackelberg mean field games a non-asymptotic analysis | arXiv: 2509.15392
Learning Interactive World Model for Object-Centric Reinforcement Learning | arXiv: 2511.02225
learning interestingness in automated mathematical theory formation | arXiv: 2511.14778
learning interpretable features in audio latent spaces via sparse autoencoders | arXiv: 2510.23802
learning intractable multimodal policies with reparameterization and diversity r | arXiv: 2511.01374
learning memory-enhanced improvement heuristics for flexible job shop scheduling | arXiv: 2603.02846
learning neural exposure fields for view synthesis | arXiv: 2510.08279
learning non-equilibrium diffusions with schrödinger bridges from exactly solvab | arXiv: 2505.16644
learning orthogonal multi-index models a fine-grained information exponent analy | arXiv: 2410.09678
learning parameterized skills from demonstrations | arXiv: 2510.24095
learning provably improves the convergence of gradient descent | arXiv: 2501.18092
learning quadratic neural networks in high dimensions sgd dynamics and scaling l | arXiv: 2508.03688
learning reconfigurable representations for multimodal federated learning with m | arXiv: 2510.22880
learning relative gene expression trends from pathology images in spatial transc | arXiv: 2512.06612
learning repetition-invariant representations for polymer informatics | arXiv: 2505.10726
learning shared representations from unpaired data | arXiv: 2505.21524
learning single-index models via harmonic decomposition | arXiv: 2506.09887
learning skill-attributes for transferable assessment in video | arXiv: 2511.13993
learning sparse approximate inverse preconditioners for conjugate gradient solve | arXiv: 2510.27517
learning spatial-aware manipulation ordering | arXiv: 2510.25138
learning task-agnostic representations through multi-teacher distillation | arXiv: 2510.18680
learning temporal 3d semantic scene completion via optical flow guidance | arXiv: 2502.14520
learning the wrong lessons syntactic-domain spurious correlations in language mo | arXiv: 2509.21155
learning theory for kernel bilevel optimization | arXiv: 2502.08457
learning time-scale invariant population-level neural representations | arXiv: 2511.13022
learning to better search with language models via guided reinforced self-traini | arXiv: 2410.02992
learning to clean reinforcement learning for noisy label correction | arXiv: 2511.19808
learning to condition a neural heuristic for scalable mpe inference | arXiv: 2509.25217
learning to factorize and adapt a versatile approach toward universal spatio-tem | arXiv: 2601.12083
learning to flow from generative pretext tasks for neural architecture encoding | arXiv: 2510.18360
learning to focus causal attention distillation via gradient-guided token prunin | arXiv: 2506.07851
learning to focus prioritizing informative histories with structured attention m | arXiv: 2511.06946
learning to insert for constructive neural vehicle routing solver | arXiv: 2505.13904
Learning to Instruct for Visual Instruction Tuning | arXiv: 2503.22215
learning to integrate diffusion odes by averaging the derivatives | arXiv: 2505.14502
Learning to Solve Complex Problems via Dataset Decomposition | arXiv: 2602.20296
learning to steer input-dependent steering for multimodal llms | arXiv: 2508.12815
learning to watermark a selective watermarking framework for large language mode | arXiv: 2510.15976
learning with calibration exploring test-time computing of spatio-temporal forec | arXiv: 2506.00635
learning-augmented facility location mechanisms for envy ratio | arXiv: 2512.11193
learning-augmented online bipartite fractional matching | arXiv: 2505.19252
learning-augmented streaming algorithms for correlation clustering | arXiv: 2510.10705
least squares variational inference | arXiv: 2502.18475
lemica lexicographic minimax path caching for efficient diffusion-based video ge | arXiv: 2511.00090
less is more but where dynamic token compression via llm-guided keyframe prior | arXiv: 2512.06866
less is more local intrinsic dimensions of contextual language models | arXiv: 2506.01034
less is more unlocking specialization of time series foundation models via struc | arXiv: 2505.23195
Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve | arXiv: 2505.23946
Let LRMs Break Free from Overthinking via Self-Braking Tuning | arXiv: 2505.14604
Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones | arXiv: 2505.21825
let the experts speak improving survival prediction calibration via mixture-of-e | arXiv: 2511.09567
leveraging depth and language for open-vocabulary domain-generalized semantic se | arXiv: 2506.09881
leveraging importance sampling to detach alignment modules from large language m | arXiv: 2505.19700
leveraging robust optimization for llm alignment under distribution shifts | arXiv: 2504.05831
levo high-quality song generation with multi-preference alignment | arXiv: 2506.07520
limited preference data learning better reward model with latent space synthesis | arXiv: 2509.26074
limopro reasoning refinement for efficient and effective test-time scaling | arXiv: 2505.19187
linear attention for efficient bidirectional sequence modeling | arXiv: 2502.16249
linear differential vision transformer learning visual contrasts via pairwise di | arXiv: 2511.00833
linear transformers implicitly discover unified numerical algorithms | arXiv: 2509.19702
linearly constrained diffusion implicit models | arXiv: 2411.00359
lineas end-to-end learning of activation steering with a distributional loss | arXiv: 2503.10679
linprim linear primitives for differentiable volumetric rendering | arXiv: 2501.16312
littlebit ultra low-bit quantization via latent factorization | arXiv: 2506.13771
livestar live streaming assistant for real-world online video understanding | arXiv: 2511.05299
llm agent communication protocol lacp requires urgent standardization a telecom- | arXiv: 2510.13821
llm agents for knowledge discovery in atomic layer processing | arXiv: 2509.26201
llm interpretability with identifiable temporal-instantaneous representation | arXiv: 2509.23323
llm meets diffusion a hybrid framework for crystal material generation | arXiv: 2510.23040
llm probing with contrastive eigenproblems improving understanding and applicabi | arXiv: 2511.02089
LLM Safety Alignment is Divergence Estimation in Disguise | arXiv: 2502.00657
LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
llm world models are mental output layer evidence of brittle world model use in | arXiv: 2507.15521
llm-assisted emergency triage benchmark bridging hospital-rich and mci-like fiel | arXiv: 2509.26351
llmscape | arXiv: 2511.07161
locality-sensitive hashing-based efficient point transformer for charged particl | arXiv: 2510.07594
locally optimal private sampling beyond the global minimax | arXiv: 2510.09485
lodge level-of-detail large-scale gaussian splatting with efficient rendering | arXiv: 2505.23158
logical expressiveness of graph neural networks with hierarchical node individua | arXiv: 2506.13911
lomix learnable weighted multi-scale logits mixing for medical image segmentatio | arXiv: 2510.22995
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs | arXiv: 2510.24606
long-tailed recognition via information-preservable two-stage learning | arXiv: 2510.08836
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization | arXiv: 2602.02341
LooGLE v2: LLM在真实世界长依赖挑战上的准备情况评估 | arXiv: 2510.22548
look and tell a dataset for multimodal grounding across egocentric and exocentri | arXiv: 2510.22672
look-ahead reasoning on learning platforms | arXiv: 2511.14745
loquetier a virtualized multi-lora framework for unified llm fine-tuning and ser | arXiv: 2511.00101
lost in transmission when and why llms fail to reason globally | arXiv: 2505.08140
lt-soups bridging head and tail classes via subsampled model soups | arXiv: 2511.10683
ltd-bench evaluating large language models by letting them draw | arXiv: 2511.02347
lumia a handheld vision-to-music system for real-time embodied composition | arXiv: 2512.17228
luminance-aware statistical quantization unsupervised hierarchical learning for | arXiv: 2511.01510
m-grpo stabilizing self-supervised reinforcement learning for large language mod | arXiv: 2512.13070
machine unlearning doesnt do what you think lessons for generative ai policy and | arXiv: 2412.06966
maestro adaptive sparse attention and robust learning for multimodal dynamic tim | arXiv: 2509.25278
MagCache: Fast Video Generation with Magnitude-Aware Cache | arXiv: 2506.09045
magical medical lay language generation via semantic invariance and layperson-ta | arXiv: 2508.08730
MaintainCoder: Maintainable Code Generation Under Dynamic Requirements | arXiv: 2503.24260
making classic gnns strong baselines across varying homophily a smoothness-gener | arXiv: 2412.09805
mamba goes home hierarchical soft mixture-of-experts for 3d medical image segmen | arXiv: 2507.06363
mango - adaptable graph network simulators via meta-learning | arXiv: 2510.05874
manifolds and modules how function develops in a neural foundation model | arXiv: 2512.07869
manipulating 3d molecules in a fixed-dimensional e3-equivariant latent space | arXiv: 2506.00771
manipulating feature visualizations with gradient slingshots | arXiv: 2401.06122
many llms are more utilitarian than one | arXiv: 2507.00814
map estimation with denoisers convergence rates and guarantees | arXiv: 2507.15397
mapping faithful reasoning in language models | arXiv: 2510.22362
mar-fl a communication efficient peer-to-peer federated learning system | arXiv: 2512.05234
mars a malignity-aware backdoor defense in federated learning | arXiv: 2509.20383
mars-bench a benchmark for evaluating foundation models for mars science tasks | arXiv: 2510.24010
martingale score an unsupervised metric for bayesian rationality in llm reasonin | arXiv: 2512.02914
MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision | arXiv: 2505.14996
masfin a multi-agent system for decomposed financial reasoning and forecasting | arXiv: 2512.21878
masked symbol modeling for demodulation of oversampled baseband communication si | arXiv: 2512.01428
masksql safeguarding privacy for llm-based text-to-sql via abstraction | arXiv: 2509.23459
mass conservation on rails -- rethinking physics-informed learning of ice flow v | arXiv: 2510.06286
Massively Parallel Imitation Learning of Mouse Forelimb Musculoskeletal Reaching Dynamics | arXiv: 2511.21848
mat-agent adaptive multi-agent training optimization | arXiv: 2510.17845
match multi-faceted adaptive topo-consistency for semi-supervised histopathology | arXiv: 2510.01532
matchings under biased and correlated evaluations | arXiv: 2510.23628
materialrefgs reflective gaussian splatting with multi-view consistent material | arXiv: 2510.11387
matryoshka pilot learning to drive black-box llms with llms | arXiv: 2410.20749
maxsup overcoming representation collapse in label smoothing | arXiv: 2502.15798
mdns masked diffusion neural sampler via stochastic optimal control | arXiv: 2508.10684
mdreid modality-decoupled learning for any-to-any multi-modal object re-identifi | arXiv: 2510.23301
mean-field sampling for cooperative multi-agent reinforcement learning | arXiv: 2412.00661
measuring what matters construct validity in large language model benchmarks | arXiv: 2511.04703
mecefo enhancing llm training robustness via fault-tolerant optimization | arXiv: 2510.16415
mechanism design for llm fine-tuning with multiple reward models | arXiv: 2405.16276
mechanistic interpretability of rnns emulating hidden markov models | arXiv: 2510.25674
medagentboard benchmarking multi-agent collaboration with conventional methods f | arXiv: 2505.12371
medmkg benchmarking medical knowledge exploitation with multimodal knowledge gra | arXiv: 2505.17214
megadance mixture-of-experts architecture for genre-aware 3d dance generation | arXiv: 2505.17543
megstate phoneme decoding from magnetoencephalography signals | arXiv: 2512.17978
meicoder decoding visual stimuli from neural activity by leveraging most excitin | arXiv: 2510.20762
memeic a step toward continual and compositional knowledge editing | arXiv: 2510.25798
memo training memory-efficient embodied agents with reinforcement learning | arXiv: 2510.19732
memoir lifelong model editing with minimal overwrite and informed retention for | arXiv: 2506.07899
Memory Mosaics at Scale | arXiv: 2507.03285
memory-augmented potential field theory a framework for adaptive control in non- | arXiv: 2509.19672
memory-efficient training with in-place fft implementation | arXiv: 2511.01385
memory-integrated reconfigurable adapters a unified framework for settings with | arXiv: 2512.00940
memtrack evaluating long-term memory and state tracking in multi-platform dynami | arXiv: 2510.01353
mergebench a benchmark for merging domain-specialized llms | arXiv: 2505.10833
merit multilingual semantic retrieval with interleaved multi-condition query | arXiv: 2506.03144
merlin l48 spectrogram dataset | arXiv: 2511.00252
mesatask towards task-driven tabletop scene generation via 3d spatial reasoning | arXiv: 2509.22281
mesh interpolation graph network for dynamic and spatially irregular global weat | arXiv: 2509.20911
mesh-rft enhancing mesh generation via fine-grained reinforcement fine-tuning | arXiv: 2505.16761
MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees | arXiv: 2505.19947
meta-learning an in-context transformer model of human higher visual cortex | arXiv: 2505.15813
meta-learning three-factor plasticity rules for structured credit assignment wit | arXiv: 2512.09366
meta-world an improved standardized rl benchmark | arXiv: 2505.11289
metabox-v2 a unified benchmark platform for meta-black-box optimization | arXiv: 2505.17745
metacognitive sensitivity for test-time dynamic model selection | arXiv: 2512.10451
metadefense defending finetuning-based jailbreak attack before and during genera | arXiv: 2510.07835
metafind scene-aware 3d asset retrieval for coherent metaverse scene generation | arXiv: 2510.04057
metags a meta-learned gaussian-phong model for out-of-distribution 3d scene reli | arXiv: 2405.20791
metamind modeling human social thoughts with metacognitive multi-agent systems | arXiv: 2505.18943
metropolis-hastings sampling for 3d gaussian reconstruction | arXiv: 2506.12945
mge-ldm joint latent diffusion for simultaneous music generation and source extr | arXiv: 2505.23305
micadangelo fine-grained reconstruction of constrained cad models from 3d scans | arXiv: 2510.23429
midas misalignment-based data augmentation strategy for imbalanced multimodal le | arXiv: 2509.25831
military ai needs technically-informed regulation to safeguard ai research and i | arXiv: 2505.18371
mimeqa towards socially-intelligent nonverbal foundation models | arXiv: 2502.16671
mind the data gap evaluating vision systems in small data applications | arXiv: 2504.06486
mind the gap aligning knowledge bases with user needs to enhance mental health r | arXiv: 2509.13626
mind the gap bridging thought leap for improved chain-of-thought tuning | arXiv: 2505.14684
mind the gap removing the discretization gap in differentiable logic gate networ | arXiv: 2506.07500
mind the gap the challenges of scale in pixel-based deep reinforcement learning | arXiv: 2505.17749
mind-the-glitch visual correspondence for detecting inconsistencies in subject-d | arXiv: 2509.21989
mindforge empowering embodied agents with theory of mind for lifelong cultural l | arXiv: 2411.12977
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents | arXiv: 2505.20148
mingle mixture of null-space gated low-rank experts for test-time continual mode | arXiv: 2505.11883
minimal semantic sufficiency meets unsupervised domain generalization | arXiv: 2509.15791
minimizing false-positive attributions in explanations of non-linear models | arXiv: 2505.11210
Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions | arXiv: 2510.22127
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents | arXiv: 2503.10809
mir-bench can your llm recognize complicated patterns via many-shot in-context r | arXiv: 2502.09933
mira medical time series foundation model for real-world health data | arXiv: 2506.07584
mirage a benchmark for multimodal information-seeking and reasoning in agricultu | arXiv: 2506.20100
mitigating disparate impact of differentially private learning through bounded a | arXiv: 2506.01396
mitigating hallucination through theory-consistent symmetric multimodal preferen | arXiv: 2506.11712
mitigating intra- and inter-modal forgetting in continual learning of unified mu | arXiv: 2512.03125
mitigating privacy-utility trade-off in decentralized federated learning via f-d | arXiv: 2510.19934
mitigating semantic collapse in partially relevant video retrieval | arXiv: 2510.27432
mitigating sexual content generation via embedding distortion in text-conditione | arXiv: 2501.18877
mitra an ai assistant for knowledge retrieval in physics collaborations | arXiv: 2603.09800
mitra mixed synthetic priors for enhancing tabular foundation models | arXiv: 2510.21204
mixat combining continuous and discrete adversarial training for llms | arXiv: 2505.16947
mixed monotonicity reachability analysis of neural ode a trade-off between tight | arXiv: 2510.17859
mixing expert knowledge bring human thoughts back to the game of go | arXiv: 2601.16447
mixture of noise for pre-trained model-based class-incremental learning | arXiv: 2509.16738
mixture of scope experts at test generalizing deeper graph neural networks with | arXiv: 2409.06998
mlr-bench evaluating ai agents on open-ended machine learning research | arXiv: 2505.19955
mlrc-bench can language agents solve machine learning research challenges | arXiv: 2504.09702
mm-opera benchmarking open-ended association reasoning for large vision-language | arXiv: 2510.26937
mmada multimodal large diffusion language models | arXiv: 2505.15809
mme-videoocr evaluating ocr-based capabilities of multimodal llms in video scena | arXiv: 2505.21333
mmg mutual information estimation via the mmse gap in diffusion | arXiv: 2509.20609
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly | arXiv: 2505.10610
mmpb its time for multi-modal personalization | arXiv: 2509.22820
mmperspective do mllms understand perspective a comprehensive benchmark for pers | arXiv: 2505.20426
mmtu a massive multi-task table understanding and reasoning benchmark | arXiv: 2506.05587
mmwalk towards multi-modal multi-view walking assistance | arXiv: 2510.11520
mobo-osd batch multi-objective bayesian optimization via orthogonal search direc | arXiv: 2510.20872
Model Context Protocol for Vision Systems: Audit, Security, and Protocol Extensions | arXiv: 2509.22814
model inversion with layer-specific modeling and alignment for data-free continu | arXiv: 2510.26311
model-based policy adaptation for closed-loop end-to-end autonomous driving | arXiv: 2511.21584
model-behavior alignment under flexible evaluation when the best-fitting model i | arXiv: 2510.23321
model-guided dual-role alignment for high-fidelity open-domain video-to-audio ge | arXiv: 2510.24103
modeling cell dynamics and interactions with unbalanced mean field schrödinger b | arXiv: 2505.11197
modeling microenvironment trajectories on spatial transcriptomics with nicheflow | arXiv: 2511.00977
modeling neural activity with conditionally linear dynamical systems | arXiv: 2502.18347
modeling x-ray photon pile-up with a normalizing flow | arXiv: 2511.11863
models that prove their own correctness | arXiv: 2405.15722
modem a morton-order degradation estimation mechanism for adverse weather image | arXiv: 2505.17581
modhifi identifying high fidelity predictive components for model modification | arXiv: 2511.19566
modulation of temporal decision-making in a deep reinforcement learning agent un | arXiv: 2511.01415
moe-gyro self-supervised over-range reconstruction and denoising for mems gyrosc | arXiv: 2506.06318
moemeta mixture-of-experts meta learning for few-shot relational learning | arXiv: 2510.23013
MoESD: 揭示稀疏MoE推理中投机解码的潜力 | arXiv: 2505.19645
mol-llama towards general understanding of molecules in large molecular language | arXiv: 2502.13449
mome mixture of matryoshka experts for audio-visual speech recognition | arXiv: 2510.04136
moment- and power-spectrum-based gaussianity regularization for text-to-image mo | arXiv: 2509.07027
monarchattention zero-shot conversion to fast hardware-aware structured attentio | arXiv: 2505.18698
monitor exploiting large language models with instruction for online video anoma | arXiv: 2510.21449
monte carlo expected threat mocet scoring | arXiv: 2511.16823
moose-chem2 exploring llm limits in fine-grained scientific hypothesis discovery | arXiv: 2505.19209
mopformer motion-primitive transformer for wearable-sensor activity recognition | arXiv: 2505.20744
more than generation unifying generation and depth estimation via text-to-image | arXiv: 2510.23574
more-brain routed mixture of experts for interpretable and generalizable cross-s | arXiv: 2505.15946
mospa human motion generation driven by spatial audio | arXiv: 2507.11949
motion matters compact gaussian streaming for free-viewpoint video reconstructio | arXiv: 2505.16533
motion4d learning 3d-consistent motion and semantics for 4d scene understanding | arXiv: 2512.03601
mouse-guided gaze semi-supervised learning of intention-aware representations fo | arXiv: 2509.19574
mozart modularized and efficient moe training on 35d wafer-scale chiplet archite | arXiv: 2603.07006
mpcache mpc-friendly kv cache eviction for efficient private llm inference | arXiv: 2501.06807
mpmavatar learning 3d gaussian avatars with accurate and robust physics-based dy | arXiv: 2510.01619
mro enhancing reasoning in diffusion language models via multi-reward optimizati | arXiv: 2510.21473
ms-bart unified modeling of mass spectra and molecules for structure elucidation | arXiv: 2510.20615
msf-cnn patch-based multi-stage fusion with convolutional neural networks for ti | arXiv: 2505.11483
mstar box-free multi-query scene text retrieval with attention recycling | arXiv: 2506.10609
mtbbench a multimodal sequential clinical decision-making benchmark in oncology | arXiv: 2511.20490
mtl-kd multi-task learning via knowledge distillation for generalizable neural v | arXiv: 2506.02935
Multi-Agent Collaboration via Evolving Orchestration | arXiv: 2505.19591
multi-class support vector machine with differential privacy | arXiv: 2510.04027
multi-environment pomdps discrete model uncertainty under partial observability | arXiv: 2510.23744
multi-head temporal latent attention | arXiv: 2505.13544
multi-head transformers provably learn symbolic multi-step reasoning via gradien | arXiv: 2508.08222
multi-modal masked autoencoders for learning image-spectrum associations for gal | arXiv: 2510.22527
multi-objective reinforcement learning with max-min criterion a game-theoretic a | arXiv: 2510.20235
multi-scale finetuning for encoder-based time series foundation models | arXiv: 2506.14087
multi-task vehicle routing solver via mixture of specialized experts under state | arXiv: 2510.21453
multi-trajectory physics-informed neural networks for hjb equations with hard-ze | arXiv: 2512.12708
multihuman-testbench benchmarking image generation for multiple humans | arXiv: 2506.20879
multimodal 3d genome pre-training | arXiv: 2504.09060
multimodal bandits regret lower bounds and optimal algorithms | arXiv: 2510.25811
multimodal bayesian network for robust assessment of casualties in autonomous tr | arXiv: 2512.18908
multimodal disease progression modeling via spatiotemporal disentanglement and m | arXiv: 2510.11112
multimodal generative flows for lhc jets | arXiv: 2509.01736
multimodal negative learning | arXiv: 2510.20877
multiplayer federated learning reaching equilibrium with less communication | arXiv: 2501.08263
multiscale guidance of protein structure prediction with heterogeneous cryo-em d | arXiv: 2506.04490
murating a high quality data selecting approach to multilingual large language m | arXiv: 2507.01785
music arena live evaluation for text-to-music | arXiv: 2507.20900
muslr multimodal symbolic logical reasoning | arXiv: 2509.25851
mustafar promoting unstructured sparsity for kv cache pruning in llm inference | arXiv: 2505.22913
mutualvpr a mutual learning framework for resolving supervision inconsistencies | arXiv: 2412.09199
muvr a multi-modal untrimmed video retrieval benchmark with multi-level visual c | arXiv: 2510.21406
mvsmamba multi-view stereo with state space model | arXiv: 2511.01315
natural gradient descent for improving variational inference based classificatio | arXiv: 2511.13224
natural gradient vi guarantees for non-conjugate models | arXiv: 2510.19163
nautilus a large multimodal model for underwater scene understanding | arXiv: 2510.27481
navigating simply aligning deeply winning solutions for mouse vs ai 2025 | arXiv: 2602.00982
navil rethinking scaling properties of native multimodal large language models u | arXiv: 2510.08565
near-exponential savings for mean estimation with active learning | arXiv: 2511.05736
near-optimal quantum algorithms for computing coarse correlated equilibria of ge | arXiv: 2510.16782
nearly-linear time private hypothesis selection with the optimal approximation f | arXiv: 2506.01162
needleinatable exploring long-context capability of large language models toward | arXiv: 2504.06560
negocollab a common representation negotiation approach for heterogeneous collab | arXiv: 2510.27647
nemotron-climb clustering-based iterative data mixture bootstrapping for languag | arXiv: 2504.13161
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models | arXiv: 2511.18890
nerfbaselines consistent and reproducible evaluation of novel view synthesis met | arXiv: 2406.17345
nesypr neurosymbolic proceduralization for efficient embodied reasoning | arXiv: 2510.19429
neural collapse in cumulative link models for ordinal regression an analysis wit | arXiv: 2506.05801
neural collapse under gradient flow on shallow relu networks for orthogonally se | arXiv: 2510.21078
neural deprojection of galaxy stellar mass profiles | arXiv: 2511.20746
neural emulator superiority when machine learning for pdes surpasses its trainin | arXiv: 2510.23111
neural entropy | arXiv: 2409.03817
neural greens functions | arXiv: 2511.01924
neural mjd neural non-stationary merton jump diffusion for time series predictio | arXiv: 2506.04542
neural network for simulating radio emission from extensive air showers | arXiv: 2512.21407
neural stochastic flows solver-free modelling and inference for sde solutions | arXiv: 2510.25769
neural thermodynamics entropic forces in deep and universal representation learn | arXiv: 2505.12387
neurips should lead scientific consensus on ai policy | arXiv: 2510.00075
neuript foundation model for neural interfaces | arXiv: 2510.16548
neuro-spectral architectures for causal physics-informed networks | arXiv: 2509.04966
neuro-symbolic entity alignment via variational inference | arXiv: 2410.04153
neuropath neurobiology-inspired path tracking and reflection for semantically co | arXiv: 2511.14096
neurosymbolic diffusion models | arXiv: 2505.13138
next semantic scale prediction via hierarchical diffusion language models | arXiv: 2510.08632
nnterp a standardized interface for mechanistic interpretability of transformers | arXiv: 2511.14465
node-based editing for multimodal generation of text audio image and video | arXiv: 2511.03227
noise-robustness through noise a framework combining asymmetric lora with poison | arXiv: 2505.23868
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation | arXiv: 2504.13055
non-asymptotic analysis of data augmentation for precision matrix estimation | arXiv: 2510.02119
non-clairvoyant scheduling with progress bars | arXiv: 2509.19662
non-convex entropic mean-field optimization via best response flow | arXiv: 2505.22760
non-markovian discrete diffusion with causal language models | arXiv: 2502.09767
non-stationary bandit convex optimization a comprehensive study | arXiv: 2506.02980
nonlinear laplacians tunable principal component analysis under directional prio | arXiv: 2505.12528
nonlinearly preconditioned gradient methods momentum and stochastic analysis | arXiv: 2510.11312
normal-abnormal guided generalist anomaly detection | arXiv: 2510.00495
normalization in attention dynamics | arXiv: 2510.22026
not all deepfakes are created equal triaging audio forgeries for robust deepfake | arXiv: 2510.17474
not all splits are equal rethinking attribute generalization across unrelated ca | arXiv: 2509.06998
novel class discovery for point cloud segmentation via joint learning of causal | arXiv: 2510.13307
novel view synthesis from a few glimpses via test-time natural video completion | arXiv: 2511.17932
npn non-linear projections of the null-space for imaging inverse problems | arXiv: 2510.01608
nsw-epnews a news-augmented benchmark for electricity price forecasting with llm | arXiv: 2506.11050
obclip oblivious cloud-device hybrid image generation with privacy preservation | arXiv: 2510.04153
object-centric representation learning for enhanced 3d semantic scene graph pred | arXiv: 2510.04714
obliviator reveals the cost of nonlinear guardedness in concept erasure | arXiv: 2603.07529
ocn effectively utilizing higher-order common neighbors for better link predicti | arXiv: 2505.19719
offline policy evaluation of multi-turn llm health coaching with real users | arXiv: 2510.17173
omni-mol multitask molecular model for any-to-any modalities | arXiv: 2502.01074
omnicast a masked latent diffusion model for weather forecasting across time sca | arXiv: 2510.18707
omnidraft a cross-vocabulary online adaptive drafter for on-device speculative d | arXiv: 2507.02659
omnifc rethinking federated clustering via lossless and secure distance reconstr | arXiv: 2505.13071
omnigaze reward-inspired generalizable gaze estimation in the wild | arXiv: 2510.13660
omnisegmentor a flexible multi-modal learning framework for semantic segmentatio | arXiv: 2509.15096
OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers | arXiv: 2505.21448
omnivcus feedforward subject-driven video customization with multimodal control | arXiv: 2506.23361
on a geometry of interbrain networks | arXiv: 2509.10650
on agnostic pac learning in the small error regime | arXiv: 2502.09496
on evaluating llm alignment by evaluating llms as judges | arXiv: 2511.20604
On Extending Direct Preference Optimization to Accommodate Ties | arXiv: 2409.17431
on geometry-enhanced parameter-efficient fine-tuning for 3d scene segmentation | arXiv: 2505.22444
On Learning Verifiers and Implications to Chain-of-Thought Reasoning | arXiv: 2505.22650
on minimax estimation of parameters in softmax-contaminated mixture of experts | arXiv: 2505.18455
on optimal steering to achieve exact fairness | arXiv: 2509.15759
on the creation of narrow ai hierarchy and nonlocality of neural network skills | arXiv: 2505.15811
on the emergence of linear analogies in word embeddings | arXiv: 2505.18651
on the empirical power of goodness-of-fit tests in watermark detection | arXiv: 2510.03944
on the entropy calibration of language models | arXiv: 2511.11966
On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks | arXiv: 2505.24205
on the global optimality of policy gradient methods in general utility reinforce | arXiv: 2410.04108
on the hardness of approximating distributions with tractable probabilistic mode | arXiv: 2506.01281
on the hardness of conditional independence testing in practice | arXiv: 2512.14000
on the relation between rectified flows and optimal transport | arXiv: 2505.19712
on the robustness of verbal confidence of llms in adversarial attacks | arXiv: 2507.06489
on the role of hidden states of modern hopfield network in transformer | arXiv: 2511.20698
on the sample complexity of differentially private policy optimization | arXiv: 2510.21060
on the surprising effectiveness of large learning rates under standard width sca | arXiv: 2505.22491
on the value of cross-modal misalignment in multimodal representation learning | arXiv: 2504.10143
on topological descriptors for graph products | arXiv: 2511.08846
on universality classes of equivariant networks | arXiv: 2506.02293
once upon an input reasoning via per-instance program synthesis | arXiv: 2510.22849
one filters all a generalist filter for state estimation | arXiv: 2509.20051
one prompt fits all universal graph adaptation for pretrained models | arXiv: 2509.22416
one sample is enough to make conformal prediction robust | arXiv: 2506.16553
one small step with fingerprints one giant leap for de novo molecule generation | arXiv: 2508.04180
one stone with two birds a null-text-null frequency-aware diffusion models for t | arXiv: 2510.08273
one token embedding is enough to deadlock your large reasoning model | arXiv: 2510.15965
one-shot transfer learning for nonlinear pdes with perturbative pinns | arXiv: 2511.11137
one-step diffusion-based image compression with semantic distillation | arXiv: 2505.16687
online feedback efficient active target discovery in partially observable enviro | arXiv: 2505.06535
online mixture of experts no-regret learning for optimal collective decision-mak | arXiv: 2510.21788
online optimization for offline safe reinforcement learning | arXiv: 2510.22027
online segment any 3d thing as instance tracking | arXiv: 2512.07599
Online Two-Stage Submodular Maximization | arXiv: 2510.19480
onlinesplatter pose-free online 3d reconstruction for free-moving objects | arXiv: 2510.20605
open vision reasoner transferring linguistic cognitive behavior for visual reaso | arXiv: 2507.05255
open-insect benchmarking open-set recognition of novel species in biodiversity m | arXiv: 2503.01691
open-world drone active tracking with goal-centered rewards | arXiv: 2412.00744
openbox annotate any bounding boxes in 3d | arXiv: 2512.01352
openhoi open-world hand-object interaction synthesis with multimodal large langu | arXiv: 2505.18947
openlex3d a tiered evaluation benchmark for open-vocabulary 3d scene representat | arXiv: 2503.19764
operation veja fixing fundamental concepts missing from modern roleplaying train | arXiv: 2601.06039
opinion maximization in social networks by modifying internal opinions | arXiv: 2510.17226
opinion towards unified expressive policy optimization for robust robot learning | arXiv: 2511.10087
optimal adjustment sets for nonparametric estimation of weighted controlled dire | arXiv: 2506.09871
optimal online change detection via random fourier features | arXiv: 2505.17789
optimal rates for generalization of gradient descent for deep relu classificatio | arXiv: 2510.02779
optimality and np-hardness of transformers in learning markovian dynamical funct | arXiv: 2510.18638
optimism without regularization constant regret in zero-sum games | arXiv: 2506.16736
optimistic online-to-batch conversions for accelerated convergence and universal | arXiv: 2511.06597
optimized learned count-min sketch | arXiv: 2512.12252
optimizing distributional geometry alignment with optimal transport for generati | arXiv: 2512.00308
optimizing the unknown black box bayesian optimization with energy-based model a | arXiv: 2510.19530
optitree hierarchical thoughts generation with tree search for llm optimization | arXiv: 2510.22192
oracle-efficient combinatorial semi-bandits | arXiv: 2510.21431
orbit -- open recommendation benchmark for reproducible research with hidden tes | arXiv: 2510.26095
orbitzoo real orbital systems challenges for reinforcement learning | arXiv: 2504.04160
orchestration framework for financial agents from algorithmic trading to agentic | arXiv: 2512.02227
order-level attention similarity across language models a latent commonality | arXiv: 2511.05064
ordinal label-distribution learning with constrained asymmetric priors for imbal | arXiv: 2509.26146
ordshap feature position importance for sequential black-box models | arXiv: 2507.11855
orient anything v2 unifying orientation and rotation understanding | arXiv: 2601.05573
orientation matters making 3d generative models orientation-aligned | arXiv: 2506.08640
orientation-anchored hyper-gaussian for 4d reconstruction from casual videos | arXiv: 2509.23492
orochi versatile biomedical image processor | arXiv: 2509.22583
orpo-distill mixed-policy preference optimization for cross-architecture llm dis | arXiv: 2509.25100
orthograd improves neural calibration | arXiv: 2506.04487
ortholoc uav 6-dof localization and calibration using orthographic geodata | arXiv: 2509.18350
oryx a scalable sequence model for many-agent coordination in offline marl | arXiv: 2505.22151
os-harm a benchmark for measuring safety of computer use agents | arXiv: 2506.14866
osmgen highly controllable satellite image synthesis using openstreetmap data | arXiv: 2511.00345
out of control -- why alignment needs formal control theory and an alignment con | arXiv: 2506.17846
out-of-distribution generalisation is hard evidence from arc-like tasks | arXiv: 2505.09716
over-squashing in spatiotemporal graph neural networks | arXiv: 2506.15507
overcoming sparsity artifacts in crosscoders to interpret chat-tuning | arXiv: 2504.02922
overfitting in adaptive robust optimization | arXiv: 2509.16451
overlaybench a benchmark for layout-to-image generation with dense overlaps | arXiv: 2509.19282
overt a benchmark for over-refusal evaluation on text-to-image models | arXiv: 2505.21347
p-drum post-hoc descriptor-based residual uncertainty modeling for machine learn | arXiv: 2509.02927
pac-bayes bounds for multivariate linear regression and linear autoencoders | arXiv: 2512.12905
pairwise optimal transports for training all-to-all flow-based condition transfe | arXiv: 2504.03188
pancakes consistent multi-protocol image segmentation across biomedical domains | arXiv: 2512.13534
panda towards generalist video anomaly detection via agentic ai engineer | arXiv: 2509.26386
pandapose 3d human pose lifting from a single image via propagating 2d pose prio | arXiv: 2602.01095
panel-by-panel souls a performative workflow for expressive faces in ai-assisted | arXiv: 2511.16038
panoptic captioning an equivalence bridge for image and text | arXiv: 2505.16334
parallelization of non-linear state-space models scaling up liquid-resistance li | arXiv: 2505.21717
parallelprompt extracting parallelism from large language model queries | arXiv: 2506.18728
parameter efficient fine-tuning via explained variance adaptation | arXiv: 2410.07170
parameter-free algorithms for the stochastically extended adversarial model | arXiv: 2510.04685
parco parallel autoregressive models for multi-agent combinatorial optimization | arXiv: 2409.03811
paretoq improving scaling laws in extremely low-bit llm quantization | arXiv: 2502.02631
parrot a benchmark for evaluating llms in cross-system sql translation | arXiv: 2509.23338
part-aware bottom-up group reasoning for fine-grained social interaction detecti | arXiv: 2511.03666
partial information decomposition via normalizing flows in latent gaussian distr | arXiv: 2510.04417
partnext a next-generation dataset for fine-grained and hierarchical 3d part und | arXiv: 2510.20155
partonomy large multimodal models with part-level visual understanding | arXiv: 2505.20759
pass path-selective state space model for event-based recognition | arXiv: 2409.16953
path attention position encoding via accumulating householder transformations | arXiv: 2505.16381
patientsim a persona-driven simulator for realistic doctor-patient interactions | arXiv: 2505.17818
perceptually aligning representations of music via noise-augmented autoencoders | arXiv: 2511.05350
performative validity of recourse explanations | arXiv: 2506.15366
periodic skill discovery | arXiv: 2511.03187
permllm learnable channel permutation for nm sparse large language models | arXiv: 2510.10136
personalized subgraph federated learning with differentiable auxiliary projectio | arXiv: 2505.23864
perturb a model not an image towards robust privacy protection via anti-personal | arXiv: 2511.01307
perturbation bounds for low-rank inverse approximations under noise | arXiv: 2510.25571
pfδ a benchmark dataset for power flow under load generation and topology variat | arXiv: 2510.22048
pharmacophore-guided generative design of novel drug-like molecules | arXiv: 2510.01480
photography perspective composition towards aesthetic perspective recommendation | arXiv: 2505.20655
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
physics of language models part 41 architecture design and the magic of canon la | arXiv: 2512.17351
physics-constrained flow matching sampling generative models with hard constrain | arXiv: 2506.04171
physics-driven spatiotemporal modeling for ai-generated video detection | arXiv: 2510.08073
physics-guided machine learning for uncertainty quantification in turbulence mod | arXiv: 2511.05633
physics-informed neural networks with fourier features and attention-driven deco | arXiv: 2510.05385
physics-informed reduced order modeling of time-dependent pdes via differentiabl | arXiv: 2505.14595
physiowave a multi-scale wavelet-transformer for physiological signal representa | arXiv: 2506.10351
physvlm-avr active visual reasoning for multimodal large language models in phys | arXiv: 2510.21111
physx-3d physical-grounded 3d asset generation | arXiv: 2507.12465
pid-controlled langevin dynamics for faster sampling of generative models | arXiv: 2511.12603
pixel-perfect depth with semantics-prompted diffusion transformers | arXiv: 2510.07316
pixfoundation 20 do video multi-modal llms use motion in visual grounding | arXiv: 2509.02807
pixperfect seamless latent diffusion local editing with discriminative pixel-spa | arXiv: 2512.03247
plana3r zero-shot metric planar 3d reconstruction via feed-forward planar splatt | arXiv: 2510.18714
planargs high-fidelity indoor 3d gaussian splatting guided by vision-language pl | arXiv: 2510.23930
planning without search refining frontier llms with offline goal-conditioned rl | arXiv: 2505.18098
planu large language model reasoning through planning under uncertainty | arXiv: 2510.18442
plasticity as the mirror of empowerment | arXiv: 2505.10361
pluralistic behavior suite stress-testing multi-turn adherence to custom behavio | arXiv: 2511.05018
pointmac meta-learned adaptation for robust test-time point cloud completion | arXiv: 2510.10365
polar sparsity high throughput batched llm inferencing with scalable contextual | arXiv: 2505.14884
polaris a high-contrast polarimetric imaging benchmark dataset for exoplanetary | arXiv: 2506.03511
policy compatible skill incremental learning via lazy learning interface | arXiv: 2509.20612
policy-as-prompt turning ai governance rules into guardrails for ai agents | arXiv: 2509.23994
poly-guard massive multi-domain safety policy-grounded guardrail dataset | arXiv: 2506.19054
polyjuice makes it real black-box universal red teaming for synthetic image dete | arXiv: 2509.15551
polypose deformable 2d3d registration via polyrigid transformations | arXiv: 2505.19256
posecrafter extreme pose estimation with hybrid video synthesis | arXiv: 2510.19527
position bridge the gaps between machine unlearning and ai regulation | arXiv: 2502.12430
position paper if innovation in ai systematically violates fundamental rights is | arXiv: 2511.00027
position the complexity of perfect ai alignment -- formalizing the rlhf trilemma | arXiv: 2511.19504
position thematic analysis of unstructured clinical transcripts with large langu | arXiv: 2509.14597
position there is no free bayesian uncertainty quantification | arXiv: 2506.03670
position towards bidirectional human-ai alignment | arXiv: 2406.09264
post hoc regression refinement via pairwise rankings | arXiv: 2508.16495
posterior sampling by combining diffusion models with annealed langevin dynamics | arXiv: 2510.26324
power ensemble aggregation for improved extreme event ai prediction | arXiv: 2511.11170
power lines scaling laws for weight decay and batch size in llm pre-training | arXiv: 2505.13738
ppg-distill efficient photoplethysmography signals analysis via foundation model | arXiv: 2509.19215
practical bayes-optimal membership inference attacks | arXiv: 2505.24089
practical do-shapley explanations with estimand-agnostic causal inference | arXiv: 2509.20211
pragmatic heterogeneous collaborative perception via generative communication me | arXiv: 2510.19618
Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning | arXiv: 2503.16965
Precise Information Control in Long-Form Text Generation | arXiv: 2506.06589
preconditioned langevin dynamics with score-based generative models for infinite | arXiv: 2505.18276
predict training data quality via its geometry in metric space | arXiv: 2510.15970
predicting public health impacts of electricity usage | arXiv: 2511.22031
predicting the performance of black-box llms through follow-up queries | arXiv: 2501.01558
prediction-powered semi-supervised learning with online power tuning | arXiv: 2510.22586
predictive feature caching for training-free acceleration of molecular geometry | arXiv: 2510.04646
predictive preference learning from human interventions | arXiv: 2510.01545
preference learning with lie detectors can induce honesty or evasion | arXiv: 2505.13787
preference learning with response time robust losses and guarantees | arXiv: 2505.22820
preference optimization by estimating the ratio of the data distribution | arXiv: 2505.19601
preference-based reinforcement learning beyond pairwise comparisons benefits of | arXiv: 2510.18713
preference-driven knowledge distillation for few-shot node classification | arXiv: 2510.10116
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation | arXiv: 2412.03409
prefm online audio-visual event parsing via predictive future modeling | arXiv: 2505.23155
prescribe predicting single-cell responses with bayesian estimation | arXiv: 2510.07964
preserving llm capabilities through calibration data curation from analysis to o | arXiv: 2510.10618
preserving task-relevant information under linear concept removal | arXiv: 2506.10703
presto preimage-informed instruction optimization for prompting black-box llms | arXiv: 2510.25808
pretraining a unified pddl domain from real-world demonstrations for generalizab | arXiv: 2507.21545
preventing shortcuts in adapter training via providing the shortcuts | arXiv: 2510.20887
principled data augmentation for learning to solve quadratic programming problem | arXiv: 2506.01728
principled fine-tuning of llms from user-edits a medley of preference supervisio | arXiv: 2601.19055
prior-guided flow matching for target-aware molecule design with learnable atom | arXiv: 2509.01486
prioritizing perception-guided self-supervision a new paradigm for causal modeli | arXiv: 2511.08214
private continual counting of unbounded streams | arXiv: 2506.15018
private evolution converges | arXiv: 2506.08312
private zeroth-order optimization with public data | arXiv: 2511.10859
probabilistic reasoning with llms for k-anonymity estimation | arXiv: 2503.09674
Probabilistic Token Alignment for Large Language Model Fusion | arXiv: 2509.17276
probability calibration for precipitation nowcasting | arXiv: 2510.00594
probing neural combinatorial optimization models | arXiv: 2510.22131
problem-parameter-free decentralized bilevel optimization | arXiv: 2510.24288
procurement auctions with predictions improved frugality for facility location | arXiv: 2512.09367
product distribution learning with imperfect advice | arXiv: 2511.10366
profit a specialized optimizer for deep fine tuning | arXiv: 2412.01930
program synthesis via test-time transduction | arXiv: 2509.17393
progressive inference-time annealing of diffusion models for sampling from boltz | arXiv: 2506.16471
projecting assumptions the duality between sparse autoencoders and concept geome | arXiv: 2503.01822
prompt tuning decision transformers with structured and scalable bandits | arXiv: 2502.04979
prompt-based safety guidance is ineffective for unlearned text-to-image diffusio | arXiv: 2511.04834
ProofSketch: Efficient Verified Reasoning for Large Language Models | arXiv: 2510.24811
prospero active learning for robust protein design beyond wild-type neighborhood | arXiv: 2505.22494
protein design with dynamic protein vocabulary | arXiv: 2505.18966
provable ordering and continuity in vision-language pretraining for generalizabl | arXiv: 2502.01218
Provable Scaling Laws for the Test-Time Compute of Large Language Models | arXiv: 2411.19477
provable watermarking for data poisoning attacks | arXiv: 2510.09210
provably efficient online rlhf with one-pass reward modeling | arXiv: 2502.07193
psi-sampler initial particle sampling for smc-based inference-time reward alignm | arXiv: 2506.01320
pubsub-vfl towards efficient two-party split learning in heterogeneous environme | arXiv: 2510.12494
pulse practical evaluation scenarios for large multimodal model unlearning | arXiv: 2507.01271
purifying shampoo investigating shampoos heuristics by decomposing its precondit | arXiv: 2506.03595
put cash on bandits a max k-armed problem for automated machine learning | arXiv: 2505.05226
q-palette fractional-bit quantizers toward optimal bit allocation for efficient | arXiv: 2509.20214
qimeng-neucomback self-evolving translation from ir to assembly code | arXiv: 2511.01183
qimeng-salv signal-aware learning for verilog code generation | arXiv: 2510.19296
qoq-med building multimodal clinical foundation models with domain-aware grpo tr | arXiv: 2506.00711
qsharp provably optimal distributional rl for llm post-training | arXiv: 2502.20548
qsvd efficient low-rank approximation for unified query-key-value weight compres | arXiv: 2510.16292
quadenhancer leveraging quadratic transformations to enhance deep neural network | arXiv: 2510.03276
quantifying and alleviating co-adaptation in sparse-view 3d gaussian splatting | arXiv: 2508.12720
quantifying climate policy action and its links to development outcomes a cross- | arXiv: 2510.17425
quantifying generalisation in imitation learning | arXiv: 2509.24784
quantifying task-relevant representational similarity using decision variable co | arXiv: 2506.02164
quantifying the role of openfold components in protein structure prediction | arXiv: 2511.14781
quantitative convergence of trained single layer neural networks to gaussian pro | arXiv: 2509.24544
quantization error propagation revisiting layer-wise post-training quantization | arXiv: 2504.09629
quantum doubly stochastic transformers | arXiv: 2504.16275
r2ec towards large recommender models with reasoning | arXiv: 2505.16994
rad towards trustworthy retrieval-augmented multi-modal clinical diagnosis | arXiv: 2509.19980
radar benchmarking language models on imperfect tabular data | arXiv: 2506.08249
radial attention onlog n sparse attention with energy decay for long video gener | arXiv: 2506.19852
radial neighborhood smoothing recommender system | arXiv: 2507.09952
radzero similarity-based cross-attention for explainable vision-language alignme | arXiv: 2504.07416
rag-igbench innovative evaluation for rag-based interleaved generation in open-d | arXiv: 2512.05119
ram-w600 a multi-task wrist dataset and benchmark for rheumatoid arthritis | arXiv: 2507.05193
random search neural networks for efficient and expressive graph learning | arXiv: 2510.22520
rao-blackwellised reparameterisation gradients | arXiv: 2506.07687
raptr radar-based 3d pose estimation using transformer | arXiv: 2511.08387
rare text semantics were always there in your diffusion transformer | arXiv: 2510.03886
rat bridging rnn efficiency and attention accuracy via chunk-based sequence mode | arXiv: 2507.04416
raw2drive reinforcement learning with aligned world models for end-to-end autono | arXiv: 2505.16394
raxss retrieval-augmented sparse sampling for explainable variable-length medica | arXiv: 2510.02936
rccda adaptive model updates in the presence of concept drift under a constraine | arXiv: 2505.24149
rd-agent-quant a multi-agent framework for data-centric factors and model joint | arXiv: 2505.15155
rdb2g-bench a comprehensive benchmark for automatic graph modeling of relational | arXiv: 2506.01360
rdd retrieval-based demonstration decomposer for planner alignment in long-horiz | arXiv: 2510.14968
re-coding for uncertainties edge-awareness semantic concordance for resilient ev | arXiv: 2511.08269
re-forc adaptive reward prediction for efficient chain-of-thought reasoning | arXiv: 2511.02130
reading recognition in the wild | arXiv: 2505.24848
real-time execution of action chunking flow policies | arXiv: 2506.07339
real-world adverse weather image restoration via dual-level reinforcement learni | arXiv: 2511.05095
Real-World Reinforcement Learning of Active Perception Behaviors | arXiv: 2512.01188
RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics | arXiv: 2505.12575
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs | arXiv: 2506.18896
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought | arXiv: 2505.12514
reasoning compiler llm-guided optimizations for efficient model serving | arXiv: 2506.01374
Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards | arXiv: 2505.24760
reasoning meets representation envisioning neuro-symbolic wireless foundation mo | arXiv: 2511.16369
reasoning models better express their confidence | arXiv: 2505.14489
reasoning models hallucinate more factuality-aware reinforcement learning for la | arXiv: 2505.24630
Reasoning With a Star: A Heliophysics Dataset and Benchmark for Agentic Scientific Reasoning | arXiv: 2511.20694
recognition through reasoning reinforcing image geo-localization with large visi | arXiv: 2506.14674
recon region-controllable data augmentation with rectification and alignment for | arXiv: 2510.15783
recon-gs continuum-preserved gaussian streaming for fast and compact reconstruct | arXiv: 2509.24325
reconstruct inpaint test-time finetune dynamic novel-view synthesis from monocul | arXiv: 2507.12646
reconstructing the local density field with combined convolutional and point clo | arXiv: 2510.08573
reconstruction and secrecy under approximate distance queries | arXiv: 2511.06461
rectified point flow generic point cloud pose estimation | arXiv: 2506.05282
rectified-cfg for flow based models | arXiv: 2510.07631
rectifying shortcut behaviors in preference-based reward learning | arXiv: 2510.19050
rectifying soft-label entangled bias in long-tailed dataset distillation | arXiv: 2511.17914
recurrent attention-based token selection for efficient streaming video-llms | arXiv: 2510.17364
recurrent memory for online interdomain gaussian processes | arXiv: 2502.08736
recurrent self-attention dynamics an energy-agnostic perspective from jacobians | arXiv: 2505.19458
redefining experts interpretable decomposition of language models for toxicity m | arXiv: 2509.16660
redundancy-aware test-time graph out-of-distribution detection | arXiv: 2510.14562
reflective translation improving low-resource machine translation via structured | arXiv: 2601.19871
reflora refactored low-rank adaptation for efficient fine-tuning of large models | arXiv: 2505.18877
regression trees know calculus | arXiv: 2405.13846
regret lower bounds for decentralized multi-agent stochastic shortest path probl | arXiv: 2511.04594
reinforcement learning finetunes small subnetworks in large language models | arXiv: 2505.11711
Reinforcement Learning for Long-Horizon Multi-Turn Search Agents | arXiv: 2510.24126
reinforcement learning teachers of test time scaling | arXiv: 2506.08388
reinforcement learning with action chunking | arXiv: 2507.07969
reinforcement learning with backtracking feedback | arXiv: 2602.08377
reinforcing the diffusion chain of lateral thought with diffusion language model | arXiv: 2505.10446
reject only critical tokens pivot-aware speculative decoding | arXiv: 2511.00351
reliabilityrag effective and provably robust defense for rag-based web-search | arXiv: 2509.23519
reliable active learning from unreliable labels via neural collapse geometry | arXiv: 2510.09740
reliable decision making via calibration oriented retrieval augmented generation | arXiv: 2411.08891
reliably detecting model failures in deployment without labels | arXiv: 2506.05047
relieving the over-aggregating effect in graph transformers | arXiv: 2510.21267
remasking discrete diffusion models with inference-time scaling | arXiv: 2503.00307
remindrag low-cost llm-guided knowledge graph traversal for efficient rag | arXiv: 2510.13193
reordering patches improves vision models | arXiv: 2505.23751
rep resource-efficient prompting for rehearsal-free continual learning | arXiv: 2406.04772
reparameterized llm training via orthogonal equivalence transformation | arXiv: 2506.08001
repic reinforced post-training for personalizing multi-modal language models | arXiv: 2506.18369
replaceme network simplification via depth pruning and transformer block lineari | arXiv: 2505.02819
repldm reprogramming pretrained latent diffusion models for high-quality high-ef | arXiv: 2410.06055
representation consistency for accurate and coherent llm answer aggregation | arXiv: 2506.21590
resnets are deeper than you think | arXiv: 2506.14386
resounding acoustic fields with reciprocity | arXiv: 2510.20602
respodiff dual-module bottleneck transformation for responsible faithful t2i gen | arXiv: 2509.15257
responserank data-efficient reward modeling through preference strength learning | arXiv: 2512.25023
restoring pruned large language models via lost component compensation | arXiv: 2510.21834
Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates | arXiv: 2505.10039
rethinking direct preference optimization in diffusion models | arXiv: 2505.18736
rethinking evaluation of infrared small target detection | arXiv: 2509.16888
rethinking losses for diffusion bridge samplers | arXiv: 2506.10982
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion | arXiv: 2502.20120
rethinking neural combinatorial optimization for vehicle routing problems with d | arXiv: 2505.24627
rethinking nighttime image deraining via learnable color space transformation | arXiv: 2510.17440
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling | arXiv: 2505.11730
rethinking pca through duality | arXiv: 2510.18130
rethinking residual distribution in locate-then-edit model editing | arXiv: 2502.03748
rethinking the simulation vs rendering dichotomy no free lunch in spatial world | arXiv: 2510.20835
retrieval is not enough enhancing rag reasoning through test-time critique and o | arXiv: 2504.14858
retrieval-augmented generation for reliable interpretation of radio regulations | arXiv: 2509.09651
Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
retrosynthesis planning via worst-path policy optimisation in tree-structured md | arXiv: 2509.10504
retrv-r1 a reasoning-driven mllm framework for universal and efficient multimoda | arXiv: 2510.02745
revealing multimodal causality with large language models | arXiv: 2509.17784
reverse engineering human preferences with reinforcement learning | arXiv: 2505.15795
revisiting agnostic boosting | arXiv: 2503.09384
revisiting bi-linear state transitions in recurrent neural networks | arXiv: 2505.21749
revisiting end-to-end learning with slide-level supervision in computational pat | arXiv: 2506.02408
revisiting generative infrared and visible image fusion based on human cognitive | arXiv: 2510.26268
revisiting logit distributions for reliable out-of-distribution detection | arXiv: 2510.20134
revisiting orbital minimization method for neural operator decomposition | arXiv: 2510.21952
revisiting semi-supervised learning in the era of foundation models | arXiv: 2503.09707
reward-aware proto-representations in reinforcement learning | arXiv: 2505.16217
rewind-to-delete certified machine unlearning for nonconvex functions | arXiv: 2409.09778
rgb-only supervised camera parameter optimization in dynamic scenes | arXiv: 2509.15123
rgb-to-polarization estimation a new task and benchmark study | arXiv: 2505.13050
riemannian consistency model | arXiv: 2510.00983
riemannian flow matching for brain connectivity matrices via pullback geometry | arXiv: 2505.18193
riganyface scaling neural facial mesh auto-rigging with unlabeled data | arXiv: 2511.18601
risk management for mitigating benchmark failure modes benchrisk | arXiv: 2510.21460
risk-averse constrained reinforcement learning with optimized certainty equivale | arXiv: 2510.20199
risk-averse total-reward reinforcement learning | arXiv: 2506.21683
rivermamba a state space model for global river discharge and flood forecasting | arXiv: 2505.22535
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning | arXiv: 2505.15034
rlgf reinforcement learning with geometric feedback for autonomous driving video | arXiv: 2509.16500
rlvr-world training world models with reinforcement learning | arXiv: 2505.13934
rlzero direct policy inference from language without in-domain supervision | arXiv: 2412.05718
rmit-adms at the mmu-rag neurips 2025 competition | arXiv: 2602.20735
rnns perform task computations by dynamically warping neural representations | arXiv: 2512.04310
robocerebra a large-scale benchmark for long-horizon robotic manipulation evalua | arXiv: 2506.06677
roborefer towards spatial referring with reasoning in vision-language models for | arXiv: 2506.04308
robot-r1 reinforcement learning for enhanced embodied reasoning in robotics | arXiv: 2506.00070
robust adversarial reinforcement learning in stochastic games via sequence model | arXiv: 2510.11877
robust and diverse multi-agent learning via rational policy gradient | arXiv: 2511.09535
robust ego-exo correspondence with long-term memory | arXiv: 2510.11417
robust egocentric referring video object segmentation via dual-modal causal inte | arXiv: 2512.24323
robust estimation under heterogeneous corruption rates | arXiv: 2508.15051
robust federated finetuning of llms via alternating optimization of lora | arXiv: 2502.01755
robust graph condensation via classification complexity mitigation | arXiv: 2510.26451
robust hallucination detection in llms via adaptive token selection | arXiv: 2504.07863
robust llm alignment via distributionally robust direct preference optimization | arXiv: 2502.01930
robust neural rendering in the wild with asymmetric dual 3d gaussian splatting | arXiv: 2506.03538
robust or suggestible exploring non-clinical induction in llm drug-safety decisi | arXiv: 2510.13931
robust sampling for active statistical inference | arXiv: 2511.08991
robustifying learning-augmented caching efficiently without compromising 1-consi | arXiv: 2507.16242
robustmerge parameter-efficient model merging for mllms with direction robustnes | arXiv: 2502.17159
robustness in both domains clip needs a robust text encoder | arXiv: 2506.03355
rogr relightable 3d objects using generative relighting | arXiv: 2510.03163
roirl efficient self-supervised reasoning with offline iterative reinforcement l | arXiv: 2510.02892
roma scaling up mamba-based foundation models for remote sensing | arXiv: 2503.10392
root cause analysis of outliers with missing structural knowledge | arXiv: 2406.05014
rotary masked autoencoders are versatile learners | arXiv: 2505.20535
router-r1 teaching llms multi-round routing and aggregation via reinforcement le | arXiv: 2506.09033
rscc a large-scale remote sensing change caption dataset for disaster events | arXiv: 2509.01907
rtv-bench benchmarking mllm continuous perception understanding and reasoning th | arXiv: 2505.02064
s2m-former spiking symmetric mixing branchformer for brain auditory attention de | arXiv: 2508.05164
s2q-vdit accurate quantized video diffusion transformer with salient data and sp | arXiv: 2508.04016
sad neural networks divergent gradient flows and asymptotic optimality via o-min | arXiv: 2505.09572
SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders | arXiv: 2508.08211
safe and stable control via lyapunov-guided diffusion models | arXiv: 2509.25375
safe multitask failure detection for vision-language-action models | arXiv: 2506.09937
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking | arXiv: 2505.12667
safepath preventing harmful reasoning in chain-of-thought via early alignment | arXiv: 2505.14667
safeptr token-level jailbreak defense in multimodal llms via prune-then-restore | arXiv: 2507.01513
safevla towards safety alignment of vision-language-action model via constrained | arXiv: 2503.03480
safire saccade-fixation reiteration with mamba for referring image segmentation | arXiv: 2510.10160
sam-r1 leveraging sam for reward feedback in multimodal segmentation via reinfor | arXiv: 2505.22596
sama towards multi-turn referential grounded video chat with large language mode | arXiv: 2505.18812
sample complexity of distributionally robust average-reward reinforcement learni | arXiv: 2505.10007
sample-adaptivity tradeoff in on-demand sampling | arXiv: 2511.15507
sample-efficient tabular self-play for offline robust reinforcement learning | arXiv: 2512.00352
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding | arXiv: 2503.01422
sand-math using llms to generate novel difficult and useful mathematics question | arXiv: 2507.20527
sansa unleashing the hidden semantics in sam2 for few-shot segmentation | arXiv: 2505.21795
sao-instruct free-form audio editing using natural language instructions | arXiv: 2510.22795
saying the unsaid revealing the hidden language of multimodal systems through te | arXiv: 2511.10690
scaffold diffusion sparse multi-category voxel structure generation with discret | arXiv: 2509.00062
Scalable Best-of-N Selection for Large Language Models via Self-Certainty | arXiv: 2502.18581
scalable diffusion transformer for conditional 4d fmri synthesis | arXiv: 2511.22870
scalable explainable and provably robust anomaly detection with one-step flow ma | arXiv: 2510.18328
scalable exploration via ensemble | arXiv: 2407.13195
scalable fingerprinting of large language models | arXiv: 2502.07760
scalable gpu-accelerated euler characteristic curves optimization and differenti | arXiv: 2510.20271
scalable inference of functional neural connectivity at submillisecond timescale | arXiv: 2510.20966
scalable neural incentive design with parameterized mean-field approximation | arXiv: 2510.21442
scalable policy-based rl algorithms for pomdps | arXiv: 2510.06540
scalable signature kernel computations for long time series via local neumann se | arXiv: 2502.20392
Scale-invariant Attention | arXiv: 2505.17083
scalediff higher-resolution image synthesis via efficient and model-agnostic dif | arXiv: 2510.25818
scaling can lead to compositional generalization | arXiv: 2507.07207
scaling diffusion transformers efficiently via μp | arXiv: 2505.15270
Scaling Embedding Layers in Language Models | arXiv: 2502.01637
scaling image geo-localization to continent level | arXiv: 2510.26795
scaling language-centric omnimodal representation learning | arXiv: 2510.11693
scaling laws and pathologies of single-layer pinns network width and pde nonline | arXiv: 2603.12556
scaling offline rl via efficient and expressive shortcut models | arXiv: 2505.22866
scaling rl to long videos | arXiv: 2507.07966
scaling up active testing to large language models | arXiv: 2508.09093
scan self-denoising monte carlo annotation for robust process reward learning | arXiv: 2509.16548
scatterad temporal-topological scattering mechanism for time series anomaly dete | arXiv: 2509.24414
scene-aware urban design a human-ai recommendation framework using co-occurrence | arXiv: 2511.06201
scenedecorator towards scene-oriented story generation with scene planning and s | arXiv: 2510.22994
scenedesigner controllable multi-object image generation with 9-dof pose manipul | arXiv: 2511.16666
sceneforge enhancing 3d-text alignment with structured scene compositions | arXiv: 2509.15693
sceneweaver all-in-one 3d scene synthesis with an extensible and self-reflective | arXiv: 2509.20414
schrödinger bridge matching for tree-structured costs and entropic wasserstein b | arXiv: 2506.17197
sciarena an open evaluation platform for non-verifiable scientific literature-gr | arXiv: 2507.01001
scmrdr a scalable and flexible framework for unpaired single-cell multi-omics da | arXiv: 2510.24987
scope saliency-coverage oriented token pruning for efficient multimodel llms | arXiv: 2510.24214
score-informed neural operator for enhancing ordering-based causal discovery | arXiv: 2508.12650
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery | arXiv: 2602.11609
scsplit bringing severity cognizance to image decomposition in fluorescence micr | arXiv: 2503.22983
sd-vlm spatial measuring and understanding with depth-encoded vision-language mo | arXiv: 2509.17664
sdtagnet leveraging text-annotated navigation maps for online hd map constructio | arXiv: 2506.08997
seal semantic-aware hierarchical learning for generalized category discovery | arXiv: 2510.18740
searching latent program spaces | arXiv: 2411.08706
seca semantically equivalent and coherent attacks for eliciting llm hallucinatio | arXiv: 2510.04398
secon-rag a two-stage semantic filtering and conflict-free framework for trustwo | arXiv: 2510.09710
second-order optimization under heavy-tailed noise hessian clipping and sample c | arXiv: 2510.10690
securing the language of life inheritable watermarks from dna language models to | arXiv: 2509.18207
seeing beyond the scene analyzing and mitigating background bias in action recog | arXiv: 2512.17953
seeing is believing mitigating ocr hallucinations in multimodal large language m | arXiv: 2506.20168
seeing sound hearing sight uncovering modality bias and conflict of ai models in | arXiv: 2505.11217
seeing the arrow of time in large multimodal models | arXiv: 2506.03340
seeing the wind from a falling leaf | arXiv: 2512.00762
seetrek training-free spatial prompting for multimodal large language model | arXiv: 2509.16087
seg-var image segmentation with visual autoregressive modeling | arXiv: 2511.12594
seg4diff unveiling open-vocabulary segmentation in text-to-image diffusion trans | arXiv: 2509.18096
segmast3r geometry grounded segment matching | arXiv: 2510.05051
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models | arXiv: 2505.23564
segment then splat unified 3d open-vocabulary segmentation via gaussian splattin | arXiv: 2503.22204
segment-factorized full-song generation on symbolic piano music | arXiv: 2510.05881
selective learning for deep time series forecasting | arXiv: 2510.25207
self forcing bridging the train-test gap in autoregressive video diffusion | arXiv: 2506.08009
self iterative label refinement via robust unlabeled learning | arXiv: 2502.12565
self-alignment of large video language models with refined regularized preferenc | arXiv: 2504.12083
self-improving embodied foundation models | arXiv: 2509.15155
self-refining language model anonymizers via adversarial distillation | arXiv: 2506.01420
self-supervised contrastive learning is approximately supervised contrastive lea | arXiv: 2506.04411
self-supervised discovery of neural circuits in spatially patterned neural respo | arXiv: 2509.17174
self-supervised learning of echocardiographic video representations via online c | arXiv: 2506.11777
self-supervised learning of graph representations for network intrusion detectio | arXiv: 2509.16625
self-supervised learning via flow-guided neural operator on time-series data | arXiv: 2602.12267
self-supervised synthetic pretraining for inference of stellar mass embedded in | arXiv: 2510.24159
semantic and visual crop-guided diffusion models for heterogeneous tissue synthe | arXiv: 2509.17847
semantic glitch agency and artistry in an autonomous pixel cloud | arXiv: 2511.16048
semantic retrieval augmented contrastive learning for sequential recommendation | arXiv: 2503.04162
semantic surgery zero-shot concept erasure in diffusion models | arXiv: 2510.22851
semi-infinite nonconvex constrained min-max optimization | arXiv: 2510.12007
semi-supervised graph anomaly detection via robust homophily learning | arXiv: 2506.15448
semi-supervised regression with heteroscedastic pseudo-labels | arXiv: 2510.15266
sempo lightweight foundation models for time series forecasting | arXiv: 2510.19710
sensorium arc ai agent system for oceanic data exploration and interactive eco-a | arXiv: 2511.15997
sequential attention-based sampling for histopathological analysis | arXiv: 2507.05077
sequential monte carlo for policy optimization in continuous pomdps | arXiv: 2505.16732
sequential multi-agent dynamic algorithm configuration | arXiv: 2510.23535
sequentially auditing differential privacy | arXiv: 2509.07055
set smoothness unlocks clarke hyper-stationarity in bilevel optimization | arXiv: 2506.04587
shallow diffuse robust and invisible watermarking through low-dimensional subspa | arXiv: 2410.21088
shallow flow matching for coarse-to-fine text-to-speech synthesis | arXiv: 2505.12226
shallow robustness deep vulnerabilities multi-turn evaluation of medical llms | arXiv: 2510.12255
shap meets tensor networks provably tractable explanations with parallelism | arXiv: 2510.21599
shap values via sparse fourier representation | arXiv: 2410.06300
shapecraft llm agents for structured textured and interactive 3d modeling | arXiv: 2510.17603
sharper convergence rates for nonconvex optimisation via reduction mappings | arXiv: 2506.08428
sharpness-aware minimization with z-score gradient filtering | arXiv: 2505.02369
sheaf cohomology of linear predictive coding networks | arXiv: 2511.11092
Sherlock: Self-Correcting Reasoning in Vision-Language Models | arXiv: 2505.22651
shift before you learn enabling low-rank representations in reinforcement learni | arXiv: 2509.05193
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks | arXiv: 2502.04204
shortcutting pre-trained flow matching diffusion models is almost free lunch | arXiv: 2510.17858
show-o2 improved native unified multimodal models | arXiv: 2506.15564
sign-in to the lottery reparameterizing sparse training from scratch | arXiv: 2504.12801
silent tokens loud effects padding in llms | arXiv: 2510.01238
simple and efficient heterogeneous temporal graph neural network | arXiv: 2510.18467
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning | arXiv: 2410.07163
simu selective influence machine unlearning | arXiv: 2510.07822
simulating society requires simulating thought | arXiv: 2506.06958
simulation-based inference for neutrino interaction model parameter tuning | arXiv: 2510.07454
simulmega moe routers are advanced policy makers for simultaneous speech transla | arXiv: 2509.01200
simultaneous swap regret minimization via kl-calibration | arXiv: 2502.16387
simworld-robotics synthesizing photorealistic and dynamic urban environments for | arXiv: 2512.10046
single-teacher view augmentation boosting knowledge distillation via angular div | arXiv: 2510.22480
singref6d monocular novel object pose estimation with a single rgb reference | arXiv: 2509.21927
sitcom scaling inference-time compute for vlas | arXiv: 2510.04041
situat3dchange situated 3d change understanding dataset for multimodal large lan | arXiv: 2510.11509
sketch-augmented features improve learning long-range dependencies in graph neur | arXiv: 2511.03824
skrull towards efficient long context fine-tuning through dynamic data schedulin | arXiv: 2505.19609
skyladder better and faster pretraining via context window scheduling | arXiv: 2503.15450
slaying towards queer language processing | arXiv: 2509.17449
slimmable nam neural amp models with adjustable runtime computational cost | arXiv: 2511.07470
Sloth: Scaling Laws for LLM Skills to Predict Multi-Benchmark Performance Across Families | arXiv: 2412.06540
small batch size training for language models when vanilla sgd works and why gra | arXiv: 2507.07101
small language models as compiler experts auto-parallelization for heterogeneous | arXiv: 2512.19250
smaller models smarter rewards a two-sided approach to process and outcome rewar | arXiv: 2510.23083
smartwilds multimodal wildlife monitoring dataset | arXiv: 2509.18894
smmile an expert-driven benchmark for multimodal medical in-context learning | arXiv: 2506.21355
smooth regularization for efficient video recognition | arXiv: 2511.20928
smore structural mixture of residual experts for parameter-efficient llm fine-tu | arXiv: 2504.06426
smrs advocating a unified reporting standard for surrogate models in the artific | arXiv: 2502.06753
sofar language-grounded orientation bridges spatial reasoning and object manipul | arXiv: 2502.13143
soft task-aware routing of experts for equivariant representation learning | arXiv: 2510.27222
solar-geco perovskite solar cell property prediction with geometric-aware co-att | arXiv: 2511.19263
solverllm leveraging test-time scaling for optimization problem via llm-guided s | arXiv: 2510.16916
solving continuous mean field games deep reinforcement learning for non-stationa | arXiv: 2510.22158
solving inequality proofs with large language models | arXiv: 2506.07927
solving neural min-max games the role of architecture initialization dynamics | arXiv: 2512.00389
some optimizers are more equal understanding the role of optimizers in group fai | arXiv: 2504.14882
sound logical explanations for mean aggregation graph neural networks | arXiv: 2511.11593
space noise contrastive estimation stabilizes self-play fine-tuning for large la | arXiv: 2512.07175
space spike-aware consistency enhancement for test-time adaptation in spiking ne | arXiv: 2504.02298
spark transformer reactivating sparsity in ffn and attention | arXiv: 2506.06644
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models | arXiv: 2504.02821
sparse mezo less parameters for better performance in zeroth-order llm fine-tuni | arXiv: 2402.15751
sparsedit token sparsification for efficient diffusion transformer | arXiv: 2412.06028
SPARTA Alignment: Collectively Aligning Multiple Language Models through Combat | arXiv: 2506.04721
spatial understanding from videos structured prompts meet simulation data | arXiv: 2506.03642
spatial-aware decision-making with ring attractors in reinforcement learning sys | arXiv: 2410.03119
spatialthinker reinforcing 3d reasoning in multimodal llms via spatial rewards | arXiv: 2511.07403
spatialtracegen high-fidelity traces for efficient vlm spatial reasoning distill | arXiv: 2511.00054
spatio-temporal directed graph learning for account takeover fraud detection | arXiv: 2509.20339
spatio-temporal graphs beyond grids benchmark for maritime anomaly detection | arXiv: 2512.20086
specattn speculating sparse attention | arXiv: 2510.27641
specialization after generalization towards understanding test-time training in | arXiv: 2509.24510
specmer fast protein generation with k-mer guided speculative decoding | arXiv: 2509.21689
spectral conditioning of attention improves transformer performance | arXiv: 2603.07162
spectral perturbation bounds for low-rank approximation with applications to pri | arXiv: 2510.25670
speculate deep and accurate lossless and training-free acceleration for offloade | arXiv: 2509.18344
spend wisely maximizing post-training gains in iterative synthetic data bootstra | arXiv: 2501.18962
spex a spectral approach to explainable clustering | arXiv: 2511.00885
spiking brain compression post-training second-order compression for spiking neu | arXiv: 2506.03996
spiking meets attention efficient remote sensing image super-resolution with att | arXiv: 2503.04223
spiral semantic-aware progressive lidar scene generation and understanding | arXiv: 2505.22643
split gibbs discrete diffusion posterior sampling | arXiv: 2503.01161
splitflow flow decomposition for inversion-free text-to-image editing | arXiv: 2510.25970
spot-trip dual-preference driven out-of-town trip recommendation | arXiv: 2506.01705
sprint enabling interleaved planning and parallelized execution in reasoning mod | arXiv: 2506.05745
spurious-aware prototype refinement for reliable out-of-distribution detection | arXiv: 2506.23881
sql-of-thought multi-agentic text-to-sql with guided error correction | arXiv: 2509.00581
sql-r1 training natural language to sql reasoning model by reinforcement learnin | arXiv: 2504.08600
sqs enhancing sparse perception models via query-based splatting in autonomous d | arXiv: 2509.16588
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning | arXiv: 2506.01713
srsr enhancing semantic accuracy in real-world image super-resolution with spati | arXiv: 2510.22534
ssr enhancing depth perception in vision-language models via rationale-guided sp | arXiv: 2505.12448
sstag structure-aware self-supervised learning method for text-attributed graphs | arXiv: 2510.01248
stable cinemetrics structured taxonomy and evaluation for professional video gen | arXiv: 2509.26555
stable coresets via posterior sampling aligning induced and full loss landscapes | arXiv: 2511.17399
stable matching with ties approximation ratios and learning | arXiv: 2411.03270
stable minima of relu neural networks suffer from the curse of dimensionality th | arXiv: 2506.20779
stableguard towards unified copyright protection and tamper localization in late | arXiv: 2509.17993
stair addressing stage misalignment through temporal-aligned preference reinforc | arXiv: 2509.23802
stamp spatial-temporal adapter with multi-head pooling | arXiv: 2511.10848
starc-9 a large-scale dataset for multi-class tissue classification for crc hist | arXiv: 2511.00383
starformer semi-supervised task-informed representation learning via dynamic att | arXiv: 2504.10097
state-covering trajectory stitching for diffusion planners | arXiv: 2506.00895
statistical guarantees for high-dimensional stochastic gradient descent | arXiv: 2510.12013
statistical inference for gradient boosting regression | arXiv: 2509.23127
statistical inference under performativity | arXiv: 2505.18493
stead robust provably secure linguistic steganography with diffusion language mo | arXiv: 2601.14778
stealthy yet effective distribution-preserving backdoor attacks on graph classif | arXiv: 2509.26032
steering generative models with experimental data for protein fitness optimizati | arXiv: 2505.15093
steering information utility in key-value memory for language model post-trainin | arXiv: 2507.05158
steering when necessary flexible steering large language models with backtrackin | arXiv: 2508.17621
stella subspace learning in low-rank adaptation using stiefel manifold | arXiv: 2510.01938
step a unified spiking transformer evaluation platform for fair and reproducible | arXiv: 2505.11151
stochastic momentum methods for non-smooth non-convex finite-sum coupled composi | arXiv: 2506.02504
stochastic regret guarantees for online zeroth- and first-order bilevel optimiza | arXiv: 2511.01126
stop ddos attacking the research community with ai-generated survey papers | arXiv: 2510.09686
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning | arXiv: 2504.15275
strap spatio-temporal pattern retrieval for out-of-distribution generalization | arXiv: 2505.19547
strassen attention split vc dimension and compositionality in transformers | arXiv: 2501.19215
strategic costs of perceived bias in fair selection | arXiv: 2510.20606
strategyproof reinforcement learning from human feedback | arXiv: 2503.09561
streambridge turning your offline video large language model into a proactive st | arXiv: 2505.05467
streamforest efficient online video understanding with persistent event memory | arXiv: 2509.24871
streaming federated learning with markovian data | arXiv: 2503.18807
struct2d a perception-guided framework for spatial reasoning in mllms | arXiv: 2506.04220
structural information-based hierarchical diffusion for offline reinforcement le | arXiv: 2509.21942
structure-aware fusion with progressive injection for multimodal molecular repre | arXiv: 2510.23640
structure-aware spectral sparsification via uniform edge sampling | arXiv: 2510.12669
Structured Reinforcement Learning for Combinatorial Decision-Making | arXiv: 2505.19053
structured sparse transition matrices to enable state tracking in state-space mo | arXiv: 2509.22284
structured temporal causality for interpretable multivariate time series anomaly | arXiv: 2510.16511
styl3r instant 3d stylized reconstruction for arbitrary scenes and styles | arXiv: 2505.21060
succeed or learn slowly sample efficient off-policy reinforcement learning for m | arXiv: 2509.01720
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications | arXiv: 2411.04975
superclip clip with simple classification supervision | arXiv: 2512.14480
superposition yields robust neural scaling | arXiv: 2505.10465
surf2ct cascaded 3d flow matching models for torso 3d ct synthesis from skin sur | arXiv: 2505.22511
suturebot a precision framework benchmark for autonomous end-to-end suturing | arXiv: 2510.20965
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents | arXiv: 2505.20411
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution | arXiv: 2502.18449
swe-sql illuminating llm pathways to solve user sql issues in real-world applica | arXiv: 2506.18951
switchable token-specific codebook quantization for face image compression | arXiv: 2510.22943
symbolic regression is all you need from simulations to scaling laws in binary n | arXiv: 2511.08784
symphony synergistic multi-agent planning with heterogeneous language model asse | arXiv: 2601.22623
symrtlo enhancing rtl code optimization with llms and neuron-inspired symbolic r | arXiv: 2504.10369
synbrain enhancing visual-to-fmri synthesis via probabilistic representation lea | arXiv: 2508.10298
synchuman synchronizing 2d and 3d generative models for single-view human recons | arXiv: 2510.07723
synergy between the strong and the weak spiking neural networks are inherently s | arXiv: 2510.07924
synergy over discrepancy a partition-based approach to multi-domain llm fine-tun | arXiv: 2511.07198
Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency | arXiv: 2505.23471
synthetic series-symbol data generation for time series foundation models | arXiv: 2510.08445
syntsbench rethinking temporal pattern learning in deep learning models for time | arXiv: 2510.20273
system prompt optimization with meta-learning | arXiv: 2505.09666
system-embedded diffusion bridge models | arXiv: 2506.23726
systematic reward gap optimization for mitigating vlm hallucinations | arXiv: 2411.17265
systematizing llm persona design a four-quadrant technical taxonomy for ai compa | arXiv: 2511.02979
t-regs minimum spanning tree regularization for self-supervised learning | arXiv: 2510.23484
t-rex task-adaptive spatial representation extraction for robotic manipulation w | arXiv: 2506.19498
t-shirt token-selective hierarchical data selection for instruction tuning | arXiv: 2506.01317
t1 a tool-oriented conversational dataset for multi-turn agentic planning | arXiv: 2505.16986
t2smark balancing robustness and diversity in noise-as-watermark for diffusion m | arXiv: 2510.22366
tabarena a living benchmark for machine learning on tabular data | arXiv: 2506.16791
table as a modality for large language models | arXiv: 2512.00947
table2latex-rl high-fidelity latex code generation from table images via reinfor | arXiv: 2509.17589
tabrag improving tabular document question answering for retrieval augmented gen | arXiv: 2511.06582
tabstar a tabular foundation model for tabular data with text fields | arXiv: 2505.18125
tai3 testing agent integrity in interpreting user intent | arXiv: 2506.07524
talk2event grounded understanding of dynamic scenes from event cameras | arXiv: 2507.17664
tami taming heterogeneity in temporal interactions for temporal graph link predi | arXiv: 2510.23577
tangledfeatures robust feature selection in highly correlated spaces | arXiv: 2510.15005
tapip3d tracking any point in persistent 3d geometry | arXiv: 2504.14717
tapvid-360 tracking any point in 360 from narrow field of view video | arXiv: 2511.21946
target speaker extraction through comparing noisy positive and negative audio en | arXiv: 2502.16611
task-optimized convolutional recurrent networks align with tactile processing in | arXiv: 2505.18361
taught well learned ill towards distillation-conditional backdoor attack | arXiv: 2509.23871
teaching language models to evolve with users dynamic profile modeling for perso | arXiv: 2505.15456
teaming llms to detect and mitigate hallucinations | arXiv: 2510.19507
temporal smoothness-aware rate-distortion optimized 4d gaussian splatting | arXiv: 2507.17336
temporal-difference variational continual learning | arXiv: 2410.07812
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Tensor Product Attention Is All You Need | arXiv: 2501.06425
tensorrl-qas reinforcement learning with tensor networks for improved quantum ar | arXiv: 2505.09371
test-time adaptation by causal trimming | arXiv: 2510.11133
test-time adaptive object detection with foundation model | arXiv: 2510.25175
test-time spectrum-aware latent steering for zero-shot generalization in vision- | arXiv: 2511.09809
text to robotic assembly of multi component objects using 3d generative ai and v | arXiv: 2511.02162
text to sketch generation with multi-styles | arXiv: 2511.04123
text-to-code generation for modular building layouts in building information mod | arXiv: 2509.23713
text-to-image models leave identifiable signatures implications for leaderboard | arXiv: 2510.06525
textttavrobustbench benchmarking the robustness of audio-visual recognition mode | arXiv: 2506.00358
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation | arXiv: 2505.15807
the biased oracle assessing llms understandability and empathy in medical diagno | arXiv: 2511.00924
the boundaries of fair ai in medical image prognosis a causal perspective | arXiv: 2510.08840
the burden of interactive alignment with inconsistent preferences | arXiv: 2510.16368
the coming crisis of multi-agent misalignment ai alignment must be a dynamic and | arXiv: 2506.01080
the complexity of finding local optima in contrastive learning | arXiv: 2509.16898
the computational complexity of counting linear regions in relu neural networks | arXiv: 2505.16716
the cost of robustness tighter bounds on parameter complexity for robust memoriz | arXiv: 2510.24643
the curse of depth in large language models | arXiv: 2502.05795
the effect of optimal self-distillation in noisy gaussian mixture model | arXiv: 2501.16226
the emergence of sparse attention impact of data distribution and benefits of re | arXiv: 2505.17863
the geometry of cortical computation manifold disentanglement and predictive dyn | arXiv: 2508.02995
the graphon limit hypothesis understanding neural network pruning via infinite w | arXiv: 2510.17515
the hawthorne effect in reasoning models evaluating and steering test awareness | arXiv: 2505.14617
the human brain as a combinatorial complex | arXiv: 2511.20692
The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models | arXiv: 2506.24000
the illusion of thinking understanding the strengths and limitations of reasonin | arXiv: 2506.06941
the impact of quantization on large reasoning model reinforcement learning | arXiv: 2511.15694
the impact of scaling training data on adversarial robustness | arXiv: 2509.25927
the implicit bias of structured state space models can be poisoned with clean la | arXiv: 2410.10473
the last vote a multi-stakeholder framework for language model governance | arXiv: 2511.13432
the lighthouse of language enhancing llm agents via critique-guided improvement | arXiv: 2503.16024
the more you automate the less you see hidden pitfalls of ai scientist systems | arXiv: 2509.08713
The Narrow Gate: Localized Image-Text Communication in Native Multimodal Models | arXiv: 2412.06646
the non-linear representation dilemma is causal abstraction enough for mechanist | arXiv: 2507.08802
the ouroboros of benchmarking reasoning evaluation in an era of saturation | arXiv: 2511.01365
the parameterized complexity of computing the vc-dimension | arXiv: 2510.17451
the pareto frontier of resilient jet tagging | arXiv: 2509.19431
the path not taken rlvr provably learns off the principals | arXiv: 2511.08567
the persistence of neural collapse despite low-rank bias | arXiv: 2410.23169
the physical basis of prediction world model formation in neural organoids via a | arXiv: 2509.04633
the platonic universe do foundation models see the same sky | arXiv: 2509.19453
the pokeagent challenge competitive and long-context learning at scale | arXiv: 2603.15563
the primacy of magnitude in low-rank adaptation | arXiv: 2507.06558
the rich and the simple on the implicit bias of adam and sgd | arXiv: 2505.24022
the rise of parameter specialization for knowledge storage in large language mod | arXiv: 2505.17260
the structural complexity of matrix-vector multiplication | arXiv: 2502.21240
the structure of relation decoding linear operators in large language models | arXiv: 2510.26543
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | arXiv: 2506.01347
the transparent earth a multimodal foundation model for the earths subsurface | arXiv: 2509.02783
the trilemma of truth in large language models | arXiv: 2506.23921
the underappreciated power of vision models for graph structural understanding | arXiv: 2510.24788
the unseen threat residual knowledge in machine unlearning under perturbed sampl | arXiv: 2601.22359
The Virtues of Brevity: Avoid Overthinking in Parallel Test-Time Reasoning | arXiv: 2510.21067
the world is bigger a computationally-embedded perspective on the big world hypo | arXiv: 2512.23419
thermalgen style-disentangled flow-based generative models for rgb-to-thermal im | arXiv: 2509.24878
think before recommendation autonomous reasoning-enhanced recommender | arXiv: 2510.23077
think or not think a study of explicit thinking in rule-based visual reinforceme | arXiv: 2503.16188
think straight stop smart structured reasoning for efficient multi-hop rag | arXiv: 2510.19171
thinkact vision-language-action reasoning via reinforced visual latent planning | arXiv: 2507.16815
thinksound chain-of-thought reasoning in multimodal large language models for au | arXiv: 2506.21448
thompson sampling for multi-objective linear contextual bandit | arXiv: 2512.00930
thompson sampling in function spaces via neural operators | arXiv: 2506.21894
thought communication in multiagent collaboration | arXiv: 2510.20733
through the river understanding the benefit of schedule-free methods for languag | arXiv: 2507.09846
thunder tile-level histopathology image understanding benchmark | arXiv: 2507.07860
tidmad time series dataset for discovering dark matter with ai denoising | arXiv: 2406.04378
tight bounds on the distortion of randomized and deterministic distributed votin | arXiv: 2509.17134
tight lower bounds and improved convergence in performative prediction | arXiv: 2412.03671
tighter cmi-based generalization bounds via stochastic projection and quantizati | arXiv: 2510.23485
tiled flash linear attention more efficient linear rnn and xlstm kernels | arXiv: 2503.14376
time reversal symmetry for efficient robotic manipulations in deep reinforcement | arXiv: 2505.13925
time travel is cheating going live with deepfund for real-time fund investment b | arXiv: 2505.11065
time-evolving dynamical system for learning latent representations of mouse visu | arXiv: 2408.07908
time-imm a dataset and benchmark for irregular multimodal multivariate time seri | arXiv: 2506.10412
time-o1 time-series forecasting needs transformed label alignment | arXiv: 2505.17847
TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
timeperceiver an encoder-decoder framework for generalized time-series forecasti | arXiv: 2512.22550
tirex zero-shot forecasting across long and short horizons with enhanced in-cont | arXiv: 2505.23719
titan a trajectory-informed technique for adaptive parameter freezing in large-s | arXiv: 2509.15193
to distill or decide understanding the algorithmic trade-off in partially observ | arXiv: 2510.03207
to see or to read user behavior reasoning in multimodal llms | arXiv: 2511.03845
token bottleneck one token to remember dynamics | arXiv: 2507.06543
token perturbation guidance for diffusion models | arXiv: 2506.10036
tokensqueeze performance-preserving compression for reasoning llms | arXiv: 2511.13223
tomcat test-time comprehensive knowledge accumulation for compositional zero-sho | arXiv: 2510.20162
Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task
topology of reasoning understanding large reasoning models through reasoning gra | arXiv: 2506.05744
torch-uncertainty a deep learning framework for uncertainty quantification | arXiv: 2511.10282
tortoise and hare guidance accelerating diffusion model inference with multirate | arXiv: 2511.04117
toward a unified geometry understanding riemannian diffusion framework for graph | arXiv: 2510.04522
toward a vision-language foundation model for medical data multimodal dataset an | arXiv: 2509.24739
toward complete merger identification at cosmic noon with deep learning | arXiv: 2511.15006
toward efficient inference attacks shadow model sharing via mixture-of-experts | arXiv: 2510.13451
toward engineering agi benchmarking the engineering design capabilities of llms | arXiv: 2509.16204
toward explainable offline rl analyzing representations in intrinsically motivat | arXiv: 2506.13958
toward real-world text image forgery localization structured and interpretable d | arXiv: 2511.12658
towards 3d objectness learning in an open world | arXiv: 2510.17686
towards a golden classifier-free guidance path via foresight fixed point iterati | arXiv: 2510.21512
towards comprehensive scene understanding integrating first and third-person vie | arXiv: 2505.21955
towards effective federated graph foundation model via mitigating knowledge enta | arXiv: 2505.12684
towards evaluating proactive risk awareness of multimodal language models | arXiv: 2505.17455
towards foundational lidar world models with efficient latent flow matching | arXiv: 2506.23434
towards general modality translation with contrastive and predictive latent diff | arXiv: 2510.20819
towards implicit aggregation robust image representation for place recognition i | arXiv: 2511.06024
towards interpretability without sacrifice faithful dense layer decomposition wi | arXiv: 2505.21364
towards multiscale graph-based protein learning with geometric secondary structu | arXiv: 2602.00862
towards physics-informed spatial intelligence with human priors an autonomous dr | arXiv: 2510.21160
towards predicting any human trajectory in context | arXiv: 2506.00871
towards provable emergence of in-context reinforcement learning | arXiv: 2509.18389
towards reliable and holistic visual in-context learning prompt selection | arXiv: 2509.25989
towards reliable code-as-policies a neuro-symbolic framework for embodied task p | arXiv: 2510.21302
towards resilient safety-driven unlearning for diffusion models against downstre | arXiv: 2507.16302
towards robust pseudo-label learning in semantic segmentation an encoding perspe | arXiv: 2512.06870
towards robust zero-shot reinforcement learning | arXiv: 2510.15382
towards scaling laws for symbolic regression | arXiv: 2510.26064
towards self-supervised foundation models for critical care time series | arXiv: 2509.19885
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning | arXiv: 2502.18080
towards understanding safety alignment a mechanistic perspective from safety neu | arXiv: 2406.14144
towards unified and lossless latent space for 3d molecular latent diffusion mode | arXiv: 2503.15567
towards universal neural operators through multiphysics pretraining | arXiv: 2511.10829
towards unsupervised domain bridging via image degradation in semantic segmentat | arXiv: 2412.10339
towards unsupervised open-set graph domain adaptation via dual reprogramming | arXiv: 2510.18363
toxictextclip text-based poisoning and backdoor attacks on clip pre-training | arXiv: 2511.00446
tp-mddn task-preferenced multi-demand-driven navigation with autonomous decision | arXiv: 2511.17225
track inpaint resplat subject-driven 3d and 4d generation with progressive textu | arXiv: 2510.23605
tracking and understanding object transformations | arXiv: 2511.04678
trackingworld world-centric monocular 3d tracking of almost all pixels | arXiv: 2512.08358
tractable multinomial logit contextual bandits with non-linear utilities | arXiv: 2601.06913
train with perturbation infer after merging a two-stage framework for continual | arXiv: 2505.22389
Training Language Models to Reason Efficiently | arXiv: 2502.04463
training robust graph neural networks by modeling noise dependencies | arXiv: 2502.19670
training the untrainable introducing inductive bias via representational alignme | arXiv: 2410.20035
training-free bayesianization for low-rank adapters of large language models | arXiv: 2412.05723
training-free constrained generation with stable diffusion models | arXiv: 2502.05625
training-free efficient video generation via dynamic token carving | arXiv: 2505.16864
training-free online video step grounding | arXiv: 2510.16989
training-free safe text embedding guidance for text-to-image diffusion models | arXiv: 2510.24012
traj-coa patient trajectory modeling via chain-of-agents for lung cancer risk pr | arXiv: 2510.10454
TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration | arXiv: 2410.20445
trajectory balance with asynchrony decoupling exploration and learning for fast | arXiv: 2503.18929
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | arXiv: 2505.15311
trans-env a framework for evaluating the linguistic robustness of llms against e | arXiv: 2505.20875
transfer learning beyond the standard model | arXiv: 2510.19168
transfer learning for benign overfitting in high-dimensional linear regression | arXiv: 2510.15337
transferable black-box one-shot forging of watermarks via image preference model | arXiv: 2510.20468
transferring causal effects using proxies | arXiv: 2510.25924
transformer copilot learning from the mistake log in llm fine-tuning | arXiv: 2505.16270
transformer embeddings for fast microlensing inference | arXiv: 2512.11687
transformer key-value memories are nearly as interpretable as sparse autoencoder | arXiv: 2510.22332
transformers provably learn chain-of-thought reasoning with length generalizatio | arXiv: 2511.07378
transun a preemptive paradigm to eradicate retransformation bias intrinsically f | arXiv: 2505.13881
trap targeted redirecting of agentic preferences | arXiv: 2505.23518
traversal verification for speculative tree decoding | arXiv: 2505.12398
tree-guided diffusion planner | arXiv: 2508.21800
trico triadic game-theoretic co-training for robust semi-supervised learning | arXiv: 2509.21526
trident tri-modal molecular representation learning with taxonomic annotations a | arXiv: 2506.21028
trim scalable 3d gaussian diffusion inference with temporal and spatial trimming | arXiv: 2511.16642
triplets better than pairs towards stable and effective self-play fine-tuning fo | arXiv: 2601.08198
tropical attention neural algorithmic reasoning for combinatorial algorithms | arXiv: 2505.17190
TRoVe: Discovering Error-Inducing Static Feature Biases in Temporal Vision-Language Models
trust -- transformer-driven u-net for sparse target recovery | arXiv: 2506.01112
trust region reward optimization and proximal inverse reward optimization algori | arXiv: 2509.23135
tts-var a test-time scaling framework for visual auto-regressive generation | arXiv: 2507.18537
turbocharging gaussian process inference with approximate sketch-and-project | arXiv: 2505.13723
tv-rec time-variant convolutional filter for sequential recommendation | arXiv: 2510.25259
twilight adaptive attention sparsity with hierarchical top-p pruning | arXiv: 2502.02770
Two Causally Related Needles in a Video Haystack | arXiv: 2505.19853
two-stage learning of stabilizing neural controllers via zubov sampling and iter | arXiv: 2506.01356
two-steps diffusion policy for robotic manipulation via genetic denoising | arXiv: 2510.21991
u-can unsupervised point cloud denoising with consistency-aware noise2noise matc | arXiv: 2510.25210
ugm2n an unsupervised and generalizable mesh movement network via m-uniform loss | arXiv: 2508.08615
ultrahr-100k enhancing uhr image synthesis with a large-scale high-quality datas | arXiv: 2510.20661
ultrametric cluster hierarchies i want em all | arXiv: 2502.14018
umami unifying masked autoregressive models and deterministic rendering for view | arXiv: 2512.20107
UMoE: Unifying Attention and FFN with Shared Experts | arXiv: 2505.07260
uncertain knowledge graph completion via semi-supervised confidence distribution | arXiv: 2510.16601
uncertainty estimation by flexible evidential deep learning | arXiv: 2510.18322
uncertainty quantification for reduced-order surrogate models applied to cloud m | arXiv: 2511.04534
uncertainty-aware multi-objective reinforcement learning-guided diffusion models | arXiv: 2510.21153
uncertainty-guided model selection for tabular foundation models in biomolecule | arXiv: 2510.02476
uncle towards scalable dynamic causal discovery in non-linear temporal systems | arXiv: 2511.03168
uncovering graph reasoning in decoder-only transformers with circuit tracing | arXiv: 2509.20336
uncovering strategic egoism behaviors in large language models | arXiv: 2511.09920
understand before you generate self-guided training for autoregressive image gen | arXiv: 2509.15185
understanding adam requires better rotation dependent assumptions | arXiv: 2410.19964
understanding and enhancing mask-based pretraining towards universal representat | arXiv: 2509.21650
understanding and improving adversarial robustness of neural probabilistic circu | arXiv: 2509.20549
understanding challenges to the interpretation of disaggregated evaluations of a | arXiv: 2506.04193
understanding differential transformer unchains pretrained self-attentions | arXiv: 2505.16333
understanding ice crystal habit diversity with self-supervised learning | arXiv: 2509.07688
understanding prompt tuning and in-context learning via meta-learning | arXiv: 2505.17010
understanding representation dynamics of diffusion models via low-dimensional mo | arXiv: 2502.05743
understanding the generalization of stochastic gradient adam in learning neural | arXiv: 2510.11354
uni-lora one vector is all you need | arXiv: 2506.00799
uni-mumer unified multi-task fine-tuning of vision-language model for handwritte | arXiv: 2505.23566
uniedit a unified knowledge editing benchmark for large language models | arXiv: 2505.12345
unified all-atom molecule generation with neural fields | arXiv: 2511.15906
unified reinforcement and imitation learning for vision-language models | arXiv: 2510.19307
uniformer unified and efficient transformer for reasoning across general and cus | arXiv: 2511.08135
unifying and enhancing graph transformers via a hierarchical mask framework | arXiv: 2510.18825
unifying appearance codes and bilateral grids for driving scene gaussian splatti | arXiv: 2506.05280
Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning | arXiv: 2505.18752
unifying proportional fairness in centroid and non-centroid clustering | arXiv: 2601.00447
unifying re-identification attribute inference and data reconstruction risks in | arXiv: 2507.06969
unifying symbolic music arrangement track-aware reconstruction and structured to | arXiv: 2408.15176
unifying text semantics and graph structures for temporal text-attributed graphs | arXiv: 2503.14411
unifying vision-language latents for zero-label image caption enhancement | arXiv: 2510.12931
unilumos fast and unified image and video relighting with physics-plausible feed | arXiv: 2511.01678
unimotion a unified motion framework for simulation prediction and planning | arXiv: 2602.00566
unimrseg unified modality-relax segmentation via hierarchical self-supervised co | arXiv: 2509.16170
unipixel unified object referring and segmentation for pixel-level visual reason | arXiv: 2509.18094
unisite the first cross-structure dataset and learning framework for end-to-end | arXiv: 2506.03237
unitok a unified tokenizer for visual generation and understanding | arXiv: 2502.20321
universal cross-tokenizer distillation via approximate likelihood matching | arXiv: 2503.20083
universal spectral tokenization via self-supervised panchromatic representation | arXiv: 2510.17959
Unlabeled Data Can Provably Enhance In-Context Learning of Transformers | arXiv: 2601.10058
unlearned but not forgotten data extraction after exact unlearning in llm | arXiv: 2505.24379
unlearning as ablation toward a falsifiable benchmark for generative scientific | arXiv: 2508.17681
unleashing diffusion transformers for visual correspondence by modulating massiv | arXiv: 2505.18584
unleashing hour-scale video training for long video-language understanding | arXiv: 2506.05332
unlocking multimodal mathematical reasoning via process reward model | arXiv: 2501.04686
unlocking transfer learning for open-world few-shot recognition | arXiv: 2411.09986
unmasking covid-19 vulnerability in nigeria mapping risks beyond urban hotspots | arXiv: 2509.05398
unpaired image-to-image translation for segmentation and signal unmixing | arXiv: 2505.20746
unsupervised discovery of high-redshift galaxy populations with variational auto | arXiv: 2511.05439
Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards | arXiv: 2509.19003
unveiling m-sharpness through the structure of stochastic gradient noise | arXiv: 2509.18001
unveiling the power of multiple gossip steps a stability-based generalization an | arXiv: 2510.07980
unveiling the spatial-temporal effective receptive fields of spiking neural netw | arXiv: 2510.21403
urb -- urban routing benchmark for rl-equipped connected autonomous vehicles | arXiv: 2505.17734
urbaning-v2x a large-scale multi-vehicle multi-infrastructure dataset across mul | arXiv: 2510.23478
urdf-anything constructing articulated objects with 3d multimodal language model | arXiv: 2511.00940
urls help topics guide understanding metadata utility in llm training | arXiv: 2505.16570
utilgen utility-centric generative data augmentation with dual-level task adapta | arXiv: 2510.24262
v-cece visual counterfactual explanations via conceptual edits | arXiv: 2509.16567
v2x-radar a multi-modal dataset with 4d radar for cooperative perception | arXiv: 2411.10962
va-gs enhancing the geometric representation of gaussian splatting via view alig | arXiv: 2510.11473
vadtree explainable training-free video anomaly detection via hierarchical granu | arXiv: 2510.22693
vagen reinforcing world model reasoning for multi-turn vlm agents | arXiv: 2510.16907
valid inference with imperfect synthetic data | arXiv: 2508.06635
validating llm-as-a-judge systems under rating indeterminacy | arXiv: 2503.05965
value gradient guidance for flow matching alignment | arXiv: 2512.05116
valuepilot a two-phase framework for value-driven decision-making | arXiv: 2512.13716
vamp variational multi-modal prompt learning for vision-language models | arXiv: 2511.22664
vanish into thin air cross-prompt universal adversarial attacks for sam2 | arXiv: 2510.24195
variance-aware feel-good thompson sampling for contextual bandits | arXiv: 2511.02123
variational autoencoder with normalizing flow for x-ray spectral fitting | arXiv: 2601.07440
variational regularized unbalanced optimal transport single network least action | arXiv: 2505.11823
vasa-3d lifelike audio-driven gaussian head avatars from a single image | arXiv: 2512.14677
VERA: Variational Inference Framework for Jailbreaking Large Language Models | arXiv: 2506.22666
verbalized algorithms | arXiv: 2509.08150
vessa video-based object-centric self-supervised adaptation for visual foundatio | arXiv: 2510.20994
vgent graph-based retrieval-reasoning-augmented generation for long video unders | arXiv: 2510.14032
vicinity-guided discriminative latent diffusion for privacy-preserving domain ad | arXiv: 2510.00478
video diffusion models excel at tracking similar-looking objects without supervi | arXiv: 2512.02339
video finetuning improves reasoning between frames | arXiv: 2511.12868
video killed the energy budget characterizing the latency and power regimes of o | arXiv: 2509.19222
video-r1 reinforcing video reasoning in mllms | arXiv: 2503.21776
video-rag visually-aligned retrieval-augmented long video comprehension | arXiv: 2411.13093
video-safetybench a benchmark for safety evaluation of video lvlms | arXiv: 2505.11842
videolucy deep memory backtracking for long video understanding | arXiv: 2510.12422
videorft incentivizing video reasoning capability in mllms via reinforced fine-t | arXiv: 2505.12434
viki-r coordinating embodied multi-agent cooperation via reinforcement learning | arXiv: 2506.09049
viking deep variational inference with stochastic projections | arXiv: 2510.23684
vimorag video-based retrieval-augmented 3d motion generation for motion language | arXiv: 2508.12081
vipamin visual prompt initialization via embedding selection and subspace expans | arXiv: 2510.16446
virus infection attack on llms your poisoning can spread via synthetic data | arXiv: 2509.23041
vision function layer in multimodal llms | arXiv: 2509.24791
vision transformers for cosmological fields application to weak lensing mass map | arXiv: 2512.07125
vision transformers with self-distilled registers | arXiv: 2505.21501
vision-centric token compression in large language model | arXiv: 2502.00791
vispec accelerating vision-language models with vision-aware speculative decodin | arXiv: 2509.15235
visual diversity and region-aware prompt learning for zero-shot hoi detection | arXiv: 2510.25094
visual instruction bottleneck tuning | arXiv: 2505.13946
visual structures helps visual reasoning addressing the binding problem in vlms | arXiv: 2506.22146
visual sync multi-camera synchronization via cross-view object motion | arXiv: 2512.02017
Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought | arXiv: 2505.15510
visuallens personalization through task-agnostic visual history | arXiv: 2411.16034
vita-15 towards gpt-4o level real-time vision and speech interaction | arXiv: 2501.01957
vitrix-clipin enhancing fine-grained visual understanding in clip via instructio | arXiv: 2508.02329
VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set | arXiv: 2510.21323
vla-cache efficient vision-language-action manipulation via adaptive token cachi | arXiv: 2502.02175
vmdt decoding the trustworthiness of video foundation models | arXiv: 2511.05682
vocabulary customization for efficient domain-specific llm deployment | arXiv: 2509.26124
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play | arXiv: 2502.01932
vorta efficient video diffusion via routing sparse attention | arXiv: 2505.18809
vq-seg vector-quantized token perturbation for semi-supervised medical image seg | arXiv: 2601.10124
vqtoken neural discrete token representation learning for extreme token reductio | arXiv: 2503.16980
vsa faster video diffusion with trainable sparse attention | arXiv: 2505.13389
vt-fsl bridging vision and text with llms for few-shot learning | arXiv: 2509.25033
walking the schrödinger bridge a direct trajectory for text-to-3d generation | arXiv: 2511.05609
walrus wavelets for long-range representation using ssms | arXiv: 2505.12161
wasserstein transfer learning | arXiv: 2505.17404
watch and listen understanding audio-visual-speech moments with multimodal llm | arXiv: 2505.18110
watermarking autoregressive image generation | arXiv: 2506.16349
wavelet canonical coherence for nonstationary signals | arXiv: 2505.14253
wavy transformer | arXiv: 2508.12787
weak-to-strong generalization under distribution shifts | arXiv: 2510.21332
wearvqa a visual question answering benchmark for wearables in egocentric authen | arXiv: 2511.22154
web-scale collection of video data for 4d animal reconstruction | arXiv: 2511.01169
web-shepherd advancing prms for reinforcing web agents | arXiv: 2505.15277
weight weaving parameter pooling for data-free model merging | arXiv: 2510.13921
wham towards a translative model of sperm whale vocalization | arXiv: 2512.02206
what ai speaks for your community polling ai agents for public opinion on data c | arXiv: 2511.22037
what can rl bring to vla generalization an empirical study | arXiv: 2505.19789
what does it take to build a performant selective classifier | arXiv: 2510.20242
what expressivity theory misses message passing complexity for gnns | arXiv: 2509.01254
what happens during the loss plateau understanding abrupt learning in transforme | arXiv: 2506.13688
what makes a reward model a good teacher an optimization perspective | arXiv: 2503.15477
what one cannot two can two-layer transformers provably represent induction head | arXiv: 2508.07208
what we dont c manifold disentanglement for structured discovery | arXiv: 2511.09433
when ai democratizes exploitation llm-assisted strategic manipulation of fair di | arXiv: 2511.14722
when are concepts erased from diffusion models | arXiv: 2505.17013
when can model-free reinforcement learning be enough for thinking | arXiv: 2506.17124
when less language is more language-reasoning disentanglement makes llms better | arXiv: 2505.15257
when no paths lead to rome benchmarking systematic neural relational reasoning | arXiv: 2510.23532
when one modality sabotages the others a diagnostic lens on multimodal reasoning | arXiv: 2511.02794
when one moment isnt enough multi-moment retrieval with cross-moment interaction | arXiv: 2510.17218
when semantics mislead vision mitigating large multimodal models hallucinations | arXiv: 2506.05551
when thinking drifts evidential grounding for robust video reasoning | arXiv: 2510.06077
when worse is better navigating the compression-generation tradeoff in visual to | arXiv: 2412.16326
where and how to perturb on the design of perturbation guidance in diffusion and | arXiv: 2506.10978
who you are matters bridging topics and social roles via llm-enhanced logical re | arXiv: 2505.10940
why diffusion models dont memorize the role of implicit dynamical regularization | arXiv: 2505.17638
why is attention sparse in particle transformer | arXiv: 2512.00210
why knowledge distillation works in generative models a minimal working explanat | arXiv: 2505.13111
why masking diffusion works condition on the jump schedule for improved discrete | arXiv: 2506.08316
Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints | arXiv: 2506.12421
wider or deeper scaling llm inference-time compute with adaptive branching tree | arXiv: 2503.04412
wildcat3d appearance-aware multi-view diffusion in the wild | arXiv: 2506.13030
windsock is dancing adaptive multimodal retrieval-augmented generation | arXiv: 2510.22694
with limited data for multimodal alignment let the structure guide you | arXiv: 2506.16895
wmcopier forging invisible image watermarks on arbitrary images | arXiv: 2503.22330
words that unite the world a unified framework for deciphering central bank comm | arXiv: 2505.17048
worse than zero-shot a fact-checking dataset for evaluating the robustness of ra | arXiv: 2502.16101
writing in symbiosis mapping human creative agency in the ai era | arXiv: 2512.13697
x-scene large-scale driving scene generation with high fidelity and flexible con | arXiv: 2506.13558
xifbench evaluating large language models on multilingual instruction following | arXiv: 2503.07539
xlstm-mixer multivariate time series forecasting by mixing via scalar memories | arXiv: 2410.16928
Yggdrasil: 桥接动态投机和静态运行时的延迟最优树型LLM解码 | arXiv: 2512.23858
you can trust your clustering model a parameter-free self-boosting plug-in for d | arXiv: 2511.21193
your pre-trained llm is secretly an unsupervised confidence calibrator | arXiv: 2505.16690
zebra towards zero-shot cross-subject generalization for universal brain visual | arXiv: 2510.27128
zero-shot context generalization in reinforcement learning from few training con | arXiv: 2507.07348
zero-shot embedding drift detection a lightweight defense against prompt injecti | arXiv: 2601.12359
zero-shot large language model agents for fully automated radiotherapy treatment | arXiv: 2510.11754
zero-shot performance prediction for probabilistic scaling laws | arXiv: 2510.16743
zero-shot robustness of vision language models via confidence-aware weighting | arXiv: 2510.02913
ZeroS: Zero-Sum Linear Attention for Efficient Transformers | arXiv: 2602.05230
zeroth-order optimization finds flat minima | arXiv: 2506.05454
zeus zero-shot embeddings for unsupervised separation of tabular data | arXiv: 2505.10704
zip2zip inference-time adaptive tokenization via online compression | arXiv: 2506.01084
zpressor bottleneck-aware compression for scalable feed-forward 3dgs | arXiv: 2505.23734
上下文学习中的技术债务：长序列中的递减效率 | arXiv: 2502.04580
笔记1: CoT是幻觉吗？数据分布角度 | arXiv: 2508.01191
笔记2：PRM必要吗？RL隐式诱导PRM能力 | arXiv: 2505.11227
笔记4：WebThinker - 赋予推理模型深度研究能力 | arXiv: 2504.21776
笔记5：ReSearch - 学习通过搜索推理 | arXiv: 2503.19470
笔记6：Self-Evaluating LLMs - 多步任务的步级置信度估计 | arXiv: 2505.17373
笔记7：价值引导搜索 - 高效链式思考推理 | arXiv: 2504.18428
笔记8：PolyMath - 多语言背景下的数学推理评估 | arXiv: 2511.07364
impact of dataset properties on membership inference | arXiv: 2402.06674
clawscreativity detection for llm-generated solutions using attention window of | arXiv: 2510.17921
levo high-quality song generation with multi-processing refined supervision | arXiv: 2506.07520
a selfimproving coding agent | arXiv: 2504.15228
a stochastic differential equation framework for multi-objective llm interaction | arXiv: 2510.10739
astrovisbench a code benchmark for scientific computing and visualization in ast | arXiv: 2505.20538
automated multi-agent workflows for rtl design | arXiv: 2509.20182
co-evolving llm coder and unit tester via reinforcement learning | arXiv: 2506.03136
core benchmarking llms code reasoning capabilities through static analysis tasks | arXiv: 2507.05269
embedding alignment in code generation for audio | arXiv: 2508.05473
flylora boosting task decoupling and parameter efficiency via implicit rank-wise | arXiv: 2510.08396
fractalbench diagnosing visual-mathematical reasoning through recursive program | arXiv: 2511.06522
learning to solve complex problems via dataset decomposition | arXiv: 2602.20296
maintaincoder maintainable code generation under dynamic requirements | arXiv: 2503.24260
mlr-bench evaluating ai agents on open-ended machine learning research | arXiv: 2505.19955
once upon an input reasoning via per-instance program synthesis | arXiv: 2510.22849
preserving llm capabilities through calibration data curation from analysis to o | arXiv: 2510.10618
principled fine-tuning of llms from user-edits a medley of preference supervisio | arXiv: 2601.19055
program synthesis via test-time transduction | arXiv: 2509.17393
qimeng-salv signal-aware learning for verilog code generation | arXiv: 2510.19296
swe-rebench an automated pipeline for task collection and decontaminated evaluat | arXiv: 2505.20411
table2latex-rl high-fidelity latex code generation from table images via reinfor | arXiv: 2509.17589
text-to-code generation for modular building layouts in building information mod | arXiv: 2509.23713
aclora almost trainingfree access controlaware multimodal ll | arXiv: 2505.11557
bridging human and llm judgments understanding and narrowing the gap | arXiv: 2508.12792
hygen efficient llm serving via elastic online-offline request co-location | arXiv: 2501.14808
metamind modeling human social thoughts with metacognitive multi-agent systems | arXiv: 2505.18943
sciarena an open evaluation platform for non-verifiable scientific literature-gr | arXiv: 2507.01001
coral longtail diffusion | arXiv: 2506.15933
dd2 onestep ar distill | arXiv: 2510.21003
why diffusion models dont memorize the role of implicit regularization | arXiv: 2505.17638
latent harmony synergistic unified uhd image restoration with pre-trained diffus
benchmarking retrievalaugmented multimodal generation for do | arXiv: 2505.16470
chain-of-retrieval augmented generation | arXiv: 2501.14342
compress gather and recompute reforming long-context processing in transformers | arXiv: 2506.01215
cooperative retrieval-augmented generation for question answering mutual informa | arXiv: 2512.10422
deep research brings deeper harm | arXiv: 2510.11851
dice discrete interpretable comparative evaluation with probabilistic scoring fo | arXiv: 2512.22629
generalized contrastive learning for universal multimodal re | arXiv: 2509.25638
hierarchical retrieval the geometry and a pretrain-finetune recipe | arXiv: 2509.16411
hifi-rag hierarchical content filtering and two-pass generation for open-domain | arXiv: 2512.22442
how should we evaluate data deletion in graph-based ann indexes | arXiv: 2512.06200
hypergraphrag retrieval-augmented generation via hypergraph-structured knowledge | arXiv: 2503.21322
improving consistency in retrieval-augmented systems with group similarity rewar | arXiv: 2510.04392
is prm necessary problem-solving rl implicitly induces prm capability in llms | arXiv: 2505.11227
learning task-agnostic representations through multi-teacher distillation | arXiv: 2510.18680
mind the gap aligning knowledge bases with user needs to enhance mental health r | arXiv: 2509.13626
mir-bench can your llm recognize complicated patterns via many-shot in-context r | arXiv: 2502.09933
mitra an ai assistant for knowledge retrieval in physics collaborations | arXiv: 2603.09800
murating a high quality data selecting approach to multilingual large language m | arXiv: 2507.01785
rag-igbench innovative evaluation for rag-based interleaved generation in open-d | arXiv: 2512.05119
reliable decision making via calibration oriented retrieval augmented generation | arXiv: 2411.08891
retrieval-augmented generation for reliable interpretation of radio regulations | arXiv: 2509.09651
retrieval is not enough enhancing rag reasoning through test-time critique and o | arXiv: 2504.14858
rmit-adms at the mmu-rag neurips 2025 competition | arXiv: 2602.20735
scale-invariant attention | arXiv: 2505.17083
scaling language-centric omnimodal representation learning | arXiv: 2510.11693
secon-rag a two-stage semantic filtering and conflict-free framework for trustwo | arXiv: 2510.09710
superclip clip with simple classification supervision | arXiv: 2512.14480
the atlas of in-context learning how attention heads shape in-context retrieval | arXiv: 2505.15807
the narrow gate localized imagetext communication in native | arXiv: 2412.06646
windsock is dancing adaptive multimodal retrieval-augmented generation | arXiv: 2510.22694
worse than zero-shot a fact-checking dataset for evaluating the robustness of ra | arXiv: 2502.16101
a is for absorption studying feature splitting and absorption in sparse autoenco | arXiv: 2409.14507
a unified reasoning framework for holistic zeroshot video an | arXiv: 2511.00962
adaptgrad adaptive sampling to reduce noise | arXiv: 2410.07711
additive models explained a computational complexity approach | arXiv: 2510.21292
agentiql an agent-inspired multi-expert framework for text-to-sql generation | arXiv: 2510.10661
an analysis of concept bottleneck models measuring understanding and mitigating | arXiv: 2505.16705
are greedy task orderings better than random in continual linear regression | arXiv: 2510.19941
arecho autoregressive evaluation via chain-based hypothesis optimization for spe | arXiv: 2505.24518
attributing response to context a jensen-shannon divergence driven mechanistic s | arXiv: 2505.16415
auditing meta-cognitive hallucinations in reasoning large language models | arXiv: 2505.13143
base models know how to reason thinking models learn when | arXiv: 2510.07364
better estimation of the kullback--leibler divergence between language models | arXiv: 2504.10637
beyond accuracy dissecting mathematical reasoning for llms u | arXiv: 2506.04723
beyond components singular vector-based interpretability of transformer circuits | arXiv: 2511.20273
beyond token probes hallucination detection via activation tensors with act-vit | arXiv: 2510.00296
bigram subnetworks mapping to next tokens in transformer language models | arXiv: 2504.15471
causal head gating a framework for interpreting roles of attention heads in tran | arXiv: 2505.13737
cbmas cognitive behavioral modeling via activation steering | arXiv: 2601.06109
chiqpm calibrated hierarchical interpretable image classification | arXiv: 2511.20779
cognitive mirrors exploring the diverse functional roles of attention heads in l | arXiv: 2512.10978
conceptscope characterizing dataset bias via disentangled visual concepts | arXiv: 2510.26186
conditional distribution compression via the kernel conditional mean embedding | arXiv: 2504.10139
curvature tuning provable training-free model steering from a single parameter | arXiv: 2502.07783
dataset distillation for pre-trained self-supervised vision models | arXiv: 2511.16674
deep modularity networks with diversity-preserving regularization | arXiv: 2501.13451
deep value benchmark measuring whether models generalize deep values or shallow | arXiv: 2511.02109
distributional autoencoders know the score | arXiv: 2502.11583
do different prompting methods yield a common task representation in language mo | arXiv: 2505.12075
dynamic algorithm for explainable k-medians clustering under lp norm | arXiv: 2512.01150
efficient vision-language reasoning via adaptive token pruning | arXiv: 2512.12701
emergence of linear truth encodings in language models | arXiv: 2510.15804
empowering decision trees via shape function branching | arXiv: 2510.19040
encoding and understanding astrophysical information in large language model-gen | arXiv: 2511.14685
evaluating llms in open-source games | arXiv: 2512.00371
explaining similarity in vision-language encoders with weighted banzhaf interact | arXiv: 2508.05430
fact faithful concept traces for explaining neural network decisions | arXiv: 2510.25512
fantastic features and where to find them a probing method to combine features f | arXiv: 2512.01405
fastdinov2 frequency based curriculum learning improves robustness and training | arXiv: 2507.03779
from flat to hierarchical extracting sparse representations with matching pursui | arXiv: 2506.03093
geometric priors for generalizable world models via vector symbolic architecture | arXiv: 2602.21467
h-splid hsic-based saliency preserving latent information decomposition | arXiv: 2510.20627
how do transformers learn implicit reasoning | arXiv: 2505.23653
improving perturbation-based explanations by understanding the role of uncertain | arXiv: 2511.10439
knowing when to stop efficient context processing via latent sufficiency signals | arXiv: 2502.01025
latent principle discovery for language model self-improvement | arXiv: 2505.16927
learning to focus causal attention distillation via gradient-guided token prunin | arXiv: 2506.07851
llm probing with contrastive eigenproblems improving understanding and applicabi | arXiv: 2511.02089
minimizing false-positive attributions in explanations of non-linear models | arXiv: 2505.11210
monte carlo expected threat mocet scoring | arXiv: 2511.16823
mopformer motion-primitive transformer for wearable-sensor activity recognition | arXiv: 2505.20744
ordshap feature position importance for sequential black-box models | arXiv: 2507.11855
out of control -- why alignment needs formal control theory and an alignment con | arXiv: 2506.17846
partial information decomposition via normalizing flows in latent gaussian distr | arXiv: 2510.04417
probabilistic token alignment for large language model fusion | arXiv: 2509.17276
rectifying shortcut behaviors in preference-based reward learning | arXiv: 2510.19050
saying the unsaid revealing the hidden language of multimodal systems through te | arXiv: 2511.10690
scpilot large language model reasoning toward automated single-cell analysis and | arXiv: 2602.11609
self-supervised contrastive learning is approximately supervised contrastive lea | arXiv: 2506.04411
shap values via sparse fourier representation | arXiv: 2410.06300
simulating society requires simulating thought | arXiv: 2506.06958
sloth scaling laws for llm skills to predict multi-benchmark performance across | arXiv: 2412.06540
spex a spectral approach to explainable clustering | arXiv: 2511.00885
steering information utility in key-value memory for language model post-trainin | arXiv: 2507.05158
tangledfeatures robust feature selection in highly correlated spaces | arXiv: 2510.15005
the non-linear representation dilemma is causal abstraction enough for mechanist | arXiv: 2507.08802
the trilemma of truth in large language models | arXiv: 2506.23921
time-evolving dynamical system for learning latent representations of mouse visu | arXiv: 2408.07908
toward explainable offline rl analyzing representations in intrinsically motivat | arXiv: 2506.13958
toward real-world text image forgery localization structured and interpretable d | arXiv: 2511.12658
towards interpretability without sacrifice faithful dense layer decomposition wi | arXiv: 2505.21364
towards scaling laws for symbolic regression | arXiv: 2510.26064
transformer key-value memories are nearly as interpretable as sparse autoencoder | arXiv: 2510.22332
tropical attention neural algorithmic reasoning for combinatorial algorithms | arXiv: 2505.17190
uncovering graph reasoning in decoder-only transformers with circuit tracing | arXiv: 2509.20336
urls help topics guide understanding metadata utility in llm training | arXiv: 2505.16570
vadtree explainable training-free video anomaly detection via hierarchical granu | arXiv: 2510.22693
valuepilot a two-phase framework for value-driven decision-making | arXiv: 2512.13716
vlsae interpreting and enhancing visionlanguage alignment wi | arXiv: 2510.21323
what happens during the loss plateau understanding abrupt learning in transforme | arXiv: 2506.13688
why is attention sparse in particle transformer | arXiv: 2512.00210
edit less achieve more dynamic sparse neuron masking for lifelong knowledge edit | arXiv: 2510.22139
kscope a framework for characterizing the knowledge status of language models | arXiv: 2506.07458
memeic a step toward continual and compositional knowledge editing | arXiv: 2510.25798
memoir lifelong model editing with minimal overwrite and informed retention for | arXiv: 2506.07899
rethinking residual distribution in locate-then-edit model editing | arXiv: 2502.03748
uniedit a unified knowledge editing benchmark for large language models | arXiv: 2505.12345
l-mtp leap multi-token prediction beyond adjacent context for large language mod | arXiv: 2505.17505
loogle v2 are llms ready for real world long dependency challenges | arXiv: 2510.22548
omnidraft a cross-vocabulary online adaptive drafter for on-device speculative d | arXiv: 2507.02659
yggdrasil bridging dynamic speculation and static runtime for latency-optimal tr | arXiv: 2512.23858
a highdimensional statistical method for optimizing transfer | arXiv: 2502.04242
a standardized benchmark for multilabel antimicrobial peptide classification | arXiv: 2511.04814
a unified framework for provably efficient algorithms to estimate shapley values | arXiv: 2506.05216
adastar adaptive data sampling for training self-taught reasoners | arXiv: 2505.16322
aggregation hides out-of-distribution generalization failures from spurious corr | arXiv: 2510.24884
asymmetric duos sidekicks improve uncertainty | arXiv: 2505.18636
bayesian evaluation of large language model behavior | arXiv: 2511.10661
belief-calibrated multi-agent consensus seeking for complex nlp tasks | arXiv: 2510.06307
benchmarking is broken -- dont let ai be its own judge | arXiv: 2510.07575
benchmarking large language models for zero-shot and few-shot phishing url detec | arXiv: 2602.02641
beyond the singular revealing the value of multiple generations in benchmark eva | arXiv: 2502.08943
beyond the surface enhancing llm-as-a-judge alignment with human via internal re | arXiv: 2508.03550
blink-twice you see but do you observe a reasoning benchmark on visual perceptio | arXiv: 2510.09361
can large language models master complex card games | arXiv: 2509.01328
climb class-imbalanced learning benchmark on tabular data | arXiv: 2505.17451
codeassistbench cab dataset benchmarking for multi-turn chat-based code assistan | arXiv: 2507.10646
compo preference alignment via comparison oracles | arXiv: 2505.05465
conformal online learning of deep koopman linear embeddings | arXiv: 2511.12760
conformal prediction in the loop a feedback-based uncertainty model for trajecto | arXiv: 2510.16376
conftuner training large language models to express their confidence verbally | arXiv: 2508.18847
cost-sensitive freeze-thaw bayesian optimization for efficient hyperparameter tu | arXiv: 2510.21379
creativity or brute force using brainteasers as a window into the problem-solvin | arXiv: 2505.10844
decoupled entropy minimization | arXiv: 2511.03256
efficient semantic uncertainty quantification in language models via diversity-s | arXiv: 2510.21310
enhancing sample selection against label noise by cutting mislabeled easy exampl | arXiv: 2502.08227
evalearn quantifying the learning capability and efficiency of llms via sequenti | arXiv: 2506.02672
exploiting task relationships in continual learning via transferability-aware ta | arXiv: 2502.11609
exploiting vocabulary frequency imbalance in language model pre-training | arXiv: 2508.15390
generalization error analysis for selective state-space models through the lens | arXiv: 2502.01473
houselayout3d a benchmark and training-free baseline for 3d layout estimation in | arXiv: 2512.02450
hybridnorm towards stable and efficient transformer training via hybrid normaliz | arXiv: 2503.04598
incomplete multi-view clustering via hierarchical semantic alignment and coopera | arXiv: 2510.13887
ineq-comp benchmarking human-intuitive compositional reasoning in automated theo | arXiv: 2505.12680
keep it on a leash controllable pseudo-label generation towards realistic long-t | arXiv: 2510.03993
lcdb 11 a database illustrating learning curves are more ill-behaved than previo | arXiv: 2505.15657
learning generalizable shape completion with sim3 equivariance | arXiv: 2509.26631
let the experts speak improving survival prediction calibration via mixture-of-e | arXiv: 2511.09567
leveraging robust optimization for llm alignment under distribution shifts | arXiv: 2504.05831
ltd-bench evaluating large language models by letting them draw | arXiv: 2511.02347
meicoder decoding visual stimuli from neural activity by leveraging most excitin | arXiv: 2510.20762
merlin l48 spectrogram dataset | arXiv: 2511.00252
mind the gap removing the discretization gap in differentiable logic gate networ | arXiv: 2506.07500
model-behavior alignment under flexible evaluation when the best-fitting model i | arXiv: 2510.23321
model context protocol for vision systems audit security and protocol extensions | arXiv: 2509.22814
mvsmamba multi-view stereo with state space model | arXiv: 2511.01315
normal-abnormal guided generalist anomaly detection | arXiv: 2510.00495
on evaluating llm alignment by evaluating llms as judges | arXiv: 2511.20604
open-insect benchmarking open-set recognition of novel species in biodiversity m | arXiv: 2503.01691
optitree hierarchical thoughts generation with tree search for llm optimization | arXiv: 2510.22192
parrot a benchmark for evaluating llms in cross-system sql translation | arXiv: 2509.23338
path attention position encoding via accumulating householder transformations | arXiv: 2505.16381
pfδ a benchmark dataset for power flow under load generation and topology variat | arXiv: 2510.22048
put cash on bandits a max k-armed problem for automated machine learning | arXiv: 2505.05226
rdb2g-bench a comprehensive benchmark for automatic graph modeling of relational | arXiv: 2506.01360
reliably detecting model failures in deployment without labels | arXiv: 2506.05047
rethinking evaluation of infrared small target detection | arXiv: 2509.16888
rethinking losses for diffusion bridge samplers | arXiv: 2506.10982
rgb-to-polarization estimation a new task and benchmark study | arXiv: 2505.13050
risk management for mitigating benchmark failure modes benchrisk | arXiv: 2510.21460
scmrdr a scalable and flexible framework for unpaired single-cell multi-omics da | arXiv: 2510.24987
semi-supervised regression with heteroscedastic pseudo-labels | arXiv: 2510.15266
small language models as compiler experts auto-parallelization for heterogeneous | arXiv: 2512.19250
test-time adaptation by causal trimming | arXiv: 2510.11133
the geometry of cortical computation manifold disentanglement and predictive dyn | arXiv: 2508.02995
thought communication in multiagent collaboration | arXiv: 2510.20733
tight lower bounds and improved convergence in performative prediction | arXiv: 2412.03671
time travel is cheating going live with deepfund for real-time fund investment b | arXiv: 2505.11065
turbocharging gaussian process inference with approximate sketch-and-project | arXiv: 2505.13723
unlocking transfer learning for open-world few-shot recognition | arXiv: 2411.09986
what does it take to build a performant selective classifier | arXiv: 2510.20242
your pre-trained llm is secretly an unsupervised confidence calibrator | arXiv: 2505.16690
evorefuse evolutionary prompt optimization for evaluation and mitigation of llm | arXiv: 2505.23473
qsharp provably optimal distributional rl for llm post-training | arXiv: 2502.20548
solverllm leveraging test-time scaling for optimization problem via llm-guided s | arXiv: 2510.16916
speculate deep and accurate lossless and training-free acceleration for offloade | arXiv: 2509.18344
streambridge turning your offline video large language model into a proactive st | arXiv: 2505.05467
symphony synergistic multi-agent planning with heterogeneous language model asse | arXiv: 2601.22623
systematizing llm persona design a four-quadrant technical taxonomy for ai compa | arXiv: 2511.02979
wider or deeper scaling llm inference-time compute with adaptive branching tree | arXiv: 2503.04412
ai progress should be measured by capability-per-resource not scale alone a fram | arXiv: 2511.01077
alternating gradient flows a theory of feature learning in two-layer neural netw | arXiv: 2506.06489
an empirical investigation of neural odes and symbolic regression for dynamical | arXiv: 2601.20637
beyond benign overfitting in nadaraya-watson interpolators | arXiv: 2502.07480
born a transformer -- always a transformer on the effect of pretraining on archi | arXiv: 2505.21785
breaking the frozen subspace importance sampling for low-rank optimization in ll | arXiv: 2502.05790
broken tokens your language model can secretly handle non-canonical tokenization | arXiv: 2506.19004
conformal risk training end-to-end optimization of conformal risk control | arXiv: 2510.08748
differentiable hierarchical visual tokenization | arXiv: 2511.02652
disaggregation reveals hidden training dynamics the case of agreement attraction | arXiv: 2510.24934
does object binding naturally emerge in large pretrained vision transformers | arXiv: 2510.24709
efficient pre-training of llms via topology-aware communication alignment on mor | arXiv: 2509.15940
enhancing training data attribution with representational optimization | arXiv: 2505.18513
final-model-only data attribution with a unifying view of gradient-based methods | arXiv: 2412.03906
flatness is necessary neural collapse is not rethinking generalization via grokk | arXiv: 2509.17738
gemstones a model suite for multi-faceted scaling laws | arXiv: 2502.06857
generalization bounds for rank-sparse neural networks | arXiv: 2510.21945
global minimizers of sigmoid contrastive loss | arXiv: 2509.18552
gradient-weight alignment as a train-time proxy for generalization in classifica | arXiv: 2510.25480
how does sequence modeling architecture influence base capabilities of pre-train | arXiv: 2505.18522
language model behavioral phases are consistent across archi | arXiv: 2510.24963
learning the wrong lessons syntactic-domain spurious correlations in language mo | arXiv: 2509.21155
learning to flow from generative pretext tasks for neural architecture encoding | arXiv: 2510.18360
leveraging importance sampling to detach alignment modules from large language m | arXiv: 2505.19700
lm behavioral phases | arXiv: 2510.24963
memory mosaics at scale | arXiv: 2507.03285
nemotron-climb clustering-based iterative data mixture bootstrapping for languag | arXiv: 2504.13161
neural collapse under gradient flow on shallow relu networks for orthogonally se | arXiv: 2510.21078
optimal online change detection via random fourier features | arXiv: 2505.17789
power lines scaling laws for weight decay and batch size in llm pre-training | arXiv: 2505.13738
predict training data quality via its geometry in metric space | arXiv: 2510.15970
prescribe predicting single-cell responses with bayesian estimation | arXiv: 2510.07964
quantifying task-relevant representational similarity using decision variable co | arXiv: 2506.02164
retrospective incontext learning for temporal credit assignm | arXiv: 2602.17497
ricl temporal credit | arXiv: 2602.17497
scalable fingerprinting of large language models | arXiv: 2502.07760
scaling embedding layers in language models | arXiv: 2502.01637
superposition yields robust neural scaling | arXiv: 2505.10465
the curse of depth in large language models | arXiv: 2502.05795
through the river understanding the benefit of schedule-free methods for languag | arXiv: 2507.09846
understanding and enhancing mask-based pretraining towards universal representat | arXiv: 2509.21650
zeus zero-shot embeddings for unsupervised separation of tabular data | arXiv: 2505.10704
time temporal reasoning | arXiv: 2505.12891
a cramrvon mises approach to incentivizing truthful data sha | arXiv: 2506.07272
a reliable cryptographic framework for empirical machine unl | arXiv: 2404.11577
buffer layers for test-time adaptation | arXiv: 2510.21271
demystifying language model forgetting with low-rank example associations | arXiv: 2406.14026
finding structure in continual learning | arXiv: 2602.04555
procurement auctions with predictions improved frugality for facility location | arXiv: 2512.09367
simu selective influence machine unlearning | arXiv: 2510.07822
stop ddos attacking the research community with ai-generated survey papers | arXiv: 2510.09686
teaming llms to detect and mitigate hallucinations | arXiv: 2510.19507
trust -- transformer-driven u-net for sparse target recovery | arXiv: 2506.01112
less is more but where dynamic token compression via llm-guided keyframe prior | arXiv: 2512.06866
qsvd efficient low-rank approximation for unified query-key-value weight compres | arXiv: 2510.16292
scalable exploration via ensemble | arXiv: 2407.13195
when worse is better navigating the compression-generation tradeoff in visual to | arXiv: 2412.16326
adaptive originality filtering rejection based prompting and riddlescore for cul | arXiv: 2508.18709
dcad-2000 a multilingual dataset across 2000 languages with data cleaning as ano | arXiv: 2502.11546
enhancing multilingual llm pretraining with model-based data selection | arXiv: 2502.10361
exploring the translation mechanism of large language models | arXiv: 2502.11806
helpsteer3-preference open human-annotated preference data across diverse tasks | arXiv: 2505.11475
how data mixing shapes in-context learning asymptotic equivalence for transforme | arXiv: 2510.25753
mergebench a benchmark for merging domain-specialized llms | arXiv: 2505.10833
merit multilingual semantic retrieval with interleaved multi-condition query | arXiv: 2506.03144
parallelprompt extracting parallelism from large language model queries | arXiv: 2506.18728
quantifying climate policy action and its links to development outcomes a cross- | arXiv: 2510.17425
zero-shot performance prediction for probabilistic scaling laws | arXiv: 2510.16743
danmaku tpp bench | arXiv: 2505.18411
ifinder structured zero-shot vision-based llm grounding for dash-cam video reaso | arXiv: 2509.19552
rtv bench benchmarking mllm continuous perception through realtime video | arXiv: 2505.02064
lr yolo lipschitz continuity image restoration object detection | arXiv: 2510.24232
m-grpo stabilizing self-supervised reinforcement learning for multimodal underst | arXiv: 2512.13070
beyond tildeosqrtt constraint violation for online convex optimization with adve | arXiv: 2505.06709
constrained network slice assignment via llms | arXiv: 2512.00040
contribution of task-irrelevant stimuli to drift of neural representations | arXiv: 2510.21588
a differentiable model of supply-chain shocks | arXiv: 2511.05231
exact learning of arithmetic with differentiable agents | arXiv: 2511.22751
orbitzoo real orbital systems challenges for reinforcement learning | arXiv: 2504.04160
ortholoc uav 6-dof localization and calibration using orthographic geodata | arXiv: 2509.18350
multi-modal masked autoencoders for learning image-spectrum associations for gal | arXiv: 2510.22527
r2ec towards large recommender models with reasoning | arXiv: 2505.16994
adaptive cooperative transmission design for ultra-reliable low-latency communic | arXiv: 2511.02216
boundary to region supervision for offline safe rl | arXiv: 2509.25727
confounding robust deep reinforcement learning a causal approach | arXiv: 2510.21110
continual knowledge adaptation for reinforcement learning | arXiv: 2510.19314
interactive and hybrid imitation learning provably beating behavior cloning | arXiv: 2412.07057
inverse optimization latent variable models for learning costs applied to route | arXiv: 2509.15999
last iterate convergence in monotone mean field games | arXiv: 2410.05127
coopera continual open ended human robot assistance | arXiv: 2510.23495
dexflywheel a scalable and self-improving data generation framework for dexterou | arXiv: 2509.23829
egothinker egocentric reasoning | arXiv: 2510.23569
t-rex task-adaptive spatial representation extraction for robotic manipulation w | arXiv: 2506.19498
oneshot transfer learning nonlinear pdes perturbative pinns | arXiv: 2511.11137
mechanistic interpretability of rnns emulating hidden mar | arXiv: 2510.25674
starformer semi-supervised task-informed representation learning via dynamic att | arXiv: 2504.10097
trident tri-modal molecular representation learning with taxonomic annotations a | arXiv: 2506.21028
a multitask benchmark for abusive language detection in lowr | arXiv: 2505.12116
active slice discovery in large language models | arXiv: 2511.20713
auto-search and refinement an automated framework for gender bias mitigation in | arXiv: 2502.11559
averimatec a dataset for automatic verification of image-text claims with eviden | arXiv: 2505.17978
concept-level explainability for auditing steering llm responses | arXiv: 2505.07610
date-lm benchmarking data attribution evaluation for large language models | arXiv: 2507.09424
deeptraverse a depth-first search inspired network for algorithmic visual unders | arXiv: 2506.10084
dont let it fade preserving edits in diffusion language mode
evaluating multiple models using labeled and unlabeled data | arXiv: 2501.11866
graphkeeper graph domain-incremental learning via knowledge disentanglement and | arXiv: 2511.00097
if-guide influence function-guided detoxification of llms | arXiv: 2506.01790
noise-robustness through noise a framework combining asymmetric lora with poison | arXiv: 2505.23868
os-harm a benchmark for measuring safety of computer use agents | arXiv: 2506.14866
policy-as-prompt turning ai governance rules into guardrails for ai agents | arXiv: 2509.23994
position paper if innovation in ai systematically violates fundamental rights is | arXiv: 2511.00027
precise information control in long-form text generation | arXiv: 2506.06589
slaying towards queer language processing | arXiv: 2509.17449
connecting the dots a machine learning ready dataset for ionospheric forecasting | arXiv: 2511.15743
ioncast a deep learning framework for forecasting ionospheric total electron con | arXiv: 2511.15004
maestro adaptive sparse attention and robust learning for multimodal dynamic tim | arXiv: 2509.25278
autoregressive adversarial posttraining for realtime interac | arXiv: 2506.09350
dismo disentangled motion representations for openworld moti | arXiv: 2511.23428
force prompting video generation models can learn and generalize physics-based c | arXiv: 2505.19386
foresight adaptive layer reuse for accelerated and highquali | arXiv: 2506.00329
lemica lexicographic minimax path caching for efficient diffusion-based video ge | arXiv: 2511.00090
magcache fast video generation with magnitudeaware cache | arXiv: 2506.09045
photography perspective composition towards aesthetic perspective recommendation | arXiv: 2505.20655
physctrl generative physics for controllable and physicsgrou | arXiv: 2509.20358
posecrafter extreme pose estimation with hybrid video synthesis | arXiv: 2510.19527
radial attention onlog n sparse attention with energy decay for long video gener | arXiv: 2506.19852
rlgf reinforcement learning with geometric feedback for autonomous driving video | arXiv: 2509.16500
s2q-vdit accurate quantized video diffusion transformer with salient data and sp | arXiv: 2508.04016
safesora safe texttovideo generation via graphical watermark | arXiv: 2505.12667
scaling rl to long videos | arXiv: 2507.07966
seeing the wind from a falling leaf | arXiv: 2512.00762
self forcing bridging the train-test gap in autoregressive video diffusion | arXiv: 2506.08009
stable cinemetrics structured taxonomy and evaluation for professional video gen | arXiv: 2509.26555
training-free efficient video generation via dynamic token carving | arXiv: 2505.16864
video diffusion models excel at tracking similar-looking objects without supervi | arXiv: 2512.02339
video killed the energy budget characterizing the latency and power regimes of o | arXiv: 2509.19222
vmdt decoding the trustworthiness of video foundation models | arXiv: 2511.05682
vorta efficient video diffusion via routing sparse attention | arXiv: 2505.18809
vsa faster video diffusion with trainable sparse attention | arXiv: 2505.13389
dualground phrase temporal | arXiv: 2510.20244
egogazevqa egocentric gaze guided video question answering | arXiv: 2509.07447
star tool video qa | arXiv: 2512.10359
tempsamp r1 temporal grounding | arXiv: 2509.18056