NeurIPS2025 论文笔记 TODO¶
总计: 2901 篇 | 已完成: 2901 | 待更新: 0
- 3-Model Speculative Decoding (PyramidSD) | arXiv: 2510.12966
- 3D Visual Illusion Depth Estimation | arXiv: 2505.13061
- 3D-Agent: Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation | arXiv: 2601.04404
- 3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks | arXiv: 2506.11147
- 3DID: Direct 3D Inverse Design for Aerodynamics with Physics-Aware Optimization | arXiv: 2512.08987
- 3EED: Ground Everything Everywhere in 3D | arXiv: 2511.01755
- 4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming | arXiv: 2509.17513
- 4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos | arXiv: 2506.08015
- 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11) | arXiv: 2504.11651
- A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective | arXiv: 2509.16499
- A Connection Between Score Matching and Local Intrinsic Dimension | arXiv: 2510.12975
- A Controllable Examination for Long-Context Language Models | arXiv: 2506.02921
- A Cramér–von Mises Approach to Incentivizing Truthful Data Sharing | arXiv: 2506.07272
- A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors | arXiv: 2510.05205
- a differentiable model of supply-chain shocks | arXiv: 2511.05231
- A Differential and Pointwise Control Approach to Reinforcement Learning | arXiv: 2404.15617
- A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking | arXiv: 2510.06699
- A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1 | arXiv: 2503.10635
- A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications | arXiv: 2509.18714
- A Generalized Label Shift Perspective for Cross-Domain Gaze Estimation | arXiv: 2505.13043
- A Gradient Flow Approach to Solving Inverse Problems with Latent Diffusion Models | arXiv: 2509.19276
- A Granular Study of Safety Pretraining under Model Abliteration | arXiv: 2510.02768
- A Graph Neural Network Approach for Localized and High-Resolution Temperature Forecasting | arXiv: 2512.00546
- A High-Dimensional Statistical Method for Optimizing Transfer Quantities in Multi-Source Transfer Learning | arXiv: 2502.04242
- A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders | arXiv: 2409.14507
- a joint learning approach to hardware caching and prefetching | arXiv: 2510.10862
- A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers | arXiv: 2503.03961
- A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings | arXiv: 2505.12116
- A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection | arXiv: 2510.21679
- A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond | arXiv: 2502.07514
- A Novel Approach to Classification of ECG Arrhythmia Types with Latent ODEs | arXiv: 2511.16933
- A Partition Cover Approach for Tokenization | arXiv: 2501.06246
- A Practical Guide for Incorporating Symmetry in Diffusion Policy | arXiv: 2505.13431
- A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning | arXiv: 2510.17697
- A Probabilistic U-Net Approach to Downscaling Climate Simulations | arXiv: 2511.03197
- A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees | arXiv: 2502.04799
- A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation | arXiv: 2404.11577
- A Self-Improving Coding Agent | arXiv: 2504.15228
- A Set of Generalized Components to Achieve Effective Poison-only Clean-label Backdoor Attacks with Collaborative Sample Selection and Triggers | arXiv: 2509.19947
- A Simple Linear Patch Revives Layer-Pruned Large Language Models | arXiv: 2505.24680
- A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization | arXiv: 2510.24710
- A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning | arXiv: 2505.19281
- A Standardized Benchmark for Multilabel Antimicrobial Peptide Classification | arXiv: 2511.04814
- A Stochastic Differential Equation Framework for Multi-Objective LLM Interactions | arXiv: 2510.10739
- A Sustainable AI Economy Needs Data Deals That Work for Generators | arXiv: 2601.09966
- A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs | arXiv: 2512.08786
- A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation | arXiv: 2505.20172
- A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning | arXiv: 2510.15444
- A Theory of Multi-Agent Generative Flow Networks | arXiv: 2509.20408
- A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone | arXiv: 2505.12781
- A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity | arXiv: 2509.24734
- A Unified Approach to Submodular Maximization Under Noise | arXiv: 2510.21128
- A Unified Framework for Establishing the Universal Approximation of Transformer-Type Architectures | arXiv: 2506.23551
- A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values | arXiv: 2506.05216
- A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random | arXiv: 2505.19093
- A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis | arXiv: 2511.00962
- A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking | arXiv: 2505.19858
- A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias | arXiv: 2511.17378
- A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning | arXiv: 2501.01774
- A Variational Manifold Embedding Framework for Nonlinear Dimensionality Reduction | arXiv: 2511.22128
- A-MEM: Agentic Memory for LLM Agents | arXiv: 2502.12110
- a-thought efficient reasoning via bidirectional compression for low-resource set | arXiv: 2505.24550
- AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation | arXiv: 2506.05768
- AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling | arXiv: 2507.08567
- Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency | arXiv: 2510.19980
- AC-LoRA: (Almost) Training-Free Access Control-Aware Multi-Modal LLMs | arXiv: 2505.11557
- Accelerate Creation of Product Claims Using Generative AI | arXiv: 2509.20652
- Accelerating Parallel Diffusion Model Serving with Residual Compression | arXiv: 2507.17511
- AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models | arXiv: 2510.20348
- Accurate and Efficient Low-Rank Model Merging in Core Space | arXiv: 2509.17786
- AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play | arXiv: 2509.24193
- ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking | arXiv: 2511.09833
- Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies | arXiv: 2509.25822
- Active Measurement: Efficient Estimation at Scale | arXiv: 2507.01372
- Active Slice Discovery in Large Language Models | arXiv: 2511.20713
- Active Target Discovery under Uninformative Prior: The Power of Permanent and Transient Memory | arXiv: 2510.16676
- Actor-Free Continuous Control via Structurally Maximizable Q-Functions | arXiv: 2510.18828
- AcuRank: 不确定性感知的自适应计算重排序 | arXiv: 2505.18512
- Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference | arXiv: 2407.11550
- AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining | arXiv: 2506.13274
- AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness | arXiv: 2511.09316
- AdaptGrad: Adaptive Sampling to Reduce Noise | arXiv: 2410.07711
- Adapting Speech Language Model to Singing Voice Synthesis | arXiv: 2512.14657
- Adapting Vision-Language Models for Evaluating World Models | arXiv: 2506.17967
- Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization | arXiv: 2509.15399
- adaptive cooperative transmission design for ultra-reliable low-latency communic | arXiv: 2511.02216
- adaptive coopetition leveraging coarse verifier signals for resilient multi-agen | arXiv: 2510.18179
- Adaptive Data Analysis for Growing Data | arXiv: 2405.13375
- Adaptive Discretization for Consistency Models | arXiv: 2510.17266
- Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning | arXiv: 2510.10207
- Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing | arXiv: 2505.21671
- Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs | arXiv: 2509.17998
- Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning | arXiv: 2509.15087
- Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning | arXiv: 2511.02567
- adaptive online emulation for accelerating complex physical simulations | arXiv: 2508.08012
- Adaptive Originality Filtering: Rejection-Based Prompting and RiddleScore for Culturally Grounded Multilingual Riddle Generation | arXiv: 2508.18709
- Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees | arXiv: 2505.18659
- Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling | arXiv: 2510.23285
- Adaptively Coordinating with Novel Partners via Learned Latent Strategies | arXiv: 2511.12754
- AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners | arXiv: 2505.16322
- AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding | arXiv: 2506.13589
- additive models explained a computational complexity approach | arXiv: 2510.21292
- addressing mark imbalance in integrationfree neural marked t | arXiv: 2510.20414
- adjacent words divergent intents jailbreaking large language models via task con | arXiv: 2510.21189
- Adjoint Schrödinger Bridge Sampler | arXiv: 2506.22565
- adjusted count quantification learning on graphs | arXiv: 2503.09395
- ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources | arXiv: 2502.07862
- AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees | arXiv: 2512.04550
- ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining | arXiv: 2511.05245
- Adv-SSL: Adversarial Self-Supervised Representation Learning with Theoretical Guarantees | arXiv: 2408.08533
- Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning | arXiv: 2505.24424
- Advancing Expert Specialization for Better MoE | arXiv: 2505.22323
- adversarial locomotion and motion imitation for humanoid policy learning | arXiv: 2504.14305
- Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text | arXiv: 2506.07001
- aero a redirection-based optimization framework inspired by judo for robust prob | arXiv: 2506.02415
- AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models | arXiv: 2511.10017
- AgentAuditor: Human-Level Safety and Security Evaluation for LLM Agents | arXiv: 2506.00641
- agentchangebench a multi-dimensional evaluation framework for goal-shift robustn | arXiv: 2510.18170
- AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents | arXiv: 2503.09780
- Agentic NL2SQL to Reduce Computational Costs | arXiv: 2510.14808
- agentic persona control and task state tracking for realistic user simulation in | arXiv: 2601.15290
- Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents | arXiv: 2506.14852
- AgentiQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation | arXiv: 2510.10661
- AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents | arXiv: 2506.04018
- agentstealth reinforcing large language model for anonymizing user-generated tex | arXiv: 2506.22508
- AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks | arXiv: 2508.00890
- aggregation hides out-of-distribution generalization failures from spurious corr | arXiv: 2510.24884
- Agint: Agentic Graph Compilation for Software Engineering Agents | arXiv: 2511.19635
- AHA -- Predicting What Matters Next: Online Highlight Detection Without Looking Ahead | arXiv: 2509.16421
- AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs | arXiv: 2511.01077
- AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift | arXiv: 2507.07820
- AI-Generated Video Detection via Perceptual Straightening | arXiv: 2507.00583
- ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering | arXiv: 2506.09050
- alias-free vit fractional shift invariance via linear attention | arXiv: 2510.22673
- Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment | arXiv: 2511.08399
- Aligning Compound AI Systems via System-level DPO | arXiv: 2502.17721
- Aligning Text to Image in Diffusion Models is Easier Than You Think | arXiv: 2503.08250
- Alignment of Large Language Models with Constrained Learning | arXiv: 2505.19387
- aline joint amortization for bayesian inference and active data acquisition | arXiv: 2506.07259
- All You Need is One: Capsule Prompt Tuning with a Single Vector | arXiv: 2510.16670
- Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression | arXiv: 2503.07561
- almguard safety shortcuts and where to find them as guardrails for audio-languag | arXiv: 2510.26096
- alternating gradient flows a theory of feature learning in two-layer neural netw | arXiv: 2506.06489
- Amortized Active Generation of Pareto Sets | arXiv: 2510.21052
- Amortized Sampling with Transferable Normalizing Flows | arXiv: 2508.18175
- an adaptive algorithm for bilevel optimization on riemannian manifolds | arXiv: 2504.06042
- an analysis of causal effect estimation using outcome invariant data augmentatio | arXiv: 2510.25128
- an analysis of concept bottleneck models measuring understanding and mitigating | arXiv: 2505.16705
- an empirical investigation of neural odes and symbolic regression for dynamical | arXiv: 2601.20637
- an evidence-based post-hoc adjustment framework for anomaly detection under data | arXiv: 2510.21296
- Angular Constraint Embedding via SpherePair Loss for Constrained Clustering | arXiv: 2510.06907
- angular steering behavior control via rotation in activation space | arXiv: 2510.26243
- Anti-Aliased 2D Gaussian Splatting | arXiv: 2506.11252
- AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making | arXiv: 2506.12374
- Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector | arXiv: 2505.17100
- Approximate Domain Unlearning for Vision-Language Models | arXiv: 2510.08132
- Approximately Aligned Decoding | arXiv: 2410.01103
- approximating shapley explanations in reinforcement learning | arXiv: 2511.06094
- aquamam an autoregressive quaternion manifold model for rapidly estimating compl | arXiv: 2301.08838
- are greedy task orderings better than random in continual linear regression | arXiv: 2510.19941
- Are Language Models Efficient Reasoners? A Perspective from Logic Programming | arXiv: 2510.25626
- are large language models sensitive to the motives behind communication | arXiv: 2510.19687
- are large reasoning models good translation evaluators analysis and performance | arXiv: 2510.20780
- are pixel-wise metrics reliable for sparse-view computed tomography reconstructi | arXiv: 2506.02093
- are vision language models ready for clinical diagnosis a 3d medical benchmark f | arXiv: 2505.18915
- arecho autoregressive evaluation via chain-based hypothesis optimization for spe | arXiv: 2505.24518
- ARGenSeg: Image Segmentation with Autoregressive Image Generation Model | arXiv: 2510.20803
- ARM: Adaptive Reasoning Model | arXiv: 2505.20258
- ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction | arXiv: 2509.20824
- artificial hivemind the open-ended homogeneity of language models and beyond | arXiv: 2510.22954
- asap an agentic solution to auto-optimize performance of large-scale llm trainin | arXiv: 2511.03844
- Ascent Fails to Forget | arXiv: 2509.26427
- asciibench evaluating language-model-based understanding of visually-oriented te | arXiv: 2512.04125
- ask a strong llm judge when your reward model is uncertain | arXiv: 2510.20369
- associative syntax and maximal repetitions reveal context-dependent complexity i | arXiv: 2512.01033
- astroco self-supervised conformer-style transformers for light-curve embeddings | arXiv: 2509.24134
- astrovisbench a code benchmark for scientific computing and visualization in ast | arXiv: 2505.20538
- asymmetric duos sidekicks improve uncertainty | arXiv: 2505.18636
- asymptotic and finite-time guarantees for langevin-based temperature annealing i | arXiv: 2603.12552
- asymptotically stable quaternion-valued hopfield-structured neural network with | arXiv: 2510.16607
- atlas autoformalizing theorems through lifting augmentation and synthesis of dat | arXiv: 2502.05567
- atlasgs atlanta-world guided surface reconstruction with implicit structured gau | arXiv: 2510.25129
- Atom of Thoughts for Markov LLM Test-Time Scaling | arXiv: 2502.12018
- Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra | arXiv: 2512.03127
- attack via overfitting 10-shot benign fine-tuning to jailbreak llms | arXiv: 2510.02833
- attention as discrete-time markov chains | arXiv: 2507.17657
- attention your vision language model could be maliciously manipulated | arXiv: 2505.19911
- AttentionPredictor: Temporal Patterns Matter for KV Cache Compression
- attractive metadata attack inducing llm agents to invoke malicious tools | arXiv: 2508.02110
- attributing response to context a jensen-shannon divergence driven mechanistic s | arXiv: 2505.16415
- audio super-resolution with latent bridge models | arXiv: 2509.17609
- Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models | arXiv: 2505.13143
- audsemthinker enhancing audio-language models through reasoning over semantics o | arXiv: 2505.14142
- AugGen: Synthetic Augmentation using Diffusion Models Can Improve Recognition | arXiv: 2503.11544
- augmenting biological fitness prediction benchmarks with landscapes features fro | arXiv: 2510.24826
- auto-compressing networks | arXiv: 2506.09714
- auto-search and refinement an automated framework for gender bias mitigation in | arXiv: 2502.11559
- autodiscovery open-ended scientific discovery via bayesian surprise | arXiv: 2507.00310
- autoencoding random forests | arXiv: 2505.21441
- autojudge judge decoding without manual annotation | arXiv: 2504.20039
- automated algorithm design via nevanlinna-pick interpolation | arXiv: 2509.21416
- Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection | arXiv: 2510.16499
- Automated Detection of Visual Attribute Reliance with a Self-Reflective Agent | arXiv: 2510.21704
- automated multi-agent workflows for rtl design | arXiv: 2509.20182
- automaton constrained q-learning | arXiv: 2510.05061
- autoopt a dataset and a unified framework for automating optimization problem so | arXiv: 2510.21436
- Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation | arXiv: 2506.09350
- AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing | arXiv: 2510.21935
- autotom scaling model-based mental inference via automated agent modeling | arXiv: 2502.15676
- AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | arXiv: 2506.13757
- Availability-aware Sensor Fusion via Unified Canonical Space | arXiv: 2503.07029
- averimatec a dataset for automatic verification of image-text claims with eviden | arXiv: 2505.17978
- badiff bandwidth adaptive diffusion model | arXiv: 2510.21366
- Balanced Conic Rectified Flow | arXiv: 2510.25229
- Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization | arXiv: 2505.22038
- balancing performance and costs in best arm identification | arXiv: 2505.20583
- bandit and delayed feedback in online structured prediction | arXiv: 2502.18709
- barcodemamba advancing state-space models for fungal biodiversity research | arXiv: 2512.15931
- barista brain scale informed spatiotemporal representation of human intracranial | arXiv: 2512.12135
- Base Models Know How to Reason, Thinking Models Learn When | arXiv: 2510.07364
- bayesian ego-graph inference for networked multi-agent reinforcement learning | arXiv: 2509.16606
- bayesian evaluation of large language model behavior | arXiv: 2511.10661
- bayesian surrogates for risk-aware pre-assessment of aging bridge portfolios | arXiv: 2509.25031
- beast efficient tokenization of b-splines encoded action sequences for imitation | arXiv: 2506.06072
- BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading | arXiv: 2506.06271
- bedlam20 synthetic humans and cameras in motion | arXiv: 2511.14394
- behavior injection preparing language models for reinforcement learning | arXiv: 2505.18917
- Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks | arXiv: 2510.06307
- benchmarking agentic systems in automated scientific information extraction with | arXiv: 2510.00795
- Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents | arXiv: 2510.22443
- benchmarking is broken -- dont let ai be its own judge | arXiv: 2510.07575
- benchmarking large language models for zero-shot and few-shot phishing url detec | arXiv: 2602.02641
- benchmarking probabilistic time series forecasting models on neural activity | arXiv: 2510.18037
- Benchmarking Retrieval-Augmented Multimodal Generation for Document Question Answering | arXiv: 2505.16470
- benfords curse tracing digit bias to numerical hallucination in llms | arXiv: 2506.01734
- better estimation of the kullback--leibler divergence between language models | arXiv: 2504.10637
- better ntk conditioning a free lunch from relu nonlinear activation in wide neur | arXiv: 2305.08813
- Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging | arXiv: 2510.20639
- Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning | arXiv: 2506.04723
- beyond benign overfitting in nadaraya-watson interpolators | arXiv: 2502.07480
- Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations | arXiv: 2505.21318
- Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits | arXiv: 2511.20273
- beyond greedy exits improved early exit decisions for risk control and reliabili | arXiv: 2509.23666
- beyond higher rank token-wise input-output projections for efficient low-rank ad | arXiv: 2510.23123
- beyond last-click an optimal mechanism for ad attribution | arXiv: 2511.22918
- Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking | arXiv: 2505.18495
- beyond parallelism synergistic computational graph effects in multi-head attenti | arXiv: 2507.02944
- beyond random automatic inner-loop optimization in dataset distillation | arXiv: 2510.04838
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
- Beyond the Singular: Value of Multiple Generations in Benchmark Evaluation | arXiv: 2502.08943
- beyond the surface enhancing llm-as-a-judge alignment with human via internal re | arXiv: 2508.03550
- beyond tildeosqrtt constraint violation for online convex optimization with adve | arXiv: 2505.06709
- beyond token probes hallucination detection via activation tensors with act-vit | arXiv: 2510.00296
- bezier splatting for fast and differentiable vector graphics rendering | arXiv: 2503.16424
- Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization | arXiv: 2510.19517
- bias in the picture benchmarking vlms with social-cue news images and llm-as-jud | arXiv: 2509.19659
- bidirectional representations augmented autoregressive biological sequence gener | arXiv: 2510.08169
- bigram subnetworks mapping to next tokens in transformer language models | arXiv: 2504.15471
- binary quadratic quantization beyond first-order quantization for real-valued ma | arXiv: 2510.18650
- biobench a blueprint to move beyond imagenet for scientific ml benchmarks | arXiv: 2511.16315
- bioclip 2 emergent properties from scaling hierarchical contrastive learning | arXiv: 2505.23883
- Bispectral OT: Dataset Comparison using Symmetry-Aware Optimal Transport | arXiv: 2509.20678
- bitmark watermarking bitwise autoregressive image generative models | arXiv: 2506.21209
- Bits Leaked per Query: Information-Theoretic Bounds on Adversarial Attacks against LLMs | arXiv: 2510.17000
- blameless users in a clean room defining copyright protection for generative mod | arXiv: 2506.19881
- Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers | arXiv: 2506.00744
- blind strong gravitational lensing inversion joint inference of source and lens | arXiv: 2511.04792
- blink-twice you see but do you observe a reasoning benchmark on visual perceptio | arXiv: 2510.09361
- bliss bandit layer importance sampling strategy for efficient training of graph | arXiv: 2512.22388
- BlurDM: A Blur Diffusion Model for Image Deblurring | arXiv: 2512.03979
- blurguard a simple approach for robustifying image protection against ai-powered | arXiv: 2511.00143
- BNMusic: Blending Environmental Noises into Personalized Music | arXiv: 2506.10754
- boltznce learning likelihoods for boltzmann generation with stochastic interpola | arXiv: 2507.00846
- boosting adversarial transferability with spatial adversarial alignment | arXiv: 2501.01015
- Boosting Generative Image Modeling via Joint Image-Feature Synthesis | arXiv: 2504.16064
- bootstrap off-policy with world model | arXiv: 2511.00423
- born a transformer -- always a transformer on the effect of pretraining on archi | arXiv: 2505.21785
- boundary-to-region supervision for offline safe reinforcement learning | arXiv: 2509.25727
- Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens | arXiv: 2509.24693
- brain-like processing pathways form in models with heterogeneous experts | arXiv: 2506.02813
- brain-like variational inference | arXiv: 2410.19315
- brain-tuning improves generalizability and efficiency of brain alignment in spee | arXiv: 2510.21520
- brainomni a brain foundation model for unified eeg and meg signals | arXiv: 2505.18185
- Breaking AR's Sampling Bottleneck: Provable Acceleration via Diffusion Language Models | arXiv: 2505.21400
- breaking the compression ceiling data-free pipeline for ultra-efficient delta co | arXiv: 2505.13563
- breaking the frozen subspace importance sampling for low-rank optimization in ll | arXiv: 2502.05790
- breaking the gradient barrier unveiling large language models for strategic clas | arXiv: 2511.06979
- bridgevla input-output alignment for efficient 3d manipulation learning with vis | arXiv: 2506.07961
- bridging embodiment gaps deploying vision-language-action models on soft robots | arXiv: 2510.17369
- bridging graph and state-space modeling for intensive care unit length of stay p | arXiv: 2508.17554
- bridging human and llm judgments understanding and narrowing the gap | arXiv: 2508.12792
- bridging symmetry and robustness on the role of equivariance in enhancing advers | arXiv: 2510.16171
- broken tokens your language model can secretly handle non-canonical tokenization | arXiv: 2506.19004
- BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent | arXiv: 2509.15566
- bubbleformer forecasting boiling with transformers | arXiv: 2507.21244
- buffer layers for test-time adaptation | arXiv: 2510.21271
- burstdeflicker a benchmark dataset for flicker removal in dynamic scenes | arXiv: 2510.09996
- c-lora contextual low-rank adaptation for uncertainty estimation in large langua | arXiv: 2505.17773
- c-nav towards self-evolving continual object navigation in open world | arXiv: 2510.20685
- c2prompt class-aware client knowledge interaction for federated continual learni | arXiv: 2509.19674
- c3po cross-view cross-modality correspondence by pointmap prediction | arXiv: 2511.18559
- cadmorph geometry-driven parametric cad editing via a plan-generate-verify loop | arXiv: 2512.11480
- cam a constructivist view of agentic memory for llm-based reading comprehension | arXiv: 2510.05520
- CAMILA: Context-Aware Masking for Image Editing with Language Alignment
- camit a time-aware car model dataset for classification and generation | arXiv: 2510.17626
- can agents fix agent issues | arXiv: 2505.20749
- Can DPO Learn Diverse Human Values? A Theoretical Scaling Law | arXiv: 2408.03459
- can knowledge-graph-based retrieval augmented generation really retrieve what yo | arXiv: 2510.16582
- can large language models master complex card games | arXiv: 2509.01328
- can llms outshine conventional recommenders a comparative evaluation | arXiv: 2503.05493
- can llms reason over non-text modalities in a training-free manner a case study | arXiv: 2509.17552
- Can LLMs Write Faithfully? An Agent-Based Evaluation of LLM-generated Islamic Content | arXiv: 2510.24438
- can multi-modal llms provide live step-by-step task guidance | arXiv: 2511.21998
- CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness | arXiv: 2502.14914
- capturing individual human preferences with reward features | arXiv: 2503.17338
- care-pd a multi-site anonymized clinical dataset for parkinsons disease gait ass | arXiv: 2510.04312
- CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs | arXiv: 2510.26843
- CAT: Circular-Convolutional Attention for Sub-Quadratic Transformers | arXiv: 2504.06704
- Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers | arXiv: 2505.13737
- causal masking on spatial data an information-theoretic case for learning spatia | arXiv: 2510.27009
- Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models | arXiv: 2505.19474
- causaldynamics a large-scale benchmark for structural discovery of dynamical cau | arXiv: 2505.16620
- causality meets locality provably generalizable and scalable policy learning for | arXiv: 2510.21427
- causality-induced positional encoding for transformer-based representation learn | arXiv: 2509.16629
- causally reliable concept bottleneck models | arXiv: 2503.04363
- CBMAS: Cognitive Behavioral Modeling via Activation Steering | arXiv: 2601.06109
- cdflow building invertible layers with circulant and diagonal matrices | arXiv: 2510.25323
- certifying concavity and monotonicity in games via sum-of-squares hierarchies | arXiv: 2512.10292
- certifying stability of reinforcement learning policies using generalized lyapun | arXiv: 2505.10947
- cgbench benchmarking language model scientific reasoning for clinical genetics r | arXiv: 2510.11985
- ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning | arXiv: 2503.19331
- chain-of-retrieval augmented generation | arXiv: 2501.14342
- channel matters estimating channel influence for multivariate time series | arXiv: 2408.14763
- characterization and learning of causal graphs from hard interventions | arXiv: 2505.01037
- characterizing the expressivity of fixed-precision transformer language models | arXiv: 2505.23623
- ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models | arXiv: 2505.13444
- Checklists Are Better Than Reward Models For Aligning Language Models | arXiv: 2507.18624
- chiqpm calibrated hierarchical interpretable image classification | arXiv: 2511.20779
- choice benchmarking the remote sensing capabilities of large vision-language mod | arXiv: 2411.18145
- chronograph a real-world graph-based multivariate time series dataset | arXiv: 2509.04449
- ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference | arXiv: 2502.00299
- classical planning with llm-generated heuristics challenging the state of the ar | arXiv: 2503.18809
- clawscreativity detection for llm-generated solutions using attention window of | arXiv: 2510.17921
- clean first align later benchmarking preference data cleaning for reliable llm a | arXiv: 2509.23564
- cleverbirds a multiple-choice benchmark for fine-grained human knowledge tracing | arXiv: 2511.08512
- climb class-imbalanced learning benchmark on tabular data | arXiv: 2505.17451
- clip-and-verify linear constraint-driven domain clipping for accelerating neural | arXiv: 2512.11087
- clipgaussian universal and multimodal style transfer based on gaussian splatting | arXiv: 2505.22854
- cloud4d estimating cloud properties at a high spatial and temporal resolution | arXiv: 2511.19431
- Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning | arXiv: 2506.03136
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation | arXiv: 2505.17534
- CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance | arXiv: 2507.10646
- codecrash exposing llm fragility to misleading natural language in code reasonin | arXiv: 2504.14119
- codegemm a codebook-centric approach to efficient gemm in quantized llms | arXiv: 2512.17970
- cognitive mirrors exploring the diverse functional roles of attention heads in l | arXiv: 2512.10978
- cogvla cognition-aligned vision-language-action model via instruction-driven rou | arXiv: 2508.21046
- coido efficient data selection for visual instruction tuning via coupled importa | arXiv: 2510.17847
- collapsing taylor mode automatic differentiation | arXiv: 2505.13644
- collective narrative grounding community-coordinated data contributions to impro | arXiv: 2601.04201
- communicating plans not percepts scalable multi-agent coordination with embodied | arXiv: 2508.02912
- comparing uniform price and discriminatory multi-unit auctions through regret mi | arXiv: 2510.19591
- complexity scaling laws for neural models using combinatorial optimization | arXiv: 2506.12932
- compo preference alignment via comparison oracles | arXiv: 2505.05465
- composing global solutions to reasoning tasks via algebraic objects in neural ne | arXiv: 2410.01779
- composing linear layers from irreducibles | arXiv: 2507.11688
- composite flow matching for reinforcement learning with shifted-dynamics data | arXiv: 2505.23062
- Composition and Alignment of Diffusion Models using Constrained Learning | arXiv: 2508.19104
- compress gather and recompute reforming long-context processing in transformers | arXiv: 2506.01215
- compressing biology evaluating the stable diffusion vae for phenotypic drug disc | arXiv: 2510.19887
- computable universal online learning | arXiv: 2510.18352
- computational hardness of reinforcement learning with partial qπ-realizability | arXiv: 2510.21888
- concept-level explainability for auditing steering llm responses | arXiv: 2505.07610
- conceptscope characterizing dataset bias via disentangled visual concepts | arXiv: 2510.26186
- Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations | arXiv: 2510.23607
- conditional distribution compression via the kernel conditional mean embedding | arXiv: 2504.10139
- Conditional Panoramic Image Generation via Masked Autoregressive Modeling | arXiv: 2505.16862
- conformal online learning of deep koopman linear embeddings | arXiv: 2511.12760
- conformal prediction for causal effects of continuous treatments | arXiv: 2407.03094
- conformal prediction in the loop a feedback-based uncertainty model for trajecto | arXiv: 2510.16376
- conformal risk training end-to-end optimization of conformal risk control | arXiv: 2510.08748
- confounding robust deep reinforcement learning a causal approach | arXiv: 2510.21110
- confrover simultaneous modeling of protein conformation and dynamics via autoreg | arXiv: 2505.17478
- conftuner training large language models to express their confidence verbally | arXiv: 2508.18847
- Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning | arXiv: 2510.20644
- connecting the dots a machine learning ready dataset for ionospheric forecasting | arXiv: 2511.15743
- connectomebench can llms proofread the connectome | arXiv: 2511.05542
- consistent sampling and simulation molecular dynamics with energy-based diffusio | arXiv: 2506.17139
- consistent supervised-unsupervised alignment for generalized category discovery | arXiv: 2507.04725
- constant bit-size transformers are turing complete | arXiv: 2506.12027
- constrained discrete diffusion | arXiv: 2503.09790
- constrained network slice assignment via large language models | arXiv: 2512.00040
- context informs pragmatic interpretation in vision-language models | arXiv: 2511.03908
- contextagent context-aware proactive llm agents with open-world sensory percepti | arXiv: 2505.14668
- contexttab a semantics-aware tabular in-context learner | arXiv: 2506.10707
- contextual dynamic pricing with heterogeneous buyers | arXiv: 2512.09513
- contextual integrity in llms via reasoning and reinforcement learning | arXiv: 2506.04245
- contextual thompson sampling via generation of missing data | arXiv: 2502.07064
- continual knowledge adaptation for reinforcement learning | arXiv: 2510.19314
- Continual Multimodal Contrastive Learning | arXiv: 2503.14963
- Continuous Diffusion Model for Language Modeling | arXiv: 2502.11564
- continuous simplicial neural networks | arXiv: 2503.12919
- continuous subspace optimization for continual learning | arXiv: 2505.11816
- continuous thought machines | arXiv: 2505.05522
- continuous uniqueness and novelty metrics for generative modeling of inorganic c | arXiv: 2510.12405
- contrastive consolidation of top-down modulations achieves sparsely supervised c | arXiv: 2505.14125
- Contrastive Representations for Temporal Reasoning
- contribution of task-irrelevant stimuli to drift of neural representations | arXiv: 2510.21588
- controlfusion a controllable image fusion framework with language-vision degrada | arXiv: 2503.23356
- Controlling Thinking Speed in Reasoning Models | arXiv: 2507.03704
- convergence theorems for entropy-regularized and distributional reinforcement le | arXiv: 2510.08526
- convis-bench estimating video similarity through semantic concepts | arXiv: 2509.19245
- convolutional monge mapping between eeg datasets to support independent componen | arXiv: 2509.01721
- coopera continual open-ended human-robot assistance | arXiv: 2510.23495
- Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers | arXiv: 2512.10422
- copresheaf topological neural networks a generalized deep learning framework | arXiv: 2505.21251
- CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
- core benchmarking llms code reasoning capabilities through static analysis tasks | arXiv: 2507.05269
- core constraint-aware one-step reinforcement learning for simulation-guided neur | arXiv: 2506.03474
- core full-path evaluation of llm agents beyond final state | arXiv: 2509.20998
- coreguard safeguarding foundational capabilities of llms against model stealing | arXiv: 2410.13903
- coreset for robust geometric median eliminating size dependency on outliers | arXiv: 2510.24621
- coresets for clustering under stochastic noise | arXiv: 2510.23438
- correlation dimension of auto-regressive large language models | arXiv: 2510.21258
- COS3D: Collaborative Open-Vocabulary 3D Segmentation | arXiv: 2510.20238
- cosmobench a multiscale multiview multitask cosmology benchmark for geometric de | arXiv: 2507.03707
- cost efficient fairness audit under partial feedback | arXiv: 2510.03734
- cost-sensitive freeze-thaw bayesian optimization for efficient hyperparameter tu | arXiv: 2510.21379
- CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring | arXiv: 2505.23575
- counteractive rl rethinking core principles for efficient and scalable deep rein | arXiv: 2603.15871
- counterfactual identifiability via dynamic optimal transport | arXiv: 2510.08294
- counterfactual reasoning for steerable pluralistic value alignment of large lang | arXiv: 2510.18526
- coupling generative modeling and an autoencoder with the causal bridge | arXiv: 2509.25599
- covariances for free exploiting mean distributions for training-free federated l | arXiv: 2412.14326
- CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder | arXiv: 2510.18583
- cpep contrastive pose-emg pre-training enhances gesture generalization on emg si | arXiv: 2509.04699
- cpret a dataset benchmark and model for retrieval in competitive programming | arXiv: 2505.12925
- CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection | arXiv: 2503.18430
- Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models | arXiv: 2505.10844
- Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training | arXiv: 2505.23971
- cross-fluctuation phase transitions reveal sampling dynamics in diffusion models | arXiv: 2511.00124
- Crucible: Quantifying the Potential of Control Algorithms through LLM Agents | arXiv: 2510.18491
- cryptomoe privacy-preserving and scalable mixture of experts inference via balan | arXiv: 2511.01197
- ctrl-alt-deceit sabotage evaluations for automated ai rd | arXiv: 2511.09904
- cue3d quantifying the role of image cues in single-image 3d generation | arXiv: 2511.22121
- cultural alien sampler open-ended art generation balancing originality and coher | arXiv: 2510.20849
- cumolos-mae a masked autoencoder for remote sensing data reconstruction | arXiv: 2508.14957
- cureagent a training-free executor-analyst framework for clinical reasoning | arXiv: 2512.05576
- curiosity-driven rl for symbolic equation solving | arXiv: 2510.17022
- curly flow matching for learning non-gradient field dynamics | arXiv: 2510.26645
- Curriculum Abductive Learning | arXiv: 2505.12275
- curvature tuning provable training-free model steering from a single parameter | arXiv: 2502.07783
- cxreasonbench a benchmark for evaluating structured diagnostic reasoning in ches | arXiv: 2505.18087
- cycle-sync robust global camera pose estimation through enhanced cycle-consisten | arXiv: 2511.02329
- cyclic counterfactuals under shift-scale interventions | arXiv: 2510.25005
- cyin cyclic informative latent space for bridging complete and incomplete multim | arXiv: 2602.04920
- cymbadiff structured spatial diffusion for sketch-based 3d semantic urban scene | arXiv: 2510.13245
- d2ust3r enhancing 3d reconstruction for dynamic scenes | arXiv: 2504.06264
- DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
- dartquant efficient rotational distribution calibration for llm quantization | arXiv: 2511.04063
- data efficient adaptation in large language models via continuous low-rank fine- | arXiv: 2509.18942
- data-juicer 20 cloud-scale adaptive data processing for and with foundation mode | arXiv: 2501.14755
- datarater meta-learned dataset curation | arXiv: 2505.17895
- dataset distillation for pre-trained self-supervised vision models | arXiv: 2511.16674
- DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models | arXiv: 2507.09424
- dbloss decomposition-based loss function for time series forecasting | arXiv: 2510.23672
- dc4gs directional consistency-driven adaptive density control for 3d gaussian sp | arXiv: 2510.26921
- dca graph-guided deep embedding clustering for brain atlases | arXiv: 2509.01426
- dcad-2000 a multilingual dataset across 2000 languages with data cleaning as ano | arXiv: 2502.11546
- dccluster-opt benchmarking dynamic multi-objective optimization for geo-distribu | arXiv: 2511.00117
- de novo generation of functional terpene synthases using tpsgpt | arXiv: 2512.08772
- Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models? | arXiv: 2508.17536
- decaflow a deconfounding causal generative model | arXiv: 2503.15114
- deceptron learned local inverses for fast and stable physics inversion | arXiv: 2511.21076
- Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation | arXiv: 2507.06607
- decomate leveraging generative models for co-creative svg animation | arXiv: 2511.06297
- decomposition of small transformer models | arXiv: 2511.08854
- Decoupled Entropy Minimization | arXiv: 2511.03256
- deep compositional phase diffusion for long motion sequence generation | arXiv: 2510.14427
- deep continuous-time state-space models for marked event sequences | arXiv: 2412.19634
- deep learning for continuous-time stochastic control with jumps | arXiv: 2505.15602
- deep legendre transform | arXiv: 2512.19649
- deep modularity networks with diversity-preserving regularization | arXiv: 2501.13451
- deep research brings deeper harm | arXiv: 2510.11851
- deep rl needs deep behavior analysis exploring implicit planning by model-free a | arXiv: 2506.06981
- deep taxonomic networks for unsupervised hierarchical prototype discovery | arXiv: 2509.23602
- deep value benchmark measuring whether models generalize deep values or shallow | arXiv: 2511.02109
- Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding | arXiv: 2505.18079
- deepasa an object-oriented multi-purpose network for auditory scene analysis | arXiv: 2509.17247
- deepdiver adaptive search intensity scaling via open-web reinforcement learning | arXiv: 2505.24332
- deeppersona a generative engine for scaling deep synthetic personas | arXiv: 2511.07338
- deeptraverse a depth-first search inspired network for algorithmic visual unders | arXiv: 2506.10084
- DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO | arXiv: 2506.07464
- defenderbench a toolkit for evaluating language agents in cybersecurity environm | arXiv: 2506.00739
- DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models | arXiv: 2509.22793
- deliberation on priors trustworthy reasoning of large language models on knowled | arXiv: 2505.15210
- deltaflow an efficient multi-frame scene flow estimation method | arXiv: 2508.17054
- deltaphi physical states residual learning for neural operators in data-limited | arXiv: 2406.09795
- deltaproduct improving state-tracking in linear rnns via householder products | arXiv: 2502.10297
- delving into cascaded instability a lipschitz continuity view on image restorati | arXiv: 2510.24232
- demandcast global hourly electricity demand forecasting | arXiv: 2510.08000
- demo generative ai helps radiotherapy planning with user preference | arXiv: 2512.08996
- demo guide-rag evidence-driven corpus curation for retrieval-augmented generatio | arXiv: 2510.15782
- demystifying language model forgetting with low-rank example associations | arXiv: 2406.14026
- demystifying spectral feature learning for instrumental variable regression | arXiv: 2506.10899
- denoiserotator enhance pruning robustness for llms via importance concentration | arXiv: 2505.23049
- denoising weak lensing mass maps with diffusion model and generative adversarial | arXiv: 2511.16415
- dense associative memory with epanechnikov energy | arXiv: 2506.10801
- dense backpropagation improves training for sparse mixture-of-experts | arXiv: 2504.12463
- dense sae latents are features not bugs | arXiv: 2506.15679
- DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models | arXiv: 2506.03517
- dependency parsing is more parameter-efficient with normalization | arXiv: 2505.20215
- depth-bounds for neural networks via the braid arrangement | arXiv: 2502.09324
- depth-supervised fusion network for seamless-free image stitching | arXiv: 2510.21396
- dermacon-in a multi-concept annotated dermatological image dataset of indian ski | arXiv: 2506.06099
- design encrypted gnn inference via server-side input graph pruning | arXiv: 2507.05649
- designx human-competitive algorithm designer for black-box optimization | arXiv: 2505.17866
- detecting generated images by fitting natural image distributions | arXiv: 2511.01293
- Detecting High-Stakes Interactions with Activation Probes | arXiv: 2506.10805
- detection and simulation of urban heat islands using a fine-tuned geospatial fou | arXiv: 2510.18773
- detectiumfire a comprehensive multi-modal dataset bridging vision and language f | arXiv: 2511.02495
- deterministic continuous replacement fast and stable module replacement in pretr | arXiv: 2511.18670
- detree detecting human-ai collaborative texts via tree-structured hierarchical r | arXiv: 2510.17489
- devfd developmental face forgery detection by learning shared and orthogonal lor | arXiv: 2509.19230
- dexflywheel a scalable and self-improving data generation framework for dexterou | arXiv: 2509.23829
- dexter diffusion-guided explanations with textual reasoning for vision models | arXiv: 2510.14741
- dgh dynamic gaussian hair | arXiv: 2512.17094
- Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking | arXiv: 2505.23495
- dice discrete interpretable comparative evaluation with probabilistic scoring fo | arXiv: 2512.22629
- DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling | arXiv: 2505.11196
- dictpfl efficient and private federated learning on encrypted gradients | arXiv: 2510.21086
- diff-icmh harmonizing machine and human vision in image compression with generat | arXiv: 2511.22549
- differentiable hierarchical visual tokenization | arXiv: 2511.02652
- differentiable structure learning and causal discovery for general binary data | arXiv: 2509.21658
- differential privacy for euclidean jordan algebra with applications to private s | arXiv: 2509.16915
- differentially private bilevel optimization efficient algorithms with near-optim | arXiv: 2506.12994
- differentially private federated low rank adaptation beyond fixed-matrix | arXiv: 2507.09990
- differentially private high-dimensional variable selection via integer programmi | arXiv: 2510.22062
- diffeye diffusion-based continuous eye-tracking data generation conditioned on n | arXiv: 2509.16767
- Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models | arXiv: 2510.23974
- Diffusion Classifiers Understand Compositionality, but Conditions Apply | arXiv: 2505.17955
- diffusion generative modeling on lie group representations | arXiv: 2502.02513
- Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization | arXiv: 2502.01051
- diffusion models meet contextual bandits | arXiv: 2402.10028
- diffusion transformers as open-world spatiotemporal foundation models | arXiv: 2411.12164
- diffusion transformers for imputation statistical efficiency and uncertainty qua | arXiv: 2510.02216
- diffusion-based electromagnetic inverse design of scattering structured media | arXiv: 2511.05357
- diffusion-classifier synergy reward-aligned learning via mutual boosting loop fo | arXiv: 2510.03608
- diffusion-driven progressive target manipulation for source-free domain adaptati | arXiv: 2510.25279
- diffusion-driven two-stage active learning for low-budget semantic segmentation | arXiv: 2510.22229
- DINO-Foresight: Looking into the Future with DINO | arXiv: 2412.11673
- directional non-commutative monoidal structures for compositional embeddings in | arXiv: 2505.15507
- disaggregation reveals hidden training dynamics the case of agreement attraction | arXiv: 2510.24934
- DISC: Dynamic Decomposition Improves LLM Inference Scaling | arXiv: 2502.16706
- DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization | arXiv: 2505.12366
- discover automated curricula for sparse-reward reinforcement learning | arXiv: 2505.19850
- discovering transformer circuits via a hybrid attribution and pruning framework | arXiv: 2510.03282
- disentangled concepts speak louder than words explainable video action recogniti | arXiv: 2511.03725
- disentangling hyperedges through the lens of category theory | arXiv: 2510.16289
- disentangling latent shifts of in-context learning with weak supervision | arXiv: 2410.01508
- DisMo: Disentangled Motion Representations for Open-World Motion Transfer | arXiv: 2511.23428
- dison decentralized isolation networks for out-of-distribution detection in medi | arXiv: 2506.09024
- distillation robustifies unlearning | arXiv: 2506.06278
- Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
- Distilling LLM Agent into Small Models with Retrieval and Code Tools | arXiv: 2505.17612
- distribution learning meets graph structure sampling | arXiv: 2405.07914
- distributional adversarial attacks and training in deep hedging | arXiv: 2508.14757
- distributional autoencoders know the score | arXiv: 2502.11583
- distributionally robust feature selection | arXiv: 2510.21113
- distributive fairness in large language models evaluating alignment with human v | arXiv: 2502.00313
- ditch the denoiser emergence of noise robustness in self-supervised learning fro | arXiv: 2505.12191
- DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection | arXiv: 2503.09271
- dna-detectllm unveiling ai-generated text via a dna-inspired mutation-repair par | arXiv: 2509.15550
- Do Different Prompting Methods Yield a Common Task Representation in Language Models? | arXiv: 2505.12075
- Do Language Models Use Their Depth Efficiently? | arXiv: 2505.13898
- do neural networks need gradient descent to generalize a theoretical study | arXiv: 2506.03931
- do-pfn in-context learning for causal effect estimation | arXiv: 2506.06039
- doctor approved generating medically accurate skin disease images through ai-exp | arXiv: 2506.12323
- Document Summarization with Conformal Importance Guarantees | arXiv: 2509.20461
- does object binding naturally emerge in large pretrained vision transformers | arXiv: 2510.24709
- Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models | arXiv: 2506.04210
- domain-adapted granger causality for real-time cross-slice attack attribution in | arXiv: 2510.05165
- domain-adaptive transformer for data-efficient glioma segmentation in sub-sahara | arXiv: 2511.02928
- Don't Be Lazy: CompleteP Enables Compute-Efficient Deep Transformers | arXiv: 2505.01618
- Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
- dont just chase highlighted tokens in mllms revisiting visual holistic context r | arXiv: 2510.02912
- DOTA: Distributional Test-Time Adaptation of Vision-Language Models | arXiv: 2409.19375
- double descent meets out-of-distribution detection theoretical insights and empi | arXiv: 2411.02184
- doubly robust alignment for large language models | arXiv: 2506.01183
- dove efficient one-step diffusion model for real-world video super-resolution | arXiv: 2505.16239
- dp-llm runtime model adaptation with dynamic layer-wise precision assignment | arXiv: 2508.06041
- dp2o-sr direct perceptual preference optimization for real-world image super-res | arXiv: 2510.18851
- dpa a one-stop metric to measure bias amplification in classification datasets | arXiv: 2412.11060
- dragon guard llm unlearning in context via negative detection and reasoning | arXiv: 2511.05784
- dreamprm domain-reweighted process reward model for multimodal reasoning | arXiv: 2505.20241
- DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents | arXiv: 2506.12104
- DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving | arXiv: 2509.17940
- dsas a universal plug-and-play framework for attention optimization in multi-doc | arXiv: 2510.12251
- dual data alignment makes ai-generated image detector easier generalizable | arXiv: 2505.14359
- dual mixture-of-experts framework for discrete-time survival analysis | arXiv: 2510.26014
- dual-flow transferable multi-target instance-agnostic attacks via in-the-wild ca | arXiv: 2502.02096
- dualfocus depth from focus with spatio-focal dual variational constraints | arXiv: 2509.21992
- duetgraph coarse-to-fine knowledge graph reasoning with dual-pathway global-loca | arXiv: 2507.11229
- duogpt training-free dual sparsity through activation-aware pruning in llms | arXiv: 2506.20194
- duolens a framework for robust detection of machine-generated multilingual text | arXiv: 2510.18904
- dyg-mamba continuous state space modeling on dynamic graphs | arXiv: 2408.06966
- dynaact large language model reasoning with dynamic action spaces | arXiv: 2511.08043
- dynaguide steering diffusion polices with active dynamic guidance | arXiv: 2506.13922
- dynamic algorithm for explainable k-medians clustering under lp norm | arXiv: 2512.01150
- dynamic bundling with large language models for zero-shot inference on text-attr | arXiv: 2505.17599
- dynamic causal discovery in alzheimers disease through latent pseudotime modelli | arXiv: 2511.04619
- dynamic diffusion schrödinger bridge in astrophysical observational inversions | arXiv: 2506.08065
- dynamic features adaptation in networking toward flexible training and explainab | arXiv: 2510.08303
- dynamic gaussian splatting from defocused and motion-blurred monocular videos | arXiv: 2510.10691
- dynamic regret reduces to kernelized static regret | arXiv: 2507.05478
- dynamics of spontaneous topic changes in next token prediction with self-attenti | arXiv: 2501.06382
- dynamics-aligned latent imagination in contextual world models for zero-shot gen | arXiv: 2508.20294
- dynamicvl benchmarking multimodal large language models for dynamic city underst | arXiv: 2505.21076
- dynanav dynamic feature and layer selection for efficient visual navigation | arXiv: 2509.21930
- dynarend learning 3d dynamics via masked future rendering for robotic manipulati | arXiv: 2510.24261
- e-bats efficient backpropagation-free test-time adaptation for speech foundation | arXiv: 2506.07078
- e-moflow learning egomotion and optical flow from event data via implicit regula | arXiv: 2510.12753
- e2e-vguard adversarial prevention for production llm-based end-to-end speech syn | arXiv: 2511.07099
- ea3d online open-world 3d object extraction from streaming videos | arXiv: 2510.25146
- eag3r event-augmented 3d geometry estimation for dynamic and extreme-lighting sc | arXiv: 2512.00771
- echoes of humanity exploring the perceived humanness of ai music | arXiv: 2509.25601
- ecocast a spatio-temporal model for continual biodiversity and climate risk fore | arXiv: 2512.02260
- edbench large-scale electron density data for molecular modeling | arXiv: 2505.09262
- eddyformer accelerated neural simulations of three-dimensional turbulence at sca | arXiv: 2510.24173
- edit less achieve more dynamic sparse neuron masking for lifelong knowledge edit | arXiv: 2510.22139
- editinfinity image editing with binary-quantized generative models | arXiv: 2510.20217
- eegrexfernet a lightweight gen-ai framework for eeg subspace reconstruction via | arXiv: 2511.02848
- ef-3dgs event-aided free-trajectory 3d gaussian splatting | arXiv: 2410.15392
- effective policy learning for multi-agent online coordination beyond submodular | arXiv: 2509.22596
- efficient adaptive experimentation with noncompliance | arXiv: 2505.17468
- efficient adaptive federated optimization | arXiv: 2410.18117
- efficient fairness-performance pareto front computation | arXiv: 2409.17643
- efficient federated learning against byzantine attacks and data heterogeneity vi | arXiv: 2408.09539
- efficient kernelized learning in polyhedral games beyond full-information from c | arXiv: 2509.20919
- efficient multi-modal large language models via progressive consistency distilla | arXiv: 2510.00515
- efficient parametric svd of koopman operator for stochastic dynamical systems | arXiv: 2507.07222
- efficient pre-training of llms via topology-aware communication alignment on mor | arXiv: 2509.15940
- efficient rectified flow for image fusion | arXiv: 2509.16549
- efficient semantic uncertainty quantification in language models via diversity-s | arXiv: 2510.21310
- efficient speech language modeling via energy distance in continuous latent spac | arXiv: 2505.13181
- efficient training-free online routing for high-volume multi-llm serving | arXiv: 2509.02718
- efficient verified machine unlearning for distillation | arXiv: 2503.22539
- efficient vision-language reasoning via adaptive token pruning | arXiv: 2512.12701
- EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval | arXiv: 2510.18546
- egobridge domain adaptation for generalizable imitation from egocentric human da | arXiv: 2509.19626
- egoemotion egocentric vision and physiological signals for emotion and personali | arXiv: 2510.22129
- EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
- elastic vits from pretrained models without retraining | arXiv: 2510.17700
- elastic weight consolidation for knowledge graph continual learning an empirical | arXiv: 2512.01890
- elasticmm efficient multimodal llms serving with elastic multimodal parallelism | arXiv: 2507.10069
- electra a cartesian network for 3d charge density prediction with floating orbit | arXiv: 2503.08305
- elucidated rolling diffusion models for probabilistic forecasting of complex dyn | arXiv: 2506.20024
- embedding alignment in code generation for audio | arXiv: 2508.05473
- Emergence and Evolution of Interpretable Concepts in Diffusion Models
- emergence and scaling laws in sgd learning of shallow neural networks | arXiv: 2504.19983
- emergence of linear truth encodings in language models | arXiv: 2510.15804
- emergency response measures for catastrophic ai risk | arXiv: 2511.05526
- emergent world beliefs exploring transformers in stochastic games | arXiv: 2512.23722
- emloc emulator-based memory-efficient fine-tuning with lora correction | arXiv: 2506.12015
- empathia multi-faceted human-ai collaboration for refugee integration | arXiv: 2508.07671
- empirical study on robustness and resilience in cooperative multi-agent reinforc | arXiv: 2510.11824
- Empower Words: DualGround for Structured Phrase and Sentence-Level Temporal Grounding
- empowering decision trees via shape function branching | arXiv: 2510.19040
- enabling differentially private federated learning for speech recognition benchm | arXiv: 2310.00098
- encoder-decoder diffusion language models for efficient training and inference | arXiv: 2510.22852
- encoding and understanding astrophysical information in large language model-gen | arXiv: 2511.14685
- EnCompass: Enhancing Agent Programming with Search Over Program Execution Paths | arXiv: 2512.03571
- endobench a comprehensive evaluation of multi-modal large language models for en | arXiv: 2505.23601
- energy loss functions for physical systems | arXiv: 2511.02087
- energy matching unifying flow matching and energy-based models for generative mo | arXiv: 2504.10612
- enerverse envisioning embodied future space for robotics manipulation | arXiv: 2501.01895
- enforcing governing equation constraints in neural pde solvers via training-free | arXiv: 2511.17258
- enginuity building an open multi-domain dataset of complex engineering diagrams | arXiv: 2601.13299
- Enhancing CLIP Robustness via Cross-Modality Alignment | arXiv: 2510.24038
- Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions | arXiv: 2510.16540
- enhancing demand-oriented regionalization with agentic ai and local heterogeneou | arXiv: 2511.10857
- enhancing diffusion model guidance through calibration and regularization | arXiv: 2511.05844
- enhancing graph classification robustness with singular pooling | arXiv: 2510.22643
- Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark | arXiv: 2510.09343
- enhancing interpretability in deep reinforcement learning through semantic clust | arXiv: 2409.17411
- enhancing multilingual llm pretraining with model-based data selection | arXiv: 2502.10361
- enhancing sample selection against label noise by cutting mislabeled easy exampl | arXiv: 2502.08227
- enhancing semi-supervised learning with zero-shot pseudolabels | arXiv: 2502.12584
- Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders
- enhancing the outcome reward-based rl training of mllms with self-consistency sa | arXiv: 2511.10648
- enhancing training data attribution with representational optimization | arXiv: 2505.18513
- Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding | arXiv: 2412.06474
- entropy rectifying guidance for diffusion and flow models | arXiv: 2504.13987
- environment inference for learning generalizable dynamical system | arXiv: 2510.19784
- epistemic uncertainty for generated image detection | arXiv: 2412.05897
- equivariance by contrast identifiable equivariant embeddings from unlabeled fini | arXiv: 2510.21706
- equivariant flow matching for symmetry-breaking bifurcation problems | arXiv: 2509.03340
- esca contextualizing embodied agents via scene-graph generation | arXiv: 2510.15963
- escaping saddle points without lipschitz smoothness the power of nonlinear preco | arXiv: 2509.15817
- establishing linear surrogate regret bounds for convex smooth losses via convolu | arXiv: 2505.09432
- estimating hitting times locally at scale | arXiv: 2511.04343
- estimation of stochastic optimal transport maps | arXiv: 2512.09499
- ethics statements in ai music papers the effective and the ineffective | arXiv: 2509.25496
- eu-agent-bench measuring illegal behavior of llm agents under eu law | arXiv: 2510.21524
- eugens efficient unified and general dense layers | arXiv: 2410.09771
- eurospeech a multilingual speech corpus | arXiv: 2510.00514
- evalearn quantifying the learning capability and efficiency of llms via sequenti | arXiv: 2506.02672
- evaluating in silico creativity an expert review of ai chess compositions | arXiv: 2510.23772
- evaluating llms for combinatorial optimization one-phase and two-phase heuristic | arXiv: 2509.22255
- Evaluating LLMs in Open-Source Games | arXiv: 2512.00371
- evaluating multimodal large language models on core music perception tasks | arXiv: 2510.22455
- evaluating multiple models using labeled and unlabeled data | arXiv: 2501.11866
- evaluating the evaluators metrics for compositional text-to-image generation | arXiv: 2509.21227
- evaluating the promise and pitfalls of llms in hiring decisions | arXiv: 2507.02087
- evaluation of vision-llms in surveillance video | arXiv: 2510.23190
- every camera effect every time all at once 4d gaussian ray tracing for physics-b | arXiv: 2509.10759
- evobrain dynamic multi-channel eeg graph modeling for time-evolving brain networ | arXiv: 2509.15857
- evodiff entropy-aware variance optimized diffusion inference | arXiv: 2509.26096
- evolm in search of lost language model training dynamics | arXiv: 2506.16029
- evolutionary learning in spatial agent-based models for physical climate risk as | arXiv: 2509.18633
- evolutionary prediction games | arXiv: 2503.03401
- evolve to inspire novelty search for diverse image generation | arXiv: 2511.00686
- evorefuse evolutionary prompt optimization for evaluation and mitigation of llm | arXiv: 2505.23473
- ewc-guided diffusion replay for exemplar-free continual learning in medical imag | arXiv: 2509.23906
- exact and linear convergence for federated learning under arbitrary client parti | arXiv: 2503.20117
- exact expressive power of transformers with padding | arXiv: 2505.18948
- exact learning of arithmetic with differentiable agents | arXiv: 2511.22751
- exgra-med extended context graph alignment for medical vision-language models | arXiv: 2410.02615
- exoplanet formation inference using conditional invertible neural networks | arXiv: 2512.05751
- explaining and mitigating crosslingual tokenizer inequities | arXiv: 2510.21909
- explaining similarity in vision-language encoders with weighted banzhaf interact | arXiv: 2508.05430
- exploiting task relationships in continual learning via transferability-aware ta | arXiv: 2502.11609
- exploiting vocabulary frequency imbalance in language model pre-training | arXiv: 2508.15390
- exploration of incremental synthetic non-morphed images for single morphing atta | arXiv: 2510.09836
- exploration via feature perturbation in contextual bandits | arXiv: 2510.17390
- exploration with foundation models capabilities limitations and hybrid approache | arXiv: 2509.19924
- exploring and leveraging class vectors for classifier editing | arXiv: 2510.11268
- exploring landscapes for better minima along valleys | arXiv: 2510.27153
- exploring neural granger causality with xlstms unveiling temporal dependencies i | arXiv: 2502.09981
- exploring semantic-constrained adversarial example with instruction uncertainty | arXiv: 2510.22981
- exploring structural degradation in dense representations for self-supervised le | arXiv: 2510.17299
- exploring the limits of strong membership inference attacks on large language mo | arXiv: 2505.18773
- exploring the translation mechanism of large language models | arXiv: 2502.11806
- exploring variational graph autoencoders for distribution grid data generation | arXiv: 2509.02469
- expo unlocking hard reasoning with self-explanation-guided reinforcement learnin | arXiv: 2507.02834
- extending ngu to multi-agent rl a preliminary study | arXiv: 2512.01321
- extragradient method for l 0 l 1-lipschitz root-finding problems | arXiv: 2510.22421
- extremely simple multimodal outlier synthesis for out-of-distribution detection | arXiv: 2505.16985
- Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video | arXiv: 2510.14560
- f-adapter frequency-adaptive parameter-efficient fine-tuning in scientific machi | arXiv: 2509.23173
- face a general framework for mapping collaborative filtering embeddings into llm | arXiv: 2510.15729
- face faithful automatic concept extraction | arXiv: 2510.11675
- face-human-bench a comprehensive benchmark of face and human understanding for m | arXiv: 2501.01243
- fact faithful concept traces for explaining neural network decisions | arXiv: 2510.25512
- factor decorrelation enhanced data removal from deep predictive models | arXiv: 2509.23443
- failure prediction at runtime for generative robot policies | arXiv: 2510.09459
- fair minimum labeling efficient temporal network activations for reachability an | arXiv: 2510.03899
- fair representation learning with controllable high confidence guarantees via ad | arXiv: 2510.21017
- fair universe higgsml uncertainty dataset and competition | arXiv: 2410.02867
- faircontrast enhancing fairness through contrastive learning and customized augm | arXiv: 2510.02017
- fairgrpo fair reinforcement learning for equitable clinical reasoning | arXiv: 2510.19893
- fairimagen post-processing for bias mitigation in text-to-image models | arXiv: 2510.21363
- fairness under competition | arXiv: 2505.16291
- fairness-regularized online optimization with switching costs | arXiv: 2512.11131
- faithful group shapley value | arXiv: 2505.19013
- faithful summarization of consumer health queries a cross-lingual framework with | arXiv: 2511.10768
- falcon an ml framework for fully automated layout-constrained analog circuit des | arXiv: 2505.21923
- falcon few-step accurate likelihoods for continuous flows | arXiv: 2512.09914
- falcon fine-grained activation manipulation by contrastive orthogonal unalignmen | arXiv: 2502.01472
- falqon accelerating lora fine-tuning with low-bit floating-point arithmetic | arXiv: 2510.24061
- fantastic features and where to find them a probing method to combine features f | arXiv: 2512.01405
- fapex fractional amplitude-phase expressor for robust cross-subject seizure pred | arXiv: 2511.03263
- far from the shallow brain-predictive reasoning embedding through residual disen | arXiv: 2510.22860
- fast and fluent diffusion language models via convolutional decoding and rejecti | arXiv: 2509.15188
- fast data attribution for text-to-image models | arXiv: 2511.10721
- fast foreground-aware diffusion with accelerated sampling trajectory for segment | arXiv: 2509.20295
- fast solvers for discrete diffusion models theory and applications of high-order | arXiv: 2502.00234
- fastdinov2 frequency based curriculum learning improves robustness and training | arXiv: 2507.03779
- faster algorithm for structured john ellipsoid computation | arXiv: 2211.14407
- fastjam a fast joint alignment model for images | arXiv: 2510.22842
- fastlongspeech enhancing large speech-language models for efficient long-speech | arXiv: 2507.14815
- FastVID: Dynamic Density Pruning for Fast Video Large Language Models
- feat free energy estimators with adaptive transport | arXiv: 2504.11516
- feature-aware modulation for learning from temporal tabular data | arXiv: 2512.03678
- fedfact a provable framework for controllable group-fairness calibration in fede | arXiv: 2506.03777
- fedqs optimizing gradient and model aggregation for semi-asynchronous federated | arXiv: 2510.07664
- fedrain-lite federated reinforcement algorithms for improving idealised numerica | arXiv: 2508.14315
- fedrts federated robust pruning via combinatorial thompson sampling | arXiv: 2501.19122
- fedrw efficient privacy-preserving data reweighting for enhancing federated lear | arXiv: 2511.07505
- fedsvd adaptive orthogonalization for private federated learning with lora | arXiv: 2505.12805
- feel-good thompson sampling for contextual bandits a markov chain monte carlo sh | arXiv: 2507.15290
- ferretnet efficient synthetic image detection via local pixel dependencies | arXiv: 2509.20890
- few-shot knowledge distillation of llms with counterfactual explanations | arXiv: 2510.21631
- few-shot learning from gigapixel images via hierarchical vision-language alignme | arXiv: 2505.17982
- fgbench a dataset and benchmark for molecular property reasoning at functional g | arXiv: 2508.01055
- fin3r fine-tuning feed-forward 3d reconstruction models via monocular knowledge | arXiv: 2511.22429
- final-model-only data attribution with a unifying view of gradient-based methods | arXiv: 2412.03906
- financial instruction following evaluation fife | arXiv: 2512.08965
- find your needle small object image retrieval via multi-object attention optimiz | arXiv: 2503.07038
- finding structure in continual learning | arXiv: 2602.04555
- finegrain evaluating failure modes of text-to-image models with vision language | arXiv: 2512.02161
- FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning | arXiv: 2510.21311
- finite-sample analysis of policy evaluation for robust average reward reinforcem | arXiv: 2502.16816
- finite-time analysis of stochastic nonconvex nonsmooth optimization on the riema | arXiv: 2510.21468
- fiper factorized features for robust image super-resolution and compression | arXiv: 2410.18083
- fira can we achieve full-rank training of llms under low-rank constraint | arXiv: 2410.01623
- firegnn neuro-symbolic graph neural networks with trainable fuzzy rules for inte | arXiv: 2509.10510
- First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training | arXiv: 2505.22453
- firstaidqa a synthetic dataset for first aid and emergency response in low-conne | arXiv: 2511.01289
- fixed-point rnns interpolating from diagonal to dense | arXiv: 2503.10799
- flarex a physics-informed dataset for lens flare removal via 2d synthesis and 3d | arXiv: 2510.09995
- flashmd long-stride universal prediction of molecular dynamics | arXiv: 2505.19350
- flatness is necessary neural collapse is not rethinking generalization via grokk | arXiv: 2509.17738
- flatten graphs as sequences transformers are scalable graph generators | arXiv: 2502.02216
- flattening hierarchies with policy bootstrapping | arXiv: 2505.14975
- flex-judge text-only reasoning unleashes zero-shot multimodal evaluators | arXiv: 2505.18601
- flexac towards flexible control of associative reasoning in multimodal large lan | arXiv: 2510.11190
- flexevent towards flexible event-frame object detection at varying operational f | arXiv: 2412.06708
- flow density control generative optimization beyond entropy-regularized fine-tun | arXiv: 2511.22640
- flow matching neural processes | arXiv: 2512.23853
- flow matching-based autonomous driving planning with advanced interactive behavi | arXiv: 2510.11083
- FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models | arXiv: 2505.19536
- FlowMoE: 分布式MoE训练的可扩展流水线调度框架 | arXiv: 2510.00207
- flux efficient descriptor-driven clustered federated learning under arbitrary di | arXiv: 2511.22305
- flux4d flow-based unsupervised 4d reconstruction | arXiv: 2512.03210
- flylora boosting task decoupling and parameter efficiency via implicit rank-wise | arXiv: 2510.08396
- flysearch exploring how vision-language models explore | arXiv: 2506.02896
- focalcodec low-bitrate speech coding via focal modulation networks | arXiv: 2502.04465
- focus internal mllm representations for efficient fine-grained visual question a | arXiv: 2506.21710
- force prompting video generation models can learn and generalize physics-based c | arXiv: 2505.19386
- forcevla enhancing vla models with a force-aware moe for contact-rich manipulati | arXiv: 2505.22159
- forecasting in offline reinforcement learning for non-stationary environments | arXiv: 2512.01987
- forensichub a unified benchmark codebase for all-domain fake image detection and | arXiv: 2505.11003
- Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
- fostering the ecosystem of ai for social impact requires expanding and strengthe | arXiv: 2510.18238
- foundation cures personalization improving personalized models prompt consistenc | arXiv: 2411.15277
- foundation models as world models a foundational study in text-based gridworlds | arXiv: 2509.15915
- foundation models for scientific discovery from paradigm enhancement to paradigm | arXiv: 2510.15280
- foxes a framework for operational x-ray emission synthesis | arXiv: 2510.22801
- fractalbench diagnosing visual-mathematical reasoning through recursive program | arXiv: 2511.06522
- fractional diffusion bridge models | arXiv: 2511.01795
- freqpolicy efficient flow-based visuomotor policy via frequency consistency | arXiv: 2506.08822
- frequency matters when time series foundation models fail under spectral shift | arXiv: 2511.05619
- frequency-aware token reduction for efficient vision transformer | arXiv: 2511.21477
- friren beyond trajectories -- a spectral lens on time | arXiv: 2505.17370
- from average-iterate to last-iterate convergence in games a reduction and its ap | arXiv: 2506.03464
- from black box to biomarker sparse autoencoders for interpreting speech models o | arXiv: 2507.16836
- from black hole to galaxy neural operator framework for accretion and feedback d | arXiv: 2512.01576
- from black-box to causal-box towards building more interpretable models | arXiv: 2510.21998
- from cradle to cane a two-pass framework for high-fidelity lifespan face aging | arXiv: 2506.20977
- from flat to hierarchical extracting sparse representations with matching pursui | arXiv: 2506.03093
- from generation to attribution music ai agent architectures for the post-streami | arXiv: 2510.20276
- from images to physics probabilistic inference of galaxy parameters and emission | arXiv: 2511.12737
- from information to generative exponent learning rate induces phase transitions | arXiv: 2510.21020
- from judgment to interference early stopping llm harmful outputs via streaming c | arXiv: 2506.09996
- from linear to nonlinear provable weak-to-strong generalization through feature | arXiv: 2510.24812
- from objects to anywhere a holistic benchmark for multi-level visual grounding i | arXiv: 2506.04897
- from pixels to views learning angular-aware and physics-consistent representatio | arXiv: 2510.22577
- from programs to poses factored real-world scene generation via learned program | arXiv: 2510.10292
- from sequence to structure uncovering substructure reasoning in transformers | arXiv: 2507.10435
- from shortcut to induction head how data diversity shapes algorithm selection in | arXiv: 2512.18634
- from simulations to surveys domain adaptation for galaxy observations | arXiv: 2511.18590
- fsnet feasibility-seeking neural network for constrained optimization with guara | arXiv: 2506.00362
- fully dynamic algorithms for chamfer distance | arXiv: 2512.16639
- functional scaling laws in kernel regression loss dynamics and learning rate sch | arXiv: 2509.19189
- future-aware end-to-end driving bidirectional modeling of trajectory planning an | arXiv: 2510.11092
- FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
- g-dpo scalable preference optimization for protein language models | arXiv: 2510.19474
- galactification painting galaxies onto dark matter only simulations using a tran | arXiv: 2511.08438
- gasp efficient black-box generation of adversarial suffixes for jailbreaking llm | arXiv: 2411.14133
- gated integration of low-rank adaptation for continual learning of large languag | arXiv: 2505.15424
- gaudp reinventing multi-agent collaboration through gaussian-image synergy in di | arXiv: 2511.00998
- gaussian process upper confidence bound achieves nearly-optimal regret in noise- | arXiv: 2502.19006
- gaussian-augmented physics simulation and system identification with complex col | arXiv: 2511.06846
- gaze beyond the frame forecasting egocentric 3d visual span | arXiv: 2511.18470
- gc4nc a benchmark framework for graph condensation on node classification with n | arXiv: 2406.16715
- gem empowering mllm for grounded ecg understanding with time series and images | arXiv: 2503.06073
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws | arXiv: 2502.06857
- geneman generalizable single-image 3d human reconstruction from multi-source hum | arXiv: 2411.18624
- Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training | arXiv: 2509.18631
- generalizable insights for graph transformers in theory and practice | arXiv: 2511.08028
- generalizable real-time neural decoding with hybrid state-space models | arXiv: 2506.05320
- generalization bounds for rank-sparse neural networks | arXiv: 2510.21945
- Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention | arXiv: 2502.01473
- generalization or hallucination understanding out-of-context reasoning in transf | arXiv: 2506.10887
- Generalized Contrastive Learning for Universal Multimodal Retrieval | arXiv: 2509.25638
- generalized linear bandits almost optimal regret with one-pass update | arXiv: 2507.11847
- generalized linear mode connectivity for transformers | arXiv: 2506.22712
- generalizing verifiable instruction following | arXiv: 2507.02833
- generalizing while preserving monotonicity in comparison-based preference learni | arXiv: 2506.08616
- Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling | arXiv: 2504.13169
- generating multi-table time series ehr from latent space with minimal preprocess | arXiv: 2507.06996
- generating physically sound designs from text and a set of physical constraints | arXiv: 2602.02213
- generative ai agents for controllable and protected content creation | arXiv: 2601.12348
- generative distribution embeddings lifting autoencoders to the space of distribu | arXiv: 2505.18150
- generative graph pattern machine | arXiv: 2505.16130
- generative model inversion through the lens of the manifold hypothesis | arXiv: 2509.20177
- generative modeling of full-atom protein conformations using latent diffusion on | arXiv: 2506.17064
- genir generative visual feedback for mental image retrieval | arXiv: 2506.06220
- geo-sign hyperbolic contrastive regularisation for geometrically aware sign lang | arXiv: 2506.00129
- geocad local geometry-controllable cad generation with large language models | arXiv: 2506.10337
- geocomplete geometry-aware diffusion for reference-driven image completion | arXiv: 2510.03110
- geodynamics a geometric state-space neural network for understanding brain dynam | arXiv: 2601.13570
- geolink empowering remote sensing foundation model with openstreetmap data | arXiv: 2509.26016
- geometric data valuation via leverage scores | arXiv: 2511.02100
- geometric imbalance in semi-supervised node classification | arXiv: 2303.10371
- geometric priors for generalizable world models via vector symbolic architecture | arXiv: 2602.21467
- geometry of decision making in language models | arXiv: 2511.20315
- georanker distance-aware ranking for worldwide image geolocalization | arXiv: 2505.13731
- georemover removing objects and their causal visual artifacts | arXiv: 2509.18538
- geosvr taming sparse voxels for geometrically accurate surface reconstruction | arXiv: 2509.18090
- gflownets for learning better drug-drug interaction representations | arXiv: 2508.06576
- gfm-rag graph foundation model for retrieval augmented generation | arXiv: 2502.01113
- global convergence for average reward constrained mdps with primal-dual actor cr | arXiv: 2505.15138
- global minimizers of ellp-regularized objectives yield the sparsest relu neural | arXiv: 2505.21791
- global minimizers of sigmoid contrastive loss | arXiv: 2509.18552
- GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity | arXiv: 2508.19972
- gnnxemplar exemplars to explanations -- natural language rules for global gnn in | arXiv: 2509.18376
- goalladder incremental goal discovery with vision-language models | arXiv: 2506.16396
- goatex geometry occlusion-aware texturing | arXiv: 2511.23051
- gora gradient-driven adaptive low rank adaptation | arXiv: 2502.12171
- GPO: Learning from Critical Steps to Improve LLM Reasoning | arXiv: 2509.16456
- gradient descent as loss landscape navigation a normative framework for deriving | arXiv: 2510.26997
- gradient variance reveals failure modes in flow-based generative models | arXiv: 2510.18118
- gradient-variation online adaptivity for accelerated optimization with hölder sm | arXiv: 2511.02276
- gradient-weight alignment as a train-time proxy for generalization in classifica | arXiv: 2510.25480
- gralora granular low-rank adaptation for parameter-efficient fine-tuning | arXiv: 2505.20355
- graph alignment via birkhoff relaxation | arXiv: 2503.05323
- graph diffusion that can insert and delete | arXiv: 2506.15725
- graph distance as surprise free energy minimization in knowledge graph reasoning | arXiv: 2512.01878
- graph neural networks for efficient ac power flow prediction in power grids | arXiv: 2502.05702
- graph neural networks for interferometer simulations | arXiv: 2512.16051
- graph persistence goes spectral | arXiv: 2506.06571
- graph your own prompt | arXiv: 2509.23373
- graph-based neural space weather forecasting | arXiv: 2509.19605
- graphchain large language models for large-scale graph analysis via tool chainin | arXiv: 2511.00457
- graphfaas serverless gnn inference for burst-resilient real-time intrusion detec | arXiv: 2511.10554
- graphkeeper graph domain-incremental learning via knowledge disentanglement and | arXiv: 2511.00097
- graphtop graph topology-oriented prompting for graph neural networks | arXiv: 2510.22451
- grasp2grasp vision-based dexterous grasp translation via schrödinger bridges | arXiv: 2506.02489
- grass scalable data attribution with gradient sparsification and sparse projecti | arXiv: 2505.18976
- graver generative graph vocabularies for robust graph foundation models fine-tun | arXiv: 2511.05592
- greedy algorithm for structured bandits a sharp characterization of asymptotic s | arXiv: 2503.04010
- greedy sampling is provably efficient for rlhf | arXiv: 2510.24700
- greenhyperspectra a multi-source hyperspectral dataset for global vegetation tra | arXiv: 2507.06806
- ground-compose-reinforce grounding language in agentic behaviours using limited | arXiv: 2507.10741
- grounding foundational vision models with 3d human poses for robust action recog | arXiv: 2511.05622
- Group-in-Group Policy Optimization for LLM Agent Training | arXiv: 2505.10978
- gsalign geometric and semantic alignment network for aerial-ground person re-ide | arXiv: 2510.22268
- gspn-2 efficient parallel sequence modeling | arXiv: 2512.07884
- gst-unet a neural framework for spatiotemporal causal inference with time-varyin | arXiv: 2502.05295
- gtpbd a fine-grained global terraced parcel and boundary dataset | arXiv: 2507.14697
- gui-rise structured reasoning and history summarization for gui navigation | arXiv: 2510.27210
- guided diffusion sampling on function spaces with applications to pdes | arXiv: 2505.17004
- guideflow3d optimization-guided rectified flow for appearance transfer | arXiv: 2510.16136
- guiding cross-modal representations with mllm priors via preference alignment | arXiv: 2506.06970
- gvpo group variance policy optimization for large language model post-training | arXiv: 2504.19599
- gyroswin 5d surrogates for gyrokinetic plasma turbulence simulations | arXiv: 2510.07314
- h-ddx a hierarchical evaluation framework for differential diagnosis | arXiv: 2510.03700
- h-splid hsic-based saliency preserving latent information decomposition | arXiv: 2510.20627
- haif-gs hierarchical and induced flow-guided gaussian splatting for dynamic scen | arXiv: 2506.09518
- hallucination as an upper bound a new perspective on text-to-image evaluation | arXiv: 2509.21257
- hamiltonian neural pde solvers through functional approximation | arXiv: 2505.13275
- hankel singular value regularization for highly compressible state space models | arXiv: 2510.22951
- HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance | arXiv: 2505.19742
- hardware-aligned hierarchical sparse attention for efficient long-term memory ac | arXiv: 2504.16795
- harnessing feature resonance under arbitrary target alignment for out-of-distrib | arXiv: 2502.16076
- harnessing the computation redundancy in vits to boost adversarial transferabili | arXiv: 2504.10804
- hawaii hierarchical visual knowledge transfer for efficient vision-language mode | arXiv: 2506.19072
- Head Pursuit: Probing Attention Specialization in Multimodal Transformers | arXiv: 2510.21518
- healthslm-bench benchmarking small language models for mobile and wearable healt | arXiv: 2509.07260
- helpsteer3-preference open human-annotated preference data across diverse tasks | arXiv: 2505.11475
- hephaestus mixture generative modeling with energy guidance for large-scale qos | arXiv: 2510.17036
- hermesflow seamlessly closing the gap in multimodal understanding and generation | arXiv: 2502.12148
- hessian-guided perturbed wasserstein gradient flows for escaping saddle points | arXiv: 2509.16974
- heterogeneous adversarial play in interactive environments | arXiv: 2510.18407
- heterogeneous swarms jointly optimizing model roles and weights for multi-llm sy | arXiv: 2502.04510
- hierarchical balance packing towards efficient supervised fine-tuning for long-c | arXiv: 2503.07680
- hierarchical koopman diffusion fast generation with interpretable diffusion traj | arXiv: 2510.12220
- hierarchical retrieval the geometry and a pretrain-finetune recipe | arXiv: 2509.16411
- hierarchical self-attention generalizing neural attention mechanics to multi-sca | arXiv: 2509.15448
- HiFi-RAG: Hierarchical Content Filtering and Two-Pass Generation for Open-Domain RAG | arXiv: 2512.22442
- high resolution udf meshing via iterative networks | arXiv: 2509.17212
- high-order equivariant flow matching for density functional theory hamiltonian p | arXiv: 2505.18817
- highlighting what matters promptable embeddings for attribute-focused image retr | arXiv: 2505.15877
- himacon discovering hierarchical manipulation concepts from unlabeled multi-moda | arXiv: 2510.11321
- hogwild inference parallel llm generation via concurrent attention | arXiv: 2504.06261
- hoi-dyn learning interaction dynamics for human-object motion diffusion | arXiv: 2507.01737
- hollowflow efficient sample likelihood evaluation using hollow message passing | arXiv: 2510.21542
- holollm multisensory foundation model for language-grounded human sensing and re | arXiv: 2505.17645
- homogeneous keys heterogeneous values exploiting local kv cache asymmetry for lo | arXiv: 2506.05410
- hopadiff holistic-partial aware fourier conditioned diffusion for referring huma | arXiv: 2506.09650
- HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models | arXiv: 2505.20444
- horizon reduction makes rl scalable | arXiv: 2506.04168
- houselayout3d a benchmark and training-free baseline for 3d layout estimation in | arXiv: 2512.02450
- how data mixing shapes in-context learning asymptotic equivalence for transforme | arXiv: 2510.25753
- how different from the past spatio-temporal time series forecasting with self-su | arXiv: 2510.04908
- How Do Transformers Learn Implicit Reasoning? | arXiv: 2505.23653
- how does sequence modeling architecture influence base capabilities of pre-train | arXiv: 2505.18522
- how foundational are foundation models for time series forecasting | arXiv: 2510.00742
- how many domains suffice for domain generalization a tight characterization via | arXiv: 2506.16704
- how many tokens do 3d point cloud transformer architectures really need | arXiv: 2511.05449
- how patterns dictate learnability in sequential data | arXiv: 2510.10744
- how should we evaluate data deletion in graph-based ann indexes | arXiv: 2512.06200
- how to build a consistency model learning flow maps via self-distillation | arXiv: 2505.18825
- human-assisted robotic policy refinement via action preference optimization | arXiv: 2506.07127
- human-inspired multi-level reinforcement learning | arXiv: 2501.07502
- human-machine ritual synergic performance through real-time motion recognition | arXiv: 2511.02351
- humancrafter synergizing generalizable human reconstruction and semantic 3d segm | arXiv: 2511.00468
- hybrid autoencoders for tabular data leveraging model-based augmentation in low- | arXiv: 2511.06961
- Hybrid Latent Reasoning via Reinforcement Learning | arXiv: 2505.18454
- hybrid physical-neural simulator for fast cosmological hydrodynamics | arXiv: 2510.26593
- hybrid-balance gflownet for solving vehicle routing problems | arXiv: 2510.04792
- hybridnorm towards stable and efficient transformer training via hybrid normaliz | arXiv: 2503.04598
- HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location | arXiv: 2501.14808
- hyperbolic dataset distillation | arXiv: 2505.24623
- hyperbolic fine-tuning for large language models | arXiv: 2410.04010
- hypergraphrag retrieval-augmented generation via hypergraph-structured knowledge | arXiv: 2503.21322
- hyperparameter transfer enables consistent gains of matrix-preconditioned optimi | arXiv: 2512.05620
- hyplanehead rethinking tri-plane-like representations in full-head image synthes | arXiv: 2509.16748
- hyrf hybrid radiance fields for memory-efficient and high-quality novel view syn | arXiv: 2509.17083
- i-raven-x benchmarking generalization and robustness of analogical and mathemati | arXiv: 2510.17496
- ibgs image-based gaussian splatting | arXiv: 2511.14357
- if-guide influence function-guided detoxification of llms | arXiv: 2506.01790
- ifinder structured zero-shot vision-based llm grounding for dash-cam video reaso | arXiv: 2509.19552
- image super-resolution with guarantees via conformalized generative models | arXiv: 2502.09664
- imagenet-trained cnns are not biased towards texture revisiting feature reliance | arXiv: 2509.20234
- imagesentinel protecting visual datasets from unauthorized retrieval-augmented i | arXiv: 2510.12119
- impact of dataset properties on membership inference vulnerability of deep trans | arXiv: 2402.06674
- impact of layer norm on memorization and generalization in transformers | arXiv: 2511.10566
- implicit augmentation from distributional symmetry in turbulence super-resolutio | arXiv: 2509.20683
- implicit bias of spectral descent and muon on multiclass separable data | arXiv: 2502.04664
- implicit modeling for transferability estimation of vision foundation models | arXiv: 2510.23145
- improved approximation algorithms for chromatic and pseudometric-weighted correl | arXiv: 2505.21939
- improved balanced classification with theoretically grounded loss functions | arXiv: 2512.23947
- improved regret and contextual linear extension for pandoras box and prophet ine | arXiv: 2505.18828
- improved regret bounds for gaussian process upper confidence bound in bayesian o | arXiv: 2506.01393
- improved training technique for shortcut models | arXiv: 2510.21250
- improving consistency in retrieval-augmented systems with group similarity rewar | arXiv: 2510.04392
- improving data efficiency for llm reinforcement fine-tuning through difficulty-t | arXiv: 2506.05316
- improving decision trees through the lens of parameterized local search | arXiv: 2510.12726
- improving diffusion-based inverse algorithms under few-step constraint via learn | arXiv: 2503.10103
- improving forecasts of suicide attempts for patients with little data | arXiv: 2511.18199
- improving perturbation-based explanations by understanding the role of uncertain | arXiv: 2511.10439
- improving planning and mbrl with temporally-extended actions | arXiv: 2505.15754
- improving posterior inference of galaxy properties with image-based conditional | arXiv: 2512.05078
- improving retrieval-augmented generation through multi-agent reinforcement learn | arXiv: 2501.15228
- improving the straight-through estimator with zeroth-order information | arXiv: 2510.23926
- improving time series forecasting via instance-aware post-hoc revision | arXiv: 2505.23583
- in search of adams secret sauce | arXiv: 2505.21829
- in the eye of mllm benchmarking egocentric video intent understanding with gaze- | arXiv: 2509.07447
- in-context compositional learning via sparse coding transformer | arXiv: 2511.20194
- in-context edit enabling instructional image editing with in-context generation | arXiv: 2504.20690
- in-context learning of linear dynamical systems with transformers approximation | arXiv: 2502.08136
- in-context learning of stochastic differential equations with foundation inferen | arXiv: 2502.19049
- inc an indirect neural corrector for auto-regressive hybrid pde solvers | arXiv: 2511.12764
- incentivizing reasoning for advanced instruction-following of large language mod | arXiv: 2506.01413
- incentivizing time-aware fairness in data sharing | arXiv: 2510.09240
- incomplete multi-view clustering via hierarchical semantic alignment and coopera | arXiv: 2510.13887
- increasing the utility of synthetic images through chamfer guidance | arXiv: 2508.10631
- incremental sequence classification with temporal consistency | arXiv: 2505.16548
- indego a dataset of industrial scenarios and collaborative work for egocentric a | arXiv: 2511.19684
- inductive transfer learning for graph-based recommenders | arXiv: 2510.22799
- ineq-comp benchmarking human-intuitive compositional reasoning in automated theo | arXiv: 2505.12680
- inference-time alignment in continuous space | arXiv: 2505.20081
- inference-time chain-of-thought pruning with latent informativeness signals | arXiv: 2511.00699
- inference-time hyper-scaling with kv cache compression | arXiv: 2506.05345
- inference-time reward hacking in large language models | arXiv: 2506.19248
- inference-time scaling for flow models via stochastic generation and rollover bu | arXiv: 2503.19385
- inferring stochastic dynamics with growth from cross-sectional data | arXiv: 2505.13197
- infinipot-v memory-constrained kv cache compression for streaming video understa | arXiv: 2506.15745
- InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
- influence functions for edge edits in non-convex graph neural networks | arXiv: 2506.04694
- influx a benchmark for self-calibration of dynamic intrinsics of video cameras | arXiv: 2510.23589
- information theoretic learning for diffusion models with warm start | arXiv: 2510.20903
- information-computation tradeoffs for noiseless linear regression with oblivious | arXiv: 2510.10665
- information-theoretic discrete diffusion | arXiv: 2510.24088
- infrequent exploration in linear bandits | arXiv: 2510.26000
- inner speech as behavior guides steerable imitation of diverse behaviors for hum | arXiv: 2602.20517
- inst-it boosting instance understanding via explicit visual prompt instruction t | arXiv: 2412.03565
- instance-level composed image retrieval | arXiv: 2510.25387
- instance-specific test-time training for speech editing in the wild | arXiv: 2506.13295
- InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention | arXiv: 2509.16691
- instant video models universal adapters for stabilizing image-based networks | arXiv: 2512.03014
- instructsam a training-free framework for instruction-oriented remote sensing ob | arXiv: 2505.15818
- Integration Matters for Learning PDEs with Backward SDEs | arXiv: 2505.01078
- interaction-centric knowledge infusion and transfer for open-vocabulary scene gr | arXiv: 2511.05935
- interactive and hybrid imitation learning provably beating behavior cloning | arXiv: 2412.07057
- interpretable next-token prediction via the generalized induction head | arXiv: 2411.00066
- interpreting gflownets for drug discovery extracting actionable insights for med | arXiv: 2511.19264
- interpreting resnet-based clip via neuron-attention decomposition | arXiv: 2509.19943
- intervene-all-paths unified mitigation of lvlm hallucinations across alignment f | arXiv: 2511.17254
- inverse optimization latent variable models for learning costs applied to route | arXiv: 2509.15999
- invisibleink high-utility and low-cost text generation with differential privacy | arXiv: 2507.02974
- ioncast a deep learning framework for forecasting ionospheric dynamics | arXiv: 2511.15004
- is artificial intelligence generated image detection a solved problem | arXiv: 2505.12335
- is sequence information all you need for bayesian optimization of antibodies | arXiv: 2509.24933
- isotropic noise in stochastic and quantum convex optimization | arXiv: 2510.20745
- It's LIT! Reliability-Optimized LLMs with Inspectable Tools | arXiv: 2511.14903
- itdpdm information-theoretic discrete poisson diffusion model | arXiv: 2505.05082
- iterative foundation model fine-tuning on multiple rewards | arXiv: 2511.00220
- its complicated the relationship of algorithmic fairness and non-discrimination | arXiv: 2501.12962
- its hard to be normal the impact of noise on structure-agnostic estimation | arXiv: 2507.02275
- jailbound jailbreaking internal safety boundaries of vision-language models | arXiv: 2505.19610
- jailbreak-zero a path to pareto optimal red teaming for large language models | arXiv: 2601.03265
- jamun bridging smoothed molecular dynamics and score-based learning for conforma | arXiv: 2410.14621
- janus-pro-r1 advancing collaborative visual comprehension and generation via rei | arXiv: 2506.01480
- janusdna a powerful bi-directional hybrid dna foundation model | arXiv: 2505.17257
- jasmine harnessing diffusion prior for self-supervised depth estimation | arXiv: 2503.15905
- jet-nemotron efficient language model with post neural architecture search | arXiv: 2508.15884
- johnson-lindenstrauss lemma beyond euclidean geometry | arXiv: 2510.22401
- jutters | arXiv: 2601.11532
- k-decore facilitating knowledge transfer in continual structured knowledge reaso | arXiv: 2509.16929
- keep it on a leash controllable pseudo-label generation towards realistic long-t | arXiv: 2510.03993
- keep it real challenges in attacking compression-based adversarial purification | arXiv: 2508.05489
- kernel conditional tests from learning-theoretic bounds | arXiv: 2506.03898
- kernel learning with adversarial features numerical efficiency and adaptive regu | arXiv: 2510.20883
- keydiff key similarity-based kv cache eviction for long-context llm inference in | arXiv: 2504.15364
- kimina lean server a high-performance lean server for large-scale verification | arXiv: 2504.21230
- kindle knowledge-guided distillation for prior-free gene regulatory network infe | arXiv: 2505.09664
- kl penalty control via perturbation for direct preference optimization | arXiv: 2502.13177
- klass kl-guided fast inference in masked diffusion models | arXiv: 2511.05664
- knolling bot teaching robots the human notion of tidiness | arXiv: 2310.04566
- know thyself by knowing others learning neuron identity from population context | arXiv: 2512.01199
- Know What You Don't Know: Uncertainty Calibration of Process Reward Models | arXiv: 2506.09338
- knowing when to stop efficient context processing via latent sufficiency signals | arXiv: 2502.01025
- knowledge distillation detection for open-weights models | arXiv: 2510.02302
- knowledge is overrated a zero-knowledge machine learning and cryptographic hashi | arXiv: 2511.12592
- knowledge-based visual question answer with multimodal processing retrieval and | arXiv: 2510.14605
- KScope: A Framework for Characterizing the Knowledge Status of Language Models | arXiv: 2506.07458
- ktae a model-free algorithm to key-tokens advantage estimation in mathematical r | arXiv: 2505.16826
- kungfubot physics-based humanoid whole-body control for learning highly-dynamic | arXiv: 2506.12851
- kuramoto orientation diffusion models | arXiv: 2509.15328
- kvzip query-agnostic kv cache compression with context reconstruction | arXiv: 2505.23416
- L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context | arXiv: 2505.17505
- l2rsi cross-view lidar-based place recognition for large-scale urban scenes via | arXiv: 2503.11245
- labelany3d label any object 3d in the wild | arXiv: 2601.01676
- LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents | arXiv: 2505.22634
- lagrangian neural odes measuring the existence of a lagrangian with helmholtz me | arXiv: 2510.06367
- LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation | arXiv: 2510.25263
- langsplatv2 high-dimensional 3d language gaussian splatting with 450 fps | arXiv: 2507.07136
- Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
- large language bayes | arXiv: 2504.14025
- large language models as medical codes selectors a benchmark using the internati | arXiv: 2507.14681
- large language models can learn and generalize steganographic chain-of-thought u | arXiv: 2506.01926
- Large Language Models Miss the Multi-Agent Mark | arXiv: 2505.21298
- large stepsizes accelerate gradient descent for regularized logistic regression | arXiv: 2506.02336
- large-scale training data attribution for music generative models via unlearning | arXiv: 2506.18312
- LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits | arXiv: 2410.01735
- last iterate convergence in monotone mean field games | arXiv: 2410.05127
- latent chain-of-thought for visual reasoning | arXiv: 2510.23925
- latent harmony synergistic unified uhd image restoration via latent space regula | arXiv: 2510.07961
- latent principle discovery for language model self-improvement | arXiv: 2505.16927
- latent representation learning in heavy-ion collisions with maskpoint transforme | arXiv: 2510.06691
- latent space factorization in lora | arXiv: 2510.19640
- latent zoning network a unified principle for generative modeling representation | arXiv: 2509.15591
- latentguard controllable latent steering for robust refusal of attacks and relia | arXiv: 2509.19839
- lattice boltzmann model for learning real-world pixel dynamicity | arXiv: 2509.16527
- layer-wise modality decomposition for interpretable multimodal sensor fusion | arXiv: 2511.00859
- layer-wise update aggregation with recycling for communication-efficient federat | arXiv: 2503.11146
- layerif estimating layer quality for large language models using influence funct | arXiv: 2505.23811
- lc-opt benchmarking reinforcement learning and agentic ai for end-to-end liquid | arXiv: 2511.00116
- lcdb 11 a database illustrating learning curves are more ill-behaved than previo | arXiv: 2505.15657
- leapfactual reliable visual counterfactual explanation using conditional flow ma | arXiv: 2510.14623
- learnable sampler distillation for discrete diffusion models | arXiv: 2509.19962
- learning approximately equivariant networks via constrained optimization | arXiv: 2505.13631
- learning at the speed of physics equilibrium propagation on oscillator ising mac | arXiv: 2510.12934
- Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Inverse Problems | arXiv: 2505.08909
- learning conformational ensembles of proteins based on backbone geometry | arXiv: 2503.05738
- learning dense hand contact estimation from imbalanced data | arXiv: 2505.11152
- learning dynamics of rnns in closed-loop environments | arXiv: 2505.13567
- learning efficient fuse-and-refine for feed-forward 3d gaussian splatting | arXiv: 2503.14698
- learning from demonstrations via capability-aware goal sampling | arXiv: 2601.08731
- learning from design procedure to generate cad programs for data augmentation | arXiv: 2603.06894
- learning from interval targets | arXiv: 2510.20925
- learning from videos for 3d world enhancing mllms with 3d vision geometry priors | arXiv: 2505.24625
- learning generalizable shape completion with sim3 equivariance | arXiv: 2509.26631
- learning grouped lattice vector quantizers for low-bit llm compression | arXiv: 2510.20984
- learning human-like rl agents through trajectory optimization with action quanti | arXiv: 2511.15055
- Learning in Compact Spaces with Approximately Normalized Transformer | arXiv: 2505.22014
- learning in stackelberg mean field games a non-asymptotic analysis | arXiv: 2509.15392
- Learning Interactive World Model for Object-Centric Reinforcement Learning | arXiv: 2511.02225
- learning interestingness in automated mathematical theory formation | arXiv: 2511.14778
- learning interpretable features in audio latent spaces via sparse autoencoders | arXiv: 2510.23802
- learning intractable multimodal policies with reparameterization and diversity r | arXiv: 2511.01374
- learning memory-enhanced improvement heuristics for flexible job shop scheduling | arXiv: 2603.02846
- learning neural exposure fields for view synthesis | arXiv: 2510.08279
- learning non-equilibrium diffusions with schrödinger bridges from exactly solvab | arXiv: 2505.16644
- learning orthogonal multi-index models a fine-grained information exponent analy | arXiv: 2410.09678
- learning parameterized skills from demonstrations | arXiv: 2510.24095
- learning provably improves the convergence of gradient descent | arXiv: 2501.18092
- learning quadratic neural networks in high dimensions sgd dynamics and scaling l | arXiv: 2508.03688
- learning reconfigurable representations for multimodal federated learning with m | arXiv: 2510.22880
- learning relative gene expression trends from pathology images in spatial transc | arXiv: 2512.06612
- learning repetition-invariant representations for polymer informatics | arXiv: 2505.10726
- learning shared representations from unpaired data | arXiv: 2505.21524
- learning single-index models via harmonic decomposition | arXiv: 2506.09887
- learning skill-attributes for transferable assessment in video | arXiv: 2511.13993
- learning sparse approximate inverse preconditioners for conjugate gradient solve | arXiv: 2510.27517
- learning spatial-aware manipulation ordering | arXiv: 2510.25138
- learning task-agnostic representations through multi-teacher distillation | arXiv: 2510.18680
- learning temporal 3d semantic scene completion via optical flow guidance | arXiv: 2502.14520
- learning the wrong lessons syntactic-domain spurious correlations in language mo | arXiv: 2509.21155
- learning theory for kernel bilevel optimization | arXiv: 2502.08457
- learning time-scale invariant population-level neural representations | arXiv: 2511.13022
- learning to better search with language models via guided reinforced self-traini | arXiv: 2410.02992
- learning to clean reinforcement learning for noisy label correction | arXiv: 2511.19808
- learning to condition a neural heuristic for scalable mpe inference | arXiv: 2509.25217
- learning to factorize and adapt a versatile approach toward universal spatio-tem | arXiv: 2601.12083
- learning to flow from generative pretext tasks for neural architecture encoding | arXiv: 2510.18360
- learning to focus causal attention distillation via gradient-guided token prunin | arXiv: 2506.07851
- learning to focus prioritizing informative histories with structured attention m | arXiv: 2511.06946
- learning to insert for constructive neural vehicle routing solver | arXiv: 2505.13904
- Learning to Instruct for Visual Instruction Tuning | arXiv: 2503.22215
- learning to integrate diffusion odes by averaging the derivatives | arXiv: 2505.14502
- Learning to Solve Complex Problems via Dataset Decomposition | arXiv: 2602.20296
- learning to steer input-dependent steering for multimodal llms | arXiv: 2508.12815
- learning to watermark a selective watermarking framework for large language mode | arXiv: 2510.15976
- learning with calibration exploring test-time computing of spatio-temporal forec | arXiv: 2506.00635
- learning-augmented facility location mechanisms for envy ratio | arXiv: 2512.11193
- learning-augmented online bipartite fractional matching | arXiv: 2505.19252
- learning-augmented streaming algorithms for correlation clustering | arXiv: 2510.10705
- least squares variational inference | arXiv: 2502.18475
- lemica lexicographic minimax path caching for efficient diffusion-based video ge | arXiv: 2511.00090
- less is more but where dynamic token compression via llm-guided keyframe prior | arXiv: 2512.06866
- less is more local intrinsic dimensions of contextual language models | arXiv: 2506.01034
- less is more unlocking specialization of time series foundation models via struc | arXiv: 2505.23195
- Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve | arXiv: 2505.23946
- Let LRMs Break Free from Overthinking via Self-Braking Tuning | arXiv: 2505.14604
- Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones | arXiv: 2505.21825
- let the experts speak improving survival prediction calibration via mixture-of-e | arXiv: 2511.09567
- leveraging depth and language for open-vocabulary domain-generalized semantic se | arXiv: 2506.09881
- leveraging importance sampling to detach alignment modules from large language m | arXiv: 2505.19700
- leveraging robust optimization for llm alignment under distribution shifts | arXiv: 2504.05831
- levo high-quality song generation with multi-preference alignment | arXiv: 2506.07520
- limited preference data learning better reward model with latent space synthesis | arXiv: 2509.26074
- limopro reasoning refinement for efficient and effective test-time scaling | arXiv: 2505.19187
- linear attention for efficient bidirectional sequence modeling | arXiv: 2502.16249
- linear differential vision transformer learning visual contrasts via pairwise di | arXiv: 2511.00833
- linear transformers implicitly discover unified numerical algorithms | arXiv: 2509.19702
- linearly constrained diffusion implicit models | arXiv: 2411.00359
- lineas end-to-end learning of activation steering with a distributional loss | arXiv: 2503.10679
- linprim linear primitives for differentiable volumetric rendering | arXiv: 2501.16312
- littlebit ultra low-bit quantization via latent factorization | arXiv: 2506.13771
- livestar live streaming assistant for real-world online video understanding | arXiv: 2511.05299
- llm agent communication protocol lacp requires urgent standardization a telecom- | arXiv: 2510.13821
- llm agents for knowledge discovery in atomic layer processing | arXiv: 2509.26201
- llm interpretability with identifiable temporal-instantaneous representation | arXiv: 2509.23323
- llm meets diffusion a hybrid framework for crystal material generation | arXiv: 2510.23040
- llm probing with contrastive eigenproblems improving understanding and applicabi | arXiv: 2511.02089
- LLM Safety Alignment is Divergence Estimation in Disguise | arXiv: 2502.00657
- LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
- llm world models are mental output layer evidence of brittle world model use in | arXiv: 2507.15521
- llm-assisted emergency triage benchmark bridging hospital-rich and mci-like fiel | arXiv: 2509.26351
- llmscape | arXiv: 2511.07161
- locality-sensitive hashing-based efficient point transformer for charged particl | arXiv: 2510.07594
- locally optimal private sampling beyond the global minimax | arXiv: 2510.09485
- lodge level-of-detail large-scale gaussian splatting with efficient rendering | arXiv: 2505.23158
- logical expressiveness of graph neural networks with hierarchical node individua | arXiv: 2506.13911
- lomix learnable weighted multi-scale logits mixing for medical image segmentatio | arXiv: 2510.22995
- Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs | arXiv: 2510.24606
- long-tailed recognition via information-preservable two-stage learning | arXiv: 2510.08836
- LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization | arXiv: 2602.02341
- LooGLE v2: LLM在真实世界长依赖挑战上的准备情况评估 | arXiv: 2510.22548
- look and tell a dataset for multimodal grounding across egocentric and exocentri | arXiv: 2510.22672
- look-ahead reasoning on learning platforms | arXiv: 2511.14745
- loquetier a virtualized multi-lora framework for unified llm fine-tuning and ser | arXiv: 2511.00101
- lost in transmission when and why llms fail to reason globally | arXiv: 2505.08140
- lt-soups bridging head and tail classes via subsampled model soups | arXiv: 2511.10683
- ltd-bench evaluating large language models by letting them draw | arXiv: 2511.02347
- lumia a handheld vision-to-music system for real-time embodied composition | arXiv: 2512.17228
- luminance-aware statistical quantization unsupervised hierarchical learning for | arXiv: 2511.01510
- m-grpo stabilizing self-supervised reinforcement learning for large language mod | arXiv: 2512.13070
- machine unlearning doesnt do what you think lessons for generative ai policy and | arXiv: 2412.06966
- maestro adaptive sparse attention and robust learning for multimodal dynamic tim | arXiv: 2509.25278
- MagCache: Fast Video Generation with Magnitude-Aware Cache | arXiv: 2506.09045
- magical medical lay language generation via semantic invariance and layperson-ta | arXiv: 2508.08730
- MaintainCoder: Maintainable Code Generation Under Dynamic Requirements | arXiv: 2503.24260
- making classic gnns strong baselines across varying homophily a smoothness-gener | arXiv: 2412.09805
- mamba goes home hierarchical soft mixture-of-experts for 3d medical image segmen | arXiv: 2507.06363
- mango - adaptable graph network simulators via meta-learning | arXiv: 2510.05874
- manifolds and modules how function develops in a neural foundation model | arXiv: 2512.07869
- manipulating 3d molecules in a fixed-dimensional e3-equivariant latent space | arXiv: 2506.00771
- manipulating feature visualizations with gradient slingshots | arXiv: 2401.06122
- many llms are more utilitarian than one | arXiv: 2507.00814
- map estimation with denoisers convergence rates and guarantees | arXiv: 2507.15397
- mapping faithful reasoning in language models | arXiv: 2510.22362
- mar-fl a communication efficient peer-to-peer federated learning system | arXiv: 2512.05234
- mars a malignity-aware backdoor defense in federated learning | arXiv: 2509.20383
- mars-bench a benchmark for evaluating foundation models for mars science tasks | arXiv: 2510.24010
- martingale score an unsupervised metric for bayesian rationality in llm reasonin | arXiv: 2512.02914
- MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision | arXiv: 2505.14996
- masfin a multi-agent system for decomposed financial reasoning and forecasting | arXiv: 2512.21878
- masked symbol modeling for demodulation of oversampled baseband communication si | arXiv: 2512.01428
- masksql safeguarding privacy for llm-based text-to-sql via abstraction | arXiv: 2509.23459
- mass conservation on rails -- rethinking physics-informed learning of ice flow v | arXiv: 2510.06286
- Massively Parallel Imitation Learning of Mouse Forelimb Musculoskeletal Reaching Dynamics | arXiv: 2511.21848
- mat-agent adaptive multi-agent training optimization | arXiv: 2510.17845
- match multi-faceted adaptive topo-consistency for semi-supervised histopathology | arXiv: 2510.01532
- matchings under biased and correlated evaluations | arXiv: 2510.23628
- materialrefgs reflective gaussian splatting with multi-view consistent material | arXiv: 2510.11387
- matryoshka pilot learning to drive black-box llms with llms | arXiv: 2410.20749
- maxsup overcoming representation collapse in label smoothing | arXiv: 2502.15798
- mdns masked diffusion neural sampler via stochastic optimal control | arXiv: 2508.10684
- mdreid modality-decoupled learning for any-to-any multi-modal object re-identifi | arXiv: 2510.23301
- mean-field sampling for cooperative multi-agent reinforcement learning | arXiv: 2412.00661
- measuring what matters construct validity in large language model benchmarks | arXiv: 2511.04703
- mecefo enhancing llm training robustness via fault-tolerant optimization | arXiv: 2510.16415
- mechanism design for llm fine-tuning with multiple reward models | arXiv: 2405.16276
- mechanistic interpretability of rnns emulating hidden markov models | arXiv: 2510.25674
- medagentboard benchmarking multi-agent collaboration with conventional methods f | arXiv: 2505.12371
- medmkg benchmarking medical knowledge exploitation with multimodal knowledge gra | arXiv: 2505.17214
- megadance mixture-of-experts architecture for genre-aware 3d dance generation | arXiv: 2505.17543
- megstate phoneme decoding from magnetoencephalography signals | arXiv: 2512.17978
- meicoder decoding visual stimuli from neural activity by leveraging most excitin | arXiv: 2510.20762
- memeic a step toward continual and compositional knowledge editing | arXiv: 2510.25798
- memo training memory-efficient embodied agents with reinforcement learning | arXiv: 2510.19732
- memoir lifelong model editing with minimal overwrite and informed retention for | arXiv: 2506.07899
- Memory Mosaics at Scale | arXiv: 2507.03285
- memory-augmented potential field theory a framework for adaptive control in non- | arXiv: 2509.19672
- memory-efficient training with in-place fft implementation | arXiv: 2511.01385
- memory-integrated reconfigurable adapters a unified framework for settings with | arXiv: 2512.00940
- memtrack evaluating long-term memory and state tracking in multi-platform dynami | arXiv: 2510.01353
- mergebench a benchmark for merging domain-specialized llms | arXiv: 2505.10833
- merit multilingual semantic retrieval with interleaved multi-condition query | arXiv: 2506.03144
- merlin l48 spectrogram dataset | arXiv: 2511.00252
- mesatask towards task-driven tabletop scene generation via 3d spatial reasoning | arXiv: 2509.22281
- mesh interpolation graph network for dynamic and spatially irregular global weat | arXiv: 2509.20911
- mesh-rft enhancing mesh generation via fine-grained reinforcement fine-tuning | arXiv: 2505.16761
- MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees | arXiv: 2505.19947
- meta-learning an in-context transformer model of human higher visual cortex | arXiv: 2505.15813
- meta-learning three-factor plasticity rules for structured credit assignment wit | arXiv: 2512.09366
- meta-world an improved standardized rl benchmark | arXiv: 2505.11289
- metabox-v2 a unified benchmark platform for meta-black-box optimization | arXiv: 2505.17745
- metacognitive sensitivity for test-time dynamic model selection | arXiv: 2512.10451
- metadefense defending finetuning-based jailbreak attack before and during genera | arXiv: 2510.07835
- metafind scene-aware 3d asset retrieval for coherent metaverse scene generation | arXiv: 2510.04057
- metags a meta-learned gaussian-phong model for out-of-distribution 3d scene reli | arXiv: 2405.20791
- metamind modeling human social thoughts with metacognitive multi-agent systems | arXiv: 2505.18943
- metropolis-hastings sampling for 3d gaussian reconstruction | arXiv: 2506.12945
- mge-ldm joint latent diffusion for simultaneous music generation and source extr | arXiv: 2505.23305
- micadangelo fine-grained reconstruction of constrained cad models from 3d scans | arXiv: 2510.23429
- midas misalignment-based data augmentation strategy for imbalanced multimodal le | arXiv: 2509.25831
- military ai needs technically-informed regulation to safeguard ai research and i | arXiv: 2505.18371
- mimeqa towards socially-intelligent nonverbal foundation models | arXiv: 2502.16671
- mind the data gap evaluating vision systems in small data applications | arXiv: 2504.06486
- mind the gap aligning knowledge bases with user needs to enhance mental health r | arXiv: 2509.13626
- mind the gap bridging thought leap for improved chain-of-thought tuning | arXiv: 2505.14684
- mind the gap removing the discretization gap in differentiable logic gate networ | arXiv: 2506.07500
- mind the gap the challenges of scale in pixel-based deep reinforcement learning | arXiv: 2505.17749
- mind-the-glitch visual correspondence for detecting inconsistencies in subject-d | arXiv: 2509.21989
- mindforge empowering embodied agents with theory of mind for lifelong cultural l | arXiv: 2411.12977
- MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents | arXiv: 2505.20148
- mingle mixture of null-space gated low-rank experts for test-time continual mode | arXiv: 2505.11883
- minimal semantic sufficiency meets unsupervised domain generalization | arXiv: 2509.15791
- minimizing false-positive attributions in explanations of non-linear models | arXiv: 2505.11210
- Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions | arXiv: 2510.22127
- MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents | arXiv: 2503.10809
- mir-bench can your llm recognize complicated patterns via many-shot in-context r | arXiv: 2502.09933
- mira medical time series foundation model for real-world health data | arXiv: 2506.07584
- mirage a benchmark for multimodal information-seeking and reasoning in agricultu | arXiv: 2506.20100
- mitigating disparate impact of differentially private learning through bounded a | arXiv: 2506.01396
- mitigating hallucination through theory-consistent symmetric multimodal preferen | arXiv: 2506.11712
- mitigating intra- and inter-modal forgetting in continual learning of unified mu | arXiv: 2512.03125
- mitigating privacy-utility trade-off in decentralized federated learning via f-d | arXiv: 2510.19934
- mitigating semantic collapse in partially relevant video retrieval | arXiv: 2510.27432
- mitigating sexual content generation via embedding distortion in text-conditione | arXiv: 2501.18877
- mitra an ai assistant for knowledge retrieval in physics collaborations | arXiv: 2603.09800
- mitra mixed synthetic priors for enhancing tabular foundation models | arXiv: 2510.21204
- mixat combining continuous and discrete adversarial training for llms | arXiv: 2505.16947
- mixed monotonicity reachability analysis of neural ode a trade-off between tight | arXiv: 2510.17859
- mixing expert knowledge bring human thoughts back to the game of go | arXiv: 2601.16447
- mixture of noise for pre-trained model-based class-incremental learning | arXiv: 2509.16738
- mixture of scope experts at test generalizing deeper graph neural networks with | arXiv: 2409.06998
- mlr-bench evaluating ai agents on open-ended machine learning research | arXiv: 2505.19955
- mlrc-bench can language agents solve machine learning research challenges | arXiv: 2504.09702
- mm-opera benchmarking open-ended association reasoning for large vision-language | arXiv: 2510.26937
- mmada multimodal large diffusion language models | arXiv: 2505.15809
- mme-videoocr evaluating ocr-based capabilities of multimodal llms in video scena | arXiv: 2505.21333
- mmg mutual information estimation via the mmse gap in diffusion | arXiv: 2509.20609
- MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly | arXiv: 2505.10610
- mmpb its time for multi-modal personalization | arXiv: 2509.22820
- mmperspective do mllms understand perspective a comprehensive benchmark for pers | arXiv: 2505.20426
- mmtu a massive multi-task table understanding and reasoning benchmark | arXiv: 2506.05587
- mmwalk towards multi-modal multi-view walking assistance | arXiv: 2510.11520
- mobo-osd batch multi-objective bayesian optimization via orthogonal search direc | arXiv: 2510.20872
- Model Context Protocol for Vision Systems: Audit, Security, and Protocol Extensions | arXiv: 2509.22814
- model inversion with layer-specific modeling and alignment for data-free continu | arXiv: 2510.26311
- model-based policy adaptation for closed-loop end-to-end autonomous driving | arXiv: 2511.21584
- model-behavior alignment under flexible evaluation when the best-fitting model i | arXiv: 2510.23321
- model-guided dual-role alignment for high-fidelity open-domain video-to-audio ge | arXiv: 2510.24103
- modeling cell dynamics and interactions with unbalanced mean field schrödinger b | arXiv: 2505.11197
- modeling microenvironment trajectories on spatial transcriptomics with nicheflow | arXiv: 2511.00977
- modeling neural activity with conditionally linear dynamical systems | arXiv: 2502.18347
- modeling x-ray photon pile-up with a normalizing flow | arXiv: 2511.11863
- models that prove their own correctness | arXiv: 2405.15722
- modem a morton-order degradation estimation mechanism for adverse weather image | arXiv: 2505.17581
- modhifi identifying high fidelity predictive components for model modification | arXiv: 2511.19566
- modulation of temporal decision-making in a deep reinforcement learning agent un | arXiv: 2511.01415
- moe-gyro self-supervised over-range reconstruction and denoising for mems gyrosc | arXiv: 2506.06318
- moemeta mixture-of-experts meta learning for few-shot relational learning | arXiv: 2510.23013
- MoESD: 揭示稀疏MoE推理中投机解码的潜力 | arXiv: 2505.19645
- mol-llama towards general understanding of molecules in large molecular language | arXiv: 2502.13449
- mome mixture of matryoshka experts for audio-visual speech recognition | arXiv: 2510.04136
- moment- and power-spectrum-based gaussianity regularization for text-to-image mo | arXiv: 2509.07027
- monarchattention zero-shot conversion to fast hardware-aware structured attentio | arXiv: 2505.18698
- monitor exploiting large language models with instruction for online video anoma | arXiv: 2510.21449
- monte carlo expected threat mocet scoring | arXiv: 2511.16823
- moose-chem2 exploring llm limits in fine-grained scientific hypothesis discovery | arXiv: 2505.19209
- mopformer motion-primitive transformer for wearable-sensor activity recognition | arXiv: 2505.20744
- more than generation unifying generation and depth estimation via text-to-image | arXiv: 2510.23574
- more-brain routed mixture of experts for interpretable and generalizable cross-s | arXiv: 2505.15946
- mospa human motion generation driven by spatial audio | arXiv: 2507.11949
- motion matters compact gaussian streaming for free-viewpoint video reconstructio | arXiv: 2505.16533
- motion4d learning 3d-consistent motion and semantics for 4d scene understanding | arXiv: 2512.03601
- mouse-guided gaze semi-supervised learning of intention-aware representations fo | arXiv: 2509.19574
- mozart modularized and efficient moe training on 35d wafer-scale chiplet archite | arXiv: 2603.07006
- mpcache mpc-friendly kv cache eviction for efficient private llm inference | arXiv: 2501.06807
- mpmavatar learning 3d gaussian avatars with accurate and robust physics-based dy | arXiv: 2510.01619
- mro enhancing reasoning in diffusion language models via multi-reward optimizati | arXiv: 2510.21473
- ms-bart unified modeling of mass spectra and molecules for structure elucidation | arXiv: 2510.20615
- msf-cnn patch-based multi-stage fusion with convolutional neural networks for ti | arXiv: 2505.11483
- mstar box-free multi-query scene text retrieval with attention recycling | arXiv: 2506.10609
- mtbbench a multimodal sequential clinical decision-making benchmark in oncology | arXiv: 2511.20490
- mtl-kd multi-task learning via knowledge distillation for generalizable neural v | arXiv: 2506.02935
- Multi-Agent Collaboration via Evolving Orchestration | arXiv: 2505.19591
- multi-class support vector machine with differential privacy | arXiv: 2510.04027
- multi-environment pomdps discrete model uncertainty under partial observability | arXiv: 2510.23744
- multi-head temporal latent attention | arXiv: 2505.13544
- multi-head transformers provably learn symbolic multi-step reasoning via gradien | arXiv: 2508.08222
- multi-modal masked autoencoders for learning image-spectrum associations for gal | arXiv: 2510.22527
- multi-objective reinforcement learning with max-min criterion a game-theoretic a | arXiv: 2510.20235
- multi-scale finetuning for encoder-based time series foundation models | arXiv: 2506.14087
- multi-task vehicle routing solver via mixture of specialized experts under state | arXiv: 2510.21453
- multi-trajectory physics-informed neural networks for hjb equations with hard-ze | arXiv: 2512.12708
- multihuman-testbench benchmarking image generation for multiple humans | arXiv: 2506.20879
- multimodal 3d genome pre-training | arXiv: 2504.09060
- multimodal bandits regret lower bounds and optimal algorithms | arXiv: 2510.25811
- multimodal bayesian network for robust assessment of casualties in autonomous tr | arXiv: 2512.18908
- multimodal disease progression modeling via spatiotemporal disentanglement and m | arXiv: 2510.11112
- multimodal generative flows for lhc jets | arXiv: 2509.01736
- multimodal negative learning | arXiv: 2510.20877
- multiplayer federated learning reaching equilibrium with less communication | arXiv: 2501.08263
- multiscale guidance of protein structure prediction with heterogeneous cryo-em d | arXiv: 2506.04490
- murating a high quality data selecting approach to multilingual large language m | arXiv: 2507.01785
- music arena live evaluation for text-to-music | arXiv: 2507.20900
- muslr multimodal symbolic logical reasoning | arXiv: 2509.25851
- mustafar promoting unstructured sparsity for kv cache pruning in llm inference | arXiv: 2505.22913
- mutualvpr a mutual learning framework for resolving supervision inconsistencies | arXiv: 2412.09199
- muvr a multi-modal untrimmed video retrieval benchmark with multi-level visual c | arXiv: 2510.21406
- mvsmamba multi-view stereo with state space model | arXiv: 2511.01315
- natural gradient descent for improving variational inference based classificatio | arXiv: 2511.13224
- natural gradient vi guarantees for non-conjugate models | arXiv: 2510.19163
- nautilus a large multimodal model for underwater scene understanding | arXiv: 2510.27481
- navigating simply aligning deeply winning solutions for mouse vs ai 2025 | arXiv: 2602.00982
- navil rethinking scaling properties of native multimodal large language models u | arXiv: 2510.08565
- near-exponential savings for mean estimation with active learning | arXiv: 2511.05736
- near-optimal quantum algorithms for computing coarse correlated equilibria of ge | arXiv: 2510.16782
- nearly-linear time private hypothesis selection with the optimal approximation f | arXiv: 2506.01162
- needleinatable exploring long-context capability of large language models toward | arXiv: 2504.06560
- negocollab a common representation negotiation approach for heterogeneous collab | arXiv: 2510.27647
- nemotron-climb clustering-based iterative data mixture bootstrapping for languag | arXiv: 2504.13161
- Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models | arXiv: 2511.18890
- nerfbaselines consistent and reproducible evaluation of novel view synthesis met | arXiv: 2406.17345
- nesypr neurosymbolic proceduralization for efficient embodied reasoning | arXiv: 2510.19429
- neural collapse in cumulative link models for ordinal regression an analysis wit | arXiv: 2506.05801
- neural collapse under gradient flow on shallow relu networks for orthogonally se | arXiv: 2510.21078
- neural deprojection of galaxy stellar mass profiles | arXiv: 2511.20746
- neural emulator superiority when machine learning for pdes surpasses its trainin | arXiv: 2510.23111
- neural entropy | arXiv: 2409.03817
- neural greens functions | arXiv: 2511.01924
- neural mjd neural non-stationary merton jump diffusion for time series predictio | arXiv: 2506.04542
- neural network for simulating radio emission from extensive air showers | arXiv: 2512.21407
- neural stochastic flows solver-free modelling and inference for sde solutions | arXiv: 2510.25769
- neural thermodynamics entropic forces in deep and universal representation learn | arXiv: 2505.12387
- neurips should lead scientific consensus on ai policy | arXiv: 2510.00075
- neuript foundation model for neural interfaces | arXiv: 2510.16548
- neuro-spectral architectures for causal physics-informed networks | arXiv: 2509.04966
- neuro-symbolic entity alignment via variational inference | arXiv: 2410.04153
- neuropath neurobiology-inspired path tracking and reflection for semantically co | arXiv: 2511.14096
- neurosymbolic diffusion models | arXiv: 2505.13138
- next semantic scale prediction via hierarchical diffusion language models | arXiv: 2510.08632
- nnterp a standardized interface for mechanistic interpretability of transformers | arXiv: 2511.14465
- node-based editing for multimodal generation of text audio image and video | arXiv: 2511.03227
- noise-robustness through noise a framework combining asymmetric lora with poison | arXiv: 2505.23868
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation | arXiv: 2504.13055
- non-asymptotic analysis of data augmentation for precision matrix estimation | arXiv: 2510.02119
- non-clairvoyant scheduling with progress bars | arXiv: 2509.19662
- non-convex entropic mean-field optimization via best response flow | arXiv: 2505.22760
- non-markovian discrete diffusion with causal language models | arXiv: 2502.09767
- non-stationary bandit convex optimization a comprehensive study | arXiv: 2506.02980
- nonlinear laplacians tunable principal component analysis under directional prio | arXiv: 2505.12528
- nonlinearly preconditioned gradient methods momentum and stochastic analysis | arXiv: 2510.11312
- normal-abnormal guided generalist anomaly detection | arXiv: 2510.00495
- normalization in attention dynamics | arXiv: 2510.22026
- not all deepfakes are created equal triaging audio forgeries for robust deepfake | arXiv: 2510.17474
- not all splits are equal rethinking attribute generalization across unrelated ca | arXiv: 2509.06998
- novel class discovery for point cloud segmentation via joint learning of causal | arXiv: 2510.13307
- novel view synthesis from a few glimpses via test-time natural video completion | arXiv: 2511.17932
- npn non-linear projections of the null-space for imaging inverse problems | arXiv: 2510.01608
- nsw-epnews a news-augmented benchmark for electricity price forecasting with llm | arXiv: 2506.11050
- obclip oblivious cloud-device hybrid image generation with privacy preservation | arXiv: 2510.04153
- object-centric representation learning for enhanced 3d semantic scene graph pred | arXiv: 2510.04714
- obliviator reveals the cost of nonlinear guardedness in concept erasure | arXiv: 2603.07529
- ocn effectively utilizing higher-order common neighbors for better link predicti | arXiv: 2505.19719
- offline policy evaluation of multi-turn llm health coaching with real users | arXiv: 2510.17173
- omni-mol multitask molecular model for any-to-any modalities | arXiv: 2502.01074
- omnicast a masked latent diffusion model for weather forecasting across time sca | arXiv: 2510.18707
- omnidraft a cross-vocabulary online adaptive drafter for on-device speculative d | arXiv: 2507.02659
- omnifc rethinking federated clustering via lossless and secure distance reconstr | arXiv: 2505.13071
- omnigaze reward-inspired generalizable gaze estimation in the wild | arXiv: 2510.13660
- omnisegmentor a flexible multi-modal learning framework for semantic segmentatio | arXiv: 2509.15096
- OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers | arXiv: 2505.21448
- omnivcus feedforward subject-driven video customization with multimodal control | arXiv: 2506.23361
- on a geometry of interbrain networks | arXiv: 2509.10650
- on agnostic pac learning in the small error regime | arXiv: 2502.09496
- on evaluating llm alignment by evaluating llms as judges | arXiv: 2511.20604
- On Extending Direct Preference Optimization to Accommodate Ties | arXiv: 2409.17431
- on geometry-enhanced parameter-efficient fine-tuning for 3d scene segmentation | arXiv: 2505.22444
- On Learning Verifiers and Implications to Chain-of-Thought Reasoning | arXiv: 2505.22650
- on minimax estimation of parameters in softmax-contaminated mixture of experts | arXiv: 2505.18455
- on optimal steering to achieve exact fairness | arXiv: 2509.15759
- on the creation of narrow ai hierarchy and nonlocality of neural network skills | arXiv: 2505.15811
- on the emergence of linear analogies in word embeddings | arXiv: 2505.18651
- on the empirical power of goodness-of-fit tests in watermark detection | arXiv: 2510.03944
- on the entropy calibration of language models | arXiv: 2511.11966
- On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks | arXiv: 2505.24205
- on the global optimality of policy gradient methods in general utility reinforce | arXiv: 2410.04108
- on the hardness of approximating distributions with tractable probabilistic mode | arXiv: 2506.01281
- on the hardness of conditional independence testing in practice | arXiv: 2512.14000
- on the relation between rectified flows and optimal transport | arXiv: 2505.19712
- on the robustness of verbal confidence of llms in adversarial attacks | arXiv: 2507.06489
- on the role of hidden states of modern hopfield network in transformer | arXiv: 2511.20698
- on the sample complexity of differentially private policy optimization | arXiv: 2510.21060
- on the surprising effectiveness of large learning rates under standard width sca | arXiv: 2505.22491
- on the value of cross-modal misalignment in multimodal representation learning | arXiv: 2504.10143
- on topological descriptors for graph products | arXiv: 2511.08846
- on universality classes of equivariant networks | arXiv: 2506.02293
- once upon an input reasoning via per-instance program synthesis | arXiv: 2510.22849
- one filters all a generalist filter for state estimation | arXiv: 2509.20051
- one prompt fits all universal graph adaptation for pretrained models | arXiv: 2509.22416
- one sample is enough to make conformal prediction robust | arXiv: 2506.16553
- one small step with fingerprints one giant leap for de novo molecule generation | arXiv: 2508.04180
- one stone with two birds a null-text-null frequency-aware diffusion models for t | arXiv: 2510.08273
- one token embedding is enough to deadlock your large reasoning model | arXiv: 2510.15965
- one-shot transfer learning for nonlinear pdes with perturbative pinns | arXiv: 2511.11137
- one-step diffusion-based image compression with semantic distillation | arXiv: 2505.16687
- online feedback efficient active target discovery in partially observable enviro | arXiv: 2505.06535
- online mixture of experts no-regret learning for optimal collective decision-mak | arXiv: 2510.21788
- online optimization for offline safe reinforcement learning | arXiv: 2510.22027
- online segment any 3d thing as instance tracking | arXiv: 2512.07599
- Online Two-Stage Submodular Maximization | arXiv: 2510.19480
- onlinesplatter pose-free online 3d reconstruction for free-moving objects | arXiv: 2510.20605
- open vision reasoner transferring linguistic cognitive behavior for visual reaso | arXiv: 2507.05255
- open-insect benchmarking open-set recognition of novel species in biodiversity m | arXiv: 2503.01691
- open-world drone active tracking with goal-centered rewards | arXiv: 2412.00744
- openbox annotate any bounding boxes in 3d | arXiv: 2512.01352
- openhoi open-world hand-object interaction synthesis with multimodal large langu | arXiv: 2505.18947
- openlex3d a tiered evaluation benchmark for open-vocabulary 3d scene representat | arXiv: 2503.19764
- operation veja fixing fundamental concepts missing from modern roleplaying train | arXiv: 2601.06039
- opinion maximization in social networks by modifying internal opinions | arXiv: 2510.17226
- opinion towards unified expressive policy optimization for robust robot learning | arXiv: 2511.10087
- optimal adjustment sets for nonparametric estimation of weighted controlled dire | arXiv: 2506.09871
- optimal online change detection via random fourier features | arXiv: 2505.17789
- optimal rates for generalization of gradient descent for deep relu classificatio | arXiv: 2510.02779
- optimality and np-hardness of transformers in learning markovian dynamical funct | arXiv: 2510.18638
- optimism without regularization constant regret in zero-sum games | arXiv: 2506.16736
- optimistic online-to-batch conversions for accelerated convergence and universal | arXiv: 2511.06597
- optimized learned count-min sketch | arXiv: 2512.12252
- optimizing distributional geometry alignment with optimal transport for generati | arXiv: 2512.00308
- optimizing the unknown black box bayesian optimization with energy-based model a | arXiv: 2510.19530
- optitree hierarchical thoughts generation with tree search for llm optimization | arXiv: 2510.22192
- oracle-efficient combinatorial semi-bandits | arXiv: 2510.21431
- orbit -- open recommendation benchmark for reproducible research with hidden tes | arXiv: 2510.26095
- orbitzoo real orbital systems challenges for reinforcement learning | arXiv: 2504.04160
- orchestration framework for financial agents from algorithmic trading to agentic | arXiv: 2512.02227
- order-level attention similarity across language models a latent commonality | arXiv: 2511.05064
- ordinal label-distribution learning with constrained asymmetric priors for imbal | arXiv: 2509.26146
- ordshap feature position importance for sequential black-box models | arXiv: 2507.11855
- orient anything v2 unifying orientation and rotation understanding | arXiv: 2601.05573
- orientation matters making 3d generative models orientation-aligned | arXiv: 2506.08640
- orientation-anchored hyper-gaussian for 4d reconstruction from casual videos | arXiv: 2509.23492
- orochi versatile biomedical image processor | arXiv: 2509.22583
- orpo-distill mixed-policy preference optimization for cross-architecture llm dis | arXiv: 2509.25100
- orthograd improves neural calibration | arXiv: 2506.04487
- ortholoc uav 6-dof localization and calibration using orthographic geodata | arXiv: 2509.18350
- oryx a scalable sequence model for many-agent coordination in offline marl | arXiv: 2505.22151
- os-harm a benchmark for measuring safety of computer use agents | arXiv: 2506.14866
- osmgen highly controllable satellite image synthesis using openstreetmap data | arXiv: 2511.00345
- out of control -- why alignment needs formal control theory and an alignment con | arXiv: 2506.17846
- out-of-distribution generalisation is hard evidence from arc-like tasks | arXiv: 2505.09716
- over-squashing in spatiotemporal graph neural networks | arXiv: 2506.15507
- overcoming sparsity artifacts in crosscoders to interpret chat-tuning | arXiv: 2504.02922
- overfitting in adaptive robust optimization | arXiv: 2509.16451
- overlaybench a benchmark for layout-to-image generation with dense overlaps | arXiv: 2509.19282
- overt a benchmark for over-refusal evaluation on text-to-image models | arXiv: 2505.21347
- p-drum post-hoc descriptor-based residual uncertainty modeling for machine learn | arXiv: 2509.02927
- pac-bayes bounds for multivariate linear regression and linear autoencoders | arXiv: 2512.12905
- pairwise optimal transports for training all-to-all flow-based condition transfe | arXiv: 2504.03188
- pancakes consistent multi-protocol image segmentation across biomedical domains | arXiv: 2512.13534
- panda towards generalist video anomaly detection via agentic ai engineer | arXiv: 2509.26386
- pandapose 3d human pose lifting from a single image via propagating 2d pose prio | arXiv: 2602.01095
- panel-by-panel souls a performative workflow for expressive faces in ai-assisted | arXiv: 2511.16038
- panoptic captioning an equivalence bridge for image and text | arXiv: 2505.16334
- parallelization of non-linear state-space models scaling up liquid-resistance li | arXiv: 2505.21717
- parallelprompt extracting parallelism from large language model queries | arXiv: 2506.18728
- parameter efficient fine-tuning via explained variance adaptation | arXiv: 2410.07170
- parameter-free algorithms for the stochastically extended adversarial model | arXiv: 2510.04685
- parco parallel autoregressive models for multi-agent combinatorial optimization | arXiv: 2409.03811
- paretoq improving scaling laws in extremely low-bit llm quantization | arXiv: 2502.02631
- parrot a benchmark for evaluating llms in cross-system sql translation | arXiv: 2509.23338
- part-aware bottom-up group reasoning for fine-grained social interaction detecti | arXiv: 2511.03666
- partial information decomposition via normalizing flows in latent gaussian distr | arXiv: 2510.04417
- partnext a next-generation dataset for fine-grained and hierarchical 3d part und | arXiv: 2510.20155
- partonomy large multimodal models with part-level visual understanding | arXiv: 2505.20759
- pass path-selective state space model for event-based recognition | arXiv: 2409.16953
- path attention position encoding via accumulating householder transformations | arXiv: 2505.16381
- patientsim a persona-driven simulator for realistic doctor-patient interactions | arXiv: 2505.17818
- perceptually aligning representations of music via noise-augmented autoencoders | arXiv: 2511.05350
- performative validity of recourse explanations | arXiv: 2506.15366
- periodic skill discovery | arXiv: 2511.03187
- permllm learnable channel permutation for nm sparse large language models | arXiv: 2510.10136
- personalized subgraph federated learning with differentiable auxiliary projectio | arXiv: 2505.23864
- perturb a model not an image towards robust privacy protection via anti-personal | arXiv: 2511.01307
- perturbation bounds for low-rank inverse approximations under noise | arXiv: 2510.25571
- pfδ a benchmark dataset for power flow under load generation and topology variat | arXiv: 2510.22048
- pharmacophore-guided generative design of novel drug-like molecules | arXiv: 2510.01480
- photography perspective composition towards aesthetic perspective recommendation | arXiv: 2505.20655
- PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
- physics of language models part 41 architecture design and the magic of canon la | arXiv: 2512.17351
- physics-constrained flow matching sampling generative models with hard constrain | arXiv: 2506.04171
- physics-driven spatiotemporal modeling for ai-generated video detection | arXiv: 2510.08073
- physics-guided machine learning for uncertainty quantification in turbulence mod | arXiv: 2511.05633
- physics-informed neural networks with fourier features and attention-driven deco | arXiv: 2510.05385
- physics-informed reduced order modeling of time-dependent pdes via differentiabl | arXiv: 2505.14595
- physiowave a multi-scale wavelet-transformer for physiological signal representa | arXiv: 2506.10351
- physvlm-avr active visual reasoning for multimodal large language models in phys | arXiv: 2510.21111
- physx-3d physical-grounded 3d asset generation | arXiv: 2507.12465
- pid-controlled langevin dynamics for faster sampling of generative models | arXiv: 2511.12603
- pixel-perfect depth with semantics-prompted diffusion transformers | arXiv: 2510.07316
- pixfoundation 20 do video multi-modal llms use motion in visual grounding | arXiv: 2509.02807
- pixperfect seamless latent diffusion local editing with discriminative pixel-spa | arXiv: 2512.03247
- plana3r zero-shot metric planar 3d reconstruction via feed-forward planar splatt | arXiv: 2510.18714
- planargs high-fidelity indoor 3d gaussian splatting guided by vision-language pl | arXiv: 2510.23930
- planning without search refining frontier llms with offline goal-conditioned rl | arXiv: 2505.18098
- planu large language model reasoning through planning under uncertainty | arXiv: 2510.18442
- plasticity as the mirror of empowerment | arXiv: 2505.10361
- pluralistic behavior suite stress-testing multi-turn adherence to custom behavio | arXiv: 2511.05018
- pointmac meta-learned adaptation for robust test-time point cloud completion | arXiv: 2510.10365
- polar sparsity high throughput batched llm inferencing with scalable contextual | arXiv: 2505.14884
- polaris a high-contrast polarimetric imaging benchmark dataset for exoplanetary | arXiv: 2506.03511
- policy compatible skill incremental learning via lazy learning interface | arXiv: 2509.20612
- policy-as-prompt turning ai governance rules into guardrails for ai agents | arXiv: 2509.23994
- poly-guard massive multi-domain safety policy-grounded guardrail dataset | arXiv: 2506.19054
- polyjuice makes it real black-box universal red teaming for synthetic image dete | arXiv: 2509.15551
- polypose deformable 2d3d registration via polyrigid transformations | arXiv: 2505.19256
- posecrafter extreme pose estimation with hybrid video synthesis | arXiv: 2510.19527
- position bridge the gaps between machine unlearning and ai regulation | arXiv: 2502.12430
- position paper if innovation in ai systematically violates fundamental rights is | arXiv: 2511.00027
- position the complexity of perfect ai alignment -- formalizing the rlhf trilemma | arXiv: 2511.19504
- position thematic analysis of unstructured clinical transcripts with large langu | arXiv: 2509.14597
- position there is no free bayesian uncertainty quantification | arXiv: 2506.03670
- position towards bidirectional human-ai alignment | arXiv: 2406.09264
- post hoc regression refinement via pairwise rankings | arXiv: 2508.16495
- posterior sampling by combining diffusion models with annealed langevin dynamics | arXiv: 2510.26324
- power ensemble aggregation for improved extreme event ai prediction | arXiv: 2511.11170
- power lines scaling laws for weight decay and batch size in llm pre-training | arXiv: 2505.13738
- ppg-distill efficient photoplethysmography signals analysis via foundation model | arXiv: 2509.19215
- practical bayes-optimal membership inference attacks | arXiv: 2505.24089
- practical do-shapley explanations with estimand-agnostic causal inference | arXiv: 2509.20211
- pragmatic heterogeneous collaborative perception via generative communication me | arXiv: 2510.19618
- Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning | arXiv: 2503.16965
- Precise Information Control in Long-Form Text Generation | arXiv: 2506.06589
- preconditioned langevin dynamics with score-based generative models for infinite | arXiv: 2505.18276
- predict training data quality via its geometry in metric space | arXiv: 2510.15970
- predicting public health impacts of electricity usage | arXiv: 2511.22031
- predicting the performance of black-box llms through follow-up queries | arXiv: 2501.01558
- prediction-powered semi-supervised learning with online power tuning | arXiv: 2510.22586
- predictive feature caching for training-free acceleration of molecular geometry | arXiv: 2510.04646
- predictive preference learning from human interventions | arXiv: 2510.01545
- preference learning with lie detectors can induce honesty or evasion | arXiv: 2505.13787
- preference learning with response time robust losses and guarantees | arXiv: 2505.22820
- preference optimization by estimating the ratio of the data distribution | arXiv: 2505.19601
- preference-based reinforcement learning beyond pairwise comparisons benefits of | arXiv: 2510.18713
- preference-driven knowledge distillation for few-shot node classification | arXiv: 2510.10116
- PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation | arXiv: 2412.03409
- prefm online audio-visual event parsing via predictive future modeling | arXiv: 2505.23155
- prescribe predicting single-cell responses with bayesian estimation | arXiv: 2510.07964
- preserving llm capabilities through calibration data curation from analysis to o | arXiv: 2510.10618
- preserving task-relevant information under linear concept removal | arXiv: 2506.10703
- presto preimage-informed instruction optimization for prompting black-box llms | arXiv: 2510.25808
- pretraining a unified pddl domain from real-world demonstrations for generalizab | arXiv: 2507.21545
- preventing shortcuts in adapter training via providing the shortcuts | arXiv: 2510.20887
- principled data augmentation for learning to solve quadratic programming problem | arXiv: 2506.01728
- principled fine-tuning of llms from user-edits a medley of preference supervisio | arXiv: 2601.19055
- prior-guided flow matching for target-aware molecule design with learnable atom | arXiv: 2509.01486
- prioritizing perception-guided self-supervision a new paradigm for causal modeli | arXiv: 2511.08214
- private continual counting of unbounded streams | arXiv: 2506.15018
- private evolution converges | arXiv: 2506.08312
- private zeroth-order optimization with public data | arXiv: 2511.10859
- probabilistic reasoning with llms for k-anonymity estimation | arXiv: 2503.09674
- Probabilistic Token Alignment for Large Language Model Fusion | arXiv: 2509.17276
- probability calibration for precipitation nowcasting | arXiv: 2510.00594
- probing neural combinatorial optimization models | arXiv: 2510.22131
- problem-parameter-free decentralized bilevel optimization | arXiv: 2510.24288
- procurement auctions with predictions improved frugality for facility location | arXiv: 2512.09367
- product distribution learning with imperfect advice | arXiv: 2511.10366
- profit a specialized optimizer for deep fine tuning | arXiv: 2412.01930
- program synthesis via test-time transduction | arXiv: 2509.17393
- progressive inference-time annealing of diffusion models for sampling from boltz | arXiv: 2506.16471
- projecting assumptions the duality between sparse autoencoders and concept geome | arXiv: 2503.01822
- prompt tuning decision transformers with structured and scalable bandits | arXiv: 2502.04979
- prompt-based safety guidance is ineffective for unlearned text-to-image diffusio | arXiv: 2511.04834
- ProofSketch: Efficient Verified Reasoning for Large Language Models | arXiv: 2510.24811
- prospero active learning for robust protein design beyond wild-type neighborhood | arXiv: 2505.22494
- protein design with dynamic protein vocabulary | arXiv: 2505.18966
- provable ordering and continuity in vision-language pretraining for generalizabl | arXiv: 2502.01218
- Provable Scaling Laws for the Test-Time Compute of Large Language Models | arXiv: 2411.19477
- provable watermarking for data poisoning attacks | arXiv: 2510.09210
- provably efficient online rlhf with one-pass reward modeling | arXiv: 2502.07193
- psi-sampler initial particle sampling for smc-based inference-time reward alignm | arXiv: 2506.01320
- pubsub-vfl towards efficient two-party split learning in heterogeneous environme | arXiv: 2510.12494
- pulse practical evaluation scenarios for large multimodal model unlearning | arXiv: 2507.01271
- purifying shampoo investigating shampoos heuristics by decomposing its precondit | arXiv: 2506.03595
- put cash on bandits a max k-armed problem for automated machine learning | arXiv: 2505.05226
- q-palette fractional-bit quantizers toward optimal bit allocation for efficient | arXiv: 2509.20214
- qimeng-neucomback self-evolving translation from ir to assembly code | arXiv: 2511.01183
- qimeng-salv signal-aware learning for verilog code generation | arXiv: 2510.19296
- qoq-med building multimodal clinical foundation models with domain-aware grpo tr | arXiv: 2506.00711
- qsharp provably optimal distributional rl for llm post-training | arXiv: 2502.20548
- qsvd efficient low-rank approximation for unified query-key-value weight compres | arXiv: 2510.16292
- quadenhancer leveraging quadratic transformations to enhance deep neural network | arXiv: 2510.03276
- quantifying and alleviating co-adaptation in sparse-view 3d gaussian splatting | arXiv: 2508.12720
- quantifying climate policy action and its links to development outcomes a cross- | arXiv: 2510.17425
- quantifying generalisation in imitation learning | arXiv: 2509.24784
- quantifying task-relevant representational similarity using decision variable co | arXiv: 2506.02164
- quantifying the role of openfold components in protein structure prediction | arXiv: 2511.14781
- quantitative convergence of trained single layer neural networks to gaussian pro | arXiv: 2509.24544
- quantization error propagation revisiting layer-wise post-training quantization | arXiv: 2504.09629
- quantum doubly stochastic transformers | arXiv: 2504.16275
- r2ec towards large recommender models with reasoning | arXiv: 2505.16994
- rad towards trustworthy retrieval-augmented multi-modal clinical diagnosis | arXiv: 2509.19980
- radar benchmarking language models on imperfect tabular data | arXiv: 2506.08249
- radial attention onlog n sparse attention with energy decay for long video gener | arXiv: 2506.19852
- radial neighborhood smoothing recommender system | arXiv: 2507.09952
- radzero similarity-based cross-attention for explainable vision-language alignme | arXiv: 2504.07416
- rag-igbench innovative evaluation for rag-based interleaved generation in open-d | arXiv: 2512.05119
- ram-w600 a multi-task wrist dataset and benchmark for rheumatoid arthritis | arXiv: 2507.05193
- random search neural networks for efficient and expressive graph learning | arXiv: 2510.22520
- rao-blackwellised reparameterisation gradients | arXiv: 2506.07687
- raptr radar-based 3d pose estimation using transformer | arXiv: 2511.08387
- rare text semantics were always there in your diffusion transformer | arXiv: 2510.03886
- rat bridging rnn efficiency and attention accuracy via chunk-based sequence mode | arXiv: 2507.04416
- raw2drive reinforcement learning with aligned world models for end-to-end autono | arXiv: 2505.16394
- raxss retrieval-augmented sparse sampling for explainable variable-length medica | arXiv: 2510.02936
- rccda adaptive model updates in the presence of concept drift under a constraine | arXiv: 2505.24149
- rd-agent-quant a multi-agent framework for data-centric factors and model joint | arXiv: 2505.15155
- rdb2g-bench a comprehensive benchmark for automatic graph modeling of relational | arXiv: 2506.01360
- rdd retrieval-based demonstration decomposer for planner alignment in long-horiz | arXiv: 2510.14968
- re-coding for uncertainties edge-awareness semantic concordance for resilient ev | arXiv: 2511.08269
- re-forc adaptive reward prediction for efficient chain-of-thought reasoning | arXiv: 2511.02130
- reading recognition in the wild | arXiv: 2505.24848
- real-time execution of action chunking flow policies | arXiv: 2506.07339
- real-world adverse weather image restoration via dual-level reinforcement learni | arXiv: 2511.05095
- Real-World Reinforcement Learning of Active Perception Behaviors | arXiv: 2512.01188
- RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics | arXiv: 2505.12575
- ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs | arXiv: 2506.18896
- Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought | arXiv: 2505.12514
- reasoning compiler llm-guided optimizations for efficient model serving | arXiv: 2506.01374
- Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards | arXiv: 2505.24760
- reasoning meets representation envisioning neuro-symbolic wireless foundation mo | arXiv: 2511.16369
- reasoning models better express their confidence | arXiv: 2505.14489
- reasoning models hallucinate more factuality-aware reinforcement learning for la | arXiv: 2505.24630
- Reasoning With a Star: A Heliophysics Dataset and Benchmark for Agentic Scientific Reasoning | arXiv: 2511.20694
- recognition through reasoning reinforcing image geo-localization with large visi | arXiv: 2506.14674
- recon region-controllable data augmentation with rectification and alignment for | arXiv: 2510.15783
- recon-gs continuum-preserved gaussian streaming for fast and compact reconstruct | arXiv: 2509.24325
- reconstruct inpaint test-time finetune dynamic novel-view synthesis from monocul | arXiv: 2507.12646
- reconstructing the local density field with combined convolutional and point clo | arXiv: 2510.08573
- reconstruction and secrecy under approximate distance queries | arXiv: 2511.06461
- rectified point flow generic point cloud pose estimation | arXiv: 2506.05282
- rectified-cfg for flow based models | arXiv: 2510.07631
- rectifying shortcut behaviors in preference-based reward learning | arXiv: 2510.19050
- rectifying soft-label entangled bias in long-tailed dataset distillation | arXiv: 2511.17914
- recurrent attention-based token selection for efficient streaming video-llms | arXiv: 2510.17364
- recurrent memory for online interdomain gaussian processes | arXiv: 2502.08736
- recurrent self-attention dynamics an energy-agnostic perspective from jacobians | arXiv: 2505.19458
- redefining experts interpretable decomposition of language models for toxicity m | arXiv: 2509.16660
- redundancy-aware test-time graph out-of-distribution detection | arXiv: 2510.14562
- reflective translation improving low-resource machine translation via structured | arXiv: 2601.19871
- reflora refactored low-rank adaptation for efficient fine-tuning of large models | arXiv: 2505.18877
- regression trees know calculus | arXiv: 2405.13846
- regret lower bounds for decentralized multi-agent stochastic shortest path probl | arXiv: 2511.04594
- reinforcement learning finetunes small subnetworks in large language models | arXiv: 2505.11711
- Reinforcement Learning for Long-Horizon Multi-Turn Search Agents | arXiv: 2510.24126
- reinforcement learning teachers of test time scaling | arXiv: 2506.08388
- reinforcement learning with action chunking | arXiv: 2507.07969
- reinforcement learning with backtracking feedback | arXiv: 2602.08377
- reinforcing the diffusion chain of lateral thought with diffusion language model | arXiv: 2505.10446
- reject only critical tokens pivot-aware speculative decoding | arXiv: 2511.00351
- reliabilityrag effective and provably robust defense for rag-based web-search | arXiv: 2509.23519
- reliable active learning from unreliable labels via neural collapse geometry | arXiv: 2510.09740
- reliable decision making via calibration oriented retrieval augmented generation | arXiv: 2411.08891
- reliably detecting model failures in deployment without labels | arXiv: 2506.05047
- relieving the over-aggregating effect in graph transformers | arXiv: 2510.21267
- remasking discrete diffusion models with inference-time scaling | arXiv: 2503.00307
- remindrag low-cost llm-guided knowledge graph traversal for efficient rag | arXiv: 2510.13193
- reordering patches improves vision models | arXiv: 2505.23751
- rep resource-efficient prompting for rehearsal-free continual learning | arXiv: 2406.04772
- reparameterized llm training via orthogonal equivalence transformation | arXiv: 2506.08001
- repic reinforced post-training for personalizing multi-modal language models | arXiv: 2506.18369
- replaceme network simplification via depth pruning and transformer block lineari | arXiv: 2505.02819
- repldm reprogramming pretrained latent diffusion models for high-quality high-ef | arXiv: 2410.06055
- representation consistency for accurate and coherent llm answer aggregation | arXiv: 2506.21590
- resnets are deeper than you think | arXiv: 2506.14386
- resounding acoustic fields with reciprocity | arXiv: 2510.20602
- respodiff dual-module bottleneck transformation for responsible faithful t2i gen | arXiv: 2509.15257
- responserank data-efficient reward modeling through preference strength learning | arXiv: 2512.25023
- restoring pruned large language models via lost component compensation | arXiv: 2510.21834
- Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates | arXiv: 2505.10039
- rethinking direct preference optimization in diffusion models | arXiv: 2505.18736
- rethinking evaluation of infrared small target detection | arXiv: 2509.16888
- rethinking losses for diffusion bridge samplers | arXiv: 2506.10982
- Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion | arXiv: 2502.20120
- rethinking neural combinatorial optimization for vehicle routing problems with d | arXiv: 2505.24627
- rethinking nighttime image deraining via learnable color space transformation | arXiv: 2510.17440
- Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling | arXiv: 2505.11730
- rethinking pca through duality | arXiv: 2510.18130
- rethinking residual distribution in locate-then-edit model editing | arXiv: 2502.03748
- rethinking the simulation vs rendering dichotomy no free lunch in spatial world | arXiv: 2510.20835
- retrieval is not enough enhancing rag reasoning through test-time critique and o | arXiv: 2504.14858
- retrieval-augmented generation for reliable interpretation of radio regulations | arXiv: 2509.09651
- Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
- retrosynthesis planning via worst-path policy optimisation in tree-structured md | arXiv: 2509.10504
- retrv-r1 a reasoning-driven mllm framework for universal and efficient multimoda | arXiv: 2510.02745
- revealing multimodal causality with large language models | arXiv: 2509.17784
- reverse engineering human preferences with reinforcement learning | arXiv: 2505.15795
- revisiting agnostic boosting | arXiv: 2503.09384
- revisiting bi-linear state transitions in recurrent neural networks | arXiv: 2505.21749
- revisiting end-to-end learning with slide-level supervision in computational pat | arXiv: 2506.02408
- revisiting generative infrared and visible image fusion based on human cognitive | arXiv: 2510.26268
- revisiting logit distributions for reliable out-of-distribution detection | arXiv: 2510.20134
- revisiting orbital minimization method for neural operator decomposition | arXiv: 2510.21952
- revisiting semi-supervised learning in the era of foundation models | arXiv: 2503.09707
- reward-aware proto-representations in reinforcement learning | arXiv: 2505.16217
- rewind-to-delete certified machine unlearning for nonconvex functions | arXiv: 2409.09778
- rgb-only supervised camera parameter optimization in dynamic scenes | arXiv: 2509.15123
- rgb-to-polarization estimation a new task and benchmark study | arXiv: 2505.13050
- riemannian consistency model | arXiv: 2510.00983
- riemannian flow matching for brain connectivity matrices via pullback geometry | arXiv: 2505.18193
- riganyface scaling neural facial mesh auto-rigging with unlabeled data | arXiv: 2511.18601
- risk management for mitigating benchmark failure modes benchrisk | arXiv: 2510.21460
- risk-averse constrained reinforcement learning with optimized certainty equivale | arXiv: 2510.20199
- risk-averse total-reward reinforcement learning | arXiv: 2506.21683
- rivermamba a state space model for global river discharge and flood forecasting | arXiv: 2505.22535
- RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning | arXiv: 2505.15034
- rlgf reinforcement learning with geometric feedback for autonomous driving video | arXiv: 2509.16500
- rlvr-world training world models with reinforcement learning | arXiv: 2505.13934
- rlzero direct policy inference from language without in-domain supervision | arXiv: 2412.05718
- rmit-adms at the mmu-rag neurips 2025 competition | arXiv: 2602.20735
- rnns perform task computations by dynamically warping neural representations | arXiv: 2512.04310
- robocerebra a large-scale benchmark for long-horizon robotic manipulation evalua | arXiv: 2506.06677
- roborefer towards spatial referring with reasoning in vision-language models for | arXiv: 2506.04308
- robot-r1 reinforcement learning for enhanced embodied reasoning in robotics | arXiv: 2506.00070
- robust adversarial reinforcement learning in stochastic games via sequence model | arXiv: 2510.11877
- robust and diverse multi-agent learning via rational policy gradient | arXiv: 2511.09535
- robust ego-exo correspondence with long-term memory | arXiv: 2510.11417
- robust egocentric referring video object segmentation via dual-modal causal inte | arXiv: 2512.24323
- robust estimation under heterogeneous corruption rates | arXiv: 2508.15051
- robust federated finetuning of llms via alternating optimization of lora | arXiv: 2502.01755
- robust graph condensation via classification complexity mitigation | arXiv: 2510.26451
- robust hallucination detection in llms via adaptive token selection | arXiv: 2504.07863
- robust llm alignment via distributionally robust direct preference optimization | arXiv: 2502.01930
- robust neural rendering in the wild with asymmetric dual 3d gaussian splatting | arXiv: 2506.03538
- robust or suggestible exploring non-clinical induction in llm drug-safety decisi | arXiv: 2510.13931
- robust sampling for active statistical inference | arXiv: 2511.08991
- robustifying learning-augmented caching efficiently without compromising 1-consi | arXiv: 2507.16242
- robustmerge parameter-efficient model merging for mllms with direction robustnes | arXiv: 2502.17159
- robustness in both domains clip needs a robust text encoder | arXiv: 2506.03355
- rogr relightable 3d objects using generative relighting | arXiv: 2510.03163
- roirl efficient self-supervised reasoning with offline iterative reinforcement l | arXiv: 2510.02892
- roma scaling up mamba-based foundation models for remote sensing | arXiv: 2503.10392
- root cause analysis of outliers with missing structural knowledge | arXiv: 2406.05014
- rotary masked autoencoders are versatile learners | arXiv: 2505.20535
- router-r1 teaching llms multi-round routing and aggregation via reinforcement le | arXiv: 2506.09033
- rscc a large-scale remote sensing change caption dataset for disaster events | arXiv: 2509.01907
- rtv-bench benchmarking mllm continuous perception understanding and reasoning th | arXiv: 2505.02064
- s2m-former spiking symmetric mixing branchformer for brain auditory attention de | arXiv: 2508.05164
- s2q-vdit accurate quantized video diffusion transformer with salient data and sp | arXiv: 2508.04016
- sad neural networks divergent gradient flows and asymptotic optimality via o-min | arXiv: 2505.09572
- SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders | arXiv: 2508.08211
- safe and stable control via lyapunov-guided diffusion models | arXiv: 2509.25375
- safe multitask failure detection for vision-language-action models | arXiv: 2506.09937
- Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking | arXiv: 2505.12667
- safepath preventing harmful reasoning in chain-of-thought via early alignment | arXiv: 2505.14667
- safeptr token-level jailbreak defense in multimodal llms via prune-then-restore | arXiv: 2507.01513
- safevla towards safety alignment of vision-language-action model via constrained | arXiv: 2503.03480
- safire saccade-fixation reiteration with mamba for referring image segmentation | arXiv: 2510.10160
- sam-r1 leveraging sam for reward feedback in multimodal segmentation via reinfor | arXiv: 2505.22596
- sama towards multi-turn referential grounded video chat with large language mode | arXiv: 2505.18812
- sample complexity of distributionally robust average-reward reinforcement learni | arXiv: 2505.10007
- sample-adaptivity tradeoff in on-demand sampling | arXiv: 2511.15507
- sample-efficient tabular self-play for offline robust reinforcement learning | arXiv: 2512.00352
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding | arXiv: 2503.01422
- sand-math using llms to generate novel difficult and useful mathematics question | arXiv: 2507.20527
- sansa unleashing the hidden semantics in sam2 for few-shot segmentation | arXiv: 2505.21795
- sao-instruct free-form audio editing using natural language instructions | arXiv: 2510.22795
- saying the unsaid revealing the hidden language of multimodal systems through te | arXiv: 2511.10690
- scaffold diffusion sparse multi-category voxel structure generation with discret | arXiv: 2509.00062
- Scalable Best-of-N Selection for Large Language Models via Self-Certainty | arXiv: 2502.18581
- scalable diffusion transformer for conditional 4d fmri synthesis | arXiv: 2511.22870
- scalable explainable and provably robust anomaly detection with one-step flow ma | arXiv: 2510.18328
- scalable exploration via ensemble | arXiv: 2407.13195
- scalable fingerprinting of large language models | arXiv: 2502.07760
- scalable gpu-accelerated euler characteristic curves optimization and differenti | arXiv: 2510.20271
- scalable inference of functional neural connectivity at submillisecond timescale | arXiv: 2510.20966
- scalable neural incentive design with parameterized mean-field approximation | arXiv: 2510.21442
- scalable policy-based rl algorithms for pomdps | arXiv: 2510.06540
- scalable signature kernel computations for long time series via local neumann se | arXiv: 2502.20392
- Scale-invariant Attention | arXiv: 2505.17083
- scalediff higher-resolution image synthesis via efficient and model-agnostic dif | arXiv: 2510.25818
- scaling can lead to compositional generalization | arXiv: 2507.07207
- scaling diffusion transformers efficiently via μp | arXiv: 2505.15270
- Scaling Embedding Layers in Language Models | arXiv: 2502.01637
- scaling image geo-localization to continent level | arXiv: 2510.26795
- scaling language-centric omnimodal representation learning | arXiv: 2510.11693
- scaling laws and pathologies of single-layer pinns network width and pde nonline | arXiv: 2603.12556
- scaling offline rl via efficient and expressive shortcut models | arXiv: 2505.22866
- scaling rl to long videos | arXiv: 2507.07966
- scaling up active testing to large language models | arXiv: 2508.09093
- scan self-denoising monte carlo annotation for robust process reward learning | arXiv: 2509.16548
- scatterad temporal-topological scattering mechanism for time series anomaly dete | arXiv: 2509.24414
- scene-aware urban design a human-ai recommendation framework using co-occurrence | arXiv: 2511.06201
- scenedecorator towards scene-oriented story generation with scene planning and s | arXiv: 2510.22994
- scenedesigner controllable multi-object image generation with 9-dof pose manipul | arXiv: 2511.16666
- sceneforge enhancing 3d-text alignment with structured scene compositions | arXiv: 2509.15693
- sceneweaver all-in-one 3d scene synthesis with an extensible and self-reflective | arXiv: 2509.20414
- schrödinger bridge matching for tree-structured costs and entropic wasserstein b | arXiv: 2506.17197
- sciarena an open evaluation platform for non-verifiable scientific literature-gr | arXiv: 2507.01001
- scmrdr a scalable and flexible framework for unpaired single-cell multi-omics da | arXiv: 2510.24987
- scope saliency-coverage oriented token pruning for efficient multimodel llms | arXiv: 2510.24214
- score-informed neural operator for enhancing ordering-based causal discovery | arXiv: 2508.12650
- scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery | arXiv: 2602.11609
- scsplit bringing severity cognizance to image decomposition in fluorescence micr | arXiv: 2503.22983
- sd-vlm spatial measuring and understanding with depth-encoded vision-language mo | arXiv: 2509.17664
- sdtagnet leveraging text-annotated navigation maps for online hd map constructio | arXiv: 2506.08997
- seal semantic-aware hierarchical learning for generalized category discovery | arXiv: 2510.18740
- searching latent program spaces | arXiv: 2411.08706
- seca semantically equivalent and coherent attacks for eliciting llm hallucinatio | arXiv: 2510.04398
- secon-rag a two-stage semantic filtering and conflict-free framework for trustwo | arXiv: 2510.09710
- second-order optimization under heavy-tailed noise hessian clipping and sample c | arXiv: 2510.10690
- securing the language of life inheritable watermarks from dna language models to | arXiv: 2509.18207
- seeing beyond the scene analyzing and mitigating background bias in action recog | arXiv: 2512.17953
- seeing is believing mitigating ocr hallucinations in multimodal large language m | arXiv: 2506.20168
- seeing sound hearing sight uncovering modality bias and conflict of ai models in | arXiv: 2505.11217
- seeing the arrow of time in large multimodal models | arXiv: 2506.03340
- seeing the wind from a falling leaf | arXiv: 2512.00762
- seetrek training-free spatial prompting for multimodal large language model | arXiv: 2509.16087
- seg-var image segmentation with visual autoregressive modeling | arXiv: 2511.12594
- seg4diff unveiling open-vocabulary segmentation in text-to-image diffusion trans | arXiv: 2509.18096
- segmast3r geometry grounded segment matching | arXiv: 2510.05051
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models | arXiv: 2505.23564
- segment then splat unified 3d open-vocabulary segmentation via gaussian splattin | arXiv: 2503.22204
- segment-factorized full-song generation on symbolic piano music | arXiv: 2510.05881
- selective learning for deep time series forecasting | arXiv: 2510.25207
- self forcing bridging the train-test gap in autoregressive video diffusion | arXiv: 2506.08009
- self iterative label refinement via robust unlabeled learning | arXiv: 2502.12565
- self-alignment of large video language models with refined regularized preferenc | arXiv: 2504.12083
- self-improving embodied foundation models | arXiv: 2509.15155
- self-refining language model anonymizers via adversarial distillation | arXiv: 2506.01420
- self-supervised contrastive learning is approximately supervised contrastive lea | arXiv: 2506.04411
- self-supervised discovery of neural circuits in spatially patterned neural respo | arXiv: 2509.17174
- self-supervised learning of echocardiographic video representations via online c | arXiv: 2506.11777
- self-supervised learning of graph representations for network intrusion detectio | arXiv: 2509.16625
- self-supervised learning via flow-guided neural operator on time-series data | arXiv: 2602.12267
- self-supervised synthetic pretraining for inference of stellar mass embedded in | arXiv: 2510.24159
- semantic and visual crop-guided diffusion models for heterogeneous tissue synthe | arXiv: 2509.17847
- semantic glitch agency and artistry in an autonomous pixel cloud | arXiv: 2511.16048
- semantic retrieval augmented contrastive learning for sequential recommendation | arXiv: 2503.04162
- semantic surgery zero-shot concept erasure in diffusion models | arXiv: 2510.22851
- semi-infinite nonconvex constrained min-max optimization | arXiv: 2510.12007
- semi-supervised graph anomaly detection via robust homophily learning | arXiv: 2506.15448
- semi-supervised regression with heteroscedastic pseudo-labels | arXiv: 2510.15266
- sempo lightweight foundation models for time series forecasting | arXiv: 2510.19710
- sensorium arc ai agent system for oceanic data exploration and interactive eco-a | arXiv: 2511.15997
- sequential attention-based sampling for histopathological analysis | arXiv: 2507.05077
- sequential monte carlo for policy optimization in continuous pomdps | arXiv: 2505.16732
- sequential multi-agent dynamic algorithm configuration | arXiv: 2510.23535
- sequentially auditing differential privacy | arXiv: 2509.07055
- set smoothness unlocks clarke hyper-stationarity in bilevel optimization | arXiv: 2506.04587
- shallow diffuse robust and invisible watermarking through low-dimensional subspa | arXiv: 2410.21088
- shallow flow matching for coarse-to-fine text-to-speech synthesis | arXiv: 2505.12226
- shallow robustness deep vulnerabilities multi-turn evaluation of medical llms | arXiv: 2510.12255
- shap meets tensor networks provably tractable explanations with parallelism | arXiv: 2510.21599
- shap values via sparse fourier representation | arXiv: 2410.06300
- shapecraft llm agents for structured textured and interactive 3d modeling | arXiv: 2510.17603
- sharper convergence rates for nonconvex optimisation via reduction mappings | arXiv: 2506.08428
- sharpness-aware minimization with z-score gradient filtering | arXiv: 2505.02369
- sheaf cohomology of linear predictive coding networks | arXiv: 2511.11092
- Sherlock: Self-Correcting Reasoning in Vision-Language Models | arXiv: 2505.22651
- shift before you learn enabling low-rank representations in reinforcement learni | arXiv: 2509.05193
- Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks | arXiv: 2502.04204
- shortcutting pre-trained flow matching diffusion models is almost free lunch | arXiv: 2510.17858
- show-o2 improved native unified multimodal models | arXiv: 2506.15564
- sign-in to the lottery reparameterizing sparse training from scratch | arXiv: 2504.12801
- silent tokens loud effects padding in llms | arXiv: 2510.01238
- simple and efficient heterogeneous temporal graph neural network | arXiv: 2510.18467
- Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning | arXiv: 2410.07163
- simu selective influence machine unlearning | arXiv: 2510.07822
- simulating society requires simulating thought | arXiv: 2506.06958
- simulation-based inference for neutrino interaction model parameter tuning | arXiv: 2510.07454
- simulmega moe routers are advanced policy makers for simultaneous speech transla | arXiv: 2509.01200
- simultaneous swap regret minimization via kl-calibration | arXiv: 2502.16387
- simworld-robotics synthesizing photorealistic and dynamic urban environments for | arXiv: 2512.10046
- single-teacher view augmentation boosting knowledge distillation via angular div | arXiv: 2510.22480
- singref6d monocular novel object pose estimation with a single rgb reference | arXiv: 2509.21927
- sitcom scaling inference-time compute for vlas | arXiv: 2510.04041
- situat3dchange situated 3d change understanding dataset for multimodal large lan | arXiv: 2510.11509
- sketch-augmented features improve learning long-range dependencies in graph neur | arXiv: 2511.03824
- skrull towards efficient long context fine-tuning through dynamic data schedulin | arXiv: 2505.19609
- skyladder better and faster pretraining via context window scheduling | arXiv: 2503.15450
- slaying towards queer language processing | arXiv: 2509.17449
- slimmable nam neural amp models with adjustable runtime computational cost | arXiv: 2511.07470
- Sloth: Scaling Laws for LLM Skills to Predict Multi-Benchmark Performance Across Families | arXiv: 2412.06540
- small batch size training for language models when vanilla sgd works and why gra | arXiv: 2507.07101
- small language models as compiler experts auto-parallelization for heterogeneous | arXiv: 2512.19250
- smaller models smarter rewards a two-sided approach to process and outcome rewar | arXiv: 2510.23083
- smartwilds multimodal wildlife monitoring dataset | arXiv: 2509.18894
- smmile an expert-driven benchmark for multimodal medical in-context learning | arXiv: 2506.21355
- smooth regularization for efficient video recognition | arXiv: 2511.20928
- smore structural mixture of residual experts for parameter-efficient llm fine-tu | arXiv: 2504.06426
- smrs advocating a unified reporting standard for surrogate models in the artific | arXiv: 2502.06753
- sofar language-grounded orientation bridges spatial reasoning and object manipul | arXiv: 2502.13143
- soft task-aware routing of experts for equivariant representation learning | arXiv: 2510.27222
- solar-geco perovskite solar cell property prediction with geometric-aware co-att | arXiv: 2511.19263
- solverllm leveraging test-time scaling for optimization problem via llm-guided s | arXiv: 2510.16916
- solving continuous mean field games deep reinforcement learning for non-stationa | arXiv: 2510.22158
- solving inequality proofs with large language models | arXiv: 2506.07927
- solving neural min-max games the role of architecture initialization dynamics | arXiv: 2512.00389
- some optimizers are more equal understanding the role of optimizers in group fai | arXiv: 2504.14882
- sound logical explanations for mean aggregation graph neural networks | arXiv: 2511.11593
- space noise contrastive estimation stabilizes self-play fine-tuning for large la | arXiv: 2512.07175
- space spike-aware consistency enhancement for test-time adaptation in spiking ne | arXiv: 2504.02298
- spark transformer reactivating sparsity in ffn and attention | arXiv: 2506.06644
- Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models | arXiv: 2504.02821
- sparse mezo less parameters for better performance in zeroth-order llm fine-tuni | arXiv: 2402.15751
- sparsedit token sparsification for efficient diffusion transformer | arXiv: 2412.06028
- SPARTA Alignment: Collectively Aligning Multiple Language Models through Combat | arXiv: 2506.04721
- spatial understanding from videos structured prompts meet simulation data | arXiv: 2506.03642
- spatial-aware decision-making with ring attractors in reinforcement learning sys | arXiv: 2410.03119
- spatialthinker reinforcing 3d reasoning in multimodal llms via spatial rewards | arXiv: 2511.07403
- spatialtracegen high-fidelity traces for efficient vlm spatial reasoning distill | arXiv: 2511.00054
- spatio-temporal directed graph learning for account takeover fraud detection | arXiv: 2509.20339
- spatio-temporal graphs beyond grids benchmark for maritime anomaly detection | arXiv: 2512.20086
- specattn speculating sparse attention | arXiv: 2510.27641
- specialization after generalization towards understanding test-time training in | arXiv: 2509.24510
- specmer fast protein generation with k-mer guided speculative decoding | arXiv: 2509.21689
- spectral conditioning of attention improves transformer performance | arXiv: 2603.07162
- spectral perturbation bounds for low-rank approximation with applications to pri | arXiv: 2510.25670
- speculate deep and accurate lossless and training-free acceleration for offloade | arXiv: 2509.18344
- spend wisely maximizing post-training gains in iterative synthetic data bootstra | arXiv: 2501.18962
- spex a spectral approach to explainable clustering | arXiv: 2511.00885
- spiking brain compression post-training second-order compression for spiking neu | arXiv: 2506.03996
- spiking meets attention efficient remote sensing image super-resolution with att | arXiv: 2503.04223
- spiral semantic-aware progressive lidar scene generation and understanding | arXiv: 2505.22643
- split gibbs discrete diffusion posterior sampling | arXiv: 2503.01161
- splitflow flow decomposition for inversion-free text-to-image editing | arXiv: 2510.25970
- spot-trip dual-preference driven out-of-town trip recommendation | arXiv: 2506.01705
- sprint enabling interleaved planning and parallelized execution in reasoning mod | arXiv: 2506.05745
- spurious-aware prototype refinement for reliable out-of-distribution detection | arXiv: 2506.23881
- sql-of-thought multi-agentic text-to-sql with guided error correction | arXiv: 2509.00581
- sql-r1 training natural language to sql reasoning model by reinforcement learnin | arXiv: 2504.08600
- sqs enhancing sparse perception models via query-based splatting in autonomous d | arXiv: 2509.16588
- SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning | arXiv: 2506.01713
- srsr enhancing semantic accuracy in real-world image super-resolution with spati | arXiv: 2510.22534
- ssr enhancing depth perception in vision-language models via rationale-guided sp | arXiv: 2505.12448
- sstag structure-aware self-supervised learning method for text-attributed graphs | arXiv: 2510.01248
- stable cinemetrics structured taxonomy and evaluation for professional video gen | arXiv: 2509.26555
- stable coresets via posterior sampling aligning induced and full loss landscapes | arXiv: 2511.17399
- stable matching with ties approximation ratios and learning | arXiv: 2411.03270
- stable minima of relu neural networks suffer from the curse of dimensionality th | arXiv: 2506.20779
- stableguard towards unified copyright protection and tamper localization in late | arXiv: 2509.17993
- stair addressing stage misalignment through temporal-aligned preference reinforc | arXiv: 2509.23802
- stamp spatial-temporal adapter with multi-head pooling | arXiv: 2511.10848
- starc-9 a large-scale dataset for multi-class tissue classification for crc hist | arXiv: 2511.00383
- starformer semi-supervised task-informed representation learning via dynamic att | arXiv: 2504.10097
- state-covering trajectory stitching for diffusion planners | arXiv: 2506.00895
- statistical guarantees for high-dimensional stochastic gradient descent | arXiv: 2510.12013
- statistical inference for gradient boosting regression | arXiv: 2509.23127
- statistical inference under performativity | arXiv: 2505.18493
- stead robust provably secure linguistic steganography with diffusion language mo | arXiv: 2601.14778
- stealthy yet effective distribution-preserving backdoor attacks on graph classif | arXiv: 2509.26032
- steering generative models with experimental data for protein fitness optimizati | arXiv: 2505.15093
- steering information utility in key-value memory for language model post-trainin | arXiv: 2507.05158
- steering when necessary flexible steering large language models with backtrackin | arXiv: 2508.17621
- stella subspace learning in low-rank adaptation using stiefel manifold | arXiv: 2510.01938
- step a unified spiking transformer evaluation platform for fair and reproducible | arXiv: 2505.11151
- stochastic momentum methods for non-smooth non-convex finite-sum coupled composi | arXiv: 2506.02504
- stochastic regret guarantees for online zeroth- and first-order bilevel optimiza | arXiv: 2511.01126
- stop ddos attacking the research community with ai-generated survey papers | arXiv: 2510.09686
- Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning | arXiv: 2504.15275
- strap spatio-temporal pattern retrieval for out-of-distribution generalization | arXiv: 2505.19547
- strassen attention split vc dimension and compositionality in transformers | arXiv: 2501.19215
- strategic costs of perceived bias in fair selection | arXiv: 2510.20606
- strategyproof reinforcement learning from human feedback | arXiv: 2503.09561
- streambridge turning your offline video large language model into a proactive st | arXiv: 2505.05467
- streamforest efficient online video understanding with persistent event memory | arXiv: 2509.24871
- streaming federated learning with markovian data | arXiv: 2503.18807
- struct2d a perception-guided framework for spatial reasoning in mllms | arXiv: 2506.04220
- structural information-based hierarchical diffusion for offline reinforcement le | arXiv: 2509.21942
- structure-aware fusion with progressive injection for multimodal molecular repre | arXiv: 2510.23640
- structure-aware spectral sparsification via uniform edge sampling | arXiv: 2510.12669
- Structured Reinforcement Learning for Combinatorial Decision-Making | arXiv: 2505.19053
- structured sparse transition matrices to enable state tracking in state-space mo | arXiv: 2509.22284
- structured temporal causality for interpretable multivariate time series anomaly | arXiv: 2510.16511
- styl3r instant 3d stylized reconstruction for arbitrary scenes and styles | arXiv: 2505.21060
- succeed or learn slowly sample efficient off-policy reinforcement learning for m | arXiv: 2509.01720
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications | arXiv: 2411.04975
- superclip clip with simple classification supervision | arXiv: 2512.14480
- superposition yields robust neural scaling | arXiv: 2505.10465
- surf2ct cascaded 3d flow matching models for torso 3d ct synthesis from skin sur | arXiv: 2505.22511
- suturebot a precision framework benchmark for autonomous end-to-end suturing | arXiv: 2510.20965
- SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents | arXiv: 2505.20411
- SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution | arXiv: 2502.18449
- swe-sql illuminating llm pathways to solve user sql issues in real-world applica | arXiv: 2506.18951
- switchable token-specific codebook quantization for face image compression | arXiv: 2510.22943
- symbolic regression is all you need from simulations to scaling laws in binary n | arXiv: 2511.08784
- symphony synergistic multi-agent planning with heterogeneous language model asse | arXiv: 2601.22623
- symrtlo enhancing rtl code optimization with llms and neuron-inspired symbolic r | arXiv: 2504.10369
- synbrain enhancing visual-to-fmri synthesis via probabilistic representation lea | arXiv: 2508.10298
- synchuman synchronizing 2d and 3d generative models for single-view human recons | arXiv: 2510.07723
- synergy between the strong and the weak spiking neural networks are inherently s | arXiv: 2510.07924
- synergy over discrepancy a partition-based approach to multi-domain llm fine-tun | arXiv: 2511.07198
- Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency | arXiv: 2505.23471
- synthetic series-symbol data generation for time series foundation models | arXiv: 2510.08445
- syntsbench rethinking temporal pattern learning in deep learning models for time | arXiv: 2510.20273
- system prompt optimization with meta-learning | arXiv: 2505.09666
- system-embedded diffusion bridge models | arXiv: 2506.23726
- systematic reward gap optimization for mitigating vlm hallucinations | arXiv: 2411.17265
- systematizing llm persona design a four-quadrant technical taxonomy for ai compa | arXiv: 2511.02979
- t-regs minimum spanning tree regularization for self-supervised learning | arXiv: 2510.23484
- t-rex task-adaptive spatial representation extraction for robotic manipulation w | arXiv: 2506.19498
- t-shirt token-selective hierarchical data selection for instruction tuning | arXiv: 2506.01317
- t1 a tool-oriented conversational dataset for multi-turn agentic planning | arXiv: 2505.16986
- t2smark balancing robustness and diversity in noise-as-watermark for diffusion m | arXiv: 2510.22366
- tabarena a living benchmark for machine learning on tabular data | arXiv: 2506.16791
- table as a modality for large language models | arXiv: 2512.00947
- table2latex-rl high-fidelity latex code generation from table images via reinfor | arXiv: 2509.17589
- tabrag improving tabular document question answering for retrieval augmented gen | arXiv: 2511.06582
- tabstar a tabular foundation model for tabular data with text fields | arXiv: 2505.18125
- tai3 testing agent integrity in interpreting user intent | arXiv: 2506.07524
- talk2event grounded understanding of dynamic scenes from event cameras | arXiv: 2507.17664
- tami taming heterogeneity in temporal interactions for temporal graph link predi | arXiv: 2510.23577
- tangledfeatures robust feature selection in highly correlated spaces | arXiv: 2510.15005
- tapip3d tracking any point in persistent 3d geometry | arXiv: 2504.14717
- tapvid-360 tracking any point in 360 from narrow field of view video | arXiv: 2511.21946
- target speaker extraction through comparing noisy positive and negative audio en | arXiv: 2502.16611
- task-optimized convolutional recurrent networks align with tactile processing in | arXiv: 2505.18361
- taught well learned ill towards distillation-conditional backdoor attack | arXiv: 2509.23871
- teaching language models to evolve with users dynamic profile modeling for perso | arXiv: 2505.15456
- teaming llms to detect and mitigate hallucinations | arXiv: 2510.19507
- temporal smoothness-aware rate-distortion optimized 4d gaussian splatting | arXiv: 2507.17336
- temporal-difference variational continual learning | arXiv: 2410.07812
- TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
- Tensor Product Attention Is All You Need | arXiv: 2501.06425
- tensorrl-qas reinforcement learning with tensor networks for improved quantum ar | arXiv: 2505.09371
- test-time adaptation by causal trimming | arXiv: 2510.11133
- test-time adaptive object detection with foundation model | arXiv: 2510.25175
- test-time spectrum-aware latent steering for zero-shot generalization in vision- | arXiv: 2511.09809
- text to robotic assembly of multi component objects using 3d generative ai and v | arXiv: 2511.02162
- text to sketch generation with multi-styles | arXiv: 2511.04123
- text-to-code generation for modular building layouts in building information mod | arXiv: 2509.23713
- text-to-image models leave identifiable signatures implications for leaderboard | arXiv: 2510.06525
- textttavrobustbench benchmarking the robustness of audio-visual recognition mode | arXiv: 2506.00358
- The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation | arXiv: 2505.15807
- the biased oracle assessing llms understandability and empathy in medical diagno | arXiv: 2511.00924
- the boundaries of fair ai in medical image prognosis a causal perspective | arXiv: 2510.08840
- the burden of interactive alignment with inconsistent preferences | arXiv: 2510.16368
- the coming crisis of multi-agent misalignment ai alignment must be a dynamic and | arXiv: 2506.01080
- the complexity of finding local optima in contrastive learning | arXiv: 2509.16898
- the computational complexity of counting linear regions in relu neural networks | arXiv: 2505.16716
- the cost of robustness tighter bounds on parameter complexity for robust memoriz | arXiv: 2510.24643
- the curse of depth in large language models | arXiv: 2502.05795
- the effect of optimal self-distillation in noisy gaussian mixture model | arXiv: 2501.16226
- the emergence of sparse attention impact of data distribution and benefits of re | arXiv: 2505.17863
- the geometry of cortical computation manifold disentanglement and predictive dyn | arXiv: 2508.02995
- the graphon limit hypothesis understanding neural network pruning via infinite w | arXiv: 2510.17515
- the hawthorne effect in reasoning models evaluating and steering test awareness | arXiv: 2505.14617
- the human brain as a combinatorial complex | arXiv: 2511.20692
- The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models | arXiv: 2506.24000
- the illusion of thinking understanding the strengths and limitations of reasonin | arXiv: 2506.06941
- the impact of quantization on large reasoning model reinforcement learning | arXiv: 2511.15694
- the impact of scaling training data on adversarial robustness | arXiv: 2509.25927
- the implicit bias of structured state space models can be poisoned with clean la | arXiv: 2410.10473
- the last vote a multi-stakeholder framework for language model governance | arXiv: 2511.13432
- the lighthouse of language enhancing llm agents via critique-guided improvement | arXiv: 2503.16024
- the more you automate the less you see hidden pitfalls of ai scientist systems | arXiv: 2509.08713
- The Narrow Gate: Localized Image-Text Communication in Native Multimodal Models | arXiv: 2412.06646
- the non-linear representation dilemma is causal abstraction enough for mechanist | arXiv: 2507.08802
- the ouroboros of benchmarking reasoning evaluation in an era of saturation | arXiv: 2511.01365
- the parameterized complexity of computing the vc-dimension | arXiv: 2510.17451
- the pareto frontier of resilient jet tagging | arXiv: 2509.19431
- the path not taken rlvr provably learns off the principals | arXiv: 2511.08567
- the persistence of neural collapse despite low-rank bias | arXiv: 2410.23169
- the physical basis of prediction world model formation in neural organoids via a | arXiv: 2509.04633
- the platonic universe do foundation models see the same sky | arXiv: 2509.19453
- the pokeagent challenge competitive and long-context learning at scale | arXiv: 2603.15563
- the primacy of magnitude in low-rank adaptation | arXiv: 2507.06558
- the rich and the simple on the implicit bias of adam and sgd | arXiv: 2505.24022
- the rise of parameter specialization for knowledge storage in large language mod | arXiv: 2505.17260
- the structural complexity of matrix-vector multiplication | arXiv: 2502.21240
- the structure of relation decoding linear operators in large language models | arXiv: 2510.26543
- The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | arXiv: 2506.01347
- the transparent earth a multimodal foundation model for the earths subsurface | arXiv: 2509.02783
- the trilemma of truth in large language models | arXiv: 2506.23921
- the underappreciated power of vision models for graph structural understanding | arXiv: 2510.24788
- the unseen threat residual knowledge in machine unlearning under perturbed sampl | arXiv: 2601.22359
- The Virtues of Brevity: Avoid Overthinking in Parallel Test-Time Reasoning | arXiv: 2510.21067
- the world is bigger a computationally-embedded perspective on the big world hypo | arXiv: 2512.23419
- thermalgen style-disentangled flow-based generative models for rgb-to-thermal im | arXiv: 2509.24878
- think before recommendation autonomous reasoning-enhanced recommender | arXiv: 2510.23077
- think or not think a study of explicit thinking in rule-based visual reinforceme | arXiv: 2503.16188
- think straight stop smart structured reasoning for efficient multi-hop rag | arXiv: 2510.19171
- thinkact vision-language-action reasoning via reinforced visual latent planning | arXiv: 2507.16815
- thinksound chain-of-thought reasoning in multimodal large language models for au | arXiv: 2506.21448
- thompson sampling for multi-objective linear contextual bandit | arXiv: 2512.00930
- thompson sampling in function spaces via neural operators | arXiv: 2506.21894
- thought communication in multiagent collaboration | arXiv: 2510.20733
- through the river understanding the benefit of schedule-free methods for languag | arXiv: 2507.09846
- thunder tile-level histopathology image understanding benchmark | arXiv: 2507.07860
- tidmad time series dataset for discovering dark matter with ai denoising | arXiv: 2406.04378
- tight bounds on the distortion of randomized and deterministic distributed votin | arXiv: 2509.17134
- tight lower bounds and improved convergence in performative prediction | arXiv: 2412.03671
- tighter cmi-based generalization bounds via stochastic projection and quantizati | arXiv: 2510.23485
- tiled flash linear attention more efficient linear rnn and xlstm kernels | arXiv: 2503.14376
- time reversal symmetry for efficient robotic manipulations in deep reinforcement | arXiv: 2505.13925
- time travel is cheating going live with deepfund for real-time fund investment b | arXiv: 2505.11065
- time-evolving dynamical system for learning latent representations of mouse visu | arXiv: 2408.07908
- time-imm a dataset and benchmark for irregular multimodal multivariate time seri | arXiv: 2506.10412
- time-o1 time-series forecasting needs transformed label alignment | arXiv: 2505.17847
- TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
- timeperceiver an encoder-decoder framework for generalized time-series forecasti | arXiv: 2512.22550
- tirex zero-shot forecasting across long and short horizons with enhanced in-cont | arXiv: 2505.23719
- titan a trajectory-informed technique for adaptive parameter freezing in large-s | arXiv: 2509.15193
- to distill or decide understanding the algorithmic trade-off in partially observ | arXiv: 2510.03207
- to see or to read user behavior reasoning in multimodal llms | arXiv: 2511.03845
- token bottleneck one token to remember dynamics | arXiv: 2507.06543
- token perturbation guidance for diffusion models | arXiv: 2506.10036
- tokensqueeze performance-preserving compression for reasoning llms | arXiv: 2511.13223
- tomcat test-time comprehensive knowledge accumulation for compositional zero-sho | arXiv: 2510.20162
- Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task
- topology of reasoning understanding large reasoning models through reasoning gra | arXiv: 2506.05744
- torch-uncertainty a deep learning framework for uncertainty quantification | arXiv: 2511.10282
- tortoise and hare guidance accelerating diffusion model inference with multirate | arXiv: 2511.04117
- toward a unified geometry understanding riemannian diffusion framework for graph | arXiv: 2510.04522
- toward a vision-language foundation model for medical data multimodal dataset an | arXiv: 2509.24739
- toward complete merger identification at cosmic noon with deep learning | arXiv: 2511.15006
- toward efficient inference attacks shadow model sharing via mixture-of-experts | arXiv: 2510.13451
- toward engineering agi benchmarking the engineering design capabilities of llms | arXiv: 2509.16204
- toward explainable offline rl analyzing representations in intrinsically motivat | arXiv: 2506.13958
- toward real-world text image forgery localization structured and interpretable d | arXiv: 2511.12658
- towards 3d objectness learning in an open world | arXiv: 2510.17686
- towards a golden classifier-free guidance path via foresight fixed point iterati | arXiv: 2510.21512
- towards comprehensive scene understanding integrating first and third-person vie | arXiv: 2505.21955
- towards effective federated graph foundation model via mitigating knowledge enta | arXiv: 2505.12684
- towards evaluating proactive risk awareness of multimodal language models | arXiv: 2505.17455
- towards foundational lidar world models with efficient latent flow matching | arXiv: 2506.23434
- towards general modality translation with contrastive and predictive latent diff | arXiv: 2510.20819
- towards implicit aggregation robust image representation for place recognition i | arXiv: 2511.06024
- towards interpretability without sacrifice faithful dense layer decomposition wi | arXiv: 2505.21364
- towards multiscale graph-based protein learning with geometric secondary structu | arXiv: 2602.00862
- towards physics-informed spatial intelligence with human priors an autonomous dr | arXiv: 2510.21160
- towards predicting any human trajectory in context | arXiv: 2506.00871
- towards provable emergence of in-context reinforcement learning | arXiv: 2509.18389
- towards reliable and holistic visual in-context learning prompt selection | arXiv: 2509.25989
- towards reliable code-as-policies a neuro-symbolic framework for embodied task p | arXiv: 2510.21302
- towards resilient safety-driven unlearning for diffusion models against downstre | arXiv: 2507.16302
- towards robust pseudo-label learning in semantic segmentation an encoding perspe | arXiv: 2512.06870
- towards robust zero-shot reinforcement learning | arXiv: 2510.15382
- towards scaling laws for symbolic regression | arXiv: 2510.26064
- towards self-supervised foundation models for critical care time series | arXiv: 2509.19885
- Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning | arXiv: 2502.18080
- towards understanding safety alignment a mechanistic perspective from safety neu | arXiv: 2406.14144
- towards unified and lossless latent space for 3d molecular latent diffusion mode | arXiv: 2503.15567
- towards universal neural operators through multiphysics pretraining | arXiv: 2511.10829
- towards unsupervised domain bridging via image degradation in semantic segmentat | arXiv: 2412.10339
- towards unsupervised open-set graph domain adaptation via dual reprogramming | arXiv: 2510.18363
- toxictextclip text-based poisoning and backdoor attacks on clip pre-training | arXiv: 2511.00446
- tp-mddn task-preferenced multi-demand-driven navigation with autonomous decision | arXiv: 2511.17225
- track inpaint resplat subject-driven 3d and 4d generation with progressive textu | arXiv: 2510.23605
- tracking and understanding object transformations | arXiv: 2511.04678
- trackingworld world-centric monocular 3d tracking of almost all pixels | arXiv: 2512.08358
- tractable multinomial logit contextual bandits with non-linear utilities | arXiv: 2601.06913
- train with perturbation infer after merging a two-stage framework for continual | arXiv: 2505.22389
- Training Language Models to Reason Efficiently | arXiv: 2502.04463
- training robust graph neural networks by modeling noise dependencies | arXiv: 2502.19670
- training the untrainable introducing inductive bias via representational alignme | arXiv: 2410.20035
- training-free bayesianization for low-rank adapters of large language models | arXiv: 2412.05723
- training-free constrained generation with stable diffusion models | arXiv: 2502.05625
- training-free efficient video generation via dynamic token carving | arXiv: 2505.16864
- training-free online video step grounding | arXiv: 2510.16989
- training-free safe text embedding guidance for text-to-image diffusion models | arXiv: 2510.24012
- traj-coa patient trajectory modeling via chain-of-agents for lung cancer risk pr | arXiv: 2510.10454
- TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration | arXiv: 2410.20445
- trajectory balance with asynchrony decoupling exploration and learning for fast | arXiv: 2503.18929
- Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning | arXiv: 2505.15311
- trans-env a framework for evaluating the linguistic robustness of llms against e | arXiv: 2505.20875
- transfer learning beyond the standard model | arXiv: 2510.19168
- transfer learning for benign overfitting in high-dimensional linear regression | arXiv: 2510.15337
- transferable black-box one-shot forging of watermarks via image preference model | arXiv: 2510.20468
- transferring causal effects using proxies | arXiv: 2510.25924
- transformer copilot learning from the mistake log in llm fine-tuning | arXiv: 2505.16270
- transformer embeddings for fast microlensing inference | arXiv: 2512.11687
- transformer key-value memories are nearly as interpretable as sparse autoencoder | arXiv: 2510.22332
- transformers provably learn chain-of-thought reasoning with length generalizatio | arXiv: 2511.07378
- transun a preemptive paradigm to eradicate retransformation bias intrinsically f | arXiv: 2505.13881
- trap targeted redirecting of agentic preferences | arXiv: 2505.23518
- traversal verification for speculative tree decoding | arXiv: 2505.12398
- tree-guided diffusion planner | arXiv: 2508.21800
- trico triadic game-theoretic co-training for robust semi-supervised learning | arXiv: 2509.21526
- trident tri-modal molecular representation learning with taxonomic annotations a | arXiv: 2506.21028
- trim scalable 3d gaussian diffusion inference with temporal and spatial trimming | arXiv: 2511.16642
- triplets better than pairs towards stable and effective self-play fine-tuning fo | arXiv: 2601.08198
- tropical attention neural algorithmic reasoning for combinatorial algorithms | arXiv: 2505.17190
- TRoVe: Discovering Error-Inducing Static Feature Biases in Temporal Vision-Language Models
- trust -- transformer-driven u-net for sparse target recovery | arXiv: 2506.01112
- trust region reward optimization and proximal inverse reward optimization algori | arXiv: 2509.23135
- tts-var a test-time scaling framework for visual auto-regressive generation | arXiv: 2507.18537
- turbocharging gaussian process inference with approximate sketch-and-project | arXiv: 2505.13723
- tv-rec time-variant convolutional filter for sequential recommendation | arXiv: 2510.25259
- twilight adaptive attention sparsity with hierarchical top-p pruning | arXiv: 2502.02770
- Two Causally Related Needles in a Video Haystack | arXiv: 2505.19853
- two-stage learning of stabilizing neural controllers via zubov sampling and iter | arXiv: 2506.01356
- two-steps diffusion policy for robotic manipulation via genetic denoising | arXiv: 2510.21991
- u-can unsupervised point cloud denoising with consistency-aware noise2noise matc | arXiv: 2510.25210
- ugm2n an unsupervised and generalizable mesh movement network via m-uniform loss | arXiv: 2508.08615
- ultrahr-100k enhancing uhr image synthesis with a large-scale high-quality datas | arXiv: 2510.20661
- ultrametric cluster hierarchies i want em all | arXiv: 2502.14018
- umami unifying masked autoregressive models and deterministic rendering for view | arXiv: 2512.20107
- UMoE: Unifying Attention and FFN with Shared Experts | arXiv: 2505.07260
- uncertain knowledge graph completion via semi-supervised confidence distribution | arXiv: 2510.16601
- uncertainty estimation by flexible evidential deep learning | arXiv: 2510.18322
- uncertainty quantification for reduced-order surrogate models applied to cloud m | arXiv: 2511.04534
- uncertainty-aware multi-objective reinforcement learning-guided diffusion models | arXiv: 2510.21153
- uncertainty-guided model selection for tabular foundation models in biomolecule | arXiv: 2510.02476
- uncle towards scalable dynamic causal discovery in non-linear temporal systems | arXiv: 2511.03168
- uncovering graph reasoning in decoder-only transformers with circuit tracing | arXiv: 2509.20336
- uncovering strategic egoism behaviors in large language models | arXiv: 2511.09920
- understand before you generate self-guided training for autoregressive image gen | arXiv: 2509.15185
- understanding adam requires better rotation dependent assumptions | arXiv: 2410.19964
- understanding and enhancing mask-based pretraining towards universal representat | arXiv: 2509.21650
- understanding and improving adversarial robustness of neural probabilistic circu | arXiv: 2509.20549
- understanding challenges to the interpretation of disaggregated evaluations of a | arXiv: 2506.04193
- understanding differential transformer unchains pretrained self-attentions | arXiv: 2505.16333
- understanding ice crystal habit diversity with self-supervised learning | arXiv: 2509.07688
- understanding prompt tuning and in-context learning via meta-learning | arXiv: 2505.17010
- understanding representation dynamics of diffusion models via low-dimensional mo | arXiv: 2502.05743
- understanding the generalization of stochastic gradient adam in learning neural | arXiv: 2510.11354
- uni-lora one vector is all you need | arXiv: 2506.00799
- uni-mumer unified multi-task fine-tuning of vision-language model for handwritte | arXiv: 2505.23566
- uniedit a unified knowledge editing benchmark for large language models | arXiv: 2505.12345
- unified all-atom molecule generation with neural fields | arXiv: 2511.15906
- unified reinforcement and imitation learning for vision-language models | arXiv: 2510.19307
- uniformer unified and efficient transformer for reasoning across general and cus | arXiv: 2511.08135
- unifying and enhancing graph transformers via a hierarchical mask framework | arXiv: 2510.18825
- unifying appearance codes and bilateral grids for driving scene gaussian splatti | arXiv: 2506.05280
- Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning | arXiv: 2505.18752
- unifying proportional fairness in centroid and non-centroid clustering | arXiv: 2601.00447
- unifying re-identification attribute inference and data reconstruction risks in | arXiv: 2507.06969
- unifying symbolic music arrangement track-aware reconstruction and structured to | arXiv: 2408.15176
- unifying text semantics and graph structures for temporal text-attributed graphs | arXiv: 2503.14411
- unifying vision-language latents for zero-label image caption enhancement | arXiv: 2510.12931
- unilumos fast and unified image and video relighting with physics-plausible feed | arXiv: 2511.01678
- unimotion a unified motion framework for simulation prediction and planning | arXiv: 2602.00566
- unimrseg unified modality-relax segmentation via hierarchical self-supervised co | arXiv: 2509.16170
- unipixel unified object referring and segmentation for pixel-level visual reason | arXiv: 2509.18094
- unisite the first cross-structure dataset and learning framework for end-to-end | arXiv: 2506.03237
- unitok a unified tokenizer for visual generation and understanding | arXiv: 2502.20321
- universal cross-tokenizer distillation via approximate likelihood matching | arXiv: 2503.20083
- universal spectral tokenization via self-supervised panchromatic representation | arXiv: 2510.17959
- Unlabeled Data Can Provably Enhance In-Context Learning of Transformers | arXiv: 2601.10058
- unlearned but not forgotten data extraction after exact unlearning in llm | arXiv: 2505.24379
- unlearning as ablation toward a falsifiable benchmark for generative scientific | arXiv: 2508.17681
- unleashing diffusion transformers for visual correspondence by modulating massiv | arXiv: 2505.18584
- unleashing hour-scale video training for long video-language understanding | arXiv: 2506.05332
- unlocking multimodal mathematical reasoning via process reward model | arXiv: 2501.04686
- unlocking transfer learning for open-world few-shot recognition | arXiv: 2411.09986
- unmasking covid-19 vulnerability in nigeria mapping risks beyond urban hotspots | arXiv: 2509.05398
- unpaired image-to-image translation for segmentation and signal unmixing | arXiv: 2505.20746
- unsupervised discovery of high-redshift galaxy populations with variational auto | arXiv: 2511.05439
- Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards | arXiv: 2509.19003
- unveiling m-sharpness through the structure of stochastic gradient noise | arXiv: 2509.18001
- unveiling the power of multiple gossip steps a stability-based generalization an | arXiv: 2510.07980
- unveiling the spatial-temporal effective receptive fields of spiking neural netw | arXiv: 2510.21403
- urb -- urban routing benchmark for rl-equipped connected autonomous vehicles | arXiv: 2505.17734
- urbaning-v2x a large-scale multi-vehicle multi-infrastructure dataset across mul | arXiv: 2510.23478
- urdf-anything constructing articulated objects with 3d multimodal language model | arXiv: 2511.00940
- urls help topics guide understanding metadata utility in llm training | arXiv: 2505.16570
- utilgen utility-centric generative data augmentation with dual-level task adapta | arXiv: 2510.24262
- v-cece visual counterfactual explanations via conceptual edits | arXiv: 2509.16567
- v2x-radar a multi-modal dataset with 4d radar for cooperative perception | arXiv: 2411.10962
- va-gs enhancing the geometric representation of gaussian splatting via view alig | arXiv: 2510.11473
- vadtree explainable training-free video anomaly detection via hierarchical granu | arXiv: 2510.22693
- vagen reinforcing world model reasoning for multi-turn vlm agents | arXiv: 2510.16907
- valid inference with imperfect synthetic data | arXiv: 2508.06635
- validating llm-as-a-judge systems under rating indeterminacy | arXiv: 2503.05965
- value gradient guidance for flow matching alignment | arXiv: 2512.05116
- valuepilot a two-phase framework for value-driven decision-making | arXiv: 2512.13716
- vamp variational multi-modal prompt learning for vision-language models | arXiv: 2511.22664
- vanish into thin air cross-prompt universal adversarial attacks for sam2 | arXiv: 2510.24195
- variance-aware feel-good thompson sampling for contextual bandits | arXiv: 2511.02123
- variational autoencoder with normalizing flow for x-ray spectral fitting | arXiv: 2601.07440
- variational regularized unbalanced optimal transport single network least action | arXiv: 2505.11823
- vasa-3d lifelike audio-driven gaussian head avatars from a single image | arXiv: 2512.14677
- VERA: Variational Inference Framework for Jailbreaking Large Language Models | arXiv: 2506.22666
- verbalized algorithms | arXiv: 2509.08150
- vessa video-based object-centric self-supervised adaptation for visual foundatio | arXiv: 2510.20994
- vgent graph-based retrieval-reasoning-augmented generation for long video unders | arXiv: 2510.14032
- vicinity-guided discriminative latent diffusion for privacy-preserving domain ad | arXiv: 2510.00478
- video diffusion models excel at tracking similar-looking objects without supervi | arXiv: 2512.02339
- video finetuning improves reasoning between frames | arXiv: 2511.12868
- video killed the energy budget characterizing the latency and power regimes of o | arXiv: 2509.19222
- video-r1 reinforcing video reasoning in mllms | arXiv: 2503.21776
- video-rag visually-aligned retrieval-augmented long video comprehension | arXiv: 2411.13093
- video-safetybench a benchmark for safety evaluation of video lvlms | arXiv: 2505.11842
- videolucy deep memory backtracking for long video understanding | arXiv: 2510.12422
- videorft incentivizing video reasoning capability in mllms via reinforced fine-t | arXiv: 2505.12434
- viki-r coordinating embodied multi-agent cooperation via reinforcement learning | arXiv: 2506.09049
- viking deep variational inference with stochastic projections | arXiv: 2510.23684
- vimorag video-based retrieval-augmented 3d motion generation for motion language | arXiv: 2508.12081
- vipamin visual prompt initialization via embedding selection and subspace expans | arXiv: 2510.16446
- virus infection attack on llms your poisoning can spread via synthetic data | arXiv: 2509.23041
- vision function layer in multimodal llms | arXiv: 2509.24791
- vision transformers for cosmological fields application to weak lensing mass map | arXiv: 2512.07125
- vision transformers with self-distilled registers | arXiv: 2505.21501
- vision-centric token compression in large language model | arXiv: 2502.00791
- vispec accelerating vision-language models with vision-aware speculative decodin | arXiv: 2509.15235
- visual diversity and region-aware prompt learning for zero-shot hoi detection | arXiv: 2510.25094
- visual instruction bottleneck tuning | arXiv: 2505.13946
- visual structures helps visual reasoning addressing the binding problem in vlms | arXiv: 2506.22146
- visual sync multi-camera synchronization via cross-view object motion | arXiv: 2512.02017
- Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought | arXiv: 2505.15510
- visuallens personalization through task-agnostic visual history | arXiv: 2411.16034
- vita-15 towards gpt-4o level real-time vision and speech interaction | arXiv: 2501.01957
- vitrix-clipin enhancing fine-grained visual understanding in clip via instructio | arXiv: 2508.02329
- VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set | arXiv: 2510.21323
- vla-cache efficient vision-language-action manipulation via adaptive token cachi | arXiv: 2502.02175
- vmdt decoding the trustworthiness of video foundation models | arXiv: 2511.05682
- vocabulary customization for efficient domain-specific llm deployment | arXiv: 2509.26124
- VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play | arXiv: 2502.01932
- vorta efficient video diffusion via routing sparse attention | arXiv: 2505.18809
- vq-seg vector-quantized token perturbation for semi-supervised medical image seg | arXiv: 2601.10124
- vqtoken neural discrete token representation learning for extreme token reductio | arXiv: 2503.16980
- vsa faster video diffusion with trainable sparse attention | arXiv: 2505.13389
- vt-fsl bridging vision and text with llms for few-shot learning | arXiv: 2509.25033
- walking the schrödinger bridge a direct trajectory for text-to-3d generation | arXiv: 2511.05609
- walrus wavelets for long-range representation using ssms | arXiv: 2505.12161
- wasserstein transfer learning | arXiv: 2505.17404
- watch and listen understanding audio-visual-speech moments with multimodal llm | arXiv: 2505.18110
- watermarking autoregressive image generation | arXiv: 2506.16349
- wavelet canonical coherence for nonstationary signals | arXiv: 2505.14253
- wavy transformer | arXiv: 2508.12787
- weak-to-strong generalization under distribution shifts | arXiv: 2510.21332
- wearvqa a visual question answering benchmark for wearables in egocentric authen | arXiv: 2511.22154
- web-scale collection of video data for 4d animal reconstruction | arXiv: 2511.01169
- web-shepherd advancing prms for reinforcing web agents | arXiv: 2505.15277
- weight weaving parameter pooling for data-free model merging | arXiv: 2510.13921
- wham towards a translative model of sperm whale vocalization | arXiv: 2512.02206
- what ai speaks for your community polling ai agents for public opinion on data c | arXiv: 2511.22037
- what can rl bring to vla generalization an empirical study | arXiv: 2505.19789
- what does it take to build a performant selective classifier | arXiv: 2510.20242
- what expressivity theory misses message passing complexity for gnns | arXiv: 2509.01254
- what happens during the loss plateau understanding abrupt learning in transforme | arXiv: 2506.13688
- what makes a reward model a good teacher an optimization perspective | arXiv: 2503.15477
- what one cannot two can two-layer transformers provably represent induction head | arXiv: 2508.07208
- what we dont c manifold disentanglement for structured discovery | arXiv: 2511.09433
- when ai democratizes exploitation llm-assisted strategic manipulation of fair di | arXiv: 2511.14722
- when are concepts erased from diffusion models | arXiv: 2505.17013
- when can model-free reinforcement learning be enough for thinking | arXiv: 2506.17124
- when less language is more language-reasoning disentanglement makes llms better | arXiv: 2505.15257
- when no paths lead to rome benchmarking systematic neural relational reasoning | arXiv: 2510.23532
- when one modality sabotages the others a diagnostic lens on multimodal reasoning | arXiv: 2511.02794
- when one moment isnt enough multi-moment retrieval with cross-moment interaction | arXiv: 2510.17218
- when semantics mislead vision mitigating large multimodal models hallucinations | arXiv: 2506.05551
- when thinking drifts evidential grounding for robust video reasoning | arXiv: 2510.06077
- when worse is better navigating the compression-generation tradeoff in visual to | arXiv: 2412.16326
- where and how to perturb on the design of perturbation guidance in diffusion and | arXiv: 2506.10978
- who you are matters bridging topics and social roles via llm-enhanced logical re | arXiv: 2505.10940
- why diffusion models dont memorize the role of implicit dynamical regularization | arXiv: 2505.17638
- why is attention sparse in particle transformer | arXiv: 2512.00210
- why knowledge distillation works in generative models a minimal working explanat | arXiv: 2505.13111
- why masking diffusion works condition on the jump schedule for improved discrete | arXiv: 2506.08316
- Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints | arXiv: 2506.12421
- wider or deeper scaling llm inference-time compute with adaptive branching tree | arXiv: 2503.04412
- wildcat3d appearance-aware multi-view diffusion in the wild | arXiv: 2506.13030
- windsock is dancing adaptive multimodal retrieval-augmented generation | arXiv: 2510.22694
- with limited data for multimodal alignment let the structure guide you | arXiv: 2506.16895
- wmcopier forging invisible image watermarks on arbitrary images | arXiv: 2503.22330
- words that unite the world a unified framework for deciphering central bank comm | arXiv: 2505.17048
- worse than zero-shot a fact-checking dataset for evaluating the robustness of ra | arXiv: 2502.16101
- writing in symbiosis mapping human creative agency in the ai era | arXiv: 2512.13697
- x-scene large-scale driving scene generation with high fidelity and flexible con | arXiv: 2506.13558
- xifbench evaluating large language models on multilingual instruction following | arXiv: 2503.07539
- xlstm-mixer multivariate time series forecasting by mixing via scalar memories | arXiv: 2410.16928
- Yggdrasil: 桥接动态投机和静态运行时的延迟最优树型LLM解码 | arXiv: 2512.23858
- you can trust your clustering model a parameter-free self-boosting plug-in for d | arXiv: 2511.21193
- your pre-trained llm is secretly an unsupervised confidence calibrator | arXiv: 2505.16690
- zebra towards zero-shot cross-subject generalization for universal brain visual | arXiv: 2510.27128
- zero-shot context generalization in reinforcement learning from few training con | arXiv: 2507.07348
- zero-shot embedding drift detection a lightweight defense against prompt injecti | arXiv: 2601.12359
- zero-shot large language model agents for fully automated radiotherapy treatment | arXiv: 2510.11754
- zero-shot performance prediction for probabilistic scaling laws | arXiv: 2510.16743
- zero-shot robustness of vision language models via confidence-aware weighting | arXiv: 2510.02913
- ZeroS: Zero-Sum Linear Attention for Efficient Transformers | arXiv: 2602.05230
- zeroth-order optimization finds flat minima | arXiv: 2506.05454
- zeus zero-shot embeddings for unsupervised separation of tabular data | arXiv: 2505.10704
- zip2zip inference-time adaptive tokenization via online compression | arXiv: 2506.01084
- zpressor bottleneck-aware compression for scalable feed-forward 3dgs | arXiv: 2505.23734
- 上下文学习中的技术债务:长序列中的递减效率 | arXiv: 2502.04580
- 笔记1: CoT是幻觉吗?数据分布角度 | arXiv: 2508.01191
- 笔记2:PRM必要吗?RL隐式诱导PRM能力 | arXiv: 2505.11227
- 笔记4:WebThinker - 赋予推理模型深度研究能力 | arXiv: 2504.21776
- 笔记5:ReSearch - 学习通过搜索推理 | arXiv: 2503.19470
- 笔记6:Self-Evaluating LLMs - 多步任务的步级置信度估计 | arXiv: 2505.17373
- 笔记7:价值引导搜索 - 高效链式思考推理 | arXiv: 2504.18428
- 笔记8:PolyMath - 多语言背景下的数学推理评估 | arXiv: 2511.07364
- impact of dataset properties on membership inference | arXiv: 2402.06674
- clawscreativity detection for llm-generated solutions using attention window of | arXiv: 2510.17921
- levo high-quality song generation with multi-processing refined supervision | arXiv: 2506.07520
- a selfimproving coding agent | arXiv: 2504.15228
- a stochastic differential equation framework for multi-objective llm interaction | arXiv: 2510.10739
- astrovisbench a code benchmark for scientific computing and visualization in ast | arXiv: 2505.20538
- automated multi-agent workflows for rtl design | arXiv: 2509.20182
- co-evolving llm coder and unit tester via reinforcement learning | arXiv: 2506.03136
- core benchmarking llms code reasoning capabilities through static analysis tasks | arXiv: 2507.05269
- embedding alignment in code generation for audio | arXiv: 2508.05473
- flylora boosting task decoupling and parameter efficiency via implicit rank-wise | arXiv: 2510.08396
- fractalbench diagnosing visual-mathematical reasoning through recursive program | arXiv: 2511.06522
- learning to solve complex problems via dataset decomposition | arXiv: 2602.20296
- maintaincoder maintainable code generation under dynamic requirements | arXiv: 2503.24260
- mlr-bench evaluating ai agents on open-ended machine learning research | arXiv: 2505.19955
- once upon an input reasoning via per-instance program synthesis | arXiv: 2510.22849
- preserving llm capabilities through calibration data curation from analysis to o | arXiv: 2510.10618
- principled fine-tuning of llms from user-edits a medley of preference supervisio | arXiv: 2601.19055
- program synthesis via test-time transduction | arXiv: 2509.17393
- qimeng-salv signal-aware learning for verilog code generation | arXiv: 2510.19296
- swe-rebench an automated pipeline for task collection and decontaminated evaluat | arXiv: 2505.20411
- table2latex-rl high-fidelity latex code generation from table images via reinfor | arXiv: 2509.17589
- text-to-code generation for modular building layouts in building information mod | arXiv: 2509.23713
- aclora almost trainingfree access controlaware multimodal ll | arXiv: 2505.11557
- bridging human and llm judgments understanding and narrowing the gap | arXiv: 2508.12792
- hygen efficient llm serving via elastic online-offline request co-location | arXiv: 2501.14808
- metamind modeling human social thoughts with metacognitive multi-agent systems | arXiv: 2505.18943
- sciarena an open evaluation platform for non-verifiable scientific literature-gr | arXiv: 2507.01001
- coral longtail diffusion | arXiv: 2506.15933
- dd2 onestep ar distill | arXiv: 2510.21003
- why diffusion models dont memorize the role of implicit regularization | arXiv: 2505.17638
- latent harmony synergistic unified uhd image restoration with pre-trained diffus
- benchmarking retrievalaugmented multimodal generation for do | arXiv: 2505.16470
- chain-of-retrieval augmented generation | arXiv: 2501.14342
- compress gather and recompute reforming long-context processing in transformers | arXiv: 2506.01215
- cooperative retrieval-augmented generation for question answering mutual informa | arXiv: 2512.10422
- deep research brings deeper harm | arXiv: 2510.11851
- dice discrete interpretable comparative evaluation with probabilistic scoring fo | arXiv: 2512.22629
- generalized contrastive learning for universal multimodal re | arXiv: 2509.25638
- hierarchical retrieval the geometry and a pretrain-finetune recipe | arXiv: 2509.16411
- hifi-rag hierarchical content filtering and two-pass generation for open-domain | arXiv: 2512.22442
- how should we evaluate data deletion in graph-based ann indexes | arXiv: 2512.06200
- hypergraphrag retrieval-augmented generation via hypergraph-structured knowledge | arXiv: 2503.21322
- improving consistency in retrieval-augmented systems with group similarity rewar | arXiv: 2510.04392
- is prm necessary problem-solving rl implicitly induces prm capability in llms | arXiv: 2505.11227
- learning task-agnostic representations through multi-teacher distillation | arXiv: 2510.18680
- mind the gap aligning knowledge bases with user needs to enhance mental health r | arXiv: 2509.13626
- mir-bench can your llm recognize complicated patterns via many-shot in-context r | arXiv: 2502.09933
- mitra an ai assistant for knowledge retrieval in physics collaborations | arXiv: 2603.09800
- murating a high quality data selecting approach to multilingual large language m | arXiv: 2507.01785
- rag-igbench innovative evaluation for rag-based interleaved generation in open-d | arXiv: 2512.05119
- reliable decision making via calibration oriented retrieval augmented generation | arXiv: 2411.08891
- retrieval-augmented generation for reliable interpretation of radio regulations | arXiv: 2509.09651
- retrieval is not enough enhancing rag reasoning through test-time critique and o | arXiv: 2504.14858
- rmit-adms at the mmu-rag neurips 2025 competition | arXiv: 2602.20735
- scale-invariant attention | arXiv: 2505.17083
- scaling language-centric omnimodal representation learning | arXiv: 2510.11693
- secon-rag a two-stage semantic filtering and conflict-free framework for trustwo | arXiv: 2510.09710
- superclip clip with simple classification supervision | arXiv: 2512.14480
- the atlas of in-context learning how attention heads shape in-context retrieval | arXiv: 2505.15807
- the narrow gate localized imagetext communication in native | arXiv: 2412.06646
- windsock is dancing adaptive multimodal retrieval-augmented generation | arXiv: 2510.22694
- worse than zero-shot a fact-checking dataset for evaluating the robustness of ra | arXiv: 2502.16101
- a is for absorption studying feature splitting and absorption in sparse autoenco | arXiv: 2409.14507
- a unified reasoning framework for holistic zeroshot video an | arXiv: 2511.00962
- adaptgrad adaptive sampling to reduce noise | arXiv: 2410.07711
- additive models explained a computational complexity approach | arXiv: 2510.21292
- agentiql an agent-inspired multi-expert framework for text-to-sql generation | arXiv: 2510.10661
- an analysis of concept bottleneck models measuring understanding and mitigating | arXiv: 2505.16705
- are greedy task orderings better than random in continual linear regression | arXiv: 2510.19941
- arecho autoregressive evaluation via chain-based hypothesis optimization for spe | arXiv: 2505.24518
- attributing response to context a jensen-shannon divergence driven mechanistic s | arXiv: 2505.16415
- auditing meta-cognitive hallucinations in reasoning large language models | arXiv: 2505.13143
- base models know how to reason thinking models learn when | arXiv: 2510.07364
- better estimation of the kullback--leibler divergence between language models | arXiv: 2504.10637
- beyond accuracy dissecting mathematical reasoning for llms u | arXiv: 2506.04723
- beyond components singular vector-based interpretability of transformer circuits | arXiv: 2511.20273
- beyond token probes hallucination detection via activation tensors with act-vit | arXiv: 2510.00296
- bigram subnetworks mapping to next tokens in transformer language models | arXiv: 2504.15471
- causal head gating a framework for interpreting roles of attention heads in tran | arXiv: 2505.13737
- cbmas cognitive behavioral modeling via activation steering | arXiv: 2601.06109
- chiqpm calibrated hierarchical interpretable image classification | arXiv: 2511.20779
- cognitive mirrors exploring the diverse functional roles of attention heads in l | arXiv: 2512.10978
- conceptscope characterizing dataset bias via disentangled visual concepts | arXiv: 2510.26186
- conditional distribution compression via the kernel conditional mean embedding | arXiv: 2504.10139
- curvature tuning provable training-free model steering from a single parameter | arXiv: 2502.07783
- dataset distillation for pre-trained self-supervised vision models | arXiv: 2511.16674
- deep modularity networks with diversity-preserving regularization | arXiv: 2501.13451
- deep value benchmark measuring whether models generalize deep values or shallow | arXiv: 2511.02109
- distributional autoencoders know the score | arXiv: 2502.11583
- do different prompting methods yield a common task representation in language mo | arXiv: 2505.12075
- dynamic algorithm for explainable k-medians clustering under lp norm | arXiv: 2512.01150
- efficient vision-language reasoning via adaptive token pruning | arXiv: 2512.12701
- emergence of linear truth encodings in language models | arXiv: 2510.15804
- empowering decision trees via shape function branching | arXiv: 2510.19040
- encoding and understanding astrophysical information in large language model-gen | arXiv: 2511.14685
- evaluating llms in open-source games | arXiv: 2512.00371
- explaining similarity in vision-language encoders with weighted banzhaf interact | arXiv: 2508.05430
- fact faithful concept traces for explaining neural network decisions | arXiv: 2510.25512
- fantastic features and where to find them a probing method to combine features f | arXiv: 2512.01405
- fastdinov2 frequency based curriculum learning improves robustness and training | arXiv: 2507.03779
- from flat to hierarchical extracting sparse representations with matching pursui | arXiv: 2506.03093
- geometric priors for generalizable world models via vector symbolic architecture | arXiv: 2602.21467
- h-splid hsic-based saliency preserving latent information decomposition | arXiv: 2510.20627
- how do transformers learn implicit reasoning | arXiv: 2505.23653
- improving perturbation-based explanations by understanding the role of uncertain | arXiv: 2511.10439
- knowing when to stop efficient context processing via latent sufficiency signals | arXiv: 2502.01025
- latent principle discovery for language model self-improvement | arXiv: 2505.16927
- learning to focus causal attention distillation via gradient-guided token prunin | arXiv: 2506.07851
- llm probing with contrastive eigenproblems improving understanding and applicabi | arXiv: 2511.02089
- minimizing false-positive attributions in explanations of non-linear models | arXiv: 2505.11210
- monte carlo expected threat mocet scoring | arXiv: 2511.16823
- mopformer motion-primitive transformer for wearable-sensor activity recognition | arXiv: 2505.20744
- ordshap feature position importance for sequential black-box models | arXiv: 2507.11855
- out of control -- why alignment needs formal control theory and an alignment con | arXiv: 2506.17846
- partial information decomposition via normalizing flows in latent gaussian distr | arXiv: 2510.04417
- probabilistic token alignment for large language model fusion | arXiv: 2509.17276
- rectifying shortcut behaviors in preference-based reward learning | arXiv: 2510.19050
- saying the unsaid revealing the hidden language of multimodal systems through te | arXiv: 2511.10690
- scpilot large language model reasoning toward automated single-cell analysis and | arXiv: 2602.11609
- self-supervised contrastive learning is approximately supervised contrastive lea | arXiv: 2506.04411
- shap values via sparse fourier representation | arXiv: 2410.06300
- simulating society requires simulating thought | arXiv: 2506.06958
- sloth scaling laws for llm skills to predict multi-benchmark performance across | arXiv: 2412.06540
- spex a spectral approach to explainable clustering | arXiv: 2511.00885
- steering information utility in key-value memory for language model post-trainin | arXiv: 2507.05158
- tangledfeatures robust feature selection in highly correlated spaces | arXiv: 2510.15005
- the non-linear representation dilemma is causal abstraction enough for mechanist | arXiv: 2507.08802
- the trilemma of truth in large language models | arXiv: 2506.23921
- time-evolving dynamical system for learning latent representations of mouse visu | arXiv: 2408.07908
- toward explainable offline rl analyzing representations in intrinsically motivat | arXiv: 2506.13958
- toward real-world text image forgery localization structured and interpretable d | arXiv: 2511.12658
- towards interpretability without sacrifice faithful dense layer decomposition wi | arXiv: 2505.21364
- towards scaling laws for symbolic regression | arXiv: 2510.26064
- transformer key-value memories are nearly as interpretable as sparse autoencoder | arXiv: 2510.22332
- tropical attention neural algorithmic reasoning for combinatorial algorithms | arXiv: 2505.17190
- uncovering graph reasoning in decoder-only transformers with circuit tracing | arXiv: 2509.20336
- urls help topics guide understanding metadata utility in llm training | arXiv: 2505.16570
- vadtree explainable training-free video anomaly detection via hierarchical granu | arXiv: 2510.22693
- valuepilot a two-phase framework for value-driven decision-making | arXiv: 2512.13716
- vlsae interpreting and enhancing visionlanguage alignment wi | arXiv: 2510.21323
- what happens during the loss plateau understanding abrupt learning in transforme | arXiv: 2506.13688
- why is attention sparse in particle transformer | arXiv: 2512.00210
- edit less achieve more dynamic sparse neuron masking for lifelong knowledge edit | arXiv: 2510.22139
- kscope a framework for characterizing the knowledge status of language models | arXiv: 2506.07458
- memeic a step toward continual and compositional knowledge editing | arXiv: 2510.25798
- memoir lifelong model editing with minimal overwrite and informed retention for | arXiv: 2506.07899
- rethinking residual distribution in locate-then-edit model editing | arXiv: 2502.03748
- uniedit a unified knowledge editing benchmark for large language models | arXiv: 2505.12345
- l-mtp leap multi-token prediction beyond adjacent context for large language mod | arXiv: 2505.17505
- loogle v2 are llms ready for real world long dependency challenges | arXiv: 2510.22548
- omnidraft a cross-vocabulary online adaptive drafter for on-device speculative d | arXiv: 2507.02659
- yggdrasil bridging dynamic speculation and static runtime for latency-optimal tr | arXiv: 2512.23858
- a highdimensional statistical method for optimizing transfer | arXiv: 2502.04242
- a standardized benchmark for multilabel antimicrobial peptide classification | arXiv: 2511.04814
- a unified framework for provably efficient algorithms to estimate shapley values | arXiv: 2506.05216
- adastar adaptive data sampling for training self-taught reasoners | arXiv: 2505.16322
- aggregation hides out-of-distribution generalization failures from spurious corr | arXiv: 2510.24884
- asymmetric duos sidekicks improve uncertainty | arXiv: 2505.18636
- bayesian evaluation of large language model behavior | arXiv: 2511.10661
- belief-calibrated multi-agent consensus seeking for complex nlp tasks | arXiv: 2510.06307
- benchmarking is broken -- dont let ai be its own judge | arXiv: 2510.07575
- benchmarking large language models for zero-shot and few-shot phishing url detec | arXiv: 2602.02641
- beyond the singular revealing the value of multiple generations in benchmark eva | arXiv: 2502.08943
- beyond the surface enhancing llm-as-a-judge alignment with human via internal re | arXiv: 2508.03550
- blink-twice you see but do you observe a reasoning benchmark on visual perceptio | arXiv: 2510.09361
- can large language models master complex card games | arXiv: 2509.01328
- climb class-imbalanced learning benchmark on tabular data | arXiv: 2505.17451
- codeassistbench cab dataset benchmarking for multi-turn chat-based code assistan | arXiv: 2507.10646
- compo preference alignment via comparison oracles | arXiv: 2505.05465
- conformal online learning of deep koopman linear embeddings | arXiv: 2511.12760
- conformal prediction in the loop a feedback-based uncertainty model for trajecto | arXiv: 2510.16376
- conftuner training large language models to express their confidence verbally | arXiv: 2508.18847
- cost-sensitive freeze-thaw bayesian optimization for efficient hyperparameter tu | arXiv: 2510.21379
- creativity or brute force using brainteasers as a window into the problem-solvin | arXiv: 2505.10844
- decoupled entropy minimization | arXiv: 2511.03256
- efficient semantic uncertainty quantification in language models via diversity-s | arXiv: 2510.21310
- enhancing sample selection against label noise by cutting mislabeled easy exampl | arXiv: 2502.08227
- evalearn quantifying the learning capability and efficiency of llms via sequenti | arXiv: 2506.02672
- exploiting task relationships in continual learning via transferability-aware ta | arXiv: 2502.11609
- exploiting vocabulary frequency imbalance in language model pre-training | arXiv: 2508.15390
- generalization error analysis for selective state-space models through the lens | arXiv: 2502.01473
- houselayout3d a benchmark and training-free baseline for 3d layout estimation in | arXiv: 2512.02450
- hybridnorm towards stable and efficient transformer training via hybrid normaliz | arXiv: 2503.04598
- incomplete multi-view clustering via hierarchical semantic alignment and coopera | arXiv: 2510.13887
- ineq-comp benchmarking human-intuitive compositional reasoning in automated theo | arXiv: 2505.12680
- keep it on a leash controllable pseudo-label generation towards realistic long-t | arXiv: 2510.03993
- lcdb 11 a database illustrating learning curves are more ill-behaved than previo | arXiv: 2505.15657
- learning generalizable shape completion with sim3 equivariance | arXiv: 2509.26631
- let the experts speak improving survival prediction calibration via mixture-of-e | arXiv: 2511.09567
- leveraging robust optimization for llm alignment under distribution shifts | arXiv: 2504.05831
- ltd-bench evaluating large language models by letting them draw | arXiv: 2511.02347
- meicoder decoding visual stimuli from neural activity by leveraging most excitin | arXiv: 2510.20762
- merlin l48 spectrogram dataset | arXiv: 2511.00252
- mind the gap removing the discretization gap in differentiable logic gate networ | arXiv: 2506.07500
- model-behavior alignment under flexible evaluation when the best-fitting model i | arXiv: 2510.23321
- model context protocol for vision systems audit security and protocol extensions | arXiv: 2509.22814
- mvsmamba multi-view stereo with state space model | arXiv: 2511.01315
- normal-abnormal guided generalist anomaly detection | arXiv: 2510.00495
- on evaluating llm alignment by evaluating llms as judges | arXiv: 2511.20604
- open-insect benchmarking open-set recognition of novel species in biodiversity m | arXiv: 2503.01691
- optitree hierarchical thoughts generation with tree search for llm optimization | arXiv: 2510.22192
- parrot a benchmark for evaluating llms in cross-system sql translation | arXiv: 2509.23338
- path attention position encoding via accumulating householder transformations | arXiv: 2505.16381
- pfδ a benchmark dataset for power flow under load generation and topology variat | arXiv: 2510.22048
- put cash on bandits a max k-armed problem for automated machine learning | arXiv: 2505.05226
- rdb2g-bench a comprehensive benchmark for automatic graph modeling of relational | arXiv: 2506.01360
- reliably detecting model failures in deployment without labels | arXiv: 2506.05047
- rethinking evaluation of infrared small target detection | arXiv: 2509.16888
- rethinking losses for diffusion bridge samplers | arXiv: 2506.10982
- rgb-to-polarization estimation a new task and benchmark study | arXiv: 2505.13050
- risk management for mitigating benchmark failure modes benchrisk | arXiv: 2510.21460
- scmrdr a scalable and flexible framework for unpaired single-cell multi-omics da | arXiv: 2510.24987
- semi-supervised regression with heteroscedastic pseudo-labels | arXiv: 2510.15266
- small language models as compiler experts auto-parallelization for heterogeneous | arXiv: 2512.19250
- test-time adaptation by causal trimming | arXiv: 2510.11133
- the geometry of cortical computation manifold disentanglement and predictive dyn | arXiv: 2508.02995
- thought communication in multiagent collaboration | arXiv: 2510.20733
- tight lower bounds and improved convergence in performative prediction | arXiv: 2412.03671
- time travel is cheating going live with deepfund for real-time fund investment b | arXiv: 2505.11065
- turbocharging gaussian process inference with approximate sketch-and-project | arXiv: 2505.13723
- unlocking transfer learning for open-world few-shot recognition | arXiv: 2411.09986
- what does it take to build a performant selective classifier | arXiv: 2510.20242
- your pre-trained llm is secretly an unsupervised confidence calibrator | arXiv: 2505.16690
- evorefuse evolutionary prompt optimization for evaluation and mitigation of llm | arXiv: 2505.23473
- qsharp provably optimal distributional rl for llm post-training | arXiv: 2502.20548
- solverllm leveraging test-time scaling for optimization problem via llm-guided s | arXiv: 2510.16916
- speculate deep and accurate lossless and training-free acceleration for offloade | arXiv: 2509.18344
- streambridge turning your offline video large language model into a proactive st | arXiv: 2505.05467
- symphony synergistic multi-agent planning with heterogeneous language model asse | arXiv: 2601.22623
- systematizing llm persona design a four-quadrant technical taxonomy for ai compa | arXiv: 2511.02979
- wider or deeper scaling llm inference-time compute with adaptive branching tree | arXiv: 2503.04412
- ai progress should be measured by capability-per-resource not scale alone a fram | arXiv: 2511.01077
- alternating gradient flows a theory of feature learning in two-layer neural netw | arXiv: 2506.06489
- an empirical investigation of neural odes and symbolic regression for dynamical | arXiv: 2601.20637
- beyond benign overfitting in nadaraya-watson interpolators | arXiv: 2502.07480
- born a transformer -- always a transformer on the effect of pretraining on archi | arXiv: 2505.21785
- breaking the frozen subspace importance sampling for low-rank optimization in ll | arXiv: 2502.05790
- broken tokens your language model can secretly handle non-canonical tokenization | arXiv: 2506.19004
- conformal risk training end-to-end optimization of conformal risk control | arXiv: 2510.08748
- differentiable hierarchical visual tokenization | arXiv: 2511.02652
- disaggregation reveals hidden training dynamics the case of agreement attraction | arXiv: 2510.24934
- does object binding naturally emerge in large pretrained vision transformers | arXiv: 2510.24709
- efficient pre-training of llms via topology-aware communication alignment on mor | arXiv: 2509.15940
- enhancing training data attribution with representational optimization | arXiv: 2505.18513
- final-model-only data attribution with a unifying view of gradient-based methods | arXiv: 2412.03906
- flatness is necessary neural collapse is not rethinking generalization via grokk | arXiv: 2509.17738
- gemstones a model suite for multi-faceted scaling laws | arXiv: 2502.06857
- generalization bounds for rank-sparse neural networks | arXiv: 2510.21945
- global minimizers of sigmoid contrastive loss | arXiv: 2509.18552
- gradient-weight alignment as a train-time proxy for generalization in classifica | arXiv: 2510.25480
- how does sequence modeling architecture influence base capabilities of pre-train | arXiv: 2505.18522
- language model behavioral phases are consistent across archi | arXiv: 2510.24963
- learning the wrong lessons syntactic-domain spurious correlations in language mo | arXiv: 2509.21155
- learning to flow from generative pretext tasks for neural architecture encoding | arXiv: 2510.18360
- leveraging importance sampling to detach alignment modules from large language m | arXiv: 2505.19700
- lm behavioral phases | arXiv: 2510.24963
- memory mosaics at scale | arXiv: 2507.03285
- nemotron-climb clustering-based iterative data mixture bootstrapping for languag | arXiv: 2504.13161
- neural collapse under gradient flow on shallow relu networks for orthogonally se | arXiv: 2510.21078
- optimal online change detection via random fourier features | arXiv: 2505.17789
- power lines scaling laws for weight decay and batch size in llm pre-training | arXiv: 2505.13738
- predict training data quality via its geometry in metric space | arXiv: 2510.15970
- prescribe predicting single-cell responses with bayesian estimation | arXiv: 2510.07964
- quantifying task-relevant representational similarity using decision variable co | arXiv: 2506.02164
- retrospective incontext learning for temporal credit assignm | arXiv: 2602.17497
- ricl temporal credit | arXiv: 2602.17497
- scalable fingerprinting of large language models | arXiv: 2502.07760
- scaling embedding layers in language models | arXiv: 2502.01637
- superposition yields robust neural scaling | arXiv: 2505.10465
- the curse of depth in large language models | arXiv: 2502.05795
- through the river understanding the benefit of schedule-free methods for languag | arXiv: 2507.09846
- understanding and enhancing mask-based pretraining towards universal representat | arXiv: 2509.21650
- zeus zero-shot embeddings for unsupervised separation of tabular data | arXiv: 2505.10704
- time temporal reasoning | arXiv: 2505.12891
- a cramrvon mises approach to incentivizing truthful data sha | arXiv: 2506.07272
- a reliable cryptographic framework for empirical machine unl | arXiv: 2404.11577
- buffer layers for test-time adaptation | arXiv: 2510.21271
- demystifying language model forgetting with low-rank example associations | arXiv: 2406.14026
- finding structure in continual learning | arXiv: 2602.04555
- procurement auctions with predictions improved frugality for facility location | arXiv: 2512.09367
- simu selective influence machine unlearning | arXiv: 2510.07822
- stop ddos attacking the research community with ai-generated survey papers | arXiv: 2510.09686
- teaming llms to detect and mitigate hallucinations | arXiv: 2510.19507
- trust -- transformer-driven u-net for sparse target recovery | arXiv: 2506.01112
- less is more but where dynamic token compression via llm-guided keyframe prior | arXiv: 2512.06866
- qsvd efficient low-rank approximation for unified query-key-value weight compres | arXiv: 2510.16292
- scalable exploration via ensemble | arXiv: 2407.13195
- when worse is better navigating the compression-generation tradeoff in visual to | arXiv: 2412.16326
- adaptive originality filtering rejection based prompting and riddlescore for cul | arXiv: 2508.18709
- dcad-2000 a multilingual dataset across 2000 languages with data cleaning as ano | arXiv: 2502.11546
- enhancing multilingual llm pretraining with model-based data selection | arXiv: 2502.10361
- exploring the translation mechanism of large language models | arXiv: 2502.11806
- helpsteer3-preference open human-annotated preference data across diverse tasks | arXiv: 2505.11475
- how data mixing shapes in-context learning asymptotic equivalence for transforme | arXiv: 2510.25753
- mergebench a benchmark for merging domain-specialized llms | arXiv: 2505.10833
- merit multilingual semantic retrieval with interleaved multi-condition query | arXiv: 2506.03144
- parallelprompt extracting parallelism from large language model queries | arXiv: 2506.18728
- quantifying climate policy action and its links to development outcomes a cross- | arXiv: 2510.17425
- zero-shot performance prediction for probabilistic scaling laws | arXiv: 2510.16743
- danmaku tpp bench | arXiv: 2505.18411
- ifinder structured zero-shot vision-based llm grounding for dash-cam video reaso | arXiv: 2509.19552
- rtv bench benchmarking mllm continuous perception through realtime video | arXiv: 2505.02064
- lr yolo lipschitz continuity image restoration object detection | arXiv: 2510.24232
- m-grpo stabilizing self-supervised reinforcement learning for multimodal underst | arXiv: 2512.13070
- beyond tildeosqrtt constraint violation for online convex optimization with adve | arXiv: 2505.06709
- constrained network slice assignment via llms | arXiv: 2512.00040
- contribution of task-irrelevant stimuli to drift of neural representations | arXiv: 2510.21588
- a differentiable model of supply-chain shocks | arXiv: 2511.05231
- exact learning of arithmetic with differentiable agents | arXiv: 2511.22751
- orbitzoo real orbital systems challenges for reinforcement learning | arXiv: 2504.04160
- ortholoc uav 6-dof localization and calibration using orthographic geodata | arXiv: 2509.18350
- multi-modal masked autoencoders for learning image-spectrum associations for gal | arXiv: 2510.22527
- r2ec towards large recommender models with reasoning | arXiv: 2505.16994
- adaptive cooperative transmission design for ultra-reliable low-latency communic | arXiv: 2511.02216
- boundary to region supervision for offline safe rl | arXiv: 2509.25727
- confounding robust deep reinforcement learning a causal approach | arXiv: 2510.21110
- continual knowledge adaptation for reinforcement learning | arXiv: 2510.19314
- interactive and hybrid imitation learning provably beating behavior cloning | arXiv: 2412.07057
- inverse optimization latent variable models for learning costs applied to route | arXiv: 2509.15999
- last iterate convergence in monotone mean field games | arXiv: 2410.05127
- coopera continual open ended human robot assistance | arXiv: 2510.23495
- dexflywheel a scalable and self-improving data generation framework for dexterou | arXiv: 2509.23829
- egothinker egocentric reasoning | arXiv: 2510.23569
- t-rex task-adaptive spatial representation extraction for robotic manipulation w | arXiv: 2506.19498
- oneshot transfer learning nonlinear pdes perturbative pinns | arXiv: 2511.11137
- mechanistic interpretability of rnns emulating hidden mar | arXiv: 2510.25674
- starformer semi-supervised task-informed representation learning via dynamic att | arXiv: 2504.10097
- trident tri-modal molecular representation learning with taxonomic annotations a | arXiv: 2506.21028
- a multitask benchmark for abusive language detection in lowr | arXiv: 2505.12116
- active slice discovery in large language models | arXiv: 2511.20713
- auto-search and refinement an automated framework for gender bias mitigation in | arXiv: 2502.11559
- averimatec a dataset for automatic verification of image-text claims with eviden | arXiv: 2505.17978
- concept-level explainability for auditing steering llm responses | arXiv: 2505.07610
- date-lm benchmarking data attribution evaluation for large language models | arXiv: 2507.09424
- deeptraverse a depth-first search inspired network for algorithmic visual unders | arXiv: 2506.10084
- dont let it fade preserving edits in diffusion language mode
- evaluating multiple models using labeled and unlabeled data | arXiv: 2501.11866
- graphkeeper graph domain-incremental learning via knowledge disentanglement and | arXiv: 2511.00097
- if-guide influence function-guided detoxification of llms | arXiv: 2506.01790
- noise-robustness through noise a framework combining asymmetric lora with poison | arXiv: 2505.23868
- os-harm a benchmark for measuring safety of computer use agents | arXiv: 2506.14866
- policy-as-prompt turning ai governance rules into guardrails for ai agents | arXiv: 2509.23994
- position paper if innovation in ai systematically violates fundamental rights is | arXiv: 2511.00027
- precise information control in long-form text generation | arXiv: 2506.06589
- slaying towards queer language processing | arXiv: 2509.17449
- connecting the dots a machine learning ready dataset for ionospheric forecasting | arXiv: 2511.15743
- ioncast a deep learning framework for forecasting ionospheric total electron con | arXiv: 2511.15004
- maestro adaptive sparse attention and robust learning for multimodal dynamic tim | arXiv: 2509.25278
- autoregressive adversarial posttraining for realtime interac | arXiv: 2506.09350
- dismo disentangled motion representations for openworld moti | arXiv: 2511.23428
- force prompting video generation models can learn and generalize physics-based c | arXiv: 2505.19386
- foresight adaptive layer reuse for accelerated and highquali | arXiv: 2506.00329
- lemica lexicographic minimax path caching for efficient diffusion-based video ge | arXiv: 2511.00090
- magcache fast video generation with magnitudeaware cache | arXiv: 2506.09045
- photography perspective composition towards aesthetic perspective recommendation | arXiv: 2505.20655
- physctrl generative physics for controllable and physicsgrou | arXiv: 2509.20358
- posecrafter extreme pose estimation with hybrid video synthesis | arXiv: 2510.19527
- radial attention onlog n sparse attention with energy decay for long video gener | arXiv: 2506.19852
- rlgf reinforcement learning with geometric feedback for autonomous driving video | arXiv: 2509.16500
- s2q-vdit accurate quantized video diffusion transformer with salient data and sp | arXiv: 2508.04016
- safesora safe texttovideo generation via graphical watermark | arXiv: 2505.12667
- scaling rl to long videos | arXiv: 2507.07966
- seeing the wind from a falling leaf | arXiv: 2512.00762
- self forcing bridging the train-test gap in autoregressive video diffusion | arXiv: 2506.08009
- stable cinemetrics structured taxonomy and evaluation for professional video gen | arXiv: 2509.26555
- training-free efficient video generation via dynamic token carving | arXiv: 2505.16864
- video diffusion models excel at tracking similar-looking objects without supervi | arXiv: 2512.02339
- video killed the energy budget characterizing the latency and power regimes of o | arXiv: 2509.19222
- vmdt decoding the trustworthiness of video foundation models | arXiv: 2511.05682
- vorta efficient video diffusion via routing sparse attention | arXiv: 2505.18809
- vsa faster video diffusion with trainable sparse attention | arXiv: 2505.13389
- dualground phrase temporal | arXiv: 2510.20244
- egogazevqa egocentric gaze guided video question answering | arXiv: 2509.07447
- star tool video qa | arXiv: 2512.10359
- tempsamp r1 temporal grounding | arXiv: 2509.18056