CVPR 2026 · Student Paper

Interactive Episodic Memory
with User Feedback

The first interactive EM-NLQ framework — bringing human feedback into visual memory search

Nikesh Subedi · Loris Bazzani · Ziad Al-Halah

University of Utah

Paper PDF Code (coming soon) arXiv Dataset (coming soon)

Abstract

Bringing Interactivity to Visual Memory Search

Current Episodic Memory with Natural Language Query (EM-NLQ) models ignore a critical real-world factor: interactivity. Users naturally refine queries and provide feedback when results are off — yet no existing method can leverage this. We introduce ReFocus, the first interactive EM-NLQ framework, built around a plug-and-play Feedback ALignment Module (FALM) that enables any base model to incorporate user feedback iteratively. We also present the EM-QnF task and dataset for feedback-driven interaction, with a lightweight training scheme requiring no sequential optimization. ReFocus achieves state-of-the-art results on three benchmarks and strong gains in human-based evaluation.

Citation

BibTeX

If you find this work useful in your research, please consider citing:

bibtex
@inproceedings{subedi2026refocus,
  title     = {Interactive Episodic Memory with User Feedback},
  author    = {Subedi, Nikesh and Bazzani, Loris and Al-Halah, Ziad},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer
               Vision and Pattern Recognition (CVPR)},
  year      = {2026},
}