IMG

Search results (32) found.
Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation
Workshop Emergent Visual Abilities and Limits of Foundation Models, EVAL-FoMo W, 2024
Localizing Auditory Concepts in CNNs
ICML Mechanistic Interpretability Workshop, ICMLMI-W, 2024
System and method for identifying soundtrack for a digital book using a movie adaptation technique
United States Patent, Us patent, 2024
Core Rank : - Google Rank :-
Major Entity Identification: A Generalizable Alternative to Coreference Resolution
Conference on Empirical Methods in Natural Language Processing, EMNLP, 2024
Core Rank : A* Google Rank :193
MICap: A Unified Model for Identity-aware Movie Descriptions
Computer Vision and Pattern Recognition, CVPR, 2024
Core Rank : A* Google Rank :440
Previously On ... From Recaps to Story Summarization
Computer Vision and Pattern Recognition, CVPR, 2024
Core Rank : A* Google Rank :440
How you feelin? Learning Emotions and Mental States in Movie Scenes
Computer Vision and Pattern Recognition, CVPR, 2023
Core Rank : A* Google Rank :440
GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering
WWW Workshop on Natural Language Processing for Knowledge Graph Construction, NLP4KGc, 2023
Core Rank : - Google Rank :-
DO VIDEO-LANGUAGE FOUNDATION MODELS HAVE A SENSE OF TIME?
workshop on International Conference on Learning Representations, ICLR-W, 2023
Core Rank : - Google Rank :-
Test of Time: Instilling Video-Language Models with a Sense of Time
Computer Vision and Pattern Recognition, CVPR, 2023
Core Rank : A* Google Rank :440