Computer Vision | Anh Totti Nguyen

Vision Language Models are Biased

Large language models (LLMs) memorize a vast amount of prior knowledge from the Internet that help them on downstream tasks...

2025
|
|

Computer Vision Explainable AI

TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models

Multi-head self-attention (MHSA) is a key component of Transformers, a widely popular architecture in both language and vision. Multiple heads...

2025 ICCV
|
|

Computer Vision

Improving zero-shot object-level change detection by incorporating visual correspondence

Detecting object-level changes between two images across possibly different views is a core task in many applications that involve visual...

2025 WACV
|
|

Computer Vision

Understanding Generative AI Capabilities in Everyday Image Editing Tasks

Generative AI (GenAI) holds significant promise for automating everyday image editing tasks, especially following the recent release of GPT-4o on...

2025
|
|

Computer Vision Explainable AI

Fast and Interpretable Face Identification for Out-Of-Distribution Data Using Vision Transformers

Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this...

2024 WACV
|
|

Computer Vision Explainable AI

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to...

2024 TMLR
|
|

Computer Vision

Vision Language Models Are Blind

While large language models with vision capabilities (VLMs), e.g., GPT-4o and Gemini-1.5 Pro, are powering various image-text applications and scoring...

2024 ACCV
|
|

Computer Vision Explainable AI NLP

PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

* Equal contribution. CLIP-based classifiers rely on the prompt containing a {class name} that is known to the text encoder....

2024 NAACL
|
|

Computer Vision

GlitchBench: Can large multimodal models detect video game glitches?

Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs....

2024 CVPR
|
|

Computer Vision

ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of Zoom and Spatial Biases in Image Classification

Image classifiers are information-discarding machines, by design. Yet, how these models discard information remains mysterious. We hypothesize that one way...

2023 NeurIPS
|

Computer Vision Explainable AI

DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

Face identification (FI) is ubiquitous and drives many high-stake decisions made by law enforcement. State-of-the-art FI approaches compare two images...

2022 CVPR
|
|

Computer Vision Explainable AI

gScoreCAM: What objects is CLIP looking at?

Large-scale, multimodal models trained on web data such as OpenAI’s CLIP are becoming the foundation of many applications. Yet, they...

2022 ACCV
|
|

Computer Vision Explainable AI

Visual correspondence-based explanations improve AI robustness and human-AI team accuracy

Explaining artificial intelligence (AI) predictions is increasingly important and even imperative in many high-stakes applications where humans are the ultimate...

2022 NeurIPS
|
|

Computer Vision Explainable AI

How explainable are adversarially-robust CNNs?

Three important criteria of existing convolutional neural networks (CNNs) are (1) test-set accuracy; (2) out-of-distribution accuracy; and (3) explainability. While...

2022
|

Computer Vision

Inverting Adversarially Robust Networks for Image Synthesis

Recent research in adversarially robust classifiers suggests their representations tend to be aligned with human perception, which makes them attractive...

2021 ACCV
|
|

Computer Vision Explainable AI

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores

Explaining the decisions of an Artificial Intelligence (AI) model is increasingly critical in many real-world, high-stake applications. Hundreds of papers...

2021 NeurIPS
|
|

Computer Vision Explainable AI Human-Computer Interaction

Explaining image classifiers by removing input features using generative models

Interpretability methods often measure the contribution of an input feature to an image classifier’s decisions by heuristically removing it via...

2020 ACCV
|
|

Computer Vision Explainable AI

The shape and simplicity biases of adversarially robust ImageNet-trained CNNs

* All authors contributed equally. Adversarial training has been the topic of dozens of studies and a leading method for...

2020
|

Computer Vision Explainable AI

SAM: The Sensitivity of Attribution Methods to Hyperparameters

* Equal contributions. Attribution methods can provide powerful insights into the reasons for a classifier’s decision. We argue that a...

2020 CVPR
|
|

Computer Vision

Strike (with) a Pose: Neural networks are easily fooled by strange poses of familiar objects

Despite excellent performance on stationary test sets, deep neural networks (DNNs) can fail to generalize to out-of-distribution (OoD) inputs, including...

2019 CVPR
|
|

Computer Vision

A cost-effective method for improving and re-purposing large, pre-trained GANs by fine-tuning their class-embeddings

Large, pre-trained generative models have been increasingly popular and useful to both the research and wider communities. Specifically, BigGANs a...

2019 ACCV
|
|

Computer Vision

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

Motion-sensor cameras in natural habitats offer the opportunity to inexpensively and unobtrusively gather vast amounts of data on animals in...

2018 PNAS
|
|

Computer Vision

VectorDefense: Vectorization as a Defense to Adversarial Examples

Training deep neural networks on images represented as grids of pixels has brought to light an interesting phenomenon known as...

2018
|
|

Computer Vision

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting...

2017 CVPR
|
|

Computer Vision Explainable AI

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner...

2016 NeurIPS
|
|

Computer Vision Explainable AI

Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks

We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To...

2016 Visualization workshop ICML
|
|

Computer Vision

Deep Neural Network are Easily Fooled: High Confidence Predictions for Unrecognizable Images

Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification...

2015

Computer Vision

Innovation Engine: Automated Creativity and Improving Stochastic Optimization via Deep Learning

The Achilles Heel of stochastic optimization algorithms is getting trapped on local optima. Novelty Search avoids this problem by encouraging...

2015 GECCO
|
|

Computer Vision

Understanding Neural Networks Through Deep Visualization

Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural...

2015 DL workshop ICML
|
|