Kartik Narayan
I am a 2nd year Ph.D. student in the Computer Science
department
at Johns Hopkins University, where I am a member of VIU lab, advised by Dr.Vishal Patel. My research focuses
on computer vision and its applications in face analysis, understanding, and recognition, with a
particular emphasis on multimodal LLMs and generation.
Prior to my doctoral studies, I worked as an undergraduate researcher under
Prof. Richa Singh and
Prof. Mayank Vatsa at the
Image Analysis and Biometrics (IAB) Lab,
IIT Jodhpur, where I worked on deepfake video generation. During my time at IIT Jodhpur,
I also collaborated with
Dr. Suman Kundu and
Dr. Suchetana Chakraborty.
Additionally, I interned at the University of Texas, San Antonio, where I worked with
Dr. Heena Rathore and
Dr. Faycal Znidi.
Email  / 
CV  / 
Google Scholar
 / 
LinkedIn  / 
Twitter  / 
Github
Open to internship opportunities for Summer 2025
|
|
Research
My research primarily focuses on computer vision and its applications in face analysis,
understanding, and recognition,
with the goal of developing robust open-world algorithms that can be deployed for real-world impact.
My specific
research interests include multimodal LLMs, video generation, face forensics,
parameter-efficient
fine-tuning,
face segmentation,
and representation learning. While most of my work has been in the application domain of face, I am
open to
exploring general vision research problems, particularly in the field of multimodal LLMs. I have
authored multiple
first-author papers in top conferences like CVPR, AAAI.
|
News
- December, 2024: One Paper is accepted at AAAI 2025.
- October, 2024: One Paper is accepted at WACV 2025.
- October, 2024: One Paper is accepted at IEEE TBIOM.
- January, 2024: One Paper is accepted at FG 2024.
- August, 2023: Joined as a PhD student at VIU Lab, Johns Hopkins University.
- May, 2023: Received CVPR Student Travel Award.
- Feb, 2023: One Paper is Accepted at CVPR 2023.
- September, 2022: One Paper is Accepted at IEEE Access.
- August, 2022: One Paper is Accepted at IJCB 2022.
- April, 2022: One Paper is Accepted at CVPRW TCV 2022.
- January, 2022: One Paper is Accepted at IEEE SysCon 2022.
|
Publications
See my Google
Scholar profile for the complete and most recent publications.
|
|
SegFace: Face Segmentation of Long-Tail Classes
Kartik Narayan,
Vibashan VS,
Vishal M. Patel
AAAI, 2025
arXiv /
project /
code
We propose SegFace, a simple and efficient transformer-based model which utilizes
learnable class-specific tokens, allowing each token to
focus on its corresponding class, thereby enabling independent modeling of each class. It improves
the performance of long-tail classes and outperforms previous state-of-the-art models, achieving a
mean F1 score of 88.96 (+2.82) on the
CelebAMask-HQ dataset and 93.03 (+0.65) on the LaPa dataset.
|
|
|
DF-Platter: Multi-subject Heterogeneous Deepfake Dataset
Kartik Narayan,
Harsh Agarwal,
Kartik Thakral,
Surbhi Mittal,
Mayank Vatsa,
Richa Singh
CVPR , 2023
paper
/
poster
In this research, we emulate the real-world scenario of deepfake generation and spreading, and
propose the DF-Platter dataset which contains (i) both low-resolution and high-resolution deepfakes
generated using multiple generation techniques, (ii) single-subject and multiple-subject deepfakes.
The results demonstrate a significant performance reduction in the deepfake detection task on
low-resolution deepfakes and show that the existing techniques fail drastically on multiple-subject
deepfakes.
|
|
|
FaceXBench: Evaluating Multimodal LLMs on Face Understanding
Kartik Narayan,
Vibashan VS,
Vishal M. Patel
Under Review
paper
We introduce FaceXBench, a comprehensive benchmark designed to evaluate MLLMs on complex face
understanding tasks. FaceXBench includes 5,000 multimodal multiple-choice questions derived from 25
public datasets
and a newly created dataset, FaceXAPI. These questions cover 14 tasks across 6 broad categories,
assessing MLLMs'
face understanding abilities in bias and fairness, face authentication, recognition, analysis,
localization and tool
retrieval.
Using FaceXBench, we conduct an extensive evaluation of 26 open-source MLLMs alongside 2
proprietary models,
revealing the unique challenges in complex face understanding tasks.
|
|
|
FaceXFormer: A Unified Transformer for Facial Analysis
Kartik Narayan,
Vibashan VS,
Rama Chellappa,
Vishal M. Patel
arXiv /
project /
code (198 ⭐)
FaceXFormer is an end-to-end unified model capable of handling a comprehensive range of
facial
analysis tasks such as
face parsing, landmark detection, head pose estimation, attributes recognition, and estimation of
age,
gender, race, and
landmarks visibility. It leverages a transformer-based encoder-decoder architecture where each task
is
treated as a learnable token, enabling
the integration of multiple tasks within a single framework.
|
|
|
PETALface: Parameter Efficient Transfer Learning for Low-resolution Face
Recognition
Kartik Narayan,
Nithin Gopalkrishnan Nair,
Jennifer Xu,
Rama Chellappa,
Vishal M. Patel
WACV, 2025
arXiv /
project /
code
The proposed PETALface a parameter efficient transfer learning approach adapts to
low-resolution datasets beating the
performance of pre-trained models with negligible drop in performance on high-resolution and
mixed-quality datasets.
PETALface enables development of generalized models achieving competitive performance on
high-resolution (LFW, CFP-FP,
CPLFW, AgeDB, CALFW, CFP-FF) and mixed-quality datasets (IJB-B, IJB-C) with big enhancements in
low-quality surveillance
quality datasets (TinyFace, BRIAR, IJB-S).
|
|
|
Hyp-OC: Hyperbolic One Class Classification for Face Anti-Spoofing
Kartik Narayan,
Vishal M. Patel
IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2024
paper /
project /
code
Most prior research in face anti-spoofing (FAS) approaches it as a two-class classification task
where models are
trained on real samples and known spoof attacks and tested for detection performance on unknown
spoof attacks. However,
in practice, FAS should be treated as a one-class classification task where, while training, one
cannot assume any
knowledge regarding the spoof samples a priori. Hyp-OC, is the first work exploring hyperbolic
embeddings for one-class
face anti-spoofing (OC-FAS).
|
|
|
Improved Representation Learning for Unconstrained Face Recognition
Nithin Gopalkrishnan Nair,
Kartik Narayan,
Maitreya Suin,
Ram Prabhakar,
Jennifer Xu,
Soraya Stevens,
Joshua Gleason,
Nathan Shnidman,
Rama Chellappa,
Vishal M. Patel
Under Review
In this
work, we tackle the problem of low-quality face recognition and take a closer look at the nature of
data
in low resolution datasets and redefine paradigms in terms of model choice, data input pipeline and
fine-tuning schemes.
With the accumulated effect of all our design choices, we achieve state-of-the-art results in
mixed-quality benchmarks
(IJB-B, IJB-C) as well as multiple challenging benchmarks for unconstrained face recognition
(Tinyface, IJB-S and
BRIAR), thereby opening up a new avenue of research in the area.
|
|
|
DeePhyNet: Towards Detecting Phylogeny in Deepfakes
Kartik Thakral,
Harsh Agarwal,
Kartik Narayan,
Surbhi Mittal,
Mayank Vatsa,
Richa Singh
IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM)
paper
We propose DeePhyNet, which performs three tasks: it first differentiates between real and fake
content; it next
determines the signature of the generative algorithm used for deepfake creation to determine which
algorithm has been
used for generation, and finally, it also predicts the phylogeny of algorithms used for generation.
To the best of our
knowledge, this is the first algorithm that performs all three tasks together for deepfake media
analysis.
|
|
|
DeePhy: On Deepfake Phylogeny
Kartik Narayan,
Harsh Agarwal,
Kartik Thakral,
Surbhi Mittal,
Mayank Vatsa,
Richa Singh
International Joint Conferece on Biometrics (IJCB), 2022
paper
We proposed the idea of DeepFake Phylogeny and a complementary dataset DeePhy. The paper shows the
need to evolve the research of model attribution of deepfakes and facilitates advancements in real
life scenarios of plagiarism detection, forgery detection, and reverse engineering of deepfakes.
|
|
|
DeSI: Deepfake Source Identifier for Social Media
Kartik Narayan,
Harsh Agarwal,
Surbhi Mittal,
Kartik Thakral,
Suman Kundu,
Mayank Vatsa,
Richa Singh
CVPR Workshops, 2022
paper
We develop an algorithm to find the source/propagator of tweets with deepfake/manipulated
images/videos relevant to a given text query. The result is shown in form of a force-directed graph
which gives temporal insight into the spread pattern and also identifies the volatile nodes in the
network by predicting the virality of tweets.
|
|
|
Using Epidemic Modelling, Machine Learning and Control Feedback Strategy for Policy
Management of COVID-19
Kartik Narayan,
Heena Rathore,
Faycal Znidi
IEEE Access, 2022
paper
/
code
We propose a threshold mechanism for policy control by analyzing the SIR model and estimating the
optimal parameters. Our work helps keep the economic impact of a pandemic under control and also
helps in predicting the approximate duration of the lockdwon.
|
|
|
Leveraging ambient sensing for the estimation of curiosity-driven human crowd
Anirban Das,
Kartik Narayan,
Suchetana Chakraborty
IEEE Systems Conference (SysCon), 2022
paper
We predicted the curious crowd attracted to an object by measuring it's spatiotemporal significance.
The work utilizes a set of passive sensors and wireless signal properties for the estimation.
|
|