PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation

Masukawa, Ryozo; Yun, Sanggeon; Yamaguchi, Yoshiki; Imani, Mohsen

Computer Science > Computer Vision and Pattern Recognition

arXiv:2410.22623 (cs)

[Submitted on 30 Oct 2024 (v1), last revised 4 Dec 2024 (this version, v2)]

Title:PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation

Authors:Ryozo Masukawa, Sanggeon Yun, Yoshiki Yamaguchi, Mohsen Imani

View PDF HTML (experimental)

Abstract:Video crime detection is a significant application of computer vision and artificial intelligence. However, existing datasets primarily focus on detecting severe crimes by analyzing entire video clips, often neglecting the precursor activities (i.e., privacy violations) that could potentially prevent these crimes. To address this limitation, we present PV-VTT (Privacy Violation Video To Text), a unique multimodal dataset aimed at identifying privacy violations. PV-VTT provides detailed annotations for both video and text in scenarios. To ensure the privacy of individuals in the videos, we only provide video feature vectors, avoiding the release of any raw video data. This privacy-focused approach allows researchers to use the dataset while protecting participant confidentiality. Recognizing that privacy violations are often ambiguous and context-dependent, we propose a Graph Neural Network (GNN)-based video description model. Our model generates a GNN-based prompt with image for Large Language Model (LLM), which deliver cost-effective and high-quality video descriptions. By leveraging a single video frame along with relevant text, our method reduces the number of input tokens required, maintaining descriptive quality while optimizing LLM API-usage. Extensive experiments validate the effectiveness and interpretability of our approach in video description tasks and flexibility of our PV-VTT dataset.

Comments:	Accepted to WACV 2025, Dataset Available Here : this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2410.22623 [cs.CV]
	(or arXiv:2410.22623v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2410.22623

Submission history

From: Ryozo Masukawa [view email]
[v1] Wed, 30 Oct 2024 01:02:20 UTC (1,429 KB)
[v2] Wed, 4 Dec 2024 23:15:45 UTC (1,429 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators