Referring Video Object Clustering (RVOC)
Sep 5, 2024·,,·
0 min read
Steven Hany
Mina Samir
Moamen Zaher
Abstract
System surveillance involves the continuous monitoring and analysis of organizational data to ensure security and operational efficiency. This proactive approach aids in detecting anomalies, threats, and performance issues, thereby safeguarding sensitive information and maintaining system integrity. This paper proposes a framework for querying videos named Referring Video Object Clustering (RVOC). Traditional methods necessitate identifying the number of clusters before object identification based on pre-trained datasets, with components such as Principal Component Analysis (PCA) relying on fixed numerical values. In contrast, The proposed framework (RVOC) performs dynamic clustering without prior training, allowing the number of classes to adjust based on retrieved video objects. This method employs a sophisticated NLP query system, enabling intricate searches like “person skating on a red skateboard.” The system diligently searches and analyzes video content to find instances matching the query, grouping and segmenting detected individuals with advanced clustering algorithms. This enhances user experience by facilitating easy navigation and selection of desired individuals, with each cluster representing a unique person or object. RVOC’s efficacy is evidenced by its Normalized Mutual Information (NMI) score of 51% in tests with multilabel car and color datasets.
Type
Publication
2024 Intelligent Methods, Systems, and Applications (IMSA)