In this work we present a novel Space Time Interest Point detection (STIP) method based on 3D facet model, we call it as FaSTIP. The proposed approach detects interest points in video data using the three-dimensional facet model efficiently. Then we describe each interest point by three-dimensional Haar wavelet transform and time derivatives of different order obtained from said facet model. Here we represent each video clip following the bag-of-words approach by learning feature specific dictionary. Finally, classification is done using non-linear SVM with chi-square kernel. We evaluate the performance of our system on standard datasets like Weizmann, KTH, UCF sports, ICD, UCF YouTube, and UCF50 and get better, or at least comparable results compared to other state-of-the-art systems.


  • Soumitra Samanta and Bhabatosh Chanda, Space-time Facet Model for Human Activity Classification, IEEE Trans. on Multimedia, 2014 [pdf] [bibtex]

  • Soumitra Samanta and Bhabatosh Chanda, FaSTIP: A New Method for Detection and Description of Space-Time Interest Points for Human Activity Classification, The 8th Indian Conference on Vision, Graphics and Image Processing (ICVGIP), 2012 [pdf] [bibtex]


    Shows the space-time interest points detected by Laptev et al. [1] and by the proposed method on five frames of five different datasets (rows 1-2: walking from Weizmann, rows 3-4: hand waving from KTH, rows 5-6: running from UCF sports, rows 7-8: Manipuri from ICD, and rows 9-10: trampoline jumping from UCF YouTube). The odd rows (1st, 3rd, 5th, 7th, and 9th) show STIPs detected by Laptev et. al. [1], and the even rows (2nd, 4th, 6th, 8th, and 10th) show STIPs detected by the proposed FaSTIP method. The 1st, 2nd, 3rd and 4th columns show the interest points due to default threshold (see paper), and the top ten, top twenty and top thirty interest points respectively by the said two methods.


    1. Ivan Laptev. On space-time interest points. In International Journal of Computer Vision, 64(2):107-123, 2005.
    2. Piotr Dollar, Vincent Rabaud, Garrison Cottrell, and Serge Belongie. Behavior recognition via sparse spatio-temporal features. In VS-PETS, October 2005.