Journal of Engineering and Applied Sciences

Year: 2019

Volume: 14

Issue: 19

Page No. 7223 - 7233

DOI: 10.36478/jeasci.2019.7223.7233

Computer Vision Methods for Looking at Peopleinteracting with Objects: A Taxonomy and Survey

References

Al-Akam, R. and D. Paulus, 2018. Dense 3D optical flow co-occurrence matrices for human activity recognition. Proceedings of the 5th International Workshop on Sensor-based Activity Recognition and Interaction (iWOAR'18), September 20-21, 2018, ACM, Berlin, Germany, pp: 1-8.

Delaitre, V., I. Laptev and J. Sivic, 2010. Recognizing human actions in still images: A study of bag-of-features and part-based representations. Proceedings of the 21st British Conference on Machine Vision (BMVC 2010), August 31-September 3, 2010, Aberystwyth, UK., pp: 1-11.

Delaitre, V., J. Sivic and I. Laptev, 2011. Learning Person-Object Interactions for Action Recognition in Still Images. In: Advances in Neural Information Processing Systems 24, Shawe-Taylor, J., R.S. Zemel, P.L. Bartlett, F. Pereira and K.Q. Weinberger (Eds.). Curran Associates Inc., New York, USA., pp: 1503-1511.

Desai, C., D. Ramanan and C. Fowlkes, 2010. Discriminative models for static human-object interactions. Proceedings of the 2010 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 9-16.

Dutta, V. and T. Zielinska, 2017. Action prediction based on physically grounded object affordances in human-object interactions. Proceedings of the 2017 11th International Workshop on Robot Motion and Control (RoMoCo), July 3-5, 2017, IEEE, Wasowo, Poland, ISBN:978-1-5386-3927-6, pp: 47-52.

Dutta, V. and T. Zielinska, 2019. Predicting human actions taking into account objecta ordances. J. Intell. Rob. Syst., 93: 745-761.
CrossRef  |  Direct Link  |

Fathi, A., X. Ren and J.M. Rehg, 2011. Learning to recognize objects in egocentric activities. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011), June 20-25, 2011, IEEE, Colorado, USA., pp: 3281-3288.

Filipovych, R. and E. Ribeiro, 2008. Recognizing primitive interactions by exploring actor-object states. Proceedings of the 2008 IEEE International Conference on Computer Vision and Pattern Recognition, June 23-28, 2008, IEEE, Anchorage, Alaska, USA., pp: 1-7.

Filipovych, R. and E. Ribeiro, 2011. Robust sequence alignment for actor-object interaction recognition: Discovering actor-object states. Comput. Vision Image Understanding, 115: 177-193.
Direct Link  |

Gall, J., A. Fossati and L. Van Gool, 2011. Functional categorization of objects using real-time markerless motion capture. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011), June 20-25, 2011, IEEE, Colorado, USA., pp: 1969-1976.

Gupta, A. and L.S. Davis, 2007. Objects in action: An approach for combining action understanding and object perception. Proceedings of the 2007 IEEE International Conference on Computer Vision and Pattern Recognition, June 17-22, 2007, Minneapolis, Minnesota, USA., pp: 1-8.

Gupta, A., A. Kembhavi and L.S. Davis, 2009. Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE. Trans. Pattern Anal. Mach. Intell., 31: 1775-1789.
CrossRef  |  Direct Link  |

Hamer, H., J. Gall, T. Weise and L. Van Gool, 2010. An object-dependent hand pose prior from sparse training data. Proceedings of the 2010 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 671-678.

Hamer, H., K. Schindler, E. Koller-Meier and L. Van Gool, 2009. Tracking a hand manipulating an object. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, September 29-October 2, 2009, IEEE, Kyoto, Japan, pp: 1475-1482.

Han, J.H., S.J. Lee and J.H. Kim, 2016. Behavior hierarchy-based affordance map for recognition of human intention and its application to human-robot interaction. IEEE. Trans. Hum. Mach. Syst., 46: 708-722.
CrossRef  |  Direct Link  |

Ikizler-Cinbis, N. and S. Sclaroff, 2010. Object, scene and actions: Combining multiple features for human action recognition. Proceedings of the 11th European Conference on Computer Vision (ECCV 2010), September 5-11, 2010, Greece, pp: 494-507.

Kjellstrom, H., D. Kragic and M.J. Black, 2010. Tracking people interacting with objects. Proceedings of the 2010 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 747-754.

Kjellstrom, H., J. Romero and D. Kragic, 2011. Visual object-action recognition: Inferring object affordances from human demonstration. Comput. Vision Image Understanding, 115: 81-90.
CrossRef  |  Direct Link  |

Koppula, H.S., R. Gupta and A. Saxena, 2012. Human activity learning using object affordances from RGB-D videos. Comput. Vision Pattern Recogn., 1: 1-10.
Direct Link  |

Koppula, H.S., R. Gupta and A. Saxena, 2013. Learning human activities and object affordances from RGB-D videos. Intl. J. Rob. Res., 32: 951-970.
CrossRef  |  Direct Link  |

Laptev, I. and P. Perez, 2007. Retrieving actions in movies. Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, October 14-21, 2007, IEEE, Rio de Janeiro, Brazil, pp: 1-8.

Liu, H., M. Liu and Q. Sun, 2014. Learning directional co-occurrence for human action classification. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 4-9, 2014, IEEE, Florence, Italy, pp: 1235-1239.

Marszalek, M., I. Laptev and C. Schmid, 2009. Actions in context. Proceedings of the CVPR 2009-IEEE International Conference on Computer Vision & Pattern Recognition, June 20-25, 2009, IEEE, Miami, Florida, USA., pp: 2929-2936.

Peursum, P., G. West and S. Venkatesh, 2005. Combining image regions and human activity for indirect object recognition in indoor wide-angle views. Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV'05) Vol. 1, October 17-21, 2005, IEEE, Beijing, China, pp: 82-89.

Prest, A., C. Schmid and V. Ferrari, 2011. Weakly supervised learning of interactions between humans and objects. IEEE. Trans. Pattern Anal. Mach. Intell., 34: 601-614.
CrossRef  |  Direct Link  |

Prest, A., V. Ferrari and C. Schmid, 2012. Explicit modeling of human-object interactions in realistic videos. IEEE. Trans. Pattern Anal. Mach. Intell., 35: 835-848.
CrossRef  |  Direct Link  |

Qiu, Z., T. Yao and T. Mei, 2017. Learning spatio-temporal representation with pseudo-3D residual networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, IEEE, Venice, Italy, pp: 5533-5541.

Qu, S. and T. Li, 2017. Human action recognition based on improved CoHOG-LQC. Proceedings of the 2017 29th Chinese Conference on Control and Decision (CCDC), May 28-30, 2017, IEEE, Chongqing, China, pp: 1928-1933.

Rabinovich, A., A. Vedaldi, C. Galleguillos, E. Wiewiora and S. Belongie, 2007. Objects in context. Proceedings of the International Conference on (ICCV) Vol. 1, October 14-21, 2007, IEEE, Rio de Janeiro, Brazil, pp: 1-8.

Ren, X. and M. Philipose, 2009. Egocentric recognition of handled objects: Benchmark and analysis. Proceedings of the 2009 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition Workshops, June 20-25, 2009, IEEE, Miami, Florida, USA., pp: 1-8.

Romero, J., H. Kjellstrom and D. Kragic, 2010. Hands in action: Real-time 3D reconstruction of hands in interaction with objects. Proceedings of the 2010 IEEE International Conference on Robotics and Automation, May 3-7, 2010, IEEE, Anchorage, Alaska, USA., pp: 458-463.

Sabri, A.Q.M., J. Boonaert, E.R.M.F. Abdullah and A.M. Mansoor, 2016. Spatio-temporal co-occurrence characterizations for human action classification. Malaysian J. Comput. Sci., 30: 154-173.
CrossRef  |  Direct Link  |

Santhanam, T., C.P. Sumathi and S. Gomathi, 2012. A survey of techniques for human detection in static images. Proceedings of the 2nd International Conference on Computational Science, Engineering and Information Technology (CCSEIT '12), October 26-28, 2012, ACM, India, pp: 328-336.

Shu, T., Y. Peng, L. Fan, H. Lu and S.C. Zhu, 2018. Perception of human interaction based on motion trajectories: From aerial videos to decontextualized animations. Top. Cognit. Sci., 10: 225-241.
CrossRef  |  PubMed  |  Direct Link  |

Singh, V.K., F.M. Khan and R. Nevatia, 2010. Multiple pose context trees for estimating human pose in object context. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 17-24.

Singha, J., A. Roy and R.H. Laskar, 2018. Dynamic hand gesture recognition using vision-based approach for human-computer interaction. Neural Comput. Appl., 29: 1129-1141.
Direct Link  |

Slimani, K.N.H., Y. Benezeth and F. Souami, 2014. Human interaction recognition based on the co-occurence of visual words. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition Workshops, June 23-28, 2014, IEEE, Columbus, Ohio, USA., pp: 461-466.

Soomro, K., A.R. Zamir and M. Shah, 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. Comput. Vision Pattern Recogn., 1: 1-7.
Direct Link  |

Stark, M., P. Lies, M. Zillich, J. Wyatt and B. Schiele, 2008. Functional object class detection based on learned affordance cues. Proceedings of the International Conference on Computer Vision Systems (ICVS2008), May 12-15, 2008, Springer, Berlin, Germany, pp: 435-444.

Ueng, S.K. and G.Z. Chen, 2016. Vision based multi-user human computer interaction. Multimedia Tools Appl., 75: 10059-10076.
CrossRef  |  Direct Link  |

Wu, C., J. Zhang, O. Sener, B. Selman and S. Savarese et al., 2017. Watch-n-patch: Unsupervised learning of actions and relations. IEEE. Trans. Pattern Anal. Mach. Intell., 40: 467-481.
CrossRef  |  PubMed  |  Direct Link  |

Wu, P., J.W. Hsieh, J.C. Cheng, S.C. Cheng and S.Y. Tseng, 2010. Human smoking event detection using visual interaction clues. Proceedings of the 2010 20th International Conference on Pattern Recognition, August 23-26, 2010, IEEE, Istanbul, Turkey, pp: 4344-4347.

Xian-Jie, Q., W. Zhao-Qi, X. Shi-Hong and L. Jin-Tao, 2005. Estimating articulated human pose from video using shape context. Proceedings of the 5th IEEE International Symposium on Signal Processing and Information Technology, December 21, 2005, IEEE, Athens, Greece, pp: 583-588.

Yao, B. and L. Fei-Fei, 2010. Grouplet: A structured image representation for recognizing human and object interactions. Proceedings of the 2010 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 9-16.

Yao, B. and L. Fei-Fei, 2010. Modeling mutual context of object and human pose in human-object interaction activities. Proceedings of the 2010 IEEE International Computer Society Conference on Computer Vision and Pattern Recognition, June 13-18, 2010, IEEE, San Francisco, California, USA., pp: 17-24.

Yao, B. and L. Fei-Fei, 2012. Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses. IEEE. Trans. Pattern Anal. Mach. Intell., 34: 1691-1703.
CrossRef  |  Direct Link  |

Yao, B., A. Khosla and L. Fei-Fei, 2011. Classifying actions and measuring action similarity by modeling the mutual context of objects and human poses. Proceedings of the 28th International Conference on Machine Learning (ICML), June 28- July 02, 2011, Bellevue, Washington, USA., pp: 1-8.

Zhao, C., J. Wang and H. Lu, 2017. Learning discriminative context models for concurrent collective activity recognition. Multimedia Tools Appl., 76: 7401-7420.
CrossRef  |  Direct Link  |

Zhu, W., J. Hu, G. Sun, X. Cao and Y. Qiao, 2016. A key volume mining deep framework for action recognition. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, IEEE, Las Vegas, Nevada, USA., pp: 1991-1999.

Related Links

Journals By Subject

Journal of Engineering and Applied Sciences

Computer Vision Methods for Looking at Peopleinteracting with Objects: A Taxonomy and Survey

References