DisCaaS: Micro Behavior Analysis on Discussion by Camera as a Sensor.

Watanabe, Ko; Soneda, Yusuke; Matsuda, Yuki; Nakamura, Yugo; Arakawa, Yutaka; Dengel, Andreas; Ishimaru, Shoya

Watanabe, Ko; Soneda, Yusuke; Matsuda, Yuki; Nakamura, Yugo; Arakawa, Yutaka; Dengel, Andreas; Ishimaru, Shoya.

Watanabe K; Department of Computer Science, University of Kaiserslautern & DFKI GmbH, 67663 Kaiserslautern, Germany.
Soneda Y; Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara 630-0192, Japan.
Matsuda Y; Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara 630-0192, Japan.
Nakamura Y; Department of Information Science and Technology, Graduate School and Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan.
Arakawa Y; Department of Information Science and Technology, Graduate School and Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan.
Dengel A; Department of Computer Science, University of Kaiserslautern & DFKI GmbH, 67663 Kaiserslautern, Germany.
Ishimaru S; Department of Computer Science, University of Kaiserslautern & DFKI GmbH, 67663 Kaiserslautern, Germany.

Sensors (Basel) ; 21(17)2021 Aug 25.

Article in English | MEDLINE | ID: covidwho-1379985

ABSTRACT

ABSTRACT

The emergence of various types of commercial cameras (compact, high resolution, high angle of view, high speed, and high dynamic range, etc.) has contributed significantly to the understanding of human activities. By taking advantage of the characteristic of a high angle of view, this paper demonstrates a system that recognizes micro-behaviors and a small group discussion with a single 360 degree camera towards quantified meeting analysis. We propose a method that recognizes speaking and nodding, which have often been overlooked in existing research, from a video stream of face images and a random forest classifier. The proposed approach was evaluated on our three datasets. In order to create the first and the second datasets, we asked participants to meet physically 16 sets of five minutes data from 21 unique participants and seven sets of 10 min meeting data from 12 unique participants. The experimental results showed that our approach could detect speaking and nodding with a macro average f1-score of 67.9% in a 10-fold random split cross-validation and a macro average f1-score of 62.5% in a leave-one-participant-out cross-validation. By considering the increased demand for an online meeting due to the COVID-19 pandemic, we also record faces on a screen that are captured by web cameras as the third dataset and discussed the potential and challenges of applying our ideas to virtual video conferences.

Subject(s)

Human Activities; Photography; COVID-19; Humans; Pandemics

Keywords

3D pose estimation; RGB sensors; camera as a smart sensor; digital camera; human action recognition; meeting analysis

Fulltext

XML

PubMed Links

Search on Google

Full text: Available Collection: International databases Database: MEDLINE Main subject: Photography / Human Activities Type of study: Experimental Studies / Prognostic study / Randomized controlled trials Limits: Humans Language: English Year: 2021 Document Type: Article Affiliation country: S21175719

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

PubMed Links

Search on Google