Pesquisa | Portal Regional da BVS (teste)

Ego4D: Around the World in 3,000 Hours of Egocentric Video.

Grauman, Kristen; Westbury, Andrew; Byrne, Eugene; Cartillier, Vincent; Chavis, Zachary; Furnari, Antonino; Girdhar, Rohit; Hamburger, Jackson; Jiang, Hao; Kukreja, Devansh; Liu, Miao; Liu, Xingyu; Martin, Miguel; Nagarajan, Tushar; Radosavovic, Ilija; Ramakrishnan, Santhosh Kumar; Ryan, Fiona; Sharma, Jayant; Wray, Michael; Xu, Mengmeng; Xu, Eric Zhongcong; Zhao, Chen; Bansal, Siddhant; Batra, Dhruv; Crane, Sean; Do, Tien; Doulaty, Morrie; Erapalli, Akshay; Feichtenhofer, Christoph; Fragomeni, Adriano; Fu, Qichen; Gebreselasie, Abrham; Gonzalez, Cristina; Hillis, James; Huang, Xuhua; Huang, Yifei; Jia, Wenqi; Khoo, Weslie; Kolar, Jachym; Kottur, Satwik; Kumar, Anurag; Landini, Federico; Li, Chao; Li, Yanghao; Li, Zhenqiang; Mangalam, Karttikeya; Modhugu, Raghava; Munro, Jonathan; Murrell, Tullie; Nishiyasu, Takumi.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Jul 26.

Artigo em Inglês | MEDLINE | ID: mdl-39058617

RESUMO

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards, with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception. Project page: https://ego4d-data.org/.

Real-world humanoid locomotion with reinforcement learning.

Radosavovic, Ilija; Xiao, Tete; Zhang, Bike; Darrell, Trevor; Malik, Jitendra; Sreenath, Koushil.

Sci Robot ; 9(89): eadi9579, 2024 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-38630806

RESUMO

Humanoid robots that can autonomously operate in diverse environments have the potential to help address labor shortages in factories, assist elderly at home, and colonize new planets. Although classical controllers for humanoid robots have shown impressive results in a number of settings, they are challenging to generalize and adapt to new environments. Here, we present a fully learning-based approach for real-world humanoid locomotion. Our controller is a causal transformer that takes the history of proprioceptive observations and actions as input and predicts the next action. We hypothesized that the observation-action history contains useful information about the world that a powerful transformer model can use to adapt its behavior in context, without updating its weights. We trained our model with large-scale model-free reinforcement learning on an ensemble of randomized environments in simulation and deployed it to the real-world zero-shot. Our controller could walk over various outdoor terrains, was robust to external disturbances, and could adapt in context.

Assuntos

Robótica , Humanos , Idoso , Robótica/métodos , Locomoção , Caminhada , Aprendizagem , Reforço Psicológico

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA