Search | VHL Regional Portal

1.

A domain-agnostic approach for characterization of lifelong learning systems.

Baker, Megan M; New, Alexander; Aguilar-Simon, Mario; Al-Halah, Ziad; Arnold, Sébastien M R; Ben-Iwhiwhu, Ese; Brna, Andrew P; Brooks, Ethan; Brown, Ryan C; Daniels, Zachary; Daram, Anurag; Delattre, Fabien; Dellana, Ryan; Eaton, Eric; Fu, Haotian; Grauman, Kristen; Hostetler, Jesse; Iqbal, Shariq; Kent, Cassandra; Ketz, Nicholas; Kolouri, Soheil; Konidaris, George; Kudithipudi, Dhireesha; Learned-Miller, Erik; Lee, Seungwon; Littman, Michael L; Madireddy, Sandeep; Mendez, Jorge A; Nguyen, Eric Q; Piatko, Christine; Pilly, Praveen K; Raghavan, Aswin; Rahman, Abrar; Ramakrishnan, Santhosh Kumar; Ratzlaff, Neale; Soltoggio, Andrea; Stone, Peter; Sur, Indranil; Tang, Zhipeng; Tiwari, Saket; Vedder, Kyle; Wang, Felix; Xu, Zifan; Yanguas-Gil, Angel; Yedidsion, Harel; Yu, Shangqun; Vallabha, Gautam K.

Neural Netw ; 160: 274-296, 2023 Mar.

Article in English | MEDLINE | ID: mdl-36709531

ABSTRACT

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of (1) Continuous Learning, (2) Transfer and Adaptation, and (3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.

Subject(s)

Education, Continuing , Machine Learning

2.

People construct simplified mental representations to plan.

Ho, Mark K; Abel, David; Correa, Carlos G; Littman, Michael L; Cohen, Jonathan D; Griffiths, Thomas L.

Nature ; 606(7912): 129-136, 2022 06.

Article in English | MEDLINE | ID: mdl-35589843

ABSTRACT

One of the most striking features of human cognition is the ability to plan. Two aspects of human planning stand out-its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to many everyday problems despite having limited cognitive resources1-3. Standard accounts in psychology, economics and artificial intelligence have suggested that human planning succeeds because people have a complete representation of a task and then use heuristics to plan future actions in that representation4-11. However, this approach generally assumes that task representations are fixed. Here we propose that task representations can be controlled and that such control provides opportunities to quickly simplify problems and more easily reason about them. We propose a computational account of this simplification process and, in a series of preregistered behavioural experiments, show that it is subject to online cognitive control12-14 and that people optimally balance the complexity of a task representation and its utility for planning and acting. These results demonstrate how strategically perceiving and conceiving problems facilitates the effective use of limited cognitive resources.

Subject(s)

Cognition , Executive Function , Efficiency , Heuristics , Humans , Models, Psychological

3.

Communication in action: Planning and interpreting communicative demonstrations.

Ho, Mark K; Cushman, Fiery; Littman, Michael L; Austerweil, Joseph L.

J Exp Psychol Gen ; 150(11): 2246-2272, 2021 Nov.

Article in English | MEDLINE | ID: mdl-34498911

ABSTRACT

Theory of mind enables an observer to interpret others' behavior in terms of unobservable beliefs, desires, intentions, feelings, and expectations about the world. This also empowers the person whose behavior is being observed: By intelligently modifying her actions, she can influence the mental representations that an observer ascribes to her, and by extension, what the observer comes to believe about the world. That is, she can engage in intentionally communicative demonstrations. Here, we develop a computational account of generating and interpreting communicative demonstrations by explicitly distinguishing between two interacting types of planning. Typically, instrumental planning aims to control states of the environment, whereas belief-directed planning aims to influence an observer's mental representations. Our framework extends existing formal models of pragmatics and pedagogy to the setting of value-guided decision-making, captures how people modify their intentional behavior to show what they know about the reward or causal structure of an environment, and helps explain data on infant and child imitation in terms of literal versus pragmatic interpretation of adult demonstrators' actions. Additionally, our analysis of belief-directed intentionality and mentalizing sheds light on the sociocognitive mechanisms that underlie distinctly human forms of communication, culture, and sociality. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

Subject(s)

Communication , Intention , Adult , Child , Emotions , Female , Humans , Infant , Social Behavior

4.

Reward-predictive representations generalize across tasks in reinforcement learning.

Lehnert, Lucas; Littman, Michael L; Frank, Michael J.

PLoS Comput Biol ; 16(10): e1008317, 2020 10.

Article in English | MEDLINE | ID: mdl-33057329

ABSTRACT

In computer science, reinforcement learning is a powerful framework with which artificial agents can learn to maximize their performance for any given Markov decision process (MDP). Advances over the last decade, in combination with deep neural networks, have enjoyed performance advantages over humans in many difficult task settings. However, such frameworks perform far less favorably when evaluated in their ability to generalize or transfer representations across different tasks. Existing algorithms that facilitate transfer typically are limited to cases in which the transition function or the optimal policy is portable to new contexts, but achieving "deep transfer" characteristic of human behavior has been elusive. Such transfer typically requires discovery of abstractions that permit analogical reuse of previously learned representations to superficially distinct tasks. Here, we demonstrate that abstractions that minimize error in predictions of reward outcomes generalize across tasks with different transition and reward functions. Such reward-predictive representations compress the state space of a task into a lower dimensional representation by combining states that are equivalent in terms of both the transition and reward functions. Because only state equivalences are considered, the resulting state representation is not tied to the transition and reward functions themselves and thus generalizes across tasks with different reward and transition functions. These results contrast with those using abstractions that myopically maximize reward in any given MDP and motivate further experiments in humans and animals to investigate if neural and cognitive systems involved in state representation perform abstractions that facilitate such equivalence relations.

Subject(s)

Models, Psychological , Neural Networks, Computer , Reinforcement, Psychology , Algorithms , Animals , Brain/physiology , Humans , Markov Chains , Reward , Task Performance and Analysis

5.

People teach with rewards and punishments as communication, not reinforcements.

Ho, Mark K; Cushman, Fiery; Littman, Michael L; Austerweil, Joseph L.

J Exp Psychol Gen ; 148(3): 520-549, 2019 Mar.

Article in English | MEDLINE | ID: mdl-30802127

ABSTRACT

Carrots and sticks motivate behavior, and people can teach new behaviors to other organisms, such as children or nonhuman animals, by tapping into their reward learning mechanisms. But how people teach with reward and punishment depends on their expectations about the learner. We examine how people teach using reward and punishment by contrasting two hypotheses. The first is evaluative feedback as reinforcement, where rewards and punishments are used to shape learner behavior through reinforcement learning mechanisms. The second is evaluative feedback as communication, where rewards and punishments are used to signal target behavior to a learning agent reasoning about a teacher's pedagogical goals. We present formalizations of learning from these 2 teaching strategies based on computational frameworks for reinforcement learning. Our analysis based on these models motivates a simple interactive teaching paradigm that distinguishes between the two teaching hypotheses. Across 3 sets of experiments, we find that people are strongly biased to use evaluative feedback communicatively rather than as reinforcement. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Subject(s)

Communication , Motivation , Punishment , Reward , Adult , Female , Humans , Male , Reinforcement, Psychology

6.

Evolution of flexibility and rigidity in retaliatory punishment.

Morris, Adam; MacGlashan, James; Littman, Michael L; Cushman, Fiery.

Proc Natl Acad Sci U S A ; 114(39): 10396-10401, 2017 09 26.

Article in English | MEDLINE | ID: mdl-28893996

ABSTRACT

Natural selection designs some social behaviors to depend on flexible learning processes, whereas others are relatively rigid or reflexive. What determines the balance between these two approaches? We offer a detailed case study in the context of a two-player game with antisocial behavior and retaliatory punishment. We show that each player in this game-a "thief" and a "victim"-must balance two competing strategic interests. Flexibility is valuable because it allows adaptive differentiation in the face of diverse opponents. However, it is also risky because, in competitive games, it can produce systematically suboptimal behaviors. Using a combination of evolutionary analysis, reinforcement learning simulations, and behavioral experimentation, we show that the resolution to this tension-and the adaptation of social behavior in this game-hinges on the game's learning dynamics. Our findings clarify punishment's adaptive basis, offer a case study of the evolution of social preferences, and highlight an important connection between natural selection and learning in the resolution of social conflicts.

Subject(s)

Punishment/psychology , Social Behavior , Social Control, Formal , Aggression/psychology , Cooperative Behavior , Humans , Learning/physiology , Reward

7.

Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study.

Kahler, Christopher W; Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L.

JMIR Ment Health ; 4(2): e25, 2017 Jun 28.

Article in English | MEDLINE | ID: mdl-28659259

ABSTRACT

BACKGROUND: Computer-delivered interventions have been shown to be effective in reducing alcohol consumption in heavy drinking college students. However, these computer-delivered interventions rely on mouse, keyboard, or touchscreen responses for interactions between the users and the computer-delivered intervention. The principles of motivational interviewing suggest that in-person interventions may be effective, in part, because they encourage individuals to think through and speak aloud their motivations for changing a health behavior, which current computer-delivered interventions do not allow. OBJECTIVE: The objective of this study was to take the initial steps toward development of a voice-based computer-delivered intervention that can ask open-ended questions and respond appropriately to users' verbal responses, more closely mirroring a human-delivered motivational intervention. METHODS: We developed (1) a voice-based computer-delivered intervention that was run by a human controller and that allowed participants to speak their responses to scripted prompts delivered by speech generation software and (2) a text-based computer-delivered intervention that relied on the mouse, keyboard, and computer screen for all interactions. We randomized 60 heavy drinking college students to interact with the voice-based computer-delivered intervention and 30 to interact with the text-based computer-delivered intervention and compared their ratings of the systems as well as their motivation to change drinking and their drinking behavior at 1-month follow-up. RESULTS: Participants reported that the voice-based computer-delivered intervention engaged positively with them in the session and delivered content in a manner consistent with motivational interviewing principles. At 1-month follow-up, participants in the voice-based computer-delivered intervention condition reported significant decreases in quantity, frequency, and problems associated with drinking, and increased perceived importance of changing drinking behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. CONCLUSIONS: Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form.

8.

Social is special: A normative framework for teaching with and learning from evaluative feedback.

Ho, Mark K; MacGlashan, James; Littman, Michael L; Cushman, Fiery.

Cognition ; 167: 91-106, 2017 10.

Article in English | MEDLINE | ID: mdl-28341268

ABSTRACT

Humans often attempt to influence one another's behavior using rewards and punishments. How does this work? Psychologists have often assumed that "evaluative feedback" influences behavior via standard learning mechanisms that learn from environmental contingencies. On this view, teaching with evaluative feedback involves leveraging learning systems designed to maximize an organism's positive outcomes. Yet, despite its parsimony, programs of research predicated on this assumption, such as ones in developmental psychology, animal behavior, and human-robot interaction, have had limited success. We offer an explanation by analyzing the logic of evaluative feedback and show that specialized learning mechanisms are uniquely favored in the case of evaluative feedback from a social partner. Specifically, evaluative feedback works best when it is treated as communicating information about the value of an action rather than as a form of reward to be maximized. This account suggests that human learning from evaluative feedback depends on inferences about communicative intent, goals and other mental states-much like learning from other sources, such as demonstration, observation and instruction. Because these abilities are especially developed in humans, the present account also explains why evaluative feedback is far more widespread in humans than non-human animals.

Subject(s)

Feedback, Psychological , Punishment , Reward , Social Behavior , Communication , Humans , Models, Psychological , Reinforcement, Psychology

9.

Reinforcement learning improves behaviour from evaluative feedback.

Littman, Michael L.

Nature ; 521(7553): 445-51, 2015 May 28.

Article in English | MEDLINE | ID: mdl-26017443

ABSTRACT

Reinforcement learning is a branch of machine learning concerned with using experience gained through interacting with the world and evaluative feedback to improve a system's ability to make behavioural decisions. It has been called the artificial intelligence problem in a microcosm because learning algorithms must act autonomously to perform well and achieve their goals. Partly driven by the increasing availability of rich data, recent years have seen exciting advances in the theory and practice of reinforcement learning, including developments in fundamental technical areas such as generalization, planning, exploration and empirical methodology, leading to increasing applicability to real-life problems.

Subject(s)

Artificial Intelligence , Feedback , Algorithms , Empirical Research , Markov Chains , Monte Carlo Method , Reward , Time Factors

10.

Optimal one-dimensional apodizations and shaped pupils for planet finding coronagraphy.

Kasdin, N Jeremy; Vanderbei, Robert J; Littman, Michael G; Spergel, David N.

Appl Opt ; 44(7): 1117-28, 2005 Mar 01.

Article in English | MEDLINE | ID: mdl-15765689

ABSTRACT

The realization that direct imaging of extrasolar planets could be technologically feasible within the next decade or so has inspired a great deal of recent research into high-contrast imaging. We have contributed several design ideas, all of which can be described as shaped pupil coronagraphs. We offer a complete and unified survey of one-dimensional shaped pupil designs, some of which have been published in our previous papers. We also introduce a promising new design, which we call bar-code masks. With these masks we can achieve the required contrast with a fairly large discovery zone and throughput, but most importantly they are perhaps the easiest to manufacture and might therefore stand up best to refined analyses.

11.

Role of the Legionella pneumophila rtxA gene in amoebae.

Cirillo, Suat L G; Yan, Ling; Littman, Michael; Samrakandi, Mustapha M; Cirillo, Jeffrey D.

Microbiology (Reading) ; 148(Pt 6): 1667-1677, 2002 Jun.

Article in English | MEDLINE | ID: mdl-12055287

ABSTRACT

Legionella pneumophila infects humans, causing Legionnaires' disease, from aerosols generated by domestic and environmental water sources. In aquatic environments L. pneumophila is thought to replicate primarily in protozoa. A 'repeats in structural toxin' (RTX) gene, rtxA, from L. pneumophila was identified recently that plays a role in entry and replication in human macrophages and also has the ability to infect mice. However, the role of this gene in the interaction of L. pneumophila with environmental protozoa and its distribution in different Legionella species has not been examined. Southern analyses demonstrated that rtxA is present in all L. pneumophila isolates tested and correlates with species that have been shown to cause disease in humans. To evaluate the importance of rtxA in the interaction with protozoa a series of studies was carried out in an environmental host for L. pneumophila, Acanthamoeba castellanii. The L. pneumophila rtxA gene plays a role in both adherence and entry into A. castellanii similar to that observed in human monocytic cells. Furthermore, it was found that rtxA is involved in intracellular survival and trafficking. In addition to demonstrating involvement of rtxA in the interaction of L. pneumophila with host cells, these data support a role for this gene both during disease in humans and in environmental reservoirs.

Subject(s)

Acanthamoeba/microbiology , Bacterial Toxins/metabolism , Genes, Bacterial/genetics , Legionella pneumophila/genetics , Legionella pneumophila/physiology , Legionnaires' Disease/microbiology , Legionnaires' Disease/parasitology , Acanthamoeba/cytology , Acanthamoeba/ultrastructure , Animals , Bacterial Adhesion , Bacterial Toxins/genetics , Blotting, Southern , Disease Reservoirs , Environment , Legionella pneumophila/pathogenicity , Legionnaires' Disease/transmission , Lysosomes/microbiology , Lysosomes/ultrastructure , Movement , Vacuoles/microbiology , Vacuoles/ultrastructure

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL