Topview Logo
  • Create viral videos with
    GPT-4o + Ads library
    Use GPT-4o to edit video empowered by Youtube & Tiktok & Facebook ads library. Turns your links or media assets into viral videos in one click.
    Try it free
    gpt video

    RSS'24 LRL Incremental Learning From Interaction and Large Language Models

    blog thumbnail

    RSS'24 LRL Incremental Learning From Interaction and Large Language Models

    Introduction

    The long-term research goals of our labs focus on engineering humanoid robots from scratch, with an emphasis on grasping and manipulation, and learning from human observation and experience. My personal research is centered on natural interaction and communication, which significantly intersects with work in natural language processing (NLP).

    The Arma Humanoid Robot Family

    At HT, we have a range of Arma humanoid robots, starting from models developed over 20 years ago. These robots have evolved visually and functionally, including iterations like Arma3, Arma6, ArmaDE, and the latest, Arma7. Arma7 boasts 32 degrees of freedom and over 100 sensors.

    Functional Cognitive Architecture

    Our research doesn't just focus on the hardware but also involves developing software, such as the functional cognitive architecture used in Arma7. This architecture is memory-centric, featuring a three-layer system that includes hardware abstraction, high-level planning and reasoning, and a central memory for mediation between these layers.

    Memory System Design

    The memory system in our architecture is designed to be distributed to avoid bottlenecks, model various types of memory, represent different data types, and be easily extendable. It includes sensory memory for input from sensors, working memory for information derived from sensory input, and long-term memory for encoding and consolidating content.

    Intuitive Human-Robot Interaction

    Natural language interaction has many complications due to ambiguities and context dependencies. Our dialog system employs an interaction manager that uses language models to understand and control the robot. Memory serves as a mediator, allowing the language model to query and update information.

    LLM as an Agent

    We have deployed a large language model (LLM) as an agent to control the robot's behavior using a prompting-based approach. This includes an API specification, examples, and the current input. The LLM generates Python commands which are executed in a Python console environment to trigger physical actions or invoke perception components.

    Incremental Learning from Interaction

    We introduced a system where robots can learn and improve from interactions. When a human gives feedback, an LLM reflects on the interaction, assesses improvement areas, and revises the action code. This new code is stored in the memory for future use.

    Experimentation and Results

    We validated our system using various language models and interaction scenarios, observing that:

    • Interactive feedback significantly improved performance.
    • Incremental learning reduced unnecessary interactions.
    • The retrieval of demonstrations was beneficial.
    • The performance of the system scaled with better language models.

    Open Challenges

    Our research faces multiple challenges, including:

    • Designing effective APIs for the language model.
    • Managing system latency and data privacy.
    • Confirming the LLM's improvements before updating the memory.
    • Personalizing user interactions.

    Personalization in Human-Robot Interaction

    We explored two approaches for personalization:

    1. Explicit attribute storage using getter and setter functions.
    2. User-specific interaction memories.

    Both approaches have their advantages and constraints. Combining them might yield better results in future implementations.

    Conclusion

    Our research focuses on combining humanoid robot design with NLP to enable intuitive interaction and incremental learning. Future work aims at improving API design, managing data security, and enhancing personalization.

    Keywords

    FAQ

    Q1: What are the main goals of your research?
    A1: Our main goals include engineering humanoid robots with a focus on grasping, manipulation, and learning from human observation and experience, particularly focusing on natural interaction and communication.

    Q2: How have the Arma humanoid robots evolved over the years?
    A2: Our Arma humanoid robots have evolved significantly in both design and functionality, with the latest model, Arma7, featuring 32 degrees of freedom and over 100 sensors.

    Q3: What does your functional cognitive architecture entail?
    A3: It is a three-layer memory-centric architecture that includes hardware abstraction, high-level planning and reasoning, and a central memory for mediating between these layers.

    Q4: How does the LLM agent control the robot's behavior?
    A4: The LLM agent uses a prompting-based approach to generate Python commands, which are then executed to perform physical actions or invoke perception components.

    Q5: What is incremental learning in your context?
    A5: Incremental learning involves the robot improving its future actions based on feedback from human interactions, with updated codes stored in memory for future tasks.

    Q6: What challenges are you currently facing?
    A6: Challenges include designing effective APIs, managing system latency and data privacy, confirming improvements before updating memory, and personalizing user interactions.

    Q7: How are you addressing personalization in human-robot interaction?
    A7: We use explicit attribute storage with getter and setter functions and user-specific interaction memories to tailor interactions to individual users.

    One more thing

    In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.

    TopView.ai provides two powerful tools to help you make ads video in one click.

    Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.

    Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.

    You may also like