Cognitive Robotics

 

Embodied Learning Environment

Page history last edited by Nicholas Davis 1 yr ago

 

Embodied Learning Environment

 

Nick Davis

Individual Research Objectives

Research Proposals

 

Goal: Create an interactive virtual knowledge store to be implemented on the wii console utilizing the wii motes as input from a human participant, camera gesture system, or gyroscopic mouse.

 

Concept: Have a human participant able to speak and interact with a virtual environment of Wikipedia likeness. The speech is processed by a synthesizer that then stemmatizes the text and conducts a frame analysis to store the input and check the history of other inputs for relational meaning. The system will couple the recorded gesture information coming from the wii console with the stemmatized text and the perceived meaning.  The coupled information will be grouped into like categories that are activated for later, more intricate interpreting. In essence, this process would be like creating a perceptual symbol system. It would have input from multiple modalities that are coupled together. This would allow for the understanding of novel concepts based upon symbols that are activated upon either: gesture, intonation, rhythm, semantic domain, syntax.

 

The information gathered by the system would not only train the system, but provide empirical research for the semantics of gesture, prosody, semiotics, and other communicative functions.

Implementation:

1)      Use the current model for the parsing agent, but add sensory input from the human domain.

     Photobucket

a.       Figure 1: Human component of the Parsing Agent

 

3)      Incorporate this into the parsing agent model

 

Photobucket

 

 

 

 

 

 

4)      The system will have a few different functions:

a.       Human says something, then the speech synthesizer will type the speech and mark prosodic and intonation changes.

b.      The text will be stemmatized and coupled with the prosodic information.

c.       The gesture will also be recorded, variety of techniques:

                                                               i.      Wii apparatus:

1.       Pros: Motion, velocity, posture data

2.       Don’t know the compatibility of this data with java or other languages.

                                                             ii.      Gyroscopic mouse:

1.       Easier to program most likely

2.       No postural data, limited gesture data.

                                                            iii.      Tri-camera system

1.       Have to deal with computer imaging,  very complex

d.      The gesture data will be coupled with the stemma to look for patterns in that domain.

e.      The stemma will then be taken to the frame analysis to look up domain knowledge of the stemma and relationship between the words.

                                                               i.      This step will also couple the gesture data with the domain content.

f.        The agent will then take the information and act on it according to which kind of speech act was used.

Photobucket

5)      Coupling gesture with semantic content along with stemma will serve to create a memory bank of patterns that will be used by the agent to create prototypical gestures that it can use to quickly discern the objective of the user.

a.       This will in essence be like a perceptual symbol. It will be multimodal, partial representation. There will be information from speech, text, semantics, and gesture. These properties will be combined based on the words and domains that are used in the act of the human.

b.      The prototypes will train the agent to interact in a more efficient manner and also help us determine how a system is able to produce a perceptual symbol system of knowledge.

6)      We can insert certain gesture schemas to start the program out with. This would in essence be like creating a body image of the user. Certain hand feet positions mean the body is in this gesture. Create a virtual representation of the body based on the limited input from the gestural domain.

a.       With the body image partially created, it will be the agent’s job to construct the body schema of each user uniquely, and humans in general.

                                                               i.      This would be like little tendencies that a person tends to do to explain a concept. The posture and movements associated with given domains.

                                                             ii.      This would change with different users, so the agent could take note of who is using the system and update the base body schema accordingly.

Photobucket

b.      New concepts or words will be understood through looking at the gesture pattern, intonation, prosody associated with it, the, cross-referencing that data with the other relevant frames in order to attach this new word with pre-stored meanings.

                                                               i.      In essence, updating and constructing a perceptual symbol system.

Comments (0)

You don't have permission to comment on this page.