In class on 10/10, we discussed some interesting things. The main points of interest were:
- Mixture Models
- The nature of representations
- Collaborative attention
First, I'll try to summarize the mixture model discussion. Matt introduced the model by describing states, which are a binary thing, such as 'on' or 'off' in a two state model, or the windshield wiper example with multiple states, like 'off' 'slow' 'medium' and 'fast.' The mixture model serves to seperate input into a preset number of states. For example, if the computer is given an array of numbers like: 1,5,6,8, 21, 23, 25, and is told that there are two states producing these numbers, it will, at first, randomly select two points and assign a bell curve whose medium is the points. Next it will gauge wether or not the bell curve contains an adequete amount of the numbers, if it does, then it stays, if not, it will try to move the median of the curve around to fit the number groupings. This works best if numbers are somewhat chunked, like in the example there are two groups, low numbers and high numbers. This will inevitably end up corresponding to the two curves. Next, the machine will assign, with a 50/50 chance, what state each curve represents. The example used in class was 'shininess.' Those numbers with a higher value, will, hopefully, be determined shiny, and from there we could code in that whatever is shiny should be attended to.
With regards to attention, we talked about having peripheral sensors all around an agent, while having a concentrated grouping of sensors in the front for fine grained analysis. Once something is deemed worthy of attention, the agent will face it and conduct a fine grained analysis. It is in this way that the mixture model could help us program in attention.
As for the nature of representations. Leland made an interesting observation about our terminology of first person representations and third person representations. He was uncomfortable with our use of the term third person representation for the map of the agent. He thought it was just that, a first person map. A third person representation, in order to be objective, would have to be agreed upon by another agent, and this is where he feels that second person representations are crucial. A second person representation in this context is basically just one agent taking another's perspective and trying to represent what the other is seeing from the first person view.
So we potentially have this terminology:
- First person view: sensory input of the robot
- First person map: summary of points from the 1st person recorded in memory to know where there are in the environment relative to the robots current position.
- Second person view: shiftin perspectives to represent another agents first person view
- Third person view: an agreed upon environment between that agents in a setting
Furthermore, we discussed how we would go about sharing the subjective map with another agent. This action requires that both agents know each is trying to communicate. For this, we would have a blinking light on each agent that indicates a desire to communicate. One agent starts blinking, the other may see this in his peripherary, which then, through the mixture model, is deemed worthy of attending to. The second agent faces the first and realizes it is blinking. Here, there will have to be a hard wired aspect that makes an agent blink if it is attending to another blinking agent. Once agent two starts blinking, they both know that each of them is 'listening' and some form of communication can follow. This is a kind of collaborative attention because the agents are attending to each other. It is not quite joint, or shared attention yet because these situations require a third party thing that they are both attending to.
We also discussed something to the effect of having a pointer, a nose basically, that one agent can point towards an object which would signifiy look where I am pointing. This would be something like, if there is a blinking agent which you are 'in-blink-with/ collaboratively attending to' and he extends his pointer, create a second person representation to see what exactly he is pointing at and use this informaiton to create an action that orients you towards this object. Matt started creating a mixture model to computationally explain how this could happen, and he is optimistic that these mixture models will allow us to do things of this nature. However, we need to explore these models further, and hopefully next week Matthew M. will have some sort of mixture model running for a demonstration of what this algorithm can do.
Further Reading:
The wikipedia page on mixture models does a bad explanation but it does show the math behind the model: http://en.wikipedia.org/wiki/Mixture_model
Comments (0)
You don't have permission to comment on this page.