First Objective: Defining the Symbol Grounding Problem
The symbol grounding problem questions how symbols (ie words, objects) get their meanings, and in doing so, questions meaning itself. Based upon Harnad’s Chinese/Chinese Dictionary example, it seems that obtaining meaning from symbols and vice versa is a redundant and circuitous process that achieves nothing. We addressed a similar example in class by noting how, semantically, a chair is equivalent to a stool, which is equivalent to a table, which is equivalent to a desk. In the end, they all have some kind of flat surface that rests on legs. But if asked to any random person to specifically identify a desk and a chair, he or she would certainly know the difference. In his article, Harnad examines this phenomenon by introducing extrinsic and intrinsic meaning. Semantically defining a chair, stool, table, and desk are examples of extrinsic meanings. However, Harnad says “cognition cannot be just symbol manipulation.” The reason why humans know the difference between said objects is because we associate intrinsic “meanings in our heads” to each. It is this “intrinsic” meaning that cognitive robotics strives to achieve.
There are two approaches to addressing the symbol grounding problem, both of which have significant and distinct capabilities and limitations. The first is the symbolic approach. The forte of this approach is that it encompasses formal and language-like tasks—such reflects the behavioral capacities of humans. Symbolists attribute the success of symbolic AI to implementation-independence, which are concepts and ideas independent of their physical forms. However, its limitation is that symbols remain ungrounded (which is the root of the problem). The second approach is connectionism, which includes sensory, motor, and learning tasks. As opposed to symbol manipulation, connectionism uses dynamic patterns of activity in a multilayered network. With connectionism, a system can learn and solve problems through recognizing patterns. The problem with this is that it does everything “non symbolically.” As Harnad claims, “many of our behavioral capacities appear to be symbolic.” Using connectionism would fail to achieve the human model of intrinsic meaning.
In a theoretical sense, I suppose that a combination of symbolism and connectionism would be key to mimicking the thought process of a human brain, but I’m not sure how this would be carried out in practice. Feedback and memory would also be essential, as humans often associate experiences and emotion with the objects around them. I feel I should do a bit more research before proposing a solution to this problem...
Nick's Response:
It does seems that the combination of symbolism and connectionism would be the ideal approach. Something like he best of both worlds kind of situations but then again, when two approaches combine, the flaws of both are brought as well. Harnad tries to use visual icons to discriminate between different elements of the world. The problem this approach has is what icon does one use to compare the current sensory projection to? I brought this up in class and I am still thinking about it. I suppose the connectionist network could be used in this circumstance. For example, depneding on the contours and features of the icon could activate nodes that have similar contours. This would generate a bunch of icons, maybe completely different objects, but some with similar features. Then, the icon overlay could happen with these activations. However, this requires setting certain standards for measuring contours, and some set of categorical distinctions would have to be in place for identifying what contour pattern the system is currently dealing with. I suppose this would be better than having to program in actual categories for objects in the world and suffering from the grounding problem.
So, if we thought about doing the above proposal with the clustering technique that the computation people are working on, what would it look like? Their approach graphs sensory activation in a euclidean environment. I think matt was only demonstrating using a binary type activation, either on or off, and then the x-y components were where in the visual field this activation happened. (I am curious how intensity and varying activation would come into place here, but hmm...maybe we will ask about that today.) And so, the system does some algorithms and comes up with blob like group. This is an iconic representation because it corresponds to what is happening in the environment. It is not an arbitrary label, a symbol, of the set of affairs, each point is there because that's what the sensor recorded. So this makes some blob. To evaluate this blob, what could we do? There are some distinguishing features most likely like: 1) Size, 2)Local Maximum, 3) Local Minimum, 4) Point Density. I suppose if we did the superimposing that Harnad was discussing, we would begin to see what kinds of things distinguish one blob from another in order to create a systematic structuring system, not yet at the categorical level, but still at the discrimination level. The point of this thought experiment was just to try to solve the 'prototypical icon' problem that was discovered in Harnad's article. It seems that he has a good theory for integrating connectionism and Symbolism, but we may have to flush out the subtleties in order to make use of it.
Comments (3)
Leland McCleary said
at 6:28 pm on Oct 8, 2008
Nick, I suggest another look at Barsalou's concept of perceptual symbol systems as an alternative to Harnad's "visual icon" / connectionist mix. Barsalou's proposal for perceptual symbols seems to be doing the same work as Harnad's "visual icons", with the difference that the perceptual symbol is directly derived from the sensory, and doesn't have to be "compared" to the sensory. Something like that.
Leland
Nicholas Davis said
at 12:36 pm on Oct 9, 2008
I completely agree that Barsalou's notion of a perceptual symbol is of utmost importance in this context. But this approach would run into one crucial problem when implemented in an agent, namely what features would it schematically store. Since, by definition, a perceptual symbol is a schematic neural activation stored to be activated upon later experiences with similar objects in order to comprehend the scene at hand in terms of previous experiences. But the question of how to link these symbols together, actually creating the 'system' would have to be most likely something to do with connectionism and weighting the links between sensory activations. Basically, we would again run into the problem of distinguishing 'critical features' that would be good to store about the object at hand, and these features, of course, would just be sesnsory information, strings of numbers coming from the robot's input, but which ones are important to store, and what should they be linked to? Which ones will really bring about a sense of understanding in a future scenario? Barsalou proposes that these systems are somewhat linguistically organized, such that all the neural activations for 'chair' will be activated upon hearing the word and different things relating to chair are stored in that system, but would we have linguistic tags like this? Hmm..maybe we could give the tags and then let them experience the object and record sensory information and relate it to the tagged word, but then again, this would not necessarily work because the relationship between words would have to encoded as well, that chairs can be used to 'sit' and sit is a thing with x,y,and z sensory information associated with it. Don't get me wrong, I like perceptual symbols, I really do, but working out the details of how a robot would utilize these has been an ongoing project in my mind. Maybe you have some thoughts?
Leland McCleary said
at 5:37 pm on Oct 9, 2008
Seems to me like all the objections you make to perceptual symbols for robots hold just as well for humans! No guarantee for us what the 'critical features' are either, or what would prepare us for a future scenario. As far as 'giving them tags', why don't we take a cue from some of the research that's being done (like Vogt's thesis) and let them figure it out among themselves and invent their own concepts. We're in a much better position to figure out what their concepts 'mean' in our language and grounded experience than they are to figure out ours. They develop their language, we learn it, and then we're on our way. Anthropologists do it all the time, and the gain is that we learn about different ways of being in and experiencing the world.
You don't have permission to comment on this page.