Are there any NLP studies based on understanding the meaning of sentences by combining the images？
I'm really sorry about my English, I think my title is not really accurately describes my question.
What I want to say is: I know that there are many studies use some NLP methods to solute some questions like add tags and image understanding, so I try to think, maybe we can reverse it, I mean, to give each letter or word a image, and the sentence has any words, because of each word have an image, the sentence must can be described as a view with many images on it, and try to let machine understand this view.
I come up with this idea, in 2 ways: Our human can generate graphics when we reading the sentences in the novel. And, I am Chinese, many of letter(word) is generated from the image of the real world. For instance, 日，this letter(word), which means the sun, and how we can draw a sun? Maybe we draw a circle and draw many lines which symbol the sun and sunlight, and, how about simplifying it? So the letter(word) "日", it was created by ancestors.
I think that there must have many studies in the way which I just say, but I can't find it, if you know, please share with me, thanks!
I would like to start an open sources project about what I say above, is there any idea? Just tell me, thanks!