I have some small experience in object recognition, and I appreciate the level of difficulty in this research topic. I am slightly confused, however, by how the skeleton based object recognition relates to the rest of the video, poster and abstract. Is this a separate project from the point cloud project?
Also, related to the skeleton recognition, did you ever show the subjects shapes that were created with erroneous branches?
Diane Cook
Faculty: Project PI
The description of how a skeleton is created for a shape appears somewhat
similar to creating a Voronoi diagram. Can the Voronoi techniques be used
to perform this task?
John Wilder
Graduate Student
The voronoi diagram is our first step. It is used in our method as a discrete approximation of the Medial Axis Transform of Blum (1967). These shape skeletons are extremely sensitive to the noise in the contour; a small bump added to the contour will add an entire new skeletal branch. We take the medial axis and prune branches to find the MAP skeleton (Feldman & Singh, 2006), which results in shape skeletons with one axis for each perceptually salient shape part. We would like a system that classifies objects based upon their shape to give the same classification even if in some images the boundary extracted for an image has more noise than the object boundary might have in a different image, which makes the MAP skeleton a good representation for us to use.
Specifically, the MAP skeleton is looking for the skeleton that maximizes the posterior p(skeleton | shape), which is proportional to p(skeleton) and p(shape | skeleton).
p(skeleton) gives higher probability to skeletons that have fewer branches and have straighter axes.
p(shape | skeleton) gives the current shape a high probability if it is well fit to the skeleton, which gives a preference to more complex skeletons.
The posterior then balances complexity and explanatory power of the skeleton.
Julia Hirschberg
Faculty: Project PI
What evidence do you have that your model will improve over other approaches to object recognition?
Edinah Gnang
Thank you for your great question. The evidence that the Tensor model [ Gnang, Elgammal, Retakh 2011] improves over other approaches comes from the fact that the tensor model captures many alternative discrete shape models based on matrix analysis, while providing a natural generalization to higher order correlation settings. Finally the tensor framework allows us to express explicitly the running time of induced algorithms in terms of the symmetries present in the shape.
Mostafa Bassiouni
Faculty: Project Co-PI
You have adequately described the theoretical basis of your model including Theorem 2.1 and 2.2. The practical utility of this model has not been fully substantiated. In your video, you discussed the practical application of recognizing whether a simple shape is an animal or a leaf and you mentioned using a naïve-based classifier in your experiments. How do you compare your model with the state-of-the-art computer vision methods for object recognition, shape recognition, or feature recognition? Also how would you characterize the computational overhead of a shape recognition method based on your Tensor model?
Edinah Gnang
Thank you very much for your questions. Allow me to address the computational aspects of the Tensor model. As formulated the shape recognition problem reduces to hypergraph sub-isomorphism instances and unless P=NP there should be no polynomial time algorithm for solving the problem in complete generality. However we have not yet tried our algorithm on computer vision datasets so it remains to be determined how it will compare to other approaches. Our overall approach improves on the generic Gröbner basis approach for solving hypergraph sub-isomorphism and we express the running time of our algorithms in terms of the size of the automorphism group of the hypergraph.
John Wilder
Graduate Student
The practical application of recognizing if a silhouette is an animal or leaf was added to show that our IGERT is working on a breadth of approaches in both computer and human vision. We have not directly compared the skeletal model to the state of the art in computer vision because the questions in that project are about the human visual system (whether or not a skeletal representation contains psychologically relevant information above and beyond what would be easily accessible from a contour representation), and since we are only using a couple properties/summaries of the shape skeleton an approach that includes additional information, such as color or texture, would perform better. Also, the test we performed was how consistent the skeletal model’s classifications were with human classifications. There is no “correct” answer in this test, we are simply comparing to human data. We did use alternative representations that have been suggested to be psychologically relevant, such as various contour representations, aspect ratio, compactness, perimeter/area^2, contour complexity, and combinations of those representations. These all failed to predict the human classifications.
Mary Kathryn Cowles
Faculty: Project PI
You have described your successful effort to create a mathematical model for the shape properties in point clouds. What will the next step of your research be? How do you go from the mathematical model to begin to develop useful code libraries for object recognition?
Edinah Gnang
Thank you for your remark, The next step would consist in trying our model and proposed algorithm on various datasets and implement efficient libraries for performing spectral analysis of tensors.