Object recognition

For Object recognition, the Fezzik project researched and implemented several methods: 1. Object recognition using ros_markers (used in final demo) 2. Object recognition using table_top (not used, but looks promising) 3. Object recognition using ar_track_alvar(not used, but looks promising) 4. Object detection using find_object_2d 5. Object detection using Visp_auto_tracker 6. Custom object detection using OpenCV with Haar Cascade

Some methods were much more successful than others. In the below, each one of them are explained and guidance for implementation is provided.

1. Object recognition using ros_markers

Summary

Ros_markers is a lighter version of ar_track_alvar. It is able to tell markers apart and provide coordinates given correct calibration.

For Fezzik, this was used to recognize objects and provide coordinates, allowing DE NIRO to grasp the given object.

Pointers

The package is not easily found online, so here is the link: `https://github.com/chili-epfl/ros_markers/blob/master/README.md`_

Hardware Requirements

  • Item 1
  • Item 2

Software Requirements

  • Pre-installed package 1
  • Pre-installed package 2

Installation

Note

TODO

Setup Test

$ roslaunch kinect2_bridge kinect2_bridge.launch
$ rviz rviz
$ roslaunch ros_markers detect.launch image_topic:=/kinect2/hd/image_color camera_frame_id:=kinect2_rgb_optical_frame

Implementation

  • The MAIN section. Explain the working version of your software.
  • Use directives (attention, note, caution, tip , important, warning, hint, error, and danger) to highlight points appropriately.
  • Screenshots of the working version of your feature tend to be particularly useful

Attention

An attention, example

Limitations

What we did not achieve and how people in future projects could improve

What Didn’t Work

List all things that you have tried, but that did not work in the end. In particular, explain why, and give any details (e.g. links, screenshots) required to reproduce your failure. Signposting failures is as important as trumpeting successes, so do err on the side of more detail in your explanations here!

  • Software that didn’t work 1 and explanation, links, …

2. Object recognition using table_top

Summary

This single module was the one that most time was spent on. Still, we never managed to get it to work. Therefore, advise for future groups is to be cautious in how much time you are willing to invest in a package that looks very promising, but for some reason is difficult to get running.

::note: The above lines are a quote of a surrendered. But we never surrender, NEVER!

Hardware requirements

  • Item 1
  • Item 2

Software requirements

  • Pre-installed package 1
  • Pre-installed package 2

Installation

  1. pip install ...
  2. sudo apt-get ...

For further information, visit the object recognition core installation page.

::attention: When visiting the installation page, make sure to click the ROS buttons so to be displayed the right installation commands to follow (this is not selected by default).

Setup tests

  • Test step 1
  • Test step 2

Implementation

Attention

An attention, example

Limitations

What didn’t work

  • Software that didn’t work 1 and explanation, links, …

3. Object recognition using ar_track_alvar

Overview

This is a broader (and heavier) version of the ros_marker package. Fezzik did not spent much time on implementing this package, but while researching it to decide between this package adn the ros_marker package, it became clear that future groups might find this relevant.

Details

The first use case for this package is to identify and track the poses of (possibly) multiple AR tags that are each considered individually. The node individualMarkers takes the following command line arguments:

  1. marker_size (double) – The width in centimeters of one side of the black square marker border
  2. max_new_marker_error (double) – A threshold determining when new markers can be detected under uncertainty
  3. max_track_error (double) – A threshold determining how much tracking error can be observed before an tag is considered to have disappeared
  4. camera_image (string) – The name of the topic that provides camera frames for detecting the AR tags. This can be mono or color, but should be an UNrectified image, since rectification takes place in this package
  5. camera_info (string) – The name of the topic that provides the camera calibration parameters so that the image can be rectified
  6. output_frame (string) – The name of the frame that the published Cartesian locations of the AR tags will be relative to

individualMarkers assumes that a Kinect being used as the camera, so that depth data can be integrated for better pose estimates. If you are not using a Kinect or do not desire to use depth data improvements, use individualMarkersNoKinect instead.

4. Object detection using find_object_2d

Overview

This package works, but was not sufficient for our task. In general, a great learning from Fezzik is to thoroughly understand the packages before trying to implement them. This package only works for detection and although that sounds quite similar to recognition, it is two quite different tasks. The advantage of this package, however, is that it can identify objects in real-time. That is, it works via edge detection and there it does not need to be trained but seems to be able to identify the object almost immediately after an image is provided of that object.

Implementation

  1. Download packages into catkin_ws/src : i) git clone https://github.com/introlab/find-object.git src/find_object_2d, and ii) https://github.com/introlab/find-object.git
  2. Go inside catkin_ws
  3. run catkin_make once to initialize package
  4. Run Kinect with: roslaunch kinect2_bridge kinect2_bridge.launch
  5. Go to root folder of catkin_ws and run source devel/setup.bash
  6. roslaunch find_object_2d find_object_3d_kinect2.launch
  7. Then go to edit and take picture. Make sure it is a good one. Add objects from scene. Note: It has a hard time (seems almost random) of recognizing some objects

5. Object detection using Visp_auto_tracker

Overview

This marker detector works and is quite easy to implement. However, it does not seem to allow for recognizing different markers, i.e. it can only be used to detect that there is a marker in the image/video input and not to tell two different markers apart.

Implementation

  1. Catkin_make
  2. Source
  3. Run the Kinect:roslaunch kinect2_bridge kinect2_bridge.launch
  4. roslaunch marker_recognition marker_recognition.launch

6. Custom object detection using OpenCV with Haar Cascade

Overview

The Haar cascade methodology for detecting faces worked very well for our project. Thus, we tried to make the same methodology detect custom objects as well. The process is quite well documented (see below), but for some reason it never worked quite well when testing our implementation. Pre-trained models (e.g. the ones for faces and/or eye detection) worked fine, but when training it on custom objects it performed quite poorly.

Remarks

The process for training a custom model is documented very well and quite interesting, but is is also relatively time consuming. It involves downloading >1000 pictures using a python script, resizing images, building your own positive pictures (images including the picture) and finally training the model (which takes time if you don’t have access to a GPU).

Testimony

Below shows the results when trying to train the model to detect a cup in a randomly chosen video.

_images/cup1.png
_images/cup2.png

As evident from the pictures, the custom trained model is much too sensitive

Resources

Very useful video guidance to make it work `https://www.youtube.com/watch?v=88HdqNDQsEk`_