The last few months, I've been doing a school project together with two other students in the course "Image processing and pattern recognition," which had as its goal to control a small micro robot through an obstacle course using nothing but the images taken by a camera in the roof.
The project was done entirely in MATLAB, with some code supplied by the school to control the robot, and some proof-of-concept code written by me in Python. Now, the project is over, and I thought I'd go through the solution we came up with here.
Detecting the robots and obstacles
To find the robot and the obstacles on the field, we decided to use a "base" image of the field when it's empty as a way to discern what's supposed to be there and what isn't. We then take a picture of the field with the robot in the starting position, move the robot forward a little bit, then take another picture.
These two pictures are then subtracted from the base picture and converted to binary images where every pixel value above 2 becomes white, and the rest becomes black. Then we apply a filtering algorithm (morphological opening-then-closing for the ones in the know) that removes whatever noise particles is left and connects certain disconnected pieces of the obstacles. Now we have sufficient grounds to discern what is an obstacle and what is the robot, as well as which direction the robot is facing.

Binary image

Morphologically filtered image
To find the obstacles, we just look at which pixels in the pictures stay the same (a simple AND operation between the two images), and the object that has moved is then the robot. To find the angle and position of the robot, all you need to do is see how far the moving objects' centers have moved.
Path-finding
Now, the remaining part of the task to be done before we can even start moving the robot towards the goal is to figure out which way it should go, so as to reach the goal AND avoid the obstacles. To do this, we decided to lean on the juggernaut of path-finding known as A*, favourite of game devs.
Before we could apply this, though, we have to convert the field image to a more game-like "tile-set". So we basically go over the image with only the obstacles, split it into tiles of a given size, and check how many white pixels each tile contains. If it crosses a threshold (we got it down so nicely we could use a threshold of 1 pixel), then it is marked as uncrossable (a "wall") in the resulting tile matrix, otherwise it's walkable.

Resulting tile matrix
We then run the A* algorithm over the image, with the current robot position, and the goal position, to get the goal path as a set of nodes per tile to cross. As this isn't exactly the best path to follow (way too many stops and turns), we then apply a further algorithm to smooth the path
Path-smoothing
To do this, we chose to use a method we call "ray-casting" to weed out unnecessary path nodes. Imagine we have a path as detailed in this picture, with the start and end nodes marked as "valid" (green):

We start with the first node, draw a line (ray) to the next node in the path, then draw two more lines parallel to that at a distance representing the robot's width, as so:

We keep doing this for every node of the path, until any of these lines collide. When this happens we know that this node cannot be reached directly from the node we're checking for, and as such we need to mark the previous node as valid.

After we have done this for all the nodes, and reached the end node, we are left with a path that has significantly less nodes, as well as more direct paths where there was a stepladder effect previously.

The last mile
Now, finally, we can move the robot through the path we have created. This is done by sending commands to the robot to rotate, and move a certain distance towards the next node in the path. As the robot isn't entirely precise, we decided to not let it move too far per movement command, so it wouldn't veer off course too much, and would be given a chance to "compensate" as it moves towards the goal.
The movement is tracked by taking a new picture with the camera, and doing the same filtering operations as before to find the new robot position and angle, so that we can guide it closer to the node as it approaches. If all goes as planned, it should eventually reach the goal position, without colliding into any of the obstacles!
