# D2: Travelling Salesman Problem

After passing your D2 examinations, you decided to become a Travelling Salesman. This involves starting from home and visiting all the towns in your manor and then returning home. Here is a map of your manor (you live at E).

The Travelling Salesman Problem is then... what is the minimum distance you must travel in order to visit each town and return home?

Note that this is different to the Chinese Postman Problem where the task
is to go down every *arc* once. You probably can find the best route for this map but for a larger map with loads
of towns there is no simple algorithm. I say *simple* because obviously we could find every possible way of visiting
each town and select the shortest overall route; but unfortunately this method quickly grows to zillions of possible routes (it
grows like *n!*).

## Upper & Lower Bounds

Because there is no known algorithm that is simple enough to use, we try to find lower and upper bounds. The lower
bound is the minimum that we have to walk and the upper bound is the maximum. If we find a lower bound of 23 and an upper bound
of 25, the the answer must be between these two numbers. Hopefully it is clear that we want the lower bound to be as *high*
as we can find and the upper bound to be as *low* as we can find. Even better if they are the same number as that *must*
be the answer to the problem.

### Upper Bound

There are two ways to quickly find a good upper bound. The upper bound is the most that we have to walk. The first way is to find a minimum spanning tree.

Now start at home (E) and go to each town before returning home *but for now I insist that you use only the arcs
on the minimum spanning tree.* What do you notice?

If we draw the minimum spanning tree on a straight line, it sometimes helps to see what *shortcuts* we could take.

If we went from E to A to B to C to D, we could return straight home to E and save time.

Sometimes it's not so obvious where the shortcut is. There might be a few to consider and this causes problems. We'll see another method for finding an upper bound later.

### Lower Bound

Now we shall seek a lower bound. Here is our town map with the shortest route marked.

Imagine now that town A didn't exist. The red route for the new map without A is just the minimum spanning tree for the remaining vertices. Now imagine that B doesn't exist instead. What can you see? Try for the other vertices.

If we now reconnect the vertex A using the two shortest lengths, we get the best (red) route. However if we did the same
thing for the vertex B, the minimun spanning tree for what's left is CD DE EA (= 43 in total). When we connect B to this by the
two shortest arcs (BC BD), the total is *smaller* than the red route.

This gives us a method for finding a *lower* bound.

- Delete a vertex and find the minimum spanning tree for what remains
- Reconnect the vertex you deleted using the two smallest arcs
- Repeat this process for all vertices and select the
*highest*as the best lower bound

But we can shorten this whole process. If in part 1 we find a minimum spanning tree that is just a straight line
(like when we delete A, the minimum spanning tree is BCDE) **and** when we reconnect the deleted vertex (A) it joins to the
two ends of the line to make a circuit (ie B and E), then the result is a lower bound that actually is a proper route. Think about it -
we have found a route that is equal to the lowest that the best route can be. We have found the best route!

## The Nearest Neighbour Algorithm

Another way of finding an upper bound is to use the Nearest Neighbour Algorithm. This is a basic, common sense algorithm. You start at home and travel to the closest town that you have not yet visited. When you have visited all of the towns, you return home directly.

It is the last bit that can prove to be a long journey - the trip home. For this reason, this algorithm is not perfect but it is straightforward and can be done on a matrix - like Prim's Algorithm but you only consider the latest column of numbers.

## Practical vs Classical Problem

In the matrix in the video clip, every town was connected directly to every other. This is called a *complete* graph.
In reality, this is not always the case (as in the example in the diagrams above). This is a difference between a *classical*
TSP and a *practical* one. The other difference is that in reality it might not always be shorter to go directly from one
town to another - sometimes it's shorter to go through another town to get there. This last bit is called the *triangle
inequality* and always holds for the classical problem but not always for the practical one.

In the diagram above, it is quicker to go from A to C to B (4+2=6) than it is to go from A to B (7). You can't draw a triangle with these lengths (try it if you don't beleive me!) so the triangle inequality does not hold for this bit of graph. This must be a practical rather than classical problem.

### Converting Practical to Classical

In reality, not all town are connected directly and the triangle inequality might not work (this is because even direct roads might curve around a bit). We can convert such a problem into a classical problem by doing 2 things...

- If there is not a direct route between 2 towns, pretend there is by using the shortest indirect route
- If the direct route is shorter than an indirect route, replace it with the value of the indirect route. This is because you'd never go direct if the indirect route was shorter.

The following graph is not a classical one as both rules are broken

In matrix form this is...

A | B | C | D | |
---|---|---|---|---|

A | - | 7 | 4 | 5 |

B | 7 | - | 2 | 4 |

C | 4 | 2 | - | - |

D | 5 | 4 | - | - |

- There is no route from C to D. The shortest indirect route is 6 so add that to the matrix
- It is quicker to go from A to C to B than it is to go directly, so replace the 7 in the matrix with the shorter route (6).

A | B | C | D | |
---|---|---|---|---|

A | - | 6 | 4 | 5 |

B | 6 | - | 2 | 4 |

C | 4 | 2 | - | 6 |

D | 5 | 4 | 6 | - |

Now we solve the problem using the usual methods for the classical problem (ie find upper bound, lower bound &c.)

When we have solved the problem, remember to say what the *actual* route is and not the *pretend* route we've
just made up. In the practical problem you often revisit towns - something that cannot happen in the classical problem.