Best First Search (Informed Search)

Best First Search is a heuristic search algorithm that selects the most promising node for expansion based on an evaluation function. It prioritizes nodes in the search space using a heuristic to estimate their potential. By iteratively choosing the most promising node, it aims to efficiently navigate towards the goal state, making it particularly effective for optimization problems.

Best First Search Algorithm

Best First Search is a heuristic search algorithm that explores a graph by expanding the most promising node first, according to a specified evaluation function. It continuously selects nodes based on their estimated cost to reach the goal, typically using a priority queue. BFS is suitable for finding paths in graphs and optimization problems where informed decision-making is crucial.

Here's a concise explanation of the Best First Search algorithm in step-by-step form:

Start with an initial node and add it to the open list.
While the open list is not empty.
- Select the node with the lowest estimated cost (based on a heuristic function) from the open list.
- If the selected node is the goal, return the solution.
- Otherwise, expand the selected node by generating its successors.
- Evaluate each successor node using the heuristic function and add them to the open list.
If the open list becomes empty without reaching the goal, the search fails.

What is a Heuristic Function?

Heuristic function (or simply heuristic) is nothing but a shortcut to solve a given problem (by making approximations) when either the exact solution is non-existing or it takes huge time to find the same. You will always see the terms admissibility and consistency associated with heuristics, let us understand what they are -

Admissibility - A heuristic function is said to be admissible if it never overestimates the cost/price of reaching the target. In other words, it can be said that the estimated cost should be less than or equal to the lowest possible cost.
Consistency - In searching related problems, a heuristic function is said to be consistent if its estimate to reach the target is always less than or equal to the sum of estimated cost from any of its neighboring points and the cost of reaching that neighbor from current position.

Majorly we have two types of best first search algorithms -

Greedy Best First Search
A* Search Algorithm

Approach 1: Greedy Best First Search Algorithm

In the greedy best first algorithm, we select the path that appears to be the most promising at any moment. Here, by the term most promising we mean the path from which the estimated cost of reaching the destination node is the minimum.

To decide which path is the most promising we will use the heuristics function (say $h(n)$ ) where $h(i)$ denotes the estimated cost of reaching the destination node from the $i^{th}$ node. Note that, the value of $h(\text{dest})$ will be 0 because no cost is incurred to reach the destination if we are already there.

Algorithm

The algorithm of the greedy best first search algorithm is as follows -

Define two empty lists (let them be openList and closeList).
Insert src in the openList.
Iterate while the open list is not empty, and do the following -
- Find the node with the least h value that is present in openList (let it be node).
- Remove the node from the openList, and insert it in the closeList.
- Check for all the adjacents nodes of the node, if anyone of them equals the dest. If yes, then terminate the search process as we've reached the destination.
- Otherwise, find all the adjacents of the node that are neither present in openList nor closeList and insert them in the openList.
If we have reached here, it means there is no possible path between src and dest.

Dry Run

Let's say we have the following graph, in which src and dest nodes are highlighted, and heuristic values for each of the nodes are written against them in the form of the table.

Step 1 - Initialize two empty lists $i.e.$ openList and closeList.
Step 2- Insert src in the openList, after which we have - openList = [src] closeList = []
Step 3 - (Iterating while the open list is not empty) -
- Iteration 1 -
  - After this operation we will have - openList = [1, 2, 3] closeList = [src]
- Iteration 2 -
  - After this operation we will have - openList = [1, 2, 7, 8] closeList = [src, 3]
- Iteration 3 -
  - Upon exploring the adjacents of node 7 we found that dest is one of them. Hence, we return that search result in success, and print the path $\text{src}\rightarrow 3 \rightarrow 7 \rightarrow\text{dest}$ .

Java Implementation

Output:

C++ Implementation

Output:

Python Implementation

Output:

Complexity Analysis

Time Complexity - In the worst situation, we may require to traverse through all the edges to reach the destination. Hence in the worst case, the time complexity will be $O(E\log{V})$ .
Space Complexity - In the worst scenario, we may have all the vertices (nodes) in openList which will make the worst case space complexity $O(V)$ .

Approach 2: A* Search Algorithm

What is A* Search Algorithm?

A* Search algorithm is a searching algorithm that searches for the path with the least cost between the initial and final state. It is one of the best and most popular methods out there, that are used in path-finding in graphs.

Consider a 2-dimensional board containing a source cell (colored green), a destination cell (colored blue), and some obstacles (colored red).

How Does the A* Algorithm Work?

Before continuing with the actual working of the A* algorithm let's have a look at some key terms:

$g$ -- It is the cost incurred in moving from the initial cell (src) to the current cell.
$h$ -- It is known as the heuristic value, it is the estimated cost of reaching the final cell (dest cell) from the current cell.
$f$ -- It is the sum of the values of $g$ and $h$ . Hence, $f=g+h$

To make the decision (where to go in the next step), the algorithm takes the $f's$ value into account. Greedily, it selects the minimum possible $f$ value to process the current cell.

Example of A* Algorithm

Let's assume we have the following board for the example with the green-colored cell as the source node and the blue-colored cell as the destination node, given that it is prohibited to pass any red-filled cell.

The heuristic values when calculated will be like -

$h=\Delta x + \Delta y - min(\Delta x,\Delta y)$

So steps involved in finding the path with minimum cost from the source node to the destination node will be -

Step 1 - From the cell $(0,0)$ we search for the cell with the least heuristic value. We have three choices $i.e.$ $(0,1),$ $(1,1),$ and $(2,1)$ with heuristic values as $4$ . We can pick any one of them, let's say we picked $(1,1)$ . Since we are moving at each step, therefore we will increase the value of $g$ after each step.
Step 2 - From cell $(1,1)$ , we jump to $(0,2)$ .
Step 3 - Again at this cell, we have only one choice $i.e.$ to go to cell $(1,3)$ .
Step 4 - From cell $(1,3)$ , we jump to $(2,4)$ .
Step 5 - Finally we reach at the cell $(2, 5)$ .

Implementation

Steps to implement the code -

Initialize two lists of Node type (say openList and closedList)
Add src node in openList.
Till openList is not empty do the following
- Find the node in the openList with the least f value (say x)
- Remove x from openList
- Find and add all the 8 successors of x and add them to openList.
- For each of the successors (say y) do the following -
  - If y is the destination then stop the search.
  - Otherwise, calculate g and h for y. y.g = x.g + distance between y and x (it is 1 in our case) y.h = distance between destination and y (Here we will be using Euclidean distance for the same). y.f = y.g + y.h
  - If a node with the same position as y and lower f is present in openList then skip this successor.
  - Else If a node with the same position as y exists in closedList with smaller f than y.f skip this successor.i6
  - Otherwise, add the node (y) to the openList.
- Push x is the closedList.

Implementing A* Algorithm in Java

Output:

Implementing A* Algorithm in C++

Output:

Implementing A* Algorithm in Python

Output:

Complexity Analysis

Time Complexity - The time it takes for the A* search algorithm to find a solution depends on how many options each step has (the branching factor) and how deep the solution is. We express this time complexity as $O(b^d)$ , where b is the branching factor and d is the depth of the solution.
Space Complexity - In the worst scenario, we may have all the vertices (nodes) in openList which will make the worst case space complexity be $O(b^d)$ .

Conclusion

Best First Search (BFS) efficiently navigates search spaces using heuristic evaluation.
Greedy Best First Search prioritizes paths based solely on heuristic estimates, sacrificing optimality for speed.
A Algorithm* combines actual and estimated costs, ensuring optimality in pathfinding tasks.
Greedy BFS is simpler and faster but may not always find the optimal solution.
A* guarantees optimality by considering both current and estimated costs.
The choice between Greedy BFS and A* depends on the balance between solution quality and computational efficiency.