Duplicate Subtree in Binary Tree

Duplicate Subtree in Binary Tree


So, in this question “Duplicate Subtree in Binary Tree”, we have to tell whether there is a repeated subtree in a tree or not. If there is, then the output should be yes or else no.


Find duplicate subtrees solution:

#include <bits/stdc++.h>
using namespace std;
// This is a structure for a binary tree node
struct Node {
    char key;
    Node *left, *right;
// function to create a new node
Node* createNewNode(char key)
    Node* node = new Node;
    node->key = key;
    node->left = node->right = NULL;
    return node;
unordered_set<string> newSubtrees;
// This function will returns true if tree contains a duplicate subtree

bool findDuplicate(Node* root)
    // A new var to store subtrees
    set<string> newSubtrees;

    // Used to traverse tree
    queue<Node*> bfs;
    while (!bfs.empty()) {
        Node* n = bfs.front();
        // To store the left and the right
        // children of the current node
        char l = ' ', r = ' ';
        // If the node has a left child
        if (n->left != NULL) {
            l = n->left->key;
            // Push left node's data
        // If the node has a right child
        if (n->right != NULL) {
            r = n->right->key;
            // Push right node's data
        string subt;
        subt += n->key;
        subt += l;
        subt += r;
        if (l != ' ' || r != ' ') {
            // If this subtree count is greater than 0
            // that means duplicate exists
            if (!newSubtrees.insert(subt).second) {
                return true;
    return false;
// Driver code
int main()
    Node* root = createNewNode('A');
    root->left = createNewNode('B');
    root->right = createNewNode('C');
    root->left->left = createNewNode('D');
    root->left->right = createNewNode('E');
    root->right->right = createNewNode('D');
    root->right->right->right = createNewNode('E');
    root->right->right->left = createNewNode('D');
    cout << (findDuplicate(root) ? "Yes,there is a duplicate subtree" : "No,there is no duplicate subtree");
    return 0;

Here, this is the solution to finding a Duplicate subtree in Binary Tree, which is a very famous problem statement and everyone has already seen this problem statement in many coding competitions, not exactly but somehow related problems like this.

Explanation of the “Duplicate Subtree in Binary Tree” Code:

  1. So, the code starts by including the necessary headers.
  2. After this, we have to define a structure Node to represent a node in the binary tree. And each node has a value of type character( or any other primitive data type) and pointers to its left and right children nodes.
  3. Now, we will create a createNewNode function that creates a new node with the given key value. After that, it will allocate memory for the node, sets its value, and initializes its left and right pointers to the “NULL” value. And this function will return the newly created node.
  4. Now, we will declare an unordered_set newSubtrees(name can be any) to store the serialized representations of subtrees encountered during the traversal. This set will help in detecting duplicate subtrees. So, unordered_set<string>(type can be any) is a data structure in C++ that represents an unordered set of unique elements. 
  5. There is a findDuplicate function which is the main function to check whether there is any duplicate subtree or not. This function takes the root of the binary tree as input and returns true if it finds a duplicate subtree. If there is no duplicate subtree, it returns false.
  6. So, inside the findDuplicate function, we will make a set of type strings named subtrees. This field will store the serialized representations of subtrees encountered during the breadth-first search (BFS) traversal.
  7. After this, we will make a queue bfs and will initialize it with the root node and used that field for the BFS traversal of the tree.
  8. The BFS traversal is performed using a while loop and the transverse will continue until the queue is empty.
  9. In each iteration of the loop, we dequeue the front node of the queue and assign it to the new variable “n”.
  10. Now, we will initialize new characters named as l and r with space (‘ ‘) to represent the keys of the left and right children node of the current node. Likewise, if the current node has a left child, its key is assigned to l variable, and the left child is enqueued in the BFS queue.
  11. If the current node has a right child, we assign its key to the variable ‘r’ and enqueue the right child in the BFS queue.
  12. After this, we construct the serialized representation of the subtree rooted at the current node by concatenating the key of the current node with ‘l’ and ‘r’.
  13. If the current node has only one child, this indicates a non-leaf node. So, after that, we will check the serialized subtree representation for duplication.
  14. If inserting the representation into the set returns false, it means a duplicate subtree exists, and true is returned from the function.
  15. Finally, if we don’t find any duplicate subtree during the traversal, we return false.
  16. Furthermore, In the main function, we create a sample binary tree by manually constructing the nodes and connecting them.
  17. The dupSubUtil function will call the root of the tree one by one, and the return node value will be printed.
  18. If a duplicate subtree is found during the traversal, “Yes, there is a duplicate subtree” is printed; otherwise, “No, there is no duplicate subtree” is printed.


In summary, we can say that the overall purpose of the code is to check whether a binary tree contains a duplicate subtree or not. It uses a BFS traversal to serialize each subtree and checks for duplicates by storing the serialized representations in a set.

About Us

Myself Bharath Choudhary, software developer at Oracle.
2021 NIT Warangal graduate.


Recent Blogs

Quick Contact


Saturday – Sunday
10 AM – 5 PM

Follow Us :