As discussed in the previous blog, with fundamental matrix the necessity is not being fulfilled. Below are the proposals for final evaluation.
1. Track the camera with pose transformations.
2. Estimate the height and correspond it to the distance where the camera is standing.
3. Correlate the terrain patches to the position on which the object is standing.
The plan is to achieve the above things on a recorded video and using rtabmap_ros.
Will wait till it gets implemented and push a long blog.