Monocular Depth Estimation (MDE) aims to infer 3D structure from a single 2D image and is critical for applications such as autonomous navigation and augmented reality. Despite recent advances, comprehensive evaluations assessing model accuracy and efficiency remain limited. This thesis addresses this gap by systematically evaluating state-of-the-art metric and relative MDE models across diverse datasets, measuring error rates, computational performance, and fine tuning effects. Every model in this thesis predicts depth from RGB images without additional information, such as camera intrinsics. The results reveal significant trade-offs between accuracy and speed. Metric models achieved high precision after fine-tuning, but their pre-trained configurations showed limited generalizability in challenging domains. Relative models demonstrated overall superior zero-shot accuracy on all datasets, showing their use cases in broad domains. Fine-tuning consistently improved metric model accuracy and often enhanced performance. Statistical analysis confirmed that observed improvements were significant.