Research Overview
Our research focuses on developing novel computer vision and machine learning techniques for geographic location inference from visual data.
NaviSense is our proprietary transformer-based architecture currently in development that combines visual embeddings with geospatial priors. The model is being trained to achieve state-of-the-art performance in location prediction by learning hierarchical representations of architectural styles, environmental features, and urban patterns.
Captures both fine-grained architectural details and broad environmental context
Learns spatial relationships between visual features and geographic coordinates
Pre-trained on millions of geotagged images for robust generalization
Developing algorithms that predict geographic coordinates from single images using deep learning and computer vision techniques.
Building large-scale, diverse datasets of geotagged imagery for training and evaluating location recognition models.
Improving model efficiency, accuracy, and robustness through novel architectures and training strategies.
A novel architecture combining vision transformers with geospatial attention mechanisms for improved location prediction accuracy.
Introducing a comprehensive dataset of 10M+ geotagged images spanning 195 countries for training location recognition models.
Enabling landmark identification without explicit training through contrastive vision-language models.
Interested in research collaboration or accessing our datasets? Get in touch with our team.