Research Overview
Our research focuses on developing novel computer vision and machine learning techniques for geographic location inference from visual data.
NaviSense is our proprietary transformer-based architecture that combines visual embeddings with geospatial priors. The model achieves state-of-the-art performance in location prediction by learning hierarchical representations of architectural styles, environmental features, and urban patterns. Recent infrastructure improvements have enabled continuous training on live user data.
Captures both fine-grained architectural details and broad environmental context
Learns spatial relationships between visual features and geographic coordinates
Real-time model updates from user interactions and location recognitions
Developing algorithms that predict geographic coordinates from single images using deep learning and computer vision techniques.
Building large-scale, diverse datasets of geotagged imagery for training and evaluating location recognition models.
Improving model efficiency, accuracy, and robustness through novel architectures and training strategies.
A novel architecture combining vision transformers with geospatial attention mechanisms for improved location prediction accuracy.
Introducing a comprehensive dataset of 10M+ geotagged images spanning 195 countries for training location recognition models.
Enabling landmark identification without explicit training through contrastive vision-language models.
Interested in research collaboration or accessing our datasets? Get in touch with our team.