Earth observation data such as satellite imagery offers great potential for better assessing and planning for vital infrastructure systems and increasing our understanding the flow of resources in our global society. Automated analysis of such data provide their greatest benefit when they are automated and deployed at scale, being able to perform assessments for regional, national, or even global extents. However, the ability to apply these techniques at scale have repeatedly been met with technical challenges since deep learning techniques trained on one geography often do not generalize well to new geographies. A solution is to provide a sufficiently diverse training data set. The goal of this project is to develop a tool to enable continuous development and expansion of effective training data for earth observation data that makes use of existing infrastructure labels available on publicly available mapping platforms while simultaneously providing dozens of earth observation sensor modalities for the area of interest. This tool will enable the creation of a new benchmark dataset for the earth observation computer vision community to be shared publicly and continuously grown. This project will also use this dataset to explore fundamental questions around the amount of training data required for effective geographic domain adaptation and the degree of diversity needed in those data.