We aim to develop self-supervised health data foundation models, and other modern AI approaches to improve prediction of phenotypes with scarce training data, such as rare cancers or disease progression and recurrence. These models are known to benefit from heterogeneous, large training data, making the Finnish registries ideal for development of such models. With our European collaborators, we will also study the performance of such models across different European healthcare systems.