How far should we trust models?