Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

My understanding is that in multimodal models, both text and image vectors align to the same semantic space, this alignment seems to be the main difference from text-only models."


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: