Thanks for the article ! Great read but a bit disappointed that you did not explain further how is that the 2 tower architecture allows to alleviate the cold start issue? Honestly I really don’t see why this is the case out of the box? Does it require extra steps at inference like if the user is new, take the average of user representations from user tower with the same characteristics ?