Adrien Biarnes
Aug 29, 2024

We perform, in-batch negative sampling. There isn't any real "sampling" per say.

What we need to compute the loss are logits of the positives and negatives along with the indices of the positives.

The positive logits are on the diagonal of the logits matrix and the negatives off-diagonal.

The positive indices are simply [0, 1, 2, ..., batch_size]

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Adrien Biarnes
Adrien Biarnes

No responses yet

Write a response