模型蒸馏
模型蒸馏指的是训练一个(通常较小的)学生模型来模仿一个(通常较大的)教师模型或教师模型集合的行为。这通常用于使模型**更快、更便宜和更轻量**。
Cross Encoder 知识蒸馏
目标是最小化学生 logits(也称为原始模型输出)和教师 logits 在相同输入对(通常是查询-答案对)上的差异。
这里有两个训练脚本,使用了 Hostätter 等人 预先计算的 logits,他们为 MS MARCO 数据集训练了一个 3 个(大型)模型的集成,并预测了各种(查询,段落)对的分数(50% 正例,50% 负例)。
-
在这个例子中,我们使用知识蒸馏和一个小型快速模型,并从教师集成学习 logits 分数。这产生了与大型模型相当的性能,同时速度快了 18 倍。
它使用
MSELoss
来最小化预测的学生 logits 和预计算的教师 logits 之间对于(查询,答案)对的距离。 train_cross_encoder_kd_margin_mse.py
这与之前的脚本设置相同,但现在使用了
MarginMSELoss
,如前面提到的 Hostätter 等人 中所用。MarginMSELoss
不适用于(查询,答案)对和预计算的 logit,而是适用于(查询,正确答案,不正确答案)三元组和一个预计算的 logit,该 logit 对应于teacher.predict([query, correct_answer]) - teacher.predict([query, incorrect_answer])
。简而言之,这个预计算的 logit 是(查询,正确答案)和(查询,不正确答案)之间的差异。
推理
tomaarsen/reranker-MiniLM-L12-H384-margin-mse 模型是使用第二个脚本训练的。如果您想在自己蒸馏模型之前试用该模型,请随意使用此脚本
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-modernbert-base-msmarco-margin-mse")
# Get scores for pairs of texts
pairs = [
["where is joplin airport", "Scott Joplin is important both as a composer for bringing ragtime to the concert hall, setting the stage (literally) for the rise of jazz; and as an early advocate for civil rights and education among American blacks. Joplin is a hero, and a national treasure of the United States."],
["where is joplin airport", "Flights from Jos to Abuja will get you to this shimmering Nigerian capital within approximately 19 hours. Flights depart from Yakubu Gowon Airport/ Jos Airport (JOS) and arrive at Nnamdi Azikiwe International Airport (ABV). Arik Air is the main airline flying the Jos to Abuja route."],
["where is joplin airport", "Janis Joplin returned to the music scene, knowing it was her destiny, in 1966. A friend, Travis Rivers, recruited her to audition for the psychedelic band, Big Brother and the Holding Company, based in San Francisco. The band was quite big in San Francisco at the time, and Joplin landed the gig."],
["where is joplin airport", "Joplin Regional Airport. Joplin Regional Airport (IATA: JLN, ICAO: KJLN, FAA LID: JLN) is a city-owned airport four miles north of Joplin, in Jasper County, Missouri. It has airline service subsidized by the Essential Air Service program. Airline flights and general aviation are in separate terminals."],
["where is joplin airport", 'Trolley and rail lines made Joplin the hub of southwest Missouri. As the center of the "Tri-state district", it soon became the lead- and zinc-mining capital of the world. As a result of extensive surface and deep mining, Joplin is dotted with open-pit mines and mine shafts.'],
]
scores = model.predict(pairs)
print(scores)
# [0.00410349 0.03430534 0.5108879 0.999984 0.91639173]
# Or rank different texts based on similarity to a single text
ranks = model.rank(
"where is joplin airport",
[
"Scott Joplin is important both as a composer for bringing ragtime to the concert hall, setting the stage (literally) for the rise of jazz; and as an early advocate for civil rights and education among American blacks. Joplin is a hero, and a national treasure of the United States.",
"Flights from Jos to Abuja will get you to this shimmering Nigerian capital within approximately 19 hours. Flights depart from Yakubu Gowon Airport/ Jos Airport (JOS) and arrive at Nnamdi Azikiwe International Airport (ABV). Arik Air is the main airline flying the Jos to Abuja route.",
"Janis Joplin returned to the music scene, knowing it was her destiny, in 1966. A friend, Travis Rivers, recruited her to audition for the psychedelic band, Big Brother and the Holding Company, based in San Francisco. The band was quite big in San Francisco at the time, and Joplin landed the gig.",
"Joplin Regional Airport. Joplin Regional Airport (IATA: JLN, ICAO: KJLN, FAA LID: JLN) is a city-owned airport four miles north of Joplin, in Jasper County, Missouri. It has airline service subsidized by the Essential Air Service program. Airline flights and general aviation are in separate terminals.",
'Trolley and rail lines made Joplin the hub of southwest Missouri. As the center of the "Tri-state district", it soon became the lead- and zinc-mining capital of the world. As a result of extensive surface and deep mining, Joplin is dotted with open-pit mines and mine shafts.',
],
)
print(ranks)
# [
# {"corpus_id": 3, "score": 0.999984},
# {"corpus_id": 4, "score": 0.91639173},
# {"corpus_id": 2, "score": 0.5108879},
# {"corpus_id": 1, "score": 0.03430534},
# {"corpus_id": 0, "score": 0.004103488},
# ]