Item type |
文献 / Documents(1) |
公開日 |
2022-07-14 |
アクセス権 |
|
|
アクセス権 |
open access |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_6501 |
|
資源タイプ |
journal article |
出版社版DOI |
|
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.3390/electronics11010037 |
|
|
言語 |
ja |
|
|
関連名称 |
10.3390/electronics11010037 |
出版タイプ |
|
|
出版タイプ |
VoR |
|
出版タイプResource |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
タイトル |
|
|
タイトル |
Diagnostic Evaluation of Policy-Gradient-Based Ranking |
|
言語 |
en |
著者 |
Yu, Hai-Tao
Huang, Degen
任, 福継
Li, Lishuang
|
抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Learning-to-rank has been intensively studied and has shown significantly increasing values in a wide range of domains, such as web search, recommender systems, dialogue systems, machine translation, and even computational biology, to name a few. In light of recent advances in neural networks, there has been a strong and continuing interest in exploring how to deploy popular techniques, such as reinforcement learning and adversarial learning, to solve ranking problems. However, armed with the aforesaid popular techniques, most studies tend to show how effective a new method is. A comprehensive comparison between techniques and an in-depth analysis of their deficiencies are somehow overlooked. This paper is motivated by the observation that recent ranking methods based on either reinforcement learning or adversarial learning boil down to policy-gradient-based optimization. Based on the widely used benchmark collections with complete information (where relevance labels are known for all items), such as MSLRWEB30K and Yahoo-Set1, we thoroughly investigate the extent to which policy-gradient-based ranking methods are effective. On one hand, we analytically identify the pitfalls of policy-gradient-based ranking. On the other hand, we experimentally compare a wide range of representative methods. The experimental results echo our analysis and show that policy-gradient-based ranking methods are, by a large margin, inferior to many conventional ranking methods. Regardless of whether we use reinforcement learning or adversarial learning, the failures are largely attributable to the gradient estimation based on sampled rankings, which significantly diverge from ideal rankings. In particular, the larger the number of documents per query and the more fine-grained the ground-truth labels, the greater the impact policy-gradient-based ranking suffers. Careful examination of this weakness is highly recommended for developing enhanced methods based on policy gradient. |
|
言語 |
en |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
learning-to-rank |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
policy gradient |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
reinforcement learning |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
adversarial learning |
キーワード |
|
|
言語 |
en |
|
主題Scheme |
Other |
|
主題 |
ranking sampling |
書誌情報 |
en : Electronics
巻 11,
号 1,
p. 37,
発行日 2021-12-23
|
収録物ID |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
20799292 |
出版者 |
|
|
出版者 |
MDPI |
|
言語 |
en |
権利情報 |
|
|
言語 |
en |
|
権利情報 |
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
EID |
|
|
識別子 |
382584 |
|
識別子タイプ |
URI |
言語 |
|
|
言語 |
eng |