{"id":1885,"date":"2021-09-29T23:32:20","date_gmt":"2021-09-29T23:32:20","guid":{"rendered":"https:\/\/giuliad10.sg-host.com\/?post_type=news-&#038;p=1885"},"modified":"2024-07-12T23:33:43","modified_gmt":"2024-07-12T23:33:43","slug":"ellis-ams-team-showcasing-results-in-the-big-bench-challenge","status":"publish","type":"news-","link":"https:\/\/ivi.fnwi.uva.nl\/ellis\/news-\/ellis-ams-team-showcasing-results-in-the-big-bench-challenge\/","title":{"rendered":"ELLIS AMS team showcasing results in the BIG-bench challenge"},"content":{"rendered":"<p>We are very glad to announce the first ELLIS AMS team is going to showcase their results of the Beyond the Imitation Game Benchmark (BIG-bench) collaborative task. The talk will happen on Tuesday 5th October, 16:00. The seminar will be held online, the zoom meeting link will be sent out later.<\/p>\n<p>&nbsp;<\/p>\n<h3>The Beyond the Imitation Game Benchmark (BIG-bench) challenge<\/h3>\n<p>Abstract:<\/p>\n<p>The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and provide concrete evidence of their capabilities and limitations. In the community building spirit of ELLIS-Amsterdam, we have formed three teams mixing Bachelor\u2019s, Master\u2019s, and PhD students and have contributed three tasks to the benchmark. In the seminar, we will briefly introduce the BIG-bench challenge and then the three teams will present their benchmarking tasks.<\/p>\n<p>The Metaphor Understanding task tests the capability of language models to understand English metaphors. It consists of two subtasks: in the first one, a language model is asked to correctly map a metaphorical expression to its correct literal paraphrases; in the second one, the model needs to map a literal paraphrase to the corresponding metaphorical expression. The two subtasks form a new dataset that takes into account the lessons learned from existing models and benchmarks.<\/p>\n<p>The Implicit Relations task evaluates a model\u2019s ability to infer relations between characters from short passages of English narratives, where the relations are left implicit. In each example, a passage and a question of the form \u201cWhat is X to Y?\u201d is presented, and the model must select the correct relation. Our new dataset makes use of 25 labels ranging from familial relations to professional relations.<\/p>\n<p>Finally, the Fantasy Reasoning task assesses a language model\u2019s ability to reason within situations that go against common sense or in some way violate the rules of the real world; humans do this easily, e.g., when reading a science fiction book. We collect a corpus of contexts that language models are extremely unlikely to be familiar with, paired with yes-no questions.<\/p>\n<p>References:<\/p>\n<p>Metaphor Understanding: <a href=\"https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/metaphor_understanding\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/metaphor_understanding<\/a><\/p>\n<p>Implicit Relations: <a href=\"https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/implicit_relations\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/implicit_relations<\/a><\/p>\n<p>Fantasy Reasoning: <a href=\"https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/fantasy_reasoning\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/google\/BIG-bench\/tree\/main\/bigbench\/benchmark_tasks\/fantasy_reasoning<\/a><\/p>\n","protected":false},"featured_media":0,"template":"","news-or-blog":[15],"class_list":["post-1885","news-","type-news-","status-publish","hentry","news-or-blog-news"],"_links":{"self":[{"href":"https:\/\/ivi.fnwi.uva.nl\/ellis\/wp-json\/wp\/v2\/news-\/1885","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ivi.fnwi.uva.nl\/ellis\/wp-json\/wp\/v2\/news-"}],"about":[{"href":"https:\/\/ivi.fnwi.uva.nl\/ellis\/wp-json\/wp\/v2\/types\/news-"}],"wp:attachment":[{"href":"https:\/\/ivi.fnwi.uva.nl\/ellis\/wp-json\/wp\/v2\/media?parent=1885"}],"wp:term":[{"taxonomy":"news-or-blog","embeddable":true,"href":"https:\/\/ivi.fnwi.uva.nl\/ellis\/wp-json\/wp\/v2\/news-or-blog?post=1885"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}