iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.1145/3596490
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T05:50:56Z","timestamp":1725256256873},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Commun. ACM"],"published-print":{"date-parts":[[2024,1]]},"abstract":"Shortcuts often hinder the robustness of large language models.<\/jats:p>","DOI":"10.1145\/3596490","type":"journal-article","created":{"date-parts":[[2023,12,21]],"date-time":"2023-12-21T18:10:41Z","timestamp":1703182241000},"page":"110-120","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Shortcut Learning of Large Language Models in Natural Language Understanding"],"prefix":"10.1145","volume":"67","author":[{"given":"Mengnan","family":"Du","sequence":"first","affiliation":[{"name":"New Jersey Institute of Technology, University Heights, Newark, NJ, USA"}]},{"given":"Fengxiang","family":"He","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Scotland"}]},{"given":"Na","family":"Zou","sequence":"additional","affiliation":[{"name":"Texas A&M University, College Station, TX, USA"}]},{"given":"Dacheng","family":"Tao","sequence":"additional","affiliation":[{"name":"University of Sydney, Australia"}]},{"given":"Xia","family":"Hu","sequence":"additional","affiliation":[{"name":"Rice University, Houston, TX, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,12,21]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Invariant risk minimization","author":"Arjovsky M.","year":"2019","unstructured":"Arjovsky , M. , Bottou , L. , Gulrajani , I. , and Lopez-Paz , D. Invariant risk minimization , 2019 ; arXiv:1907.02893. Arjovsky, M., Bottou, L., Gulrajani, I., and Lopez-Paz, D. Invariant risk minimization, 2019; arXiv:1907.02893."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.insights-1.18"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing.","author":"Branco R.","unstructured":"Branco , R. , Branco , A. , Silva , J. , and Rodrigues , J . Shortcutted commonsense: Data spuriousness in deep learning of commonsense reasoning . In Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing. Branco, R., Branco, A., Silva, J., and Rodrigues, J. Shortcutted commonsense: Data spuriousness in deep learning of commonsense reasoning. In Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing."},{"key":"e_1_2_1_4_1","volume-title":"Advances in Neural Information Processing Systems","author":"Brown T.B.","year":"2020","unstructured":"Brown , T.B. et al. Language models are few-shot learners . Advances in Neural Information Processing Systems , 2020 . Brown, T.B. et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 2020."},{"key":"e_1_2_1_5_1","volume-title":"Advances in Neural Information Processing Systems","author":"Bubeck S.","year":"2021","unstructured":"Bubeck , S. and Sellke , M . A universal law of robustness via isoperimetry . Advances in Neural Information Processing Systems , 2021 . Bubeck, S. and Sellke, M. A universal law of robustness via isoperimetry. Advances in Neural Information Processing Systems, 2021."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i10.21296"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1418"},{"key":"e_1_2_1_8_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. North American","author":"Devlin J.","year":"2019","unstructured":"Devlin , J. , Chang , M. , Lee , K. , and Toutanova , K . Bert: Pre-training of deep bidirectional transformers for language understanding. North American Chapter of the Assoc. Computational Linguistics , 2019 . Devlin, J., Chang, M., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. North American Chapter of the Assoc. Computational Linguistics, 2019."},{"key":"e_1_2_1_9_1","volume-title":"et al. Towards interpreting and mitigating shortcut learning behavior of NLU models. North American","author":"Du M.","year":"2021","unstructured":"Du , M. et al. Towards interpreting and mitigating shortcut learning behavior of NLU models. North American Chapter of the Assoc. Computational Linguistics , 2021 . Du, M. et al. Towards interpreting and mitigating shortcut learning behavior of NLU models. North American Chapter of the Assoc. Computational Linguistics, 2021."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.eacl-main.129"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 2019 Conf. Empirical Methods in Natural Language Processing and the 9th Intern. Joint Conf. Natural Language Processing, 55--65","author":"Ethayarajh K.","unstructured":"Ethayarajh , K. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings . In Proceedings of the 2019 Conf. Empirical Methods in Natural Language Processing and the 9th Intern. Joint Conf. Natural Language Processing, 55--65 . Ethayarajh, K. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. In Proceedings of the 2019 Conf. Empirical Methods in Natural Language Processing and the 9th Intern. Joint Conf. Natural Language Processing, 55--65."},{"key":"e_1_2_1_12_1","volume-title":"Annotation artifacts in natural language inference data. North American","author":"Gururangan S.","year":"2018","unstructured":"Gururangan , S. , Swayamdipta , S. , Levy , O. , Schwartz , R. , Bowman , S.R. , and Smith , N.A . Annotation artifacts in natural language inference data. North American Chapter of the Assoc. Computational Linguistics , 2018 . Gururangan, S., Swayamdipta, S., Levy, O., Schwartz, R., Bowman, S.R., and Smith, N.A. Annotation artifacts in natural language inference data. North American Chapter of the Assoc. Computational Linguistics, 2018."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.492"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6311"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.84"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.85"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-86383-8_36"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1631"},{"key":"e_1_2_1_19_1","volume-title":"et al. RoBERTa: A robustly optimized BERT pretraining approach","author":"Liu Y.","year":"2019","unstructured":"Liu , Y. et al. RoBERTa: A robustly optimized BERT pretraining approach , 2019 ; arXiv:1907.11692. Liu, Y. et al. RoBERTa: A robustly optimized BERT pretraining approach, 2019; arXiv:1907.11692."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1334"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.","author":"Mendelson M.","unstructured":"Mendelson , M. and Belinkov , Y . Debiasing Methods in Natural Language Understanding Make Bias More Accessible . In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Mendelson, M. and Belinkov, Y. Debiasing Methods in Natural Language Understanding Make Bias More Accessible. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing."},{"key":"e_1_2_1_22_1","volume-title":"Simplicity Bias in 1-Hidden Layer Neural Networks","author":"Morwani D.","year":"2023","unstructured":"Morwani , D. , Batra , J. , Jain , P. , and Netrapalli , P . Simplicity Bias in 1-Hidden Layer Neural Networks , 2023 ; arXiv:2302.00457. Morwani, D., Batra, J., Jain, P., and Netrapalli, P. Simplicity Bias in 1-Hidden Layer Neural Networks, 2023; arXiv:2302.00457."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1459"},{"key":"e_1_2_1_24_1","volume-title":"Deep Learning for Code Workshop","author":"Nye M.","year":"2022","unstructured":"Nye , M. et al. Show your work: Scratchpads for intermediate computation with language models . Deep Learning for Code Workshop , 2022 . Nye, M. et al. Show your work: Scratchpads for intermediate computation with language models. Deep Learning for Code Workshop, 2022."},{"key":"e_1_2_1_25_1","volume-title":"Combining feature and instance attribution to detect artifacts","author":"Pezeshkpour P.","year":"2021","unstructured":"Pezeshkpour , P. , Jain , S. , Singh , S. , and Wallace , B.C . Combining feature and instance attribution to detect artifacts , 2021 ; arXiv:2107.00323. Pezeshkpour, P., Jain, S., Singh, S., and Wallace, B.C. Combining feature and instance attribution to detect artifacts, 2021; arXiv:2107.00323."},{"key":"e_1_2_1_26_1","volume-title":"Out of Order: How important is the sequential order of words in a sentence in natural language understanding tasks? 2020","author":"Pham T.M.","year":"2012","unstructured":"Pham , T.M. , Bui , T. , Mai , L. , and Nguyen , A . Out of Order: How important is the sequential order of words in a sentence in natural language understanding tasks? 2020 ; arXiv: 2012 .15180. Pham, T.M., Bui, T., Mai, L., and Nguyen, A. Out of Order: How important is the sequential order of words in a sentence in natural language understanding tasks? 2020; arXiv:2012.15180."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.blackboxnlp-1.1"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing.","author":"Qi F.","unstructured":"Qi , F. , Chen , Y. , Zhang , X. , Li , M. , Liu , Z. , and Sun , M . Mind the style of text! Adversarial and backdoor attacks based on text style transfer . In Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing. Qi, F., Chen, Y., Zhang, X., Li, M., Liu, Z., and Sun, M. Mind the style of text! Adversarial and backdoor attacks based on text style transfer. In Proceedings of the 2021 Conf. Empirical Methods in Natural Language Processing."},{"key":"e_1_2_1_29_1","volume-title":"et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Machine Learning Research","author":"Raffel C.","year":"2020","unstructured":"Raffel , C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Machine Learning Research ( 2020 ). Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Machine Learning Research (2020)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.86"},{"key":"e_1_2_1_31_1","volume-title":"Advances in Neural Information Processing Systems","author":"Robinson J.","year":"2021","unstructured":"Robinson , J. , Sun , L. , Yu , K. , Batmanghelich , K. , Jegelka , S. , and Sra , S . Can contrastive learning avoid shortcut solutions ? Advances in Neural Information Processing Systems , 2021 . Robinson, J., Sun, L., Yu, K., Batmanghelich, K., Jegelka, S., and Sra, S. Can contrastive learning avoid shortcut solutions? Advances in Neural Information Processing Systems, 2021."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1341"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 2020 Conf. Empirical Methods in Natural Language Processing.","author":"Sen P.","unstructured":"Sen , P. and Saffari , A . What do models learn from question answering datasets? In Proceedings of the 2020 Conf. Empirical Methods in Natural Language Processing. Sen, P. and Saffari, A. What do models learn from question answering datasets? In Proceedings of the 2020 Conf. Empirical Methods in Natural Language Processing."},{"key":"e_1_2_1_34_1","volume-title":"Advances in Neural Information Processing Systems","author":"Shah H.","year":"2020","unstructured":"Shah , H. , Tamuly , K. , Raghunathan , A. , Jain , P. , and Netrapalli , P . The pitfalls of simplicity bias in neural networks . Advances in Neural Information Processing Systems , 2020 . Shah, H., Tamuly, K., Raghunathan, A., Jain, P., and Netrapalli, P. The pitfalls of simplicity bias in neural networks. Advances in Neural Information Processing Systems, 2020."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 2022 Intern. Conf. Learning Representations.","author":"Shi Y.","unstructured":"Shi , Y. et al. Gradient matching for domain generalization . In Proceedings of the 2022 Intern. Conf. Learning Representations. Shi, Y. et al. Gradient matching for domain generalization. In Proceedings of the 2022 Intern. Conf. Learning Representations."},{"key":"e_1_2_1_36_1","volume-title":"et al. Prompting gpt-3 to be reliable","author":"Si C.","year":"2022","unstructured":"Si , C. et al. Prompting gpt-3 to be reliable , 2022 ; arXiv:2210.09150. Si, C. et al. Prompting gpt-3 to be reliable, 2022; arXiv:2210.09150."},{"key":"e_1_2_1_37_1","volume-title":"What does BERT learn from multiple-choice reading comprehension datasets? (2019)","author":"Si C.","year":"1910","unstructured":"Si , C. , Wang , S. , Kan , M. , and Jiang , J . What does BERT learn from multiple-choice reading comprehension datasets? (2019) ; arXiv: 1910 .12391. Si, C., Wang, S., Kan, M., and Jiang, J. What does BERT learn from multiple-choice reading comprehension datasets? (2019); arXiv:1910.12391."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.230"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 2022 AAAI Conf. Artificial Intelligence.","author":"Stacey J.","unstructured":"Stacey , J. , Belinkov , Y. , and Rei , M . Supervising Model Attention with Human Explanations for Robust Natural Language Inference . In Proceedings of the 2022 AAAI Conf. Artificial Intelligence. Stacey, J., Belinkov, Y., and Rei, M. Supervising Model Attention with Human Explanations for Robust Natural Language Inference. In Proceedings of the 2022 AAAI Conf. Artificial Intelligence."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.665"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 2017 Intern. Conf. Machine Learning.","author":"Sundararajan M.","unstructured":"Sundararajan , M. , Taly , A. , and Yan , Q . Axiomatic attribution for deep networks . In Proceedings of the 2017 Intern. Conf. Machine Learning. Sundararajan, M., Taly, A., and Yan, Q. Axiomatic attribution for deep networks. In Proceedings of the 2017 Intern. Conf. Machine Learning."},{"key":"e_1_2_1_42_1","volume-title":"Unshuffling data for improved generalization, (2020)","author":"Teney D.","year":"2002","unstructured":"Teney , D. , Abbasnejad , E. , and van den Hengel , A. Unshuffling data for improved generalization, (2020) ; arXiv: 2002 .11894. Teney, D., Abbasnejad, E., and van den Hengel, A. Unshuffling data for improved generalization, (2020); arXiv:2002.11894."},{"key":"e_1_2_1_43_1","volume-title":"An empirical study on robustness to spurious correlations using pre-trained language models. Trans. Assoc. Computational Linguistics","author":"Tu L.","year":"2020","unstructured":"Tu , L. , Lalwani , G. , Gella , S. , and He , H . An empirical study on robustness to spurious correlations using pre-trained language models. Trans. Assoc. Computational Linguistics ( 2020 ). Tu, L., Lalwani, G., Gella, S., and He, H. An empirical study on robustness to spurious correlations using pre-trained language models. Trans. Assoc. Computational Linguistics (2020)."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.613"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.713"},{"key":"e_1_2_1_46_1","volume-title":"Gradient methods provably converge to non-robust networks","author":"Vardi G.","year":"2022","unstructured":"Vardi , G. , Yehudai , G. , and Shamir , O . Gradient methods provably converge to non-robust networks , 2022 ; arXiv:2202.04347. Vardi, G., Yehudai, G., and Shamir, O. Gradient methods provably converge to non-robust networks, 2022; arXiv:2202.04347."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings for the 35th Conf. Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)","author":"Wang B.","year":"2021","unstructured":"Wang , B. et al. Adversarial GLUE: A multi-task benchmark for robustness evaluation of language models . In Proceedings for the 35th Conf. Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) , 2021 . Wang, B. et al. Adversarial GLUE: A multi-task benchmark for robustness evaluation of language models. In Proceedings for the 35th Conf. Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the 2022 Conf. North American Chapter of the Assoc. Computational Linguistics: Human Language Technologies.","author":"Webson A.","unstructured":"Webson , A. and Pavlick , E . Do prompt-based models really understand the meaning of their prompts? In Proceedings of the 2022 Conf. North American Chapter of the Assoc. Computational Linguistics: Human Language Technologies. Webson, A. and Pavlick, E. Do prompt-based models really understand the meaning of their prompts? In Proceedings of the 2022 Conf. North American Chapter of the Assoc. Computational Linguistics: Human Language Technologies."},{"key":"e_1_2_1_49_1","volume-title":"Advances in Neural Information Processing Systems","author":"Wei J.","year":"2022","unstructured":"Wei , J. et al. Chain-of-thought prompting elicits reasoning in large language models . Advances in Neural Information Processing Systems , 2022 . Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 2022."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.40"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the 2018 Conf. Empirical Methods in Natural Language Processing.","author":"Zellers R.","unstructured":"Zellers , R. , Bisk , Y. , Schwartz , R. , and Choi , Y . Swag: A large-scale adversarial dataset for grounded commonsense inference . In Proceedings of the 2018 Conf. Empirical Methods in Natural Language Processing. Zellers, R., Bisk, Y., Schwartz, R., and Choi, Y. Swag: A large-scale adversarial dataset for grounded commonsense inference. In Proceedings of the 2018 Conf. Empirical Methods in Natural Language Processing."},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the 2021 Intern. Conf. Machine Learning, 12697--12706","author":"Zhao Z.","unstructured":"Zhao , Z. , Wallace , E. , Feng , S. , Klein , D. , and Singh , S . Calibrate before use: Improving few-shot performance of language models . In Proceedings of the 2021 Intern. Conf. Machine Learning, 12697--12706 . Zhao, Z., Wallace, E., Feng, S., Klein, D., and Singh, S. Calibrate before use: Improving few-shot performance of language models. In Proceedings of the 2021 Intern. Conf. Machine Learning, 12697--12706."}],"container-title":["Communications of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3596490","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,5]],"date-time":"2024-01-05T23:38:01Z","timestamp":1704497881000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3596490"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,21]]},"references-count":52,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1]]}},"alternative-id":["10.1145\/3596490"],"URL":"http:\/\/dx.doi.org\/10.1145\/3596490","relation":{},"ISSN":["0001-0782","1557-7317"],"issn-type":[{"value":"0001-0782","type":"print"},{"value":"1557-7317","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,21]]},"assertion":[{"value":"2023-12-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}