Uncategorized

The ambiguity of BERTology: what do large language models represent?



  • Adger, D. (2022). What are linguistic representations? Mind & Language, 37(2), 248–260.

    Article 

    Google Scholar
     

  • Behme, C. (2015). Is the ontology of biolinguistics coherent? Language Sciences, 47, 32–42.

    Article 

    Google Scholar
     

  • Belinkov, Y., & Glass, J. (2019). Analysis methods in neural language processing: A survey. Transactions of the Association for Computational Linguistics, 7, 49–72.

    Article 

    Google Scholar
     

  • Benacerraf, P. (1973). Mathematical truth. Journal of Philosophy, 70(19), 661–679.

    Article 

    Google Scholar
     

  • Blaho, S. (2007). The syntax of phonology: A radically substance-free approach (PhD Thesis). University of Tromsø.

  • Bloomfield, L. (1933). Language. Henry Holt.


    Google Scholar
     

  • Bloomfield, L. (1936). Language or ideas. Language, 12(2), 89–95.

    Article 

    Google Scholar
     

  • Boone, W., & Piccinini, G. (2016). Mechanistic abstraction. Philosophy of Science, 83(5), 686–697.

    Article 

    Google Scholar
     

  • Brentano, F. (1874/1911). Psychology from an empirical standpoint. Routledge and Kegan Paul.

  • Brunila, M., & LaViolette, J. (2022). What company do words keep? Revisiting the distributional semantics of J.R. Firth & Zellig Harris. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4403–4417).

  • Buckner, C. (2018). Empiricism without magic: Transformational abstraction in deep convolutional neural networks. Synthese, 195(12), 5339–5372.

    Article 

    Google Scholar
     

  • Burge, T. (1986). Individualism and psychology. The Philosophical Review, 95(1), 3–45.

    Article 

    Google Scholar
     

  • Cappelen, H., & Dever, J. (2021). Making AI intelligible: Philosophical foundations. Oxford University Press.

    Book 

    Google Scholar
     

  • Chalmers, D. J. (1995). On implementing a computation. Minds and Machines, 4, 391–402.

    Article 

    Google Scholar
     

  • Chi, E.A., Hewitt, J. & Manning, C.D. (2020). Finding universal grammatical relations in multilingual BERT. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5564–5577).

  • Chomsky, N. (1957). Syntactic structures. Mouton.

    Book 

    Google Scholar
     

  • Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press.


    Google Scholar
     

  • Chomsky, N. (1975). The logical structure of linguistic theory. Plenum press.


    Google Scholar
     

  • Chomsky, N. (1980). Rules and representations. Columbia University Press.

    Book 

    Google Scholar
     

  • Chomsky, N. (1986). Knowledge of language. Praeger Publications.


    Google Scholar
     

  • Chomsky, N. (1995). The minimalist program. MIT Press.


    Google Scholar
     

  • Chomsky, N. (2012). The science of language. Cambridge University Press.

    Book 

    Google Scholar
     

  • Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper & Row.


    Google Scholar
     

  • Coenen, A., Reif, E., Yuan, A., Kim, B., Pearce, A., Viégas, F. & Wattenberg, M. (2019). Visualizing and measuring the geometry of BERT. In Proceedings of the 33rd Conference on Neural Information Processing Systems (pp. 8592–8600).

  • Collins, J. (2014). Representations without representa: Content and illusion in linguistic theory. In P. Stalmaszczyk (Ed.), Semantics and beyond: Philosophical and linguistic inquiries (p. 2764). De Gruyter.


    Google Scholar
     

  • Collins, J. (2023). Internalist priorities in a philosophy of words. Synthese, 201(3), 110.

    Article 

    Google Scholar
     

  • Collins, J., & Rey, G. (2021). Chomsky and intentionality. In N. Allott, T. Lohndal, & G. Rey (Eds.), A companion to Chomsky (pp. 488–502). Wiley.

    Chapter 

    Google Scholar
     

  • Croft, W. A. (2001). Radical construction grammar: Syntactic theory in typological perspective. Oxford University Press.

    Book 

    Google Scholar
     

  • Danilevsky, M., Qian, K., Aharonov, R., Katsis, Y., Kawas, B. & Sen, P. (2020). A survey of the state of explainable AI for natural language processing. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (pp. 447–459).

  • Dennett, D. C. (1991). Consciousness explained. Little Brown and Company.


    Google Scholar
     

  • Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (pp. 4171–4186).

  • Dretske, F. I. (1981). Knowledge and the flow of information. The MIT Press.


    Google Scholar
     

  • Dunbar, E. (2019). Generative grammar, neural networks, and the implementational mapping problem: Response to Pater. Language, 95(1), e87–e98.

    Article 

    Google Scholar
     

  • Dupre, G. (2022). Georges Rey’s representation of language. BJPS Review of Books, , Retrieved from https://www.thebsps.org/reviewofbooks/dupre-on-rey/

  • Egan, F. (2010). Computation models: A modest role for content. Studies in History and Philosophy of Science, 41(3), 253–259.

    Article 

    Google Scholar
     

  • Egan, F. (2014). How to think about mental content. Philosophical Studies, 170(1), 115–135.

    Article 

    Google Scholar
     

  • Egan, F. (2017). Function-theoretic explanation and the search for neural mechanisms. In D. Kaplan (Ed.), Explanation and integration in mind and brain science (pp. 145–163). Oxford University Press.


    Google Scholar
     

  • Egan, F. (2018). The nature and function of content in computational models. In M. Sprevak & M. Colombo (Eds.), The Routledge handbook of the computational mind (pp. 247–258). Routledge.

    Chapter 

    Google Scholar
     

  • Facchin, M. (2022). Troubles with mathematical contents. Philosophical Psychology, 5, 1–24.

    Article 

    Google Scholar
     

  • Favela, L. H., & Machery, E. (2023). Investigating the concept of representation in the neural and psychological sciences. Frontiers in Psychology, 5, 14.


    Google Scholar
     

  • Fodor, J.A. (1981). Some notes on what linguistics is about. N. Block (Ed.), Readings in philosophy of psychology, vol. II (pp. 197–207).

  • Fodor, J. A. (1990). A theory of content and other essays. MIT Press.


    Google Scholar
     

  • Gastaldi, J. L., & Pellissier, L. (2021). The calculus of language: Explicit representation of emergent linguistic structure through type-theoretical paradigms. Interdisciplinary Science Reviews, 46(4), 569–590.

    Article 

    Google Scholar
     

  • Gleitman, L. (2021). Language as a branch of psychology: Chomsky and cognitive science. In N. Allott, T. Lohndal, & G. Rey (Eds.), A companion to Chomsky (pp. 109–122). Wiley.

  • Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language. Oxford University Press.


    Google Scholar
     

  • Harris, Z. S. (1951). Methods in structural linguistics. The University of Chicago Press.


    Google Scholar
     

  • Haspelmath, M. (2010). Comparative concepts and descriptive categories in crosslinguistic studies. Language, 86(3), 663–687.

    Article 

    Google Scholar
     

  • Haspelmath, M. (2020). Human linguisticality and the building blocks of languages. Frontiers in Psychology, 10, 3056.

    Article 

    Google Scholar
     

  • Hewitt, J., & Manning, C.D. (2019). A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4129–4138).

  • Immer, A., Hennigen, L.T., Fortuin, V. & Cotterell, R. (2022). Probing as quantifying inductive bias. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (pp. 1839–1851).

  • Iosad, P. (2017). A substance-free framework for phonology: An analysis of the Breton dialect of Bothoa. Edinburgh University Press.

    Book 

    Google Scholar
     

  • Jackson, F. (1977). Perception: A representative theory. Cambridge University Press.


    Google Scholar
     

  • Jawahar, G., Sagot, B. & Seddah, D. (2019). What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3651–3657).

  • Jelinek, F. (2005). Some of my best friends are linguists. Language Resources and Evaluation, 39(1), 25–34.

    Article 

    Google Scholar
     

  • Kaplan, D. (2011). Explanation and description in computational neuroscience. Synthese, 183(3), 339–373.

    Article 

    Google Scholar
     

  • Karlsson, F. (2006). Recursion in natural languages. In Advances in Natural Language Processing, 5th International Conference on NLP, FinTAL 2006 (p. 1).

  • Katz, J. (1981). Language and other abstract objects. Rowman and Littlefield.


    Google Scholar
     

  • Kovaleva, O., Romanov, A., Rogers, A. & Rumshisky, A. (2019). Revealing the dark secrets of BERT. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (pp. 4365–4374).

  • Kripke, S. (1980). Naming and necessity. Harvard University Press.


    Google Scholar
     

  • Kulmizev, A., & Nivre, J. (2022). Schrödinger’s tree-on syntax and neural language models. Frontiers in Artificial Intelligence, 5, 85.

    Article 

    Google Scholar
     

  • Kulmizev, A., Ravishankar, V., Abdou, M. & Nivre, J. (2020). Do neural language models show preferences for syntactic formalisms? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 4077–4091).

  • Kuokkanen, J. (2022). Vertical-horizontal distinction in resolving the abstraction, hierarchy, and generality problems of the mechanistic account of physical computation. Synthese, 200(3), 247.

    Article 

    Google Scholar
     

  • Kuznetsov, I., & Gurevych, I. (2020). A matter of framing: The impact of linguistic formalism on probing results. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 171–182).

  • Lakoff, G. (1990). The invariance hypothesis: Is abstract reason based on imageschemas? Cognitive Linguistics, 1(1), 39–74.

    Article 

    Google Scholar
     

  • Langacker, R. W. (1987). Foundations of cognitive grammar, volume 1, theoretical prerequisites. Stanford University Press.


    Google Scholar
     

  • Lasri, K., Pimentel, T., Lenci, A., Poibeau, T. & Cotterell, R. (2022). Probing for the usage of grammatical number. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers (pp. 8818–8831).

  • Laurence, S. (2003). Is linguistics a branch of psychology? In A. Barber (Ed.), Epistemology of language (pp. 69–106). Oxford University Press.

    Chapter 

    Google Scholar
     

  • Levine, R. (2018). ‘Biolinguistics’: Some foundational problems. In C. Behme & M. Neef (Eds.), Essays on linguistic realism (pp. 21–60). John Benjamins Publishing Company.

    Chapter 

    Google Scholar
     

  • Levy, A. (2013). Three kinds of new mechanism. Biology and Philosophy, 28(1), 99–114.

    Article 

    Google Scholar
     

  • Lewis, D. (1970). How to define theoretical terms. Journal of Philosophy, 67(13), 426–446.

    Article 

    Google Scholar
     

  • Li, J., Cotterell, R. & Sachan, M. (2022). Probing via prompting. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1144–1157).

  • Li, L., Ma, R., Guo, Q., Xue, X. & Qiu, X. (2020). BERT-ATTACK: Adversarial attack against BERT using BERT. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 6193–6202).

  • Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7, 195–212.

    Article 

    Google Scholar
     

  • Madabushi, H.T., Romain, L., Divjak, D. & Milin, P. (2020). CXGBERT: BERT meets construction grammar. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 4020–4032).

  • Manning, C. D., Clark, K., & Hewitt, J. (2020). Emergent linguistic structure in artificial neural networks trained by self-supervision. PNAS, 117(48), 30046–30054.

    Article 

    Google Scholar
     

  • Marcus, G. F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3), 243–282.

    Article 

    Google Scholar
     

  • Marr, D. (1982). Vision. W.H. Freeman and Company.


    Google Scholar
     

  • Matthews, R. J. (2007). The measure of mind: Propositional attitudes and their attribution. Oxford University Press.

    Book 

    Google Scholar
     

  • McCoy, T., Frank, R., & Linzen, T. (2020). Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks. Transactions of the Association for Computational Linguistics, 8, 125–140.

    Article 

    Google Scholar
     

  • McCoy, T., Pavlick, E. & Linzen, T. (2019). Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3428–3448).

  • Mickus, T., Paperno, D., Constant, M. & van Deemter, K. (2020). What do you mean, BERT? Assessing BERT as a distributional semantics model. In Proceedings of the Society for Computation in Linguistics (pp. 350–361).

  • Miller, P. H. (1999). Strong generative capacity: The semantics of linguistic formalism. CSLI Publications.


    Google Scholar
     

  • Millikan, R. G. (1993). Content and vehicle. In N. Eilan, R. McCarthy, & B. Brewer (Eds.), Spatial representation (pp. 256–268). Blackwell.


    Google Scholar
     

  • Millikan, R. G. (2017). Beyond concepts: Unicepts, language, and natural information. Oxford University Press.

    Book 

    Google Scholar
     

  • Mueller, A., Frank, R., Linzen, T., Wang, L. & Schuster, S. (2022). Coloring the blank slate: Pre-training imparts a hierarchical inductive bias to sequence-to-sequence models. In Findings of the Association for Computational Linguistics: ACL 2022 (pp. 1352–1368).

  • Nadeem, M., Bethke, A. & Reddy, S. (2020). StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (pp. 5356–5371).

  • Neander, K. (2017). A mark of the mental: A defence of informational teleosemantics. MIT Press.

    Book 

    Google Scholar
     

  • Nefdt, R. M. (2023). Language, science, and structure: A journey into the philosophy of linguistics. Oxford University Press.

    Book 

    Google Scholar
     

  • Newmeyer, F. (2010). On comparative concepts and descriptive categories: A reply to Haspelmath. Language, 86(3), 688–695.

    Article 

    Google Scholar
     

  • Odden, D. (2013). Formal phonology. Nordlyd, 40(1), 249–273.

    Article 

    Google Scholar
     

  • OpenAI (2023). GPT-4 technical report (Tech. Rep.).

  • Ott, D. (2017). Strong generative capacity and the empirical base of linguistic theory. Frontiers in Psychology, 7, 8.


    Google Scholar
     

  • Pater, J. (2019). Generative linguistics and neural networks at 60: Foundation, friction, and fusion. Language, 95(1), e41–e74.

    Article 

    Google Scholar
     

  • Pennington, J., Socher, R. & Manning, C.D. (2014). GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).

  • Piccinini, G. (2015). Physical computation: A mechanistic account. Oxford University Press.

    Book 

    Google Scholar
     

  • Pinker, S., & Price, A. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28(1–2), 73–193.

    Article 

    Google Scholar
     

  • Poeppel, D., & Embick, D. (2005). Defining the relation between linguistics and neuroscience. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 1–16). Lawrence and Erlbaum Associates.


    Google Scholar
     

  • Postal, P. (2003). Remarks on the foundations of linguistics. The Philosophical Forum, 34(3–4), 233–252.

    Article 

    Google Scholar
     

  • Postal, P. (2009). The incoherence of Chomsky’s ‘biolinguistic’ ontology. Biolinguistics, 3(1), 104–123.

    Article 

    Google Scholar
     

  • Putnam, H. (1988). Representation and reality. MIT Press.


    Google Scholar
     

  • Quine, W. V. O. (1970). Methodological reflections on current linguistic theory. Synthese, 21, 386–398.

    Article 

    Google Scholar
     

  • Rey, G. (2020). Representation of language: Philosophical issues in a Chomskyan linguistics. Oxford University Press.

    Book 

    Google Scholar
     

  • Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866.

    Article 

    Google Scholar
     

  • Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, & T. P. R. Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition: Vol. 2. psychological and biological models (pp. 216–271). MIT Press.


    Google Scholar
     

  • Sennrich, R., Haddow, B. & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1715–1725).

  • Smith, B. C. (2006). Why we still need knowledge of language. Croatian Journal of Philosophy, 6(3), 431–456.


    Google Scholar
     

  • Soler, A.G., & Apidianaki, M. (2020). BERT knows Punta Cana is not just beautiful, it’s gorgeous: Ranking scalar adjectives with contextualized representations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 7371–7385).

  • Sprevak, M. (2018). Triviality arguments about computational implementation. In M. Sprevak & M. Colombo (Eds.), Routledge handbook of the computational mind (pp. 175–191). Routledge.

    Chapter 

    Google Scholar
     

  • Swoyer, C. (1991). Structural representation and surrogative reasoning. Synthese, 87(3), 449–508.

    Article 

    Google Scholar
     

  • Tenney, I., Das, D. & Pavlick, E. (2019). BERT rediscovers the classical NLP pipeline. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 4593–4601).

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Polosukhins, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing (pp. 6000–6010).

  • Weiss, G., Goldberg, Y. & Yahav, E. (2021). Thinking like transformers. In Proceedings of the 38th international conference on machine learning (pp. 11080–11090).



  • Source link

    Leave a Reply

    Your email address will not be published. Required fields are marked *