The ambiguity of BERTology: what do large language models represent?

Adger, D. (2022). What are linguistic representations? Mind & Language, 37(2), 248–260.

Behme, C. (2015). Is the ontology of biolinguistics coherent? Language Sciences, 47, 32–42.

Belinkov, Y., & Glass, J. (2019). Analysis methods in neural language processing: A survey. Transactions of the Association for Computational Linguistics, 7, 49–72.

Article

Google Scholar

Benacerraf, P. (1973). Mathematical truth. Journal of Philosophy, 70(19), 661–679.

Article

Google Scholar

Blaho, S. (2007). The syntax of phonology: A radically substance-free approach (PhD Thesis). University of Tromsø.

Bloomfield, L. (1933). Language. Henry Holt.

Google Scholar

Bloomfield, L. (1936). Language or ideas. Language, 12(2), 89–95.

Article

Google Scholar

Boone, W., & Piccinini, G. (2016). Mechanistic abstraction. Philosophy of Science, 83(5), 686–697.

Article

Google Scholar

Brentano, F. (1874/1911). Psychology from an empirical standpoint. Routledge and Kegan Paul.

Brunila, M., & LaViolette, J. (2022). What company do words keep? Revisiting the distributional semantics of J.R. Firth & Zellig Harris. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4403–4417).

Buckner, C. (2018). Empiricism without magic: Transformational abstraction in deep convolutional neural networks. Synthese, 195(12), 5339–5372.

Article

Google Scholar

Burge, T. (1986). Individualism and psychology. The Philosophical Review, 95(1), 3–45.

Article

Google Scholar

Cappelen, H., & Dever, J. (2021). Making AI intelligible: Philosophical foundations. Oxford University Press.

Book

Google Scholar

Chalmers, D. J. (1995). On implementing a computation. Minds and Machines, 4, 391–402.

Article

Google Scholar

Chi, E.A., Hewitt, J. & Manning, C.D. (2020). Finding universal grammatical relations in multilingual BERT. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 5564–5577).

Chomsky, N. (1957). Syntactic structures. Mouton.

Book

Google Scholar

Chomsky, N. (1965). Aspects of the theory of syntax. MIT Press.

Google Scholar

Chomsky, N. (1975). The logical structure of linguistic theory. Plenum press.

Google Scholar

Chomsky, N. (1980). Rules and representations. Columbia University Press.

Book

Google Scholar

Chomsky, N. (1986). Knowledge of language. Praeger Publications.

Google Scholar

Chomsky, N. (1995). The minimalist program. MIT Press.

Google Scholar

Chomsky, N. (2012). The science of language. Cambridge University Press.

Book

Google Scholar

Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper & Row.

Google Scholar

Coenen, A., Reif, E., Yuan, A., Kim, B., Pearce, A., Viégas, F. & Wattenberg, M. (2019). Visualizing and measuring the geometry of BERT. In Proceedings of the 33rd Conference on Neural Information Processing Systems (pp. 8592–8600).

Collins, J. (2014). Representations without representa: Content and illusion in linguistic theory. In P. Stalmaszczyk (Ed.), Semantics and beyond: Philosophical and linguistic inquiries (p. 2764). De Gruyter.

Google Scholar

Collins, J. (2023). Internalist priorities in a philosophy of words. Synthese, 201(3), 110.

Article

Google Scholar

Collins, J., & Rey, G. (2021). Chomsky and intentionality. In N. Allott, T. Lohndal, & G. Rey (Eds.), A companion to Chomsky (pp. 488–502). Wiley.

Chapter

Google Scholar

Croft, W. A. (2001). Radical construction grammar: Syntactic theory in typological perspective. Oxford University Press.

Book

Google Scholar

Danilevsky, M., Qian, K., Aharonov, R., Katsis, Y., Kawas, B. & Sen, P. (2020). A survey of the state of explainable AI for natural language processing. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (pp. 447–459).

Dennett, D. C. (1991). Consciousness explained. Little Brown and Company.

Google Scholar

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (pp. 4171–4186).

Dretske, F. I. (1981). Knowledge and the flow of information. The MIT Press.

Google Scholar

Dunbar, E. (2019). Generative grammar, neural networks, and the implementational mapping problem: Response to Pater. Language, 95(1), e87–e98.

Article

Google Scholar

Dupre, G. (2022). Georges Rey’s representation of language. BJPS Review of Books, , Retrieved from https://www.thebsps.org/reviewofbooks/dupre-on-rey/

Egan, F. (2010). Computation models: A modest role for content. Studies in History and Philosophy of Science, 41(3), 253–259.

Article

Google Scholar

Egan, F. (2014). How to think about mental content. Philosophical Studies, 170(1), 115–135.

Article

Google Scholar

Egan, F. (2017). Function-theoretic explanation and the search for neural mechanisms. In D. Kaplan (Ed.), Explanation and integration in mind and brain science (pp. 145–163). Oxford University Press.

Google Scholar

Egan, F. (2018). The nature and function of content in computational models. In M. Sprevak & M. Colombo (Eds.), The Routledge handbook of the computational mind (pp. 247–258). Routledge.

Chapter

Google Scholar

Facchin, M. (2022). Troubles with mathematical contents. Philosophical Psychology, 5, 1–24.

Article

Google Scholar

Favela, L. H., & Machery, E. (2023). Investigating the concept of representation in the neural and psychological sciences. Frontiers in Psychology, 5, 14.

Google Scholar

Fodor, J.A. (1981). Some notes on what linguistics is about. N. Block (Ed.), Readings in philosophy of psychology, vol. II (pp. 197–207).

Fodor, J. A. (1990). A theory of content and other essays. MIT Press.

Google Scholar

Gastaldi, J. L., & Pellissier, L. (2021). The calculus of language: Explicit representation of emergent linguistic structure through type-theoretical paradigms. Interdisciplinary Science Reviews, 46(4), 569–590.

Article

Google Scholar

Gleitman, L. (2021). Language as a branch of psychology: Chomsky and cognitive science. In N. Allott, T. Lohndal, & G. Rey (Eds.), A companion to Chomsky (pp. 109–122). Wiley.

Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language. Oxford University Press.

Google Scholar

Harris, Z. S. (1951). Methods in structural linguistics. The University of Chicago Press.

Google Scholar

Haspelmath, M. (2010). Comparative concepts and descriptive categories in crosslinguistic studies. Language, 86(3), 663–687.

Article

Google Scholar

Haspelmath, M. (2020). Human linguisticality and the building blocks of languages. Frontiers in Psychology, 10, 3056.

Article

Google Scholar

Hewitt, J., & Manning, C.D. (2019). A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4129–4138).

Immer, A., Hennigen, L.T., Fortuin, V. & Cotterell, R. (2022). Probing as quantifying inductive bias. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (pp. 1839–1851).

Iosad, P. (2017). A substance-free framework for phonology: An analysis of the Breton dialect of Bothoa. Edinburgh University Press.

Book

Google Scholar

Jackson, F. (1977). Perception: A representative theory. Cambridge University Press.

Google Scholar

Jawahar, G., Sagot, B. & Seddah, D. (2019). What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3651–3657).

Jelinek, F. (2005). Some of my best friends are linguists. Language Resources and Evaluation, 39(1), 25–34.

Article

Google Scholar

Kaplan, D. (2011). Explanation and description in computational neuroscience. Synthese, 183(3), 339–373.

Article

Google Scholar

Karlsson, F. (2006). Recursion in natural languages. In Advances in Natural Language Processing, 5th International Conference on NLP, FinTAL 2006 (p. 1).

Katz, J. (1981). Language and other abstract objects. Rowman and Littlefield.

Google Scholar

Kovaleva, O., Romanov, A., Rogers, A. & Rumshisky, A. (2019). Revealing the dark secrets of BERT. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (pp. 4365–4374).

Kripke, S. (1980). Naming and necessity. Harvard University Press.

Google Scholar

Kulmizev, A., & Nivre, J. (2022). Schrödinger’s tree-on syntax and neural language models. Frontiers in Artificial Intelligence, 5, 85.

Article

Google Scholar

Kulmizev, A., Ravishankar, V., Abdou, M. & Nivre, J. (2020). Do neural language models show preferences for syntactic formalisms? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 4077–4091).

Kuokkanen, J. (2022). Vertical-horizontal distinction in resolving the abstraction, hierarchy, and generality problems of the mechanistic account of physical computation. Synthese, 200(3), 247.

Article

Google Scholar

Kuznetsov, I., & Gurevych, I. (2020). A matter of framing: The impact of linguistic formalism on probing results. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 171–182).

Lakoff, G. (1990). The invariance hypothesis: Is abstract reason based on imageschemas? Cognitive Linguistics, 1(1), 39–74.

Article

Google Scholar

Langacker, R. W. (1987). Foundations of cognitive grammar, volume 1, theoretical prerequisites. Stanford University Press.

Google Scholar

Lasri, K., Pimentel, T., Lenci, A., Poibeau, T. & Cotterell, R. (2022). Probing for the usage of grammatical number. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers (pp. 8818–8831).

Laurence, S. (2003). Is linguistics a branch of psychology? In A. Barber (Ed.), Epistemology of language (pp. 69–106). Oxford University Press.

Chapter

Google Scholar

Levine, R. (2018). ‘Biolinguistics’: Some foundational problems. In C. Behme & M. Neef (Eds.), Essays on linguistic realism (pp. 21–60). John Benjamins Publishing Company.

Chapter

Google Scholar

Levy, A. (2013). Three kinds of new mechanism. Biology and Philosophy, 28(1), 99–114.

Article

Google Scholar

Lewis, D. (1970). How to define theoretical terms. Journal of Philosophy, 67(13), 426–446.

Article

Google Scholar

Li, J., Cotterell, R. & Sachan, M. (2022). Probing via prompting. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1144–1157).

Li, L., Ma, R., Guo, Q., Xue, X. & Qiu, X. (2020). BERT-ATTACK: Adversarial attack against BERT using BERT. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 6193–6202).

Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7, 195–212.

Article

Google Scholar

Madabushi, H.T., Romain, L., Divjak, D. & Milin, P. (2020). CXGBERT: BERT meets construction grammar. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 4020–4032).

Manning, C. D., Clark, K., & Hewitt, J. (2020). Emergent linguistic structure in artificial neural networks trained by self-supervision. PNAS, 117(48), 30046–30054.

Article

Google Scholar

Marcus, G. F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3), 243–282.

Article

Google Scholar

Marr, D. (1982). Vision. W.H. Freeman and Company.

Google Scholar

Matthews, R. J. (2007). The measure of mind: Propositional attitudes and their attribution. Oxford University Press.

Book

Google Scholar

McCoy, T., Frank, R., & Linzen, T. (2020). Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks. Transactions of the Association for Computational Linguistics, 8, 125–140.

Article

Google Scholar

McCoy, T., Pavlick, E. & Linzen, T. (2019). Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3428–3448).

Mickus, T., Paperno, D., Constant, M. & van Deemter, K. (2020). What do you mean, BERT? Assessing BERT as a distributional semantics model. In Proceedings of the Society for Computation in Linguistics (pp. 350–361).

Miller, P. H. (1999). Strong generative capacity: The semantics of linguistic formalism. CSLI Publications.

Google Scholar

Millikan, R. G. (1993). Content and vehicle. In N. Eilan, R. McCarthy, & B. Brewer (Eds.), Spatial representation (pp. 256–268). Blackwell.

Google Scholar

Millikan, R. G. (2017). Beyond concepts: Unicepts, language, and natural information. Oxford University Press.

Book

Google Scholar

Mueller, A., Frank, R., Linzen, T., Wang, L. & Schuster, S. (2022). Coloring the blank slate: Pre-training imparts a hierarchical inductive bias to sequence-to-sequence models. In Findings of the Association for Computational Linguistics: ACL 2022 (pp. 1352–1368).

Nadeem, M., Bethke, A. & Reddy, S. (2020). StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (pp. 5356–5371).

Neander, K. (2017). A mark of the mental: A defence of informational teleosemantics. MIT Press.

Book

Google Scholar

Nefdt, R. M. (2023). Language, science, and structure: A journey into the philosophy of linguistics. Oxford University Press.

Book

Google Scholar

Newmeyer, F. (2010). On comparative concepts and descriptive categories: A reply to Haspelmath. Language, 86(3), 688–695.

Article

Google Scholar

Odden, D. (2013). Formal phonology. Nordlyd, 40(1), 249–273.

Article

Google Scholar

OpenAI (2023). GPT-4 technical report (Tech. Rep.).

Ott, D. (2017). Strong generative capacity and the empirical base of linguistic theory. Frontiers in Psychology, 7, 8.

Google Scholar

Pater, J. (2019). Generative linguistics and neural networks at 60: Foundation, friction, and fusion. Language, 95(1), e41–e74.

Article

Google Scholar

Pennington, J., Socher, R. & Manning, C.D. (2014). GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).

Piccinini, G. (2015). Physical computation: A mechanistic account. Oxford University Press.

Book

Google Scholar

Pinker, S., & Price, A. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28(1–2), 73–193.

Article

Google Scholar

Poeppel, D., & Embick, D. (2005). Defining the relation between linguistics and neuroscience. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 1–16). Lawrence and Erlbaum Associates.

Google Scholar

Postal, P. (2003). Remarks on the foundations of linguistics. The Philosophical Forum, 34(3–4), 233–252.

Article

Google Scholar

Postal, P. (2009). The incoherence of Chomsky’s ‘biolinguistic’ ontology. Biolinguistics, 3(1), 104–123.

Article

Google Scholar

Putnam, H. (1988). Representation and reality. MIT Press.

Google Scholar

Quine, W. V. O. (1970). Methodological reflections on current linguistic theory. Synthese, 21, 386–398.

Article

Google Scholar

Rey, G. (2020). Representation of language: Philosophical issues in a Chomskyan linguistics. Oxford University Press.

Book

Google Scholar

Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866.

Article

Google Scholar

Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, & T. P. R. Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition: Vol. 2. psychological and biological models (pp. 216–271). MIT Press.

Google Scholar

Sennrich, R., Haddow, B. & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1715–1725).

Smith, B. C. (2006). Why we still need knowledge of language. Croatian Journal of Philosophy, 6(3), 431–456.

Google Scholar

Soler, A.G., & Apidianaki, M. (2020). BERT knows Punta Cana is not just beautiful, it’s gorgeous: Ranking scalar adjectives with contextualized representations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 7371–7385).

Sprevak, M. (2018). Triviality arguments about computational implementation. In M. Sprevak & M. Colombo (Eds.), Routledge handbook of the computational mind (pp. 175–191). Routledge.

Chapter

Google Scholar

Swoyer, C. (1991). Structural representation and surrogative reasoning. Synthese, 87(3), 449–508.

Article

Google Scholar

Tenney, I., Das, D. & Pavlick, E. (2019). BERT rediscovers the classical NLP pipeline. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 4593–4601).

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Polosukhins, I. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing (pp. 6000–6010).

Weiss, G., Goldberg, Y. & Yahav, E. (2021). Thinking like transformers. In Proceedings of the 38th international conference on machine learning (pp. 11080–11090).

Source link