REFLECTIONS ON THE EVOLUTION OF LANGUAGE PROCESSING IN AI: FROM COGNITIVE THEORIES TO PRACTICAL APPLICATIONS

Giacomo Ferrari

Università del Piemonte Orientale Vercelli, Italy

Abstract

The evolution of Artificial Intelligence (AI) techniques applied to natural language processing (NLP) reflects a shift from cognitive science-inspired models of the human mind to purely engineering-driven systems such as transformer-based architectures. Early AI approaches, grounded in cognitive science, not only produced notable computational results but also stimulated innovative research in linguistics, particularly in semantics and pragmatics. In contrast, the rise of data-driven and machine learning-based methods provided extensive linguistic datasets, enabling new insights into underexplored aspects of language. The convergence of quantitative linguistics and machine learning has led to the development of complex statistical models capable of predicting linguistic structures with high accuracy. Recent advancements in transformer technology have enabled systems—such as ChatGPT—to respond effectively to user inputs in natural language. While these tools are remarkably efficient, they do not contribute original theoretical insights into the nature of language. This article explores the historical trajectory of AI in NLP, assesses its linguistic implications, and highlights the limitations of current models in advancing our understanding of language.

Keywords: Artificial Intelligence; Natural Language processing; Large language models; Cognitive approach; Transformers technology.

References

Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines. Cognitive Science, 9(1), 147-169. doi: https://doi.org/10.1016/S0364-0213(85)80012-4

Aho, A. V., & Ullman, J. D. (1977). Principles of compiler design. Addison-Wesley.

Allen, J. F., & Perrault, C. R. (1980). Analyzing intention in utterances. Artificial Intelligence, 15(3), 143-178. doi: https://doi.org/10.1016/0004-3702(80)90042-9

Austin, J. L. (1962). How to do things with words. Clarendon.

Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. Cambridge University Press.

Brachman, R. (1978). A structural paradigm for representing knowledge. Bolt, Beranek, and Neumann Technical Report (3605).

Busa, R. (S. J.) (1985). De terminationum Latinarum statisticis mensuris ex Indice Thomistico. In S. Fraser (Ed.), Hommage à Pierre Guiraud (pp. 59-67). Les belles Lettres.

Chomsky, N., Roberts, I., & Watumull, J. (2023, March 8). The false promise of ChatGPT. The New York Times, from https://www.nytimes.com/2023/03/08/opinion/chatgpt-artificial-intelligence.html

Church, A. (1935). Abstract No. 204. Bulletin of the American Mathematical Society, 41(4), 332-333.

Church, A. (1936). An unsolvable problem of elementary number theory. American Journal of Mathematics, 58(2), 345-363. doi: https://doi.org/10.2307/2268571

Cohen, P. R., & Perrault, C. R. (1979). Elements of a plan-based theory of speech acts. Cognitive Science, 3(3), 177-212. https://doi.org/10.1016/S0364-0213(79)80006-3

Cristea, D., Ide, N., & Romary, L. (1998). Veins theory: A model of global discourse cohesion and coherence. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1 (pp. 281-285). Montreal: ACL. doi: https://doi.org/10.3115/980845.980891

Fodor, J. (1983). The modularity of mind: An essay on faculty psychology. MIT Press.

Friedman, J. (1971). A computer model of transformational grammar. Elsevier.

Grice, H. P. (1975). Logic and conversation. In D. Davidson (Ed.), The logic of grammar (pp. 64-75). Dickenson.

Grosz, B. J. (1977). The representation and use of focus in dialogue understanding (D.Phil. dissertation, University of California, Berkeley, California). Stanford Research Institute, Menlo Park, California.

Grosz, B. J., & Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3), 175-204.

Guiraud, H. (1954). Le caractères statistiques du vocabulaire. Essai de méthodologie. Presses Universitaires de France.

Hastie, T., Tibshirani, R., & Friedman, J. (2011). The elements of statistical learning: Data mining, inference, and prediction. Springer.

Herdan, G. (1956). Language as choice and chance. Noordhoff.

Herdan, G. (1964). Quantitative linguistics. Butterworths.

Hilbert, D., & Ackermann, W. (1928). Grundzüge der theoretischen Logik. Springer.

Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554.

Hinton, G., et al. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29, 82–97. doi: https://doi.org/10.1109/MSP.2012.2205597

Hobbs, J. R., & Rosenschein, A. J. (1977). Making computational sense of Montague’s intensional logic. Artificial Intelligence, 9(1), 287-306. doi: https://doi.org/10.1016/0004-3702(77)90025-X

Hofstadter, D. (1979). Gödel, Escher, Bach: An eternal golden braid. Vintage Books.

Joshi, A., Levy, L. S., & Takahashi, M. (1985). Tree adjoining grammars. Journal of Computer Systems Sciences, 10(1).

Joshi, A. K., & Schabes, Y. (1997). Tree-adjoining grammars. In G. Rozenberg & A. Salomaa (Eds.), Handbook of formal languages (pp. 69-124). Springer. doi: https://doi.org/10.1007/978-3-642-59126-6_2

Juilland, A., & Chang-Rodriguez, E. (1964). Frequency dictionary of Spanish words. Mouton De Gruyter. doi: https://doi.org/10.1515/9783112415467

Juilland, A., Edwards, P. M. H., & Juilland, I. (1966). Frequency dictionary of Rumanian words. Mouton De Gruyter.

Juilland, A., Brodin, D. R., & Davidovich, C. (1970). Frequency dictionary of French words. Mouton De Gruyter.

Juilland, A., & Traversa, V. (1973). Frequency dictionary of Italian words. Mouton De Gruyter. doi: https://doi.org/10.1515/9783110868937

Kamp, H., & Reyle, U. (1993). From discourse to logic: Introduction to model-theoretic semantics of natural language, formal logic, and discourse representation theory. Springer.

Kaplan, R., & Bresnan, J. (1981). Lexical-functional grammar: A formal system for grammatical representation. In J. Bresnan (Ed.), The mental representation of grammatical relations (pp. 173-281). MIT Press.

Kay, M. (1985). Parsing in functional unification grammar. In D. R. Dowty, L. Karttunen, & A. M. Zwicky (Eds.), Natural language parsing: Psychological, computational, and theoretical perspectives (pp. 251-278). Cambridge University Press.

Kennedy, G. (1998). An introduction to corpus linguistics. Longman.

Louwerse, M. M., Jeuniaux, P., Hoque, M. E., Wu, J., & Lewis, G. (2006). Multimodal communication in computer-mediated map task scenarios. In Proceedings of the Cognitive Science Society Conference (pp. 1717-1722). Erlbaum Mahwah, NJ.

Mann, W. C., & Thompson, S. A. (1987). Rhetorical structure theory: A theory of text organization. University of Southern California, ISI/RS-87-90.

Minsky, M. (1975). A framework for representing knowledge. In P. Winston (Ed.), The psychology of computer vision (pp. 211-217). McGraw Hill.

Muller, Ch. (1968). Initiation à a statistique linguistique. Larousse.

Perrault, C. R., Allen, J. F., & Cohen, P. R. (1978). Speech acts as a basis for understanding dialogue coherence. American Journal of Computational Linguistics, 32-39.

Pollard, C., & Sag, I. A. (1987). Information-based syntax and semantics. Center for the Study of Language and Information.

Ross-Quillian, M. (1967). Word concepts: A theory and simulation of some basic semantic capabilities. Behavioral Science, 12(5), 410-430. doi: https://doi.org/10.1002/bs.3830120511

Sag, I. A., Wasow, T., & Bender, E. M. (2003). Syntactic theory: A formal introduction (2nd ed.). University of Chicago Press.

Searle, J. (1969). Speech acts. Cambridge University Press.

Sidner, C. L. (1979). Towards a computational theory of definite anaphora comprehension in English discourse (Ph.D. dissertation, Massachusetts Institute of Technology). MIT AI Lab.

Sowa, J. F. (Ed.). (1991). Principles of semantic networks: Explorations in the representation of knowledge. Morgan Kaufmann.

Turing, A. (1937). On computable numbers with an application to the Entscheidungsproblem. In J. Brundan, S. Chatterjee, M. Chudnovsky, D. Isaksen, V. Marković, J. McKernan, C. Mouhot, J. Newton, H. Oh, M. del Pino, D. Schindler, S. Sheffield, M. Visan, D. T. Wise, & M. Yakimov (Eds.), Proceedings of the London Mathematical Society, s2-42(1), 230-265. doi: https://doi.org/10.1112/plms/s2-42.1.230

Turing, A. (1950). Computing machinery and intelligence. Mind, 49, 433-460.

Woods, W. A. (1981). Procedural semantics as a theory of meaning. In A. K. Joshi, B. L. Webber, & I. A. Sag (Eds.), Elements of discourse understanding (pp. 319-327). Cambridge University Press.

How to cite this article: Ferrari, G. (2023). Reflections on the evolution of language processing in AI: From cognitive theories to practical applications. Journal of Linguistic and Intercultural Education - JoLIE, 16(1), 21–36. https://doi.org/10.29302/jolie.2023.16.1.2

For details on subscription, go to: http://jolie.uab.ro/index.php?pagina=-&id=19&l=en