LIS 590SOS / CS 598SOS: Self-Organizing Information Systems

This ongoing Research Seminar course investigates the intersection of self-organization and computer-based information systems that interact. It focuses on the emergence and evolution of communication and language as the prototype or "model organism" for self-organizing information systems. We examine numerous computational models of language evolution, using them as a basis for thinking about information and self-organization in distributed systems of many kinds. This is a rewarding and fun process, because:

  1. Language is the principal human information system, and a foundation of every computational information system.

  2. Language is an inherently collective, distributed, and social phenomenon - there is no single-agent language.

  3. All languages evolve: language is a complex adaptive system.

  4. Successful languages must be stable enough for reliable communication, yet constantly adaptive to new agents and tasks---they solve the problem of maintaining "flexible coherence". Insights from studies of language self-organization can generalize to other "flexibly coherent" systems

  5. Language evolution provides a compelling arena for grappling with thorny issues of grounded meaning and distributed semantics, which impact all distributed information systems.

  6. Language evolution studies are currently very hot research topics. Researchers are working to discover general underlying mathematical, computational, and implementation principles that explain where human language came from, how it changes, and how artificial agents (robots, software agents, dynamic databases, web services, ontology managers) might develop, reconcile, and sustain their own sophisticated representation and communication regimes from the ground up.

  7. Computational language evolution models provide useful accounts and theories for many other critical adaptive information problems, including adaptive information organizations (shared subject indices; ontologies; folksonomies; collaborative tagging); database interoperability; resource description/discovery regimes like "ad-words," web service brokering, and digital library metadata; P2P information retrieval [IR] systems); schema reconciliation for databases; adaptive genomics; cellular and bacterial signaling systems and gene regulatory networks.

Over several past semesters, participants have developed conference papers, journal papers, and both M.S. and Ph.D. thesis topics from their work in this course.

Subject Matter

Subject matter is organized under the following topics:

*'ed items represent core knowledge of field. Other topics are woven in and readings from these areas are chosen as the interests of participants dictate.

As a research seminar course, along with the subject matter of the course material, we give significant attention to the research methods, history, and scientific context of the ideas treated.

Almost all the papers we will read are contained in the UIUC Language Evolution and Computation web repository. Exploring this will give a glimpse of the course content and style.

The course is a continuation of an ongoing seminar, and may be repeated for credit since the content differs from semester to semester.

Format

The format of the course is reading, analysis, presentation, and discussion of research papers. This is coupled through the semester with joint project work and writing in areas of individual interest. Students join groups covering each of the core subject areas. Each group leads the class in developing, presenting, and discussing new knowledge in their area. This allows students to build a deep understanding in their chosen areas, as well as a broader grasp of other areas led by other groups. Typically, two or three different papers are covered thoroughly each week. In some weeks we may treat a number of papers on the same thread or theme from the same researchers, in which case there may be more, though overlapping, papers.

Prerequisite Knowledge

Virtually all papers treated in the course include mathematical and/or computational models. Prospective students should have a general understanding of principles of computation, computer programming, computer systems, and probability. Remedial discussions are held on advanced topics in these areas as needed. Participants should also be comfortable with exploring and learning about new mathematical theories, models, and expressions.

Relevance to other UIUC courses and programs

To my knowledge, there is no other course at UIUC that treats either self-organizing information systems or language emergence and evolution. This course is complementary to existing UIUC courses in the following areas, but doesn't overlap with any of them:

This course will also fit as part of the new Certificate of Advanced Study in Language and Speech Processing at UIUC (http://lsp.lang.uiuc.edu/). Cross-listing with Linguistics is also being pursued.

Calendar

Core topics

The course covers each of the following topics for the number of weeks shown in a rotating schedule, for a total of 15 weeks. Representative sample papers for each topic are given below.

Overviews, surveys, and background* (1 week)

Fitch, W. T. (2005) The Evolution of Language: A Comparative Review. Biology and Philosophy, 20(2-3):193--203.

Wang, W. S-Y. and Minett, J. W. (2005) The invasion of language: emergence, change and death. Trends in Ecology and Evolution, 20(5):263--269.

Wagner, K., Reggia, J. A., Uriagereka, J., and Wilkinson, G. S. (2003) Progress in the simulation of emergent communication and language. Adaptive Behavior, 11(1):37--69.

Nowak, M. A., Komarova, N. L., and Niyogi, P. (2002) Computational and evolutionary aspects of language. Nature, 417:611--617.

Emergence of Symbols and Symbolization* (3 weeks)

Allen, M., Goldman, C. V., and Zilberstein, S. (2005) Learning to Communicate in Decentralized Systems. In Proceedings of the Workshop on Multiagent Learning, AAAI-05, pages 1--8. Pittsburgh, PA.

Mariarosaria Taddeo and Luciano Floridi. Solving the symbol grounding problem: a critical overview of fifteen years of research. to appear in Journal of Theoretical and Experimental Artificial Intelligence.

Paul Vogt. Language Evolution and Robotics: Issues on Symbol Grounding and Language Acquisition. in A. Loula, R. Gudwin, J. Queiroz (eds), Artificial Cognitive Systems.

Hutchins, E. and Hazlehurst, B. (1995) How to Invent a Lexicon: The Development of Shared Symbols in Interaction. In G. N. Gilbert and R. Conte, editors, Artificial Societies: The Computer Simulation of Social Life. London: UCL Press.

Emergence of Structure* (3 weeks)

Sole, R. V. (2005) Syntax for free? Nature, 434:289.

Vogt, P. (2005) The emergence of compositional structures in perceptually grounded language games. Artificial Intelligence 167(1-2): 206-242

Steels, Luc. Constructivist Development of Grounded Construction Grammars. In Daelemans, W., editor, Proceedings Annual Meeting of Association for Computational Linguistics, 2004.

Fernando Pereira. "Formal grammar and information theory: Together again?" Philosophical Transactions of the Royal Society, 358(1769):1239-1253, April 2000.

Nowak, M. A., Plotkin, J. B., and Jansen, V. A. A. (2000) The evolution of syntactic communication. Nature, 404:495--498.

Hashimoto, T. and Ikegami, T. (1996) Emergence of net-grammar in communicating agents. Biosystems, 38(1):1--14.

Flexible Coherence, Convergence and Collectivity* (3 weeks)

Nakamura, M., Hashimoto, T., and Tojo, S. (2005) Language Change in Modified Language Dynamics Equation by Memoryless Learners. In Second International Symposium on the Emergence and Evolution of Linguistic Communication.

Baronchelli, A., Felici, M., Caglioti, E., Loreto, V., and Steels, L. (2005) Sharp Transition towards Shared Vocabularies in Multi-Agent Systems. ArXiv, 2005.

Matsen, F. A. and Nowak, M. A. (2004) Win-stay, lose-shift in language learning from peers. PNAS, 101(52):18053--18057.

Cucker, F., Smale, S., and Zhou, D-X. (2004) Modeling Language Evolution. Foundations of Computational Mathematics, 4(3):315--343.

Jordi Delgado "Emergence of social conventions in complex networks." Artificial Intelligence, Volume 141, Issue 1 (October 2002)

Language Evolution as a General Model for Adaptive Infosystems* (3 weeks)

Collier, T. C. and Taylor, C.E. (2004). Self-Organization in Sensor Networks. Journal of Parallel and Distributed Computing 64(7): 866-873.

Steels, L. (2004) Analogies between Genome and Language Evolution. In Pollack, J. et.al., editor, ALife 9. The MIT Press Cambridge Ma.

Jacob, E. B., Becker, I., Shapira, Y., and Levine, H. (2004). Bacterial linguistic communication and social intelligence. Trends in Microbiology, 12(8)

Staab, S.; Santini, S.; Nack, F.; Steels, L.; Maedche, A. (2002) "Emergent semantics," IEEE Intelligent Systems, Volume: 17 Issue: 1 Page(s): 78-86

Searls, D. B. (2002) The language of genes. Nature, 420:211--217.

Karl Aberer et al. "A Framework for Semantic Gossiping" SIGMOD Record, 2002.

Michael Buckland, "Vocabulary as a Central Concept in Library and Information Science" in: Digital Libraries: Interdisciplinary Concepts, Challenges, and Opportunities. Proceedings of the Third International Conference on Conceptions of Library and Information Science (CoLIS3, Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al. Zagreb: Lokve, pp 3-12.

Greenberg, J. H. (1992) Preliminaries to a Systematic Comparison Between Biological and Linguistic Evolution. In Hawkins, John A. and Murray Gell-Mann, editors, The Evolution of Human Languages. Reading, MA: Addison-Wesley.


Special topics, selected (approximately 1 week)

These topics are covered and discussed as they occur in papers in the core sections. In addition, depending on the interests of participants, some papers in these areas may be analyzed, presented, and discussed directly.

Emergence of Signals and Signalling

Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002) The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science, 298:1569--1579.

Krakauer, D. and Pagel, M. (1995) Spatial structure and the evolution of honest cost-free signalling. Proceedings of The Royal Society of London. Series B, Biological Sciences, 260:365--372.

Emergence of Ontological Categories

Roesner, D. and Kunze, M. (2002) Exploiting Sublanguage and Domain Characteristics in a Bootstrapping Approach to Lexicon and Ontology Creation. In Proceedings of the OntoLex 2002 - Ontologies and Lexical Knowledge Bases, pages 68--73.

Steels, L. (1998) The Origins of Ontologies and Communication Conventions in Multi-Agent Systems. Autonomous Agents and Multi-Agent Systems, 1(2):169--194.

Cohen, P. R. (1998) Growing Ontologies. Technical report, Computer Science Department, University of Massachusetts at Amherst.

Initial Conditions and Origins of Language

Fitch, W. T., Hauser, M. D., and Chomsky, N. (2005) The evolution of the language faculty: Clarifications and implications. Cognition.

Jackendoff, R. and Pinker, S. (2005) The nature of the language faculty and its implications for evolution of language (Reply to Fitch, Hauser, and Chomsky). Cognition.

Wiles, J., Watson, J., Tonkes, B. & Deacon, T. (2002). Strange Loops in Learning and Evolution. Presented at ICCS02, submitted to InterJournal (May 2002)

Nowak, M. A., Komarova, N. L., and Niyogi, P. (2001) Evolution of Universal Grammar. Science, 291:114--118.

Linking Language and Action

Cohen, P. R. (2000) Learning Concepts by Interaction. Technical report, Computer Science Department, University of Massachusetts at Amherst.

Werner, G. and Dyer, M. (1992) Evolution of Communication in Artificial Organisms. In C. Langton and C. Taylor and D. Farmer and S. Rasmussen, editors, Artificial Life II, pages 659--687. Redwood City, CA: Addison-Wesley Pub.