(Re)discovering music theory: AI algorithm learns the rules of musical composition and provides a framework for knowledge discovery


Michael O'Boyle

Professor Lav Varshney
Professor Lav Varshney

Can an AI algorithm learn the rules of Western music theory without any prior knowledge of music? And can it express them in a simple form that humans can understand?

According to Lav Varshney, professor of electrical & computer engineering at the University of Illinois Urbana-Champaign, the answer is “Yes!” His work with former graduate student Haizi Yu introduces a new machine learning paradigm called information lattice learning. As they discuss in their article published in IEEE BITS magazine, the algorithm analyzes datasets by looking for simple, easy-to-understand rules, mimicking how humans make sense of unfamiliar ideas or data. When applied to 370 works by the late-Baroque composer Johann Sebastian Bach, it correctly identified the principles of music composition found in standard textbooks, and it expressed them in a form that music students could understand.

Information lattice learning has the potential to enable knowledge discovery in disciplines beyond music by determining simple principles governing datasets. “It has been used to learn fundamental laws in neuroscience, chemistry, and quantum optics, and it does this with no prior knowledge built in and very little data,” Varshney said.

A student learning from a teacher

The algorithm identifies the principles underlying a musical style with an architecture inspired by the interaction between a student and a teacher. A “student” produces the most random composition consistent with a provided set of rules. A “teacher” then compares the result to a body of work in the target style and creates a new rule to bring the student closer to that style. The teacher strives to create rules that are simple as possible to facilitate human understanding.

Varshney notes that this emulates an idea from psychology where people develop a stronger understanding of new concepts by talking about them with others. “There is work showing that communication between agents about novel phenomena leads to better abstraction and clearer knowledge discovery, and we seem to have rediscovered that,” he said.

Rules from symmetry

The generated rules are mathematical constraints between characteristics of the data. Without prior knowledge, these characteristics can only be patterns, or symmetries observed in the data. “This is the foundation of information lattice learning,” according to Varshney. “When considering data with no prior knowledge, the only thing one can really see is symmetry.”  

Before the student-teacher cycle begins, the algorithm must define a set of possible characteristics from which the teacher draws when constructing rules. One way is to draw from Core Knowledge - a set of abilities cognitive scientists believe are innate to all humans that includes counting, sorting, and elementary pattern recognition - to create a set of symmetries. Varshney compares this set, called the partition lattice, to a universe of concepts.

For example, the C and D major scales are constructed by assembling notes based on the same pattern. The only difference is the starting note. This fact can be represented as a mathematical symmetry by noting that scales which only differ by the staring note belong to the same class. This is one concept that could be included in the partition lattice.

An unexpected connection

As Varshney and Yu were developing these ideas, they learned that Claude Shannon, the engineering theorist who founded the field of information theory, proposed a similar construct in 1950. The partition lattice generalizes what Shannon called the information lattice, a framework for comparing information content between sources. The difference is that the partition lattice does not impose quantitative measures of correlation, so only relative structure and hierarchy remain. Correlations can then be learned from the data.

Varshney explained that this has proven to be a powerful idea, saying, “An important breakthrough was the realization that clusters form a hierarchical structure, so the knowledge that is discovered will also naturally have a hierarchical structure. This is how human knowledge builds on top of itself.”

AI co-creativity

To evaluate the success of information lattice learning in identifying the underlying laws of music theory, Varshney and Yu collaborated with Illinois professor of music Heinrich Taube. They gave students in the CS + Music program the algorithm-generated rules for Bach’s compositions and asked the students to interpret them. In a vast majority of the cases, the students correctly identified the human-language equivalent to the rules. The program even identified rules which were not previously codified.

In addition, Yu, who was later a postdoctoral scholar at the University of Chicago’s Knowledge Lab, is leading the company Kocree, Inc. which is using features from information lattice learning to develop an environment where users can collaboratively create music. This co-creativity environment is used in outreach to engage youth in STEM through musical composition in collaboration with Hip Hop Xpress at the University of Illinois as well as the Musical Arts Institute in Chicago, the House of Miles in East St. Louis, Illinois, and My Music Ed in Dayton, Ohio.