How big data brings graph theory to a new dimension


Graph theory is not enough.

The mathematical language used to discuss connections usually depends on the network—vertices (points) and edges (lines connecting them)—which have been a valuable method of simulating real-world phenomena since at least the 18th century. But decades ago, the advent of huge data sets forced researchers to expand their toolboxes, while also providing them with a vast sandbox in which new mathematical insights can be applied.From then on Josh GrochoAs a computer scientist at the University of Colorado at Boulder, as researchers have developed a new type of network model that can find complex structures and signals in the noise of big data, he has experienced an exciting period of rapid growth.

Grochow is one of a growing number of researchers who point out that graph theory has its limitations when looking for connections in big data. The graph represents each relationship as a two-tuple or paired interaction. However, many complex systems cannot be represented by binary connections alone. The latest developments in this field show how to move forward.

Consider trying to build a parenting network model. Obviously, every parent has a connection with the child, but the parenting relationship is not just the sum of these two connections, because graph theory may model it. The same goes for trying to simulate peer pressure and other phenomena.

“There are many intuitive models. Only when you already have groups in your data will you capture the impact of peer pressure on social dynamics,” said Leoni Newhauser Aachen University of Technology, Germany. But the binary network cannot capture group influence.

Mathematicians and computer scientists use the term “high-order interactions” to describe these complex ways that group dynamics rather than binary connections can influence individual behavior. From the entangled interaction in quantum mechanics to the trajectory of disease spreading in people, these mathematical phenomena are everywhere.If the pharmacologist wants to model medicine interactionsFor example, graph theory might show how two drugs react with each other-but what about three drugs? Four?

Although the tools for exploring these interactions are not new, it is only in recent years that high-dimensional data sets have become the engine of discovery, providing mathematicians and network theorists with new ideas. These efforts have produced interesting results regarding the limitations and expansion possibilities of graphs.

“Now we know that the Internet is just the shadow of things,” Grocho said. If the data set has a complex underlying structure, then modeling it as a graph may only reveal a limited projection of the entire story.

Emilie Purvine of the Pacific Northwest National Laboratory is excited about the power of tools such as Hypermap for drawing more subtle connections between data points.

Photo: Andrea Starr/Pacific Northwest National Laboratory

“We have realized that the data structure we use to study things, from a mathematical point of view, does not exactly match what we see in the data,” said the mathematician Emily Pervin Pacific Northwest National Laboratory.

This is why mathematicians, computer scientists, and other researchers are paying more and more attention to how to generalize graph theory—in its many forms—to explore higher-order phenomena. In the past few years, a large number of proposed methods have been proposed to characterize these interactions, and they have been mathematically verified in high-dimensional data sets.

For Purvine, the mathematical exploration of higher-order interactions is like a mapping of new dimensions. “Think of the diagram as the basis of a two-dimensional land,” she said. The three-dimensional buildings that can be climbed to the top may vary greatly. “When you are on the ground, they look the same, but the things you build on them are different.”

Enter hypergraph

Finding those high-dimensional structures is where mathematics becomes particularly vague and interesting. For example, a higher-order analog of a graph is called a hypergraph, which has “super edges” instead of edges. These can connect multiple nodes, which means it can represent a multipath (or multilinear) relationship. The over-edge may be seen as a surface instead of a line, like a tarp fixed in three or more places.

This is good, but we still don’t know how these structures relate to traditional structures. Mathematicians are currently learning which graph theory rules also apply to higher-order interactions, which indicates a new field of exploration.

To illustrate the types of relationships that hypergraphs can sort out from large data sets (and ordinary graphs cannot), Purvine gave a simple example close to home, namely, the scientific publishing world. Imagine two data sets, each containing up to three papers co-authored by mathematicians; for simplicity, we named them A, B, and C. A data set contains six papers, and three different pairs (AB, AC, and BC) each have two papers. The other one has only two papers in total, and each paper is co-written by three mathematicians (ABC).


Source link