BIGDATA: F: DKA: CSD: Iterative Crowdsourced Hypothesis Generation

PI: James Bagrow
Co-PIs: Joshua Bongard, Christopher Danforth, Peter Dodds, Paul Hines

Establishing causal relationships -- for example, that cigarette smoking causes lung cancer -- is one of the most challenging aspects of scientific research. Computers excel at calculation, but are unable to separate cause-and-effect from mere correlation. Humans, on the other hand, can make logical conclusions based on their experiences but, in the modern era of Big Data, there are far too many potential relationships for humans to manually examine. This research aims to build a crowdsourcing web platform to use the knowledge of interested non-experts (Hunch) and the algorithmic power of computers (Crunch) to discover and test causal relationships in large-scale data. Algorithms identify potential relationships and users are asked to validate them. Further, users are able to propose their own hypotheses that can subsequently be validated, creating an accelerating feedback loop of scientific discovery. The goal of systematically discovering causal relationships has the potential for broad societal impact, and virtually anyone with web access can participate directly in this scientific research.To support this goal, the researchers are developing novel statistical methods that determine the data types of crowd-suggested observables on the fly. For example, are 'wages' and 'gender' real-valued or binary variables? Finally, the crowd is a relatively limited resource. To use it efficiently, machine learning algorithms would identify which substructures in the correlational network are most likely to be causal, and then focus the crowd's efforts towards them. These efficient, adaptive methods allow causal relationships to be combined into larger chains that explain growing numbers of causes and effects.

Funded by the US National Science Foundation

CAREER: Harnessing Smart Grid Data to Enable Resilient and Efficient Electricity

PI: Paul Hines

The objective of this research is to harness Smart Grid data (Big Data) to enable more resilient and efficient electricity. Three research sub-projects contribute to this goal. Project 1 combines a new "Random Chemistry" computational algorithm with complex networks methods to find patterns of vulnerability in power systems, and uses the results to reduce cascading failure blackout risk. Project 2 transforms smart grid data into actionable information about the health of a power grid by looking at statistical properties (structured noise) in data from grid sensors. Projects 1 and 2 seeks to make power grids more resilient to fluctuations from renewable generation or weather events. Project 3 uses crowdsourcing to identify trends affecting residential energy consumption through a web-based energy efficiency social network.

Intellectual Merit: This project integrates research ideas from diverse scientific disciplines, including complex systems, graph theory, data science, computational intelligence and crowdsourcing. Projects 1 and 2 use abstract complex systems approaches, while retaining critical information about the physics of power systems. By using data from real power systems the project will contribute to the emerging field of data science. The third project combines computational intelligence with crowdsourcing in a way that could open new ways to improve energy efficiency.

Broader Impacts: This project tests new educational approaches, including a unique LEGO-based grid simulator, and integrates smart grid data into new courses. New curriculum and a hands on ?smart grid road show? will be leveraged to attract students from diverse educational and demographic backgrounds to study electric energy.

Funded by the US National Science Foundation

IGERT: Smart Grids - Technology, Human Behavior and Policy

PI: Jeffrey Marshall
Co-PIs: Margaret Eppstein, Stephen Higgins, Paul Hines, Christopher Koliba

This Integrative Graduate Education and Research Traineeship (IGERT) award supports the interdisciplinary training of Ph.D. scientists and engineers working in the development of modern, smart-grid power systems that operate with extensive interaction and data exchange between the utility company and the consumer. The educational program integrates technology, human behavior and policy within a complex systems framework in order to understand the coupled dynamics of the integrated power system. Intellectual Merit: This training program integrates education and research in complex systems modeling, human behavior, engineering, and policy at the University of Vermont (UVM) with expertise in modern electric power systems, renewable energy and high-performance computing at Sandia National Laboratories to train students to effectively deal with integrated technological-human-policy systems typical of the smart grid. Trainees will complete a summer internship at Sandia National Laboratories, engage in outreach activities with public museums and energy festivals in Vermont, work collaboratively with undergraduate students in game design at Champlain College, and visit the consortium of Vermont utility companies involved in developing the first state-wide smart grid in the nation. Broader Impacts: Development of the smart grid entails a transformation from the current centralized control structure for power delivery to a decentralized structure in which the consumer plays an integral role. Design, control and optimization of smart grids to provide reliable, inexpensive power will require utility companies to understand, model and shape human behavioral responses and governments to design policies to enhance energy efficiency and smart grid adoption. This training program lays the educational foundations for a workforce trained in an integrated approach to complex human-technical-policy systems. IGERT is an NSF-wide program intended to meet the challenges of educating U.S. Ph.D. scientists and engineers with the interdisciplinary background, deep knowledge in a chosen discipline, and the technical, professional, and personal skills needed for the career demands of the future. The program is intended to establish new models for graduate education and training in a fertile environment for collaborative research that transcends traditional disciplinary boundaries, and to engage students in understanding the processes by which research is translated to innovations for societal benefit.

Funded by the US National Science Foundation

Robustness, resilience and emergent properties of interdependent networks

PI: Raissa D’Souza (UC Davis)
Co-PIs: Pierre-Andre Noel (UC Davis), Paul Hines

Collections of networks are at the core of military and civilian life, spanning technological, bi- ological and social systems. All of these networks interact, leading to new emergent properties, unanticipated phase transitions, new vulnerabilities and, moreover, potential benefits. A WMD attack may severely compromise essential critical infrastructures such as power grids, telecommunication and water distribution networks, which have expansive geographic reach and many known and unknown interactions with other critical infrastructures and lifeline systems such as transportation networks, supply chains, emergency response networks and even societal response. In the first three years of this project, we have developed fundamental mathematical models to characterize some of these effects, ranging from cascading failures in interdependent networks (establishing the concept of optimal interdependence), to methods to mitigate the abruptness of explosive percolation phase transitions, to methods to control cascading failures. With funding for two additional option years we will be able to build upon this foundational work and make stronger connections to real-world systems. Our objective is to combine complex networks approaches and engineering simulators to understand the impact of WMD attacks on cascading failures. From a theoretical foundations perspective, we will develop a mathematical framework for treating flows and cascades in networks with long-range loops (present in critical infrastructure). From a more practical perspective, we will use data from detailed engineering simulators to build abstracted network models, which have high-fidelity to real-world networks, but are amenable to statistical analysis. We will focus initially on cascading failures within power systems, and then extend the methods to cascades that propagate across interdependent networks.

Funded by the Defense Threat Reduction Agency