Skip to main content

Parallel Sessions

The parallel sessions include 20-minute presentations in the morning and afternoon and one 50-minute panel discussion featuring invited Northwestern faculty, postdocs, and students.

Morning Sessions: 11:10 - 11:35 a.m.

Understanding and addressing gun violence with a data-driven and community-engaged approach - Arch Room

Su Burtner, Scientific Director, Center for Neighborhood Engaged Research & Science (CORNERS)

Each year in Chicago, thousands of individuals are victims of gun violence, with hundreds of those being fatal. These victimizations and fatalities are mostly concentrated in a small set of neighborhoods, and data and research are urgently needed to understand the concentrated but complex dynamics of these gun violence incidents across social networks and geographic space. Partnering with community-based organizations that do the vital work of street outreach and community violence intervention (CVI) in Chicago, the Center for Neighborhood Engaged Research and Science (CORNERS) at Northwestern University provides technical assistance and research support toward the evaluation of CVI programs in communities that need them most. To build partner trust and prioritize the security of highly sensitive data, our research center has built out a cloud-based data platform with capacities for secure big data storage, data cleaning (ETL) pipelines, structured querying, and advanced computational modeling. This platform further allows our researchers to quickly merge and clean large, small, and messy data from multiple sources, including Chicago Police Department administrative records, publicly available demographic and spatial datasets, and CVI participant program information. Our research not only contributes to the science of neighborhoods and gun violence prevention, but it illustrates the power of data-driven and engaged research to shift the research paradigm, elevate community voices, and help Chicago communities become healthier, safer, and more equitable.

Following Data from Telescopes to Publication: Observing our Colorful and Transient Universe - Big Ten Room

Wen-fai Fong, Associate Professor, Physics and Astronomy, Weinberg College of Arts & Sciences

Astronomical surveys which repeatedly image the sky have revealed that our universe is full of cosmic transients, or sources that change in brightness over time. Discoveries of transients are enabled by advances in detector technology and data processing techniques. Consequently, our knowledge of the transient universe grows richer with each passing decade. However, this presents statistical and data-related challenges especially as we expand to all wavelengths of light on the electromagnetic spectrum. In this talk, I will describe the overarching research questions my group at Northwestern is tackling, and the data and analysis flow from telescope to publication. I will also highlight areas in which Northwestern's computing resources have been particularly beneficial, and areas in which we could use more guidance, possibly from other scientific fields.

Paradigms and Hierarchies: Linking Knowledge Structures and Social Order in Science - Lake Room

Silvan Baier, Research Fellow, Kellogg School of Management

The processes by which scientific knowledge is created, disseminated, evaluated, and organized are permeated by inequalities across social, institutional, and geographic scales. This results in science being a highly stratified, constantly evolving, layered, and global multiscale network. However, not all scientific fields are created equal. While inequalities in science have been widely studied, differences in the conceptual organization of scientific fields have been overlooked. Fields with strong paradigms – where research is evaluated based on a mostly agreed-upon framework – ought to be fairer in their distribution of resources, attention, and rewards. This includes fields in the physical sciences like physics, chemistry, and math. Fields with weaker paradigms ought to be more unequal and less meritocratic by contrast. This includes fields in the social sciences like sociology, anthropology, and political science. These are fields with far less agreement as to how to understand and evaluate work in the field.

We explore the link between the paradigmatic cohesion of fields and the hierarchical ordering of participating organizations and individuals via citations, as well as through faculty hiring flows. Citation networks and faculty hiring allow us to capture the social structure of entities in science, while we apply natural language processing techniques to paper abstracts to gauge the knowledge structures of different fields.

For 11 fields in the social and natural sciences, from 1990 to 2012, and across a plethora of levels and measures, we find robust support for the positive association between scientific paradigms and status hierarchies. Across organizational citation networks, individual citation networks, and hiring networks, different measures of scientific paradigms are positively correlated with the level of status hierarchy. Put differently, high paradigm fields exhibit more status inequality in attention, reward, and influence than low paradigm fields.

Morning Sessions:  11:40 a.m. - 12:05 p.m.

Widespread misidentification of SEM instruments in the peer-reviewed materials science and engineering literature - Arch Room

Reese Richardson, PhD Student, McCormick School of Engineering and Applied Science

Materials science and engineering (MSE) research has, for the most part, escaped the doubts brought about the reliability of the scientific literature by recent large-scale replication studies in psychology and cancer biology. However, users on post-publication peer review sites have recently identified dozens of articles where the make and model of the scanning electron microscope (SEM) listed in the text of the paper does not match the instrument's metadata visible in the images in the published article. In order to systematically investigate this potential risk to the MSE literature, we develop a semi-automated approach to scan published figures for this metadata and check it against the SEM instrument identified in the text. Starting from an exhaustive set of 1,067,102 articles published since 2010 in 50 journals with impact factors ranging from 2 to 24, we identify 11,314 articles for which SEM make and model can be identified in an image's metadata. For 21.2% of those articles, the image metadata does not match the SEM manufacturer or model listed in the text and, for another 24.7%, at least some of the instruments used in the study are not reported. Unexplained patterns common to many of these articles suggest the involvement of paper mills, organizations that mass-produce, sell authorship on, and publish fraudulent scientific manuscripts at scale.

Detecting Gender Embodiment Using 3D Video Analysis - Big Ten Room

Doron Shiffer-Sebba, Postdoctoral Fellow, Weinberg College of Arts & Sciences

We use 3D video analysis to model gendered body language in public space. Specifically, we seek to better understand the components of body language that comprise gender performance by applying machine learning techniques to 3D videos of individuals walking in public spaces. We use OpenPose, a skeleton-tracking library, to extract individuals' body part positions from 3D videos - a process that requires heavy GPU computing on video clips that are several gigabytes each. We then use machine learning on annotated videos, analyzing which body motions are perceived as more "male" and which more "female". This project has the potential to clarify a core sociological question - what exactly “doing gender” means – i.e., what physical manifestations individuals are perceiving that encourage them to categorize individuals into gender categories.

Computational and Data-Driven Exploration of the Impact of Metal-Organic Framework Topology on Cryogenic Hydrogen Storage - Lake Room

Kunhuan Liu, PhD Student, McCormick School of Engineering and Applied Science

Hydrogen is considered a crucial clean energy vector to mitigate climate change, but due to the low volumetric energy density of gaseous hydrogen, it is difficult to store hydrogen for many practical applications. Cryogenic sorption-based methods, particularly using nanoporous metal–organic frameworks (MOFs), have been considered as viable solutions to enhance the deliverable capacity of stored hydrogen. MOFs are a class of porous crystalline material synthesized from well-defined molecular building blocks (organic ligands and metal clusters) that assemble into a range of networks. The networks denote how building blocks are connected and are often referred as the underlying topology of the MOFs. In this study, we constructed in silico 105,764 MOF structures using 534 topologies and performed high-throughput screening of their hydrogen deliverable capacities using molecular simulations and surrogate machine learning models. Based on the analysis of over 100,000 MOFs, we explored the less-known effect of underlying topologies of MOFs on hydrogen deliverable capacity. We discover that while most topologies can generate high-performing MOFs, the best performing MOFs are generated with a set of topologies that have a low average deliverable capacity. We further find that two mathematical descriptors of the topologies, net density and td10, best explain this pattern and subsequently apply these principles to identify promising topologies that are also experimentally favorable.

Afternoon Sessions:  1 - 1:25 p.m.

A digital archive enables quantitative studies about how scientific communities engage with a funding agency - Arch Room

Spencer Hong, PhD Student, McCormick School of Engineering and Applied Science

The successful institutionalization of genome biology after the Human Genome Project (HGP) has hitherto not been investigated quantitatively due to technical challenges in extracting and organizing information from heterogeneous physical documents of research funding agencies, while also accommodating ethical and legal ramifications for data distribution and analysis. Here we address these points for 22,843 historic documents assembled by National Human Genome Research Institute (NHGRI), an Institute of the National Institutes of Health over 20 years. We find that NHGRI develops large research programs by engaging with novel groups of external scientists and show that support for specific non-human organisms can be recapitulated through 13 identifiable biological, organizational and semantic properties contained in whitepapers. We publish the archive with both automatically generated and expert-curated metadata that enables scholars to study a first digital archive of a major funding agency. Similarly, our open AI and machine learning tools will be usable by other archives.

Predicting impact of malaria interventions to inform resource prioritization in Guinea - Big Ten Room

Jaline Gerardin, Assistant Professor of Preventive Medicine (Epidemiology), Feinberg School of Medicine

In the context of high malaria burden yet limited resources, Guinea’s national malaria program used a subnational tailoring approach to prioritize intervention allocation, including engagement of stakeholders, data review, and data analytics, to identify the most appropriate mix of interventions in each district. Mathematical modeling was then used to predict the impact of different intervention mix scenarios proposed by the malaria program. This talk focuses on the methods and approaches used to develop, calibrate, and apply mathematical models to inform decision-making in the context of diverse data sources and qualities.

Afternoon Panel Session: 1 - 1:55 p.m.

Panel: The Future of AI-aided Research: The Confluence of Human, Artificial, and Collective Intelligence - McCormick Auditorium

Ken Forbus, Walter P. Murphy Professor of Computer Science and Professor of Education, Director of the Qualitative Reasoning Group

Elizabeth Gerber, Professor of Mechanical Engineering and (by courtesy) Computer Science, McCormick School of Engineering and Applied Science; Professor of Communication Studies, School of Communications; co-Director of the Center for Human Computer Interaction + Design

Kristian Hammond, Bill and Cathy Osborn Professor of Computer Science, McCormick School of Engineering and Applied Science; Director, Center for Advancing Safety for Machine Intelligence; Director, Master of Science in Artificial Intelligence

For this panel, we bring together three distinguished Northwestern researchers specializing in distinct realms of intelligent systems: artificial, cognitive, and collective. Leveraging the confluence among these systems, the panelists will explore AI’s capabilities and constraints to transform established research processes, mechanisms of collaboration and discovery, and the overall landscape of knowledge creation. Join us for what is sure to be a lively and thought-provoking discussion of the future of AI-aided research! The panel will answer questions posed by the moderators as well as questions from the audience.

Afternoon Sessions: 1:30 - 1:55 p.m.

Beyond the Lenses: The AI-Driven Transformation of Electron Microscopy in Multi-User Facilities - Arch Room

Roberto dos Reis, Research Assistant Professor, Materials Science and Engineering, McCormick School of Engineering and Applied Science


Electron Microscopy (EM) has evolved into a highly versatile analytical tool, capable of providing in-depth insights into material structures at an atomic level. This evolution has been supported by significant hardware innovations, including the integration of direct electron detectors and the tool's expanding capabilities for in-situ experiments. Such multimodal functionalities have not only broadened the scope of the applications but also introduced considerable complexity into the data it generates. The volume, variety, and intricacy of data from these multifaceted experiments present unprecedented challenges in data management and analysis.

To navigate these complexities, the adoption of artificial intelligence (AI) must become a part of the data management strategy. AI can dynamically adjust to the needs of increasing data volumes, utilizing machine learning to unravel the complexities of multimodal datasets and automate quality control, thereby ensuring data integrity. Embracing AI-driven methodologies within EM facilities offers a promising path to effectively manage the diverse data challenges posed by the latest technological advancements, especially in contexts where multiple users interact with the system. Integrating AI into EM data management becomes crucial not only for leveraging the equipment’s analytical power but also for facilitating a seamless operational workflow among various researchers. This approach promises to enhance research efficiency, data quality, and collaborative discovery, laying the groundwork for the next generation of scientific exploration and innovation in multi-user environments.

Non-Negative Tensor Factorization for Multi-omics Integration - Big Ten Room

David Katz, Postdoctoral Scholar, Feinberg School of Medicine

Advancements in single-cell multi-omics technologies enable simultaneous measurement of different omic modalities within individual cells. Yet, the challenge remains to integrate multi-omics data without losing the interaction information between the different modalities. Information about the interactions between different omic modalities is lost when a matrix (n=2) is organized as a collection of vectors (e.g., eigenvalues) or a tensor (n=3) is organized as a collection of matrices. The interaction information between modalities is an important part of characterizing the nuanced gene-regulatory programs and cellular states from single-cell multi-omics data. For example, the DNA methylation level near promoters is negatively correlated with gene expression, while the methylation level at gene bodies often shows a positive correlation with gene expression. How to integrate these modalities by considering their interactions in the context of different genomic locations is one of the major challenges in single-cell multi-omics integration. Here, we developed a Non-Negative Tensor Factorization model to derive latent factors from high-dimensional multi-omics data directly without significant loss of interaction information between different omics modalities.