Poster Session
Wildcat Room: Noon - 2 p.m.
FIRST PLACE WINNER: Understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results
Reese Richardson, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Heliodoro Tejedor Navarro, Luis AN Amaral, and Thomas Stoeger
Abstract: Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes are similarly abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 35 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
SECOND PLACE WINNER: Enhancing Lung Tumor Segmentation: A Scalable, Data-Driven Approach Integrating Local and Global Contexture Learning
Additional Authors: Yaqi Miao, Troy Teo Peng, and Mohamed Abazeed
Abstract: Accurate mapping of lung gross tumor volumes (GTVs) is crucial for developing effective radiation oncology treatments. The 3D UNet architecture, a deep learning model, offers promising avenues for automating this intricate and often inaccurate process. Our study, leveraging one of the largest datasets to date from two academic sites (n = 941), explores the efficacy of 3D UNet models trained and tested to improve the accuracy and context-sensitivity of lung tumor segmentation. We investigate the impact of varying region-of-interest (ROI) sizes on learning and inference, assessing the balance between local and global insights for enhanced precision and contextual understanding.
Methods: We implemented a 3D UNet architecture for lung GTV segmentation, employing datasets from two institutions and employing data sampling to achieve balanced representation across full CT scans (512x512x256) and context windows (128x128x128) centered around tumor isocenters. Training and testing phases addressed both the entire CT volume and the ROI centered on the tumor isocenter. To enhance the contextual understanding of our model, we meticulously noted the location of the tumor within the thoracic region. Leveraging two Nvidia Tesla A100 40GB GPUs, we utilized distributed computation, enabling our model to be trained in approximately 12 hours for 20 epochs at a batch size of 12.
Results: When exclusively trained and tested on a local ROI, our model achieved a DICE score of 0.5758, indicating substantial segmentation accuracy within that confined area. However, transitioning to the full CT context resulted in a diminished DICE score of 0.3002, implying a trade-off between precision and contextuality. This transition highlighted the challenge of maintaining segmentation accuracy across broader contexts. Furthermore, testing revealed a performance difference between the institutions contributing data. Specifically, on the training institution's data, the DICE score reached 0.6604, while on the other institution's data, it dropped to 0.4884, signifying a significant performance gap of about 26.04%. This discrepancy underscores the importance of considering institution-specific characteristics during model training and testing. Conversely, leveraging a context window approach yielded exceptional precision in local segmentation but incurred a compromise in capturing the broader context of the tumor environment. This observation emphasizes the need for a balanced approach that optimizes both precision and contextuality, especially when dealing with heterogeneous datasets from multiple institutions.
Conclusion: Our study underscores the importance of balancing precision and context in lung tumor segmentation. While training solely on the ROI yields reasonable results, it falls short of capturing the holistic tumor landscape. Conversely, utilizing the full CT provides valuable context but sacrifices precision. By integrating both approaches, we propose a hybrid model that harnesses the strengths of each—utilizing the full CT model to identify regions of interest and employing the context window for precise segmentation. This synergistic approach not only surpasses human precision but also equips clinicians with a powerful tool for accurate and comprehensive tumor delineation. This advancement represents a significant leap in scalable data learning, offering great potential to enhance treatment planning and patient outcomes in clinical settings.
THIRD PLACE WINNER: Neural Network Models Towards Space Group Determination Using Dynamically Simulated EBSD Patterns
Alfred Yan, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Muhammed Nur Talha Kilic, Ankit Agrawal, Roberto dos Reis, Vinayak Dravid
To alleviate this problem, this study attempts to develop neural network models that can quickly classify the crystal structure of a novel material given its EBSD pattern. More specifically, models are trained to classify the space group of cubic materials in the point group m-3m. Dynamical simulation software is used to create a large training dataset consisting of EBSD patterns from different cubic materials, and neural networks models are trained on the simulated patterns to classify the space group. The models are evaluated in a cross-validation setting, such that each validation set contains a set of materials that is completely distinct from the other validation sets. A high accuracy is achieved, suggesting that space group information is present in EBSD patterns that can be accessed by a neural network. By employing advanced machine learning techniques and high-throughput simulation, this method may eventually offer a robust, high-throughput tool for automated crystallographic analysis without the prerequisite of prior material phase knowledge.
Multiscale Molecular Dynamics using GROMACS on Quest for understanding the interactome in Peptide Amphiphiles
Dhwanit Dave, Postdoctoral Scholar, Simpson Querrey Institute, Office for Research
Additional Authors: Liam C. Palmer and Samuel I. Stupp
Abstract: The design of sustainable materials that are composed of biomolecular scaffolds such as peptides and peptide amphiphiles (PAs) requires computational modeling of their supramolecular interactome. We use molecular dynamics (MD) simulations on Quest to understand the self-assembly of these PAs and the interactions that dictate their material and biological properties. Using coarse-grained (CG) models that use a reduced representation method to access longer timescale trajectories and larger simulation volumes, seek to elucidate how the amino acid sequence affects the nanoscale morphologies, motion, and interaction strengths. Augmenting these CG models with all-atom (AA) simulations and experimental spectroscopic data, we are now exploring how to enhance the functionality of these materials for regenerative medical applications, energy harvesting and other challenges.
Investigating the Effects of Water and Crosslinking Content on the Mechanical Properties of Mussel Foot Protein 5
Alex Gerber, Undergraduate Student, McCormick School of Engineering and Applied Science
Additional Authors: Jacob Graham, Sinan Keten
Abstract: Mytilus galloprovincialis (the Mediterranean mussel) produces mussel foot protein 5 (fp5), which has extraordinary strength, adhesion, and cohesion properties that allow for adherence to a variety of underwater surfaces. A better understanding of fp5 and its properties aids in its use in synthetic materials that are biocompatible and biodegradable, such as hydrogels and adhesives, to promote wound healing at surgical sites. Several chemical and physical aspects of the protein have been attributed to fp5’s unique properties, however, the full extent of sequence and molecular contributions to fp5’s function are not yet fully understood.
This project uses Molecular Dynamic (MD) simulations to investigate the effects of the proportion of tyrosine crosslinking and water content of simulations on the mechanical properties of fp5. Simulations were run at 10%, 30%, and 50% water by weight with 10% to 50% crosslinking densities (the proportion of crosslinked tyrosine residues compared to the total number of tyrosine residues in the system). Analysis of the simulated tensile tests revealed that plateau stress and elastic modulus decrease with increasing water content, while crosslinking density does not impact these properties. Furthermore, toughness and maximum stress of the protein decrease with increasing water content and increase with higher crosslinking densities. Finally, strain hardening behavior increases in cases of higher crosslinking densities and higher water proportions.
The dependence of fp5’s mechanical properties on water content and crosslinking density is a discovery that will influence how fp5 (and other similarly crosslinked protein systems) are modeled to enhance the development of synthetic materials. This improved understanding of these parameters also works towards creating a model of fp5 that can predict mechanical behaviors of the bulk protein from an amino acid sequence, which would aid in the discovery of protein sequences with enhanced mechanical properties.
A Computational Method Discerning the Relationship Between Order and Mechanical Properties of Spider Silk Fibers
Jacob Graham, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Xinyan Yang, Timothy Russell, Sinan Keten
Abstract: Spider silks are studied widely for their exceptional mechanical properties and biocompatibility. Currently, spider silk farms are prohibitively expensive to operate, restricted by spiders’ cannibalistic tendencies in confinement. Recombinant silk production, on the other hand, is limited by metabolic constraints associated with recruiting bacteria to express foreign proteins and inability to accurately mimic the natural spinning process in vitro. In the silk fiber-producing ducts of spiders, the degree of alignment of long proteins makes the mechanical properties of silk tunable, with the ability to achieve high tensile strengths and high breaking strains. Here, we use a molecular dynamics technique called dissipative particle dynamics to demonstrate how this tunability is achieved through chain extension and order along the fiber axis. For low molecular weight peptides, a larger degree of order is required to achieve the same tensile strength as high molecular weight peptides, which also exhibit greater extensibility, toughness, and elastic modulus. We also demonstrate that our data, collected using parallel computing architectures on Quest, is consistent with the behavior of wet-spun fibers drawn to varying degrees of order in a typical synthetic spider silk manufacturing setup. Our work presents a framework for predicting spider silk mechanical properties based on the degree of drawing and peptide alignment for the design of mechanically tunable fibers.
Imaging technology and machine learning algorithms to investigate circulating tumor cells (CTCs) in predicting patient prognosis
Andrew Hoffman, Research Specialist, Feinberg School of Medicine
Additional Authors: Josh Squires, Yuanfei Sun, Aoran Sun, Youbin Zhang, and Huiping Liu
Abstract: Utilizing advanced imaging technology and machine learning algorithms, this study investigates the significance of circulating tumor cells (CTCs), immune cells, and CTC clusters in predicting patient prognosis. By analyzing FDA approved Cell search images obtained from the blood of breast cancer patients, we aim to elucidate the clinical relevance of various cell populations and their special organization. Our analysis focuses on identifying CTC’s, immune cells, and CTC clusters differentiated by homogeneity or heterogeneity in cellular composition. By utilizing artificial intelligence (AI) algorithms, we employ feature extraction and classification to identify different cell types and cell clusters. Specifically, we examine the expression patterns of biomarkers such as cytokeratin, CD45, and dapi to discern the spatial distribution and intensity within samples. The study aims to explore the potential prognostic values of CTCs, immune cells, and CTC clusters by correlating their presence, abundance, and spatial information with clinical outcomes. These outcomes may include patient survival, disease progression, and treatment response. By combining multidimensional data derived from cell morphology, biomarker expression, and spatial relationships, we aim to develop predictive models capable of stratifying patients into unique risk groups.
Multi-Objective Bayesian Optimization for Agent-Based Model Calibration: Efficiently Selecting Parameters to Fit Diverse Malaria Field Datasets
Tobias Holden, MD-PhD Student, Feinberg School of Medicine
Additional Authors: Ricky Richter, Anne Stahlfeld, Aurelien Cavelan, Josh Suresh, Prashanth Selvaraj, Caitlin Bever, Melissa Penny, and Jaline Gerardin
Abstract: Mathematical models of malaria transmission are used to study disease dynamics and predict the impact of interventions where clinical trials may not be feasible, and modeling has a growing role in public health decision-making. To make accurate predictions, models should reproduce key quantitative relationships between parasites, symptoms, gametocytes, and infectiousness within human and mosquito hosts, which can all vary between malaria-endemic settings. As such, calibrating a complex model requires sampling many input parameters while targeting multiple objectives. We were motivated to recalibrate EMOD’s within-host model to address unexpected model behaviors, and include datasets made available since the last calibration in 2018.
We used a gaussian process emulator with Thompson sampling to simultaneously calibrate 13 parameters of the agent-based malaria transmission model, EMOD, to data from malaria clinical trials. The parameters under calibration determine modeled immunity and parasite dynamics, while calibration targets describe observed malaria incidence, prevalence, parasite densities, and infectiousness from eight different malaria-endemic sites in Sub-Saharan Africa. We designed simulation scenarios that matched the seasonality & intensity of malaria transmission in each site, and included historical anti-malaria interventions (ex. curative treatment, bednets). The gaussian process emulator was trained on 1,000 randomly selected parameter sets, before emulating the simulated goodness-of-fit at 5,000 new parameter sets. Then, a Thompson sampling algorithm with trust region bayes optimization (TuRBO) was used to select 200 new parameter sets for simulation. The process continued until 5,000 parameter sets were simulated.
We found that Gaussian process emulation with Thompson sampling converged on optimal model parameters more quickly and improved fit to reference datasets, compared to prior approaches. The new approach has been applied to the vector habitat model of EMOD, and is well-suited to any resource-intensive modeling project with a large number of unknown input parameters.
A network science approach to the world of painters
Eliseu Antonio Kloster FIlho, Undergraduate Student, Weinberg College of Arts & Sciences
Abstract: We analyze a dynamic network generated by linking painters according to who they have influenced and for how long. We find that a simple mathematical model can describe the evolution of a given individual’s influence during their lifetime. For example, we predict how many artists they have influenced, or how many pupils they had in any given year of activity. Similarly to the concept of ultimate impact in citation networks, we propose a way of quantifying how influential an artist was based on various network properties, and a way of predicting rising stars according to this criterion. We also find evidence for scale-freeness, and the formation of communities that correlate to linguistic and geographic boundaries. The data used in this work is curated by specialists and made publicly available by various organizations.
Robust Self-Calibrated Probabilistic Fibrosis Signature technique for cardiac 3D LGE MRI
Mehri Mehrnia, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Mohammed S.M. Elbaz
Atrial fibrillation (AF) is a prevalent cardiac arrhythmia that poses a risk of heart attack to 50 million individuals worldwide. Left atrial (LA) fibrosis evaluation from 3D LGE-MRI can prognosticate muscle injury(fibrosis/scar) of LA wall in AF. However, current assessment methods are not standard, are defined based on threshold, sensitive to noise, and LA wall variabilities. This impedes reproducibility and clinical utility of 3D LGE MRI[1]. Thus, we propose a novel threshold-free, robust, and standardized technique for quantifying fibrosis burden by deriving a comprehensive probabilistic signature from 3D LGE MRI. The signature[2]probabilistically encodes multi-billion LGE intensity co-disparities(comparisons) per patient from the entire LA volume. Our method gives a standardized Fibrosis signature index (FSI) which quantifies fibrosis burden per patient with higher FSI values indicating a higher fibrosis burden.
We evaluated the method extensively on 143 3D LGE-MRI scans of AF patients[3] against two widely used quantification methods(i.e., (μ+2.3σ) and IIR1.2 methods)[4,5] in terms of: 1)feasibility for quantifying LA fibrosis burden, 2) stability and reproducibility in presence of noise, 3) repeatability in post-ablation follow-up.
To evaluate the reproducibility and agreement of each method (i.e., our proposed (μ+2.3σ) and IIR1.2 methods), ICC, CoV, Spearman correlation, and Bland-Altman plot were tested. Our proposed method, FSI, is highly correlated with current methods (rho>0.7, p<0.001).
We simulated three levels of incremental scan-specific Rician-noise augmented to 3D LGE MRI image data. At the highest augmented Rician noise power, the proposed FSI method showed ~3-times better robustness by the lowest variability (CoV=7.77%) than μ+2.3σ and IIR1.2 methods while keeping highest agreement with original scan values (ICC_(FSI )= 0.85, ICC_(IIR1.2 )= 0.79, ICC_(μ+2.3σ )= 0.48).
The FSI method in two-scan follow-ups outperforms existing methods by three-fold less variability (CoV), twice higher correlation(rho), lower bias, and maintaining excellent agreement with original values.
Epidermal Langerhans Cells as disease-driving immune cells in Painful Diabetic Neuropathy
Paola Pacifico, Postdoctoral Scholar, Feinberg School of Medicine
Additional Authors: Nirupa D. Jayaraj, Dongjun Ren, Dale George, James S. Coy-Dibley, Mirna Andelic, Giuseppe Lauria, Richard J. Miller, and Daniela M. Menichella.
Abstract:
The interplay between non-neuronal cells and nerve afferents in the epidermis plays a prominent role in health and disease. Painful Diabetic Neuropathy (PDN), one of the most common and intractable complications of diabetes, is characterized by the remodeling of cutaneous innervation and neuropathic pain. Even though increasing evidence suggests the importance of epidermal non-neuronal cells, such as resident immune cells, in the development of PDN, the mechanisms underlying this neuropathy remain largely unknown. To investigate how epidermal cells communicate with cutaneous afferents and how this communication affects PDN, we have adopted an experimental approach that combines a mouse model of PDN and skin biopsies from PDN patients. We performed single-cell RNA sequencing of the epidermis in mice fed a high-fat diet (HFD, 42% fat) and a regular diet (RD, 11% fat) for 10 weeks, including both male and female subjects. We explored epidermis anatomy from skin sections from well-clinically characterized PDN patients and controls. Unsupervised clustering of scRNA-seq data from HFD and RD mouse epidermis revealed several distinct clusters, including keratinocytes at different stages of differentiation and Langerhans Cells (LCs), a population of resident antigen-presenting cells. We observed a significant increase in LCs in the epidermis of HFD mice, suggesting their key role in promoting neuronal excitability. We demonstrated that LCs are crucial players in neuro-immune communication and may be involved in axonal degeneration/regeneration in PDN through semaphorin-plexin pathways. We also validated key targets in patients with PDN through in-situ hybridization and immunohistochemistry. Our findings highlight the pivotal role of Langerhans cells in the development of PDN, revealing their functional association with sensory afferent neurons in the epidermis. The disrupted neuron-immune communication between LCs and cutaneous afferents may be responsible for the neuropathic pain in PDN and the remodeling of cutaneous innervation in both mice and PDN patients.
How to find exploding stars in your sleep
Nabeel Rehemtulla, Postdoctoral Scholar, Simpson Querrey Institute, Office for Research
Additional Authors: Adam A. Miller, Theophile Jegou du Laz, and Michael W. Coughlin
Abstract: Supernovae (SNe), the explosive deaths of stars, are the engines through which most heavy elements in the Universe are created. They occur in a variety of types and provide unique insights into stellar lifecycles and the expansion of the Universe. Teams like the Bright Transient Survey (BTS) must identify these events in real-time to facilitate follow-up observations, which are necessary to best classify an event's type and extract maximal information from it. BTS in particular is conducting a large statistical study, aiming to classify all supernovae passing our criteria (>1,300 SNe / year). Traditionally, SN identification and the triggering of follow-up observations has been done via visual inspection of candidate SNe, typically called "scanning". Over the 5+ years of BTS operations, scanning has taken an estimated 2,000+ hours of expert effort. We have developed BTSbot, a multi-modal convolutional neural network to automate scanning for BTS. BTSbot produces a unit-interval score on images and metadata features of a candidate SN indicating whether it believes the candidate to be a SN fit for follow-up observation. Near-perfect completeness of SNe passing our criteria is necessary in order to maintain the quality of the BTS sample, and BTSbot achieves >99% completeness of relevant SNe. In addition, the purity of the selected sample is crucial to prevent wasting valuable observational resources on uninteresting sources, and the sample selected by BTSbot is >93% pure. Now, BTSbot has been deployed into real-time BTS operations where it regularly identifies and triggers on new SNe before any human sees the data. Less than a week after being deployed, BTSbot contributed to the world's first SN to be fully-automatically discovered, identified, classified, and shared with the community.
Automating Multiple Variable Extraction from Echocardiogram Reports Utilizing ChatGPT4 : A Non-Inferiority Study Comparing Manual Reviewers with ChatGPT4
Alan Soetikno, MD-MBA Student, Feinberg School of Medicine
Additional Authors: Daniel Acciani, Lynna Yang, Scott Wu, Dr. Preeti Kansal MD, Dr. David Liebovitz MD
Abstract: The Electronic Medical Record (EMR) is a rich source of data for clinical research across all disciplines of medicine. However, this critical data is inaccessible to physicians and researchers as they are buried within vast amounts of unstructured medical text. Converting this text into an analyzable format currently requires a significant amount of manual labor and is prone to human-error. This can be easily seen with echocardiograms, a commonly used diagnostic imaging modality used to visualize the heart's structure and function. Many pertinent clinical findings are only documented in echocardiogram reports, rendering them a rich source of unstructured quantitative and qualitative data.
We hypothesized that Large Language Models (LLMs), such OpenAI’s GPT series, are able to extract key clinical data elements from echocardiograms with similar accuracy as manual reviewers, but in significantly less time. We conducted a non-inferiority study comparing OpenAI’s GPT4 model with manual reviewers in extracting 37 unique variables from echocardiogram reports. We randomly selected 500 unique echocardiograms performed at Northwestern Memorial Hospital from 2020 to 2022 for use in this analysis and compared the accuracy of GPT4 and manual reviewers at extracting these values. Our analysis showed that GPT4 and manual reviewers agreed on 18141 values out of 18500 possible values (0.9805, 95%CI: [0.9785 - 0.9825]). Discrepancies between manual reviewers and GPT 4 were analyzed and showed that manual reviewers correctly labeled 18336 values (0.9911, 95% CI:[0.9897, 0.9924]) while GPT4 was able to correctly label 18,325 values (0.9905, 95% CI:[0.989, 0.9918]). GPT4 cost on average $0.041 per echocardiogram and processed all 500 reports in under 40 minutes. This study demonstrates ChatGPT's efficacy in utilizing GPT4 to extract data from echocardiogram reports, offering a promising solution to streamline clinical research processes.
Automated Extraction of Mineralized Tissue Metrics in Murine Incisors
Ethan Suwandi, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Victoria Cooley, Sarah Boyer, Denis Keane, William Guise, Viktor Nikitin, Pavel Shevchenko, Adya Verma, Tomas Wald, Jeffrey O. Bush, Ophir Klein, Stuart Stock, and Derk Joester
Abstract: Congenital dental developmental defects are associated with a significant cost to both society and a patient’s physiological and psychological health. Enamel formation (amelogenesis) occurs in multiple stages (pre-secretory, secretory, and maturation) and is influenced by a number of associated matrix proteins and proteases. Continuously growing mice incisors are an ideal model as all stages of tooth development can be observed along its axis. To gain insight into the mechanisms and interactions of amelogenesis-associated proteins, recent collaborative efforts have created an extensive set of mice models exhibiting hypoplasia, hypomineralization and ectopic mineral formation from a series of conditional knock-outs and knock-in mutants. As part of this project, we have collected over 250 synchrotron and lab source micro-computed tomography (μCT) reconstructions of mouse hemimandibles from 12 distinct genotypes in males and 16 distinct genotypes in females.
Given the large number of samples, an automated analysis pipeline was developed to extract and compare metrics for different mineralized tissues between different genotypes, sexes, and developmental stages. Building on automatic segmentation using a previously developed convolutional neural network (CNN), we report our methods for the algorithmic identification of landmarks, orientation and fitting of the arc of the incisor, and the extraction of metrics both along the arc and within the interior of the developing enamel. In addition to quantifying morphological and densification metrics between mineralized tissues, we demonstrate the ability to automatically distinguish between different stages of amelogenesis, compare shifts in stage onset and length, and determine extents, means, and rates of growth and mineralization across different mutant phenotypes.
Taken together, the combination of high-resolution μCT reconstructions, CNN-based segmentation, and an automated analysis pipeline allows us to study the process of amelogenesis on an extensive set of unique phenotypes in greater detail than before.
Detecting small-magnitude earthquakes in a noisy urban environment
Ann Mariam Thomas, PhD Student, Weinberg College of Arts & Sciences
Additional Authors: Omkar Ranadive, and Suzan van der Lee
Abstract: On November 4, 2013, a magnitude 3.2 earthquake occurred near a limestone quarry and reservoir about 18 km southwest of downtown Chicago. The earthquake was covered extensively in the media because of its connection with the local quarry, where dynamite was blasted about seven seconds before the earthquake. The unique circumstances of the event raised the possibility that the quake was induced by local industrial operations. However, due to the sparse seismicity of the Chicago area, we cannot statistically investigate this claim of induced seismicity. Additional earthquake detections are needed to constrain the background seismicity rate and local fault lines. Therefore, to better assess seismic hazards in the Chicago area and expand its earthquake catalog, our study aims to detect small-magnitude earthquakes that have gone undetected in the past decade.
Since seismometers are typically installed in rural areas, traditional earthquake detection methods (e.g. STA/LTA ratio method) are ill-suited for the noisy urban environment of the Chicago area. To detect small earthquake signals that may be buried in background noise, we have developed a Random Forest model that classifies seismograms as a signal of interest (earthquakes and blasts) or noise. We trained our model using global earthquake seismograms and Chicago noise, including transient noise signals that other detection methods have misclassification as earthquakes. Our model uses a novel set of 120 waveform features that were designed using our domain knowledge on the characteristic frequencies of local earthquakes and anthropogenic noise. We will present the development of our model and its performance on 5+ years of seismic data from a broadband station in the Chicago area. In this time period, our model was able to detect a few potential earthquakes that are currently missing in existing earthquake catalogs. We will present these detections and discuss their notable time-frequency features.
Advancing Thermoelectric Cooling: A Physics-Informed Computational Workflow for Efficient Materials Discovery
Michael Toriyama, PhD Student, McCormick School of Engineering and Applied Science
Additional Authors: Prashun Gorai and G. Jeffrey Snyder
Abstract: Cooling technologies, such as air conditioning and refrigeration, account for nearly 10% of the global electricity consumption today, putting enormous pressure on our ever-growing demand for energy. They also act as major sources of greenhouse gas emissions, necessitating environmentally-friendly alternatives in our endeavor towards net-zero. Solid-state thermoelectric cooling offers a viable option, but new materials are needed to advance this technology. Narrow-gap semiconductors have historically demonstrated high thermoelectric performance; yet, existing computational search methods either do not fully capture the relevant physics of such materials or rely on resource-intensive calculations. Here, we design a physics-informed workflow that employs descriptors computable using relatively inexpensive first-principles calculations. The predictive capabilities of the workflow are validated on known high-performing thermoelectrics, which are ultimately “re-discovered” by the methodology as intended. By rapidly screening hundreds of compounds, we identify three yet-to-be-considered narrow-gap semiconductors (SrSb2, Zn3As2, and NaCdSb) that are intriguing candidates for next-generation thermoelectric cooling devices. Our high-throughput screening also uncovers an entire category of materials that are promising for thermoelectrics, known as topological insulators. Further analysis shows that band inversion, a characteristic feature of all topological insulators, is the key property that unlocks high thermoelectric performance in this particular set of materials. The data-driven strategy enables the identification of individual candidates as well as broader material classes suitable for thermoelectrics, bringing us one step closer towards realizing thermoelectric cooling in its fullest potential.
Global Improvements in the Representation of Women in Science have Stalled
Samvardhan Vishnoi, PhD Student, Weinberg College of Arts & Science
Abstract: This study challenges the commonly-held view of slow but consistent progress in gender equality by assessing recent trends in representation of women among published scholars. We analyzed one of the most prominent abstract and citation databases, Scopus, which includes over 33 million publications from 1996 to 2020. We estimated that the Gender Parity Index (i.e., the number of female scholars per male scholar) increased significantly, on a global scale, until around 2011. However, since then the trend towards higher representation of women has stagnated across the large majority of countries worldwide. Our projections indicate that, if current trends persist, gender gaps are likely to increase or stabilize over the next decade.