Jana Diesner

Welcome!

I recently joined the University of Illinois at Urbana-Champaign, Graduate School of Library and Information Science as an assistant professor. Prior to that, I wrote my PhD thesis at Carnegie Mellon University, School of Computer Science, where I am a PhD candidate in the Computation, Organizations and Society Program. My thesis is about the computational integration of text data and network data to construct and analyze socio-technical networks. My committee members are Kathleen M. Carley, William W. Cohen, Carolyn P. Rose and Jeff Johnson.

Overview on Research Experience:

My research mission is to contribute to the computational analysis and better understanding of the co-evolution and interplay of information and the structure and functioning of socio-technical networks. To put this goal into action, I conduct highly interdisciplinary research at the nexus of network science, natural language processing and machine learning. More specifically, I develop, adapt, analyze and apply methods and technologies that facilitate the extraction of information about networks from large-scale text corpora and the consideration of the substance of information for network analysis.This work promotes the collection and management of rich network data, and meaningful and actionable data analysis.

From an application domain perspective, I am passionate about studying socio-technical factors that impede the sustainable development of networks and their wider context. Driven by this interest, I focus on networks that involve the production and processing of covert information, and on covert networks.

Technology:
At CMU, I have been a developer for AutoMap. This tool supports users in performing a wide range of NLP routines,content analysis, and relational text data analysis. As part of my thesis, I have been focusing on adding and evaluating machine-learning based text mining methods to AutoMap, e.g. entity extraction.

Methods:
Robustness: The goal with technologies I work on is to facilitate the informed, accurate and efficient information extraction from natural language text data. The computational steps and interdependent subroutines involved in highly accurate technologies fro this purpose impact the structure, properties and interpretation of the resulting network data. These impacts are not sufficiently understood. I address this shortcoming by determining the amount and boundaries of variations in network structure that are due to engineering decisions made when building tools and end-user decisions made when applying tools (example).

Integration of text data and network data: Prior research from different fields has shown that without considering the substance of information and communication data, we are limited in our ability to understand the effects of language use and information flow in communities, including the transformative role that language can play on communities. To address this limitation, I combine theory and models from the social sciences with computational methods, especially from natural langauge processinag and machine learning, to develop methods that support the utilization of the content of text data for network analysis (example).

Empirical Work:
I deploy the tools and methods I work on to study dynamic networks from the domains of business (e.g. Enron) and science (e.g. collaboration on research proposals), and groups from various geopolitical regions (e.g. the Sudan). For many of these projects, I had the chance to collaborate with and learn from (subject matter) experts from a wide range of domains and backgrounds - thank you all.

 

photo by cirosec