A Deep Learning Approach to Industry Classification
by
Xiao Fang
Professor of Management Information Systems
JPMorgan Chase Senior Fellow
Lerner College of Business and Economics &
Institute for Financial Services Analytics
University of Delaware
Friday, Feb 18
11:00 am – 12:30 pm
In-person: 1810 Liacouras Walk 420
Abstract:
Industry classification systems (ICSs), which identify economically related firms as peer firms, play a fundamental role in business research and practice. Traditional expert-driven approaches manually design ICSs and thus have limitations, including high maintenance costs and coarse granularity of the identified firm relatedness. To circumvent these limitations, recent research takes an algorithm-driven approach, employing a bag-of-words method to represent firms’ 10-K reports and leveraging these representations for identifying economically related firms. While firms’ 10-K reports are highly informative for identifying economically related firms, the bag-of-words method is inadequate for representing these documents, as it ignores the rich semantic information encoded in word contexts and order, resulting in a less effective ICS. Recent developments in deep-learning-based document embedding provide powerful tools for document representation. However, existing document embedding models (DEMs) are not well suited to capture the rich semantics of 10-K reports due to their challenging nature: they are long documents featuring heterogeneous and shifting concepts. We propose a novel DEM to address these challenges; it solves them through an innovative design of an adaptive gating mechanism and its associated gating function. In addition, we develop a new ICS that takes firms’ 10-K reports as input, employs the proposed DEM to represent the semantics of these reports, and identifies economically related firms based on similarities between their 10-K representations. We demonstrate through extensive empirical evaluations that our proposed ICS is superior to representative existing ICSs as well as ICSs constructed using state-of-the-art DEMs. This study contributes to business research and practice with a novel ICS that can effectively identify economically related firms. It also contributes to the field of deep-learning-based document embedding with an innovative DEM that can capture the semantics of a broad variety of long documents with shifting concepts, such as 10-K reports, legal documents, and patent documents.
Bio:
Xiao Fang is Professor of MIS and JPMorgan Chase Senior Fellow at Lerner College of Business & Economics and Institute for Financial Services Analytics, University of Delaware. He also holds appointments at Department of Computer Science as well as Department of Electrical and Computer Engineering, University of Delaware. His current research focuses on financial technology, social network analytics, and health care analytics, with methods and tools drawn from reference disciplines including Computer Science (e.g., Machine Learning) and Management Science (e.g., Optimization). He has published in business journals including Management Science, Operations Research, MIS Quarterly, and Information Systems Research as well as computer science outlets such as ACM Transactions on Information Systems and IEEE Transactions on Knowledge and Data Engineering. Professor Fang co-founded INFORMS Workshop on Data Science in 2017. He served as an Associate Editor for MIS Quarterly and currently on the editorial board of Service Science (INFORMS).