Through the work of the Integrative Data Science Lab (IDSL), we are pioneering new ways to rapidly improve early-stage drug discovery using integrative knowledge graphs and advanced machine learning approaches to profile and predict the biological effects of potential new drugs. We developed Chem2Bio2RDF, the first large scale linked public data graph for preclinical drug discovery; novel link prediction and data mining algorithms for finding hidden insights in large heterogeneous data graphs; and in our 2012 Drug Discovery Today paper laid out a strategy for using linked data and graph analytics to expand beyond the current single-target drug discovery model. Current projects include researching knowledge graphs that encode computable networks for multi-mechanism complex diseases, integrating patient medical records with molecular data to help identify potential targeted therapies, and developing new ways to apply machine learning on top of heterogeneous linked data graphs. We are thankful to NIH NCATS, Indiana CTSI, the OpenPHACTS foundation, Eli Lilly, and Pfizer for funding of this work. Applications in this area are being commercialized in our company Data2Discovery Inc.
These are a selection of our papers that we think are a good starting point, with links to the PDFs of the articles. For a full list of publications relating to knowledge graphs, see David’s Google Scholar page. Please contact David if you have trouble accessing any papers of interest.
Wild, D.J., Ding, Y., Sheth, A.P., Harland, L., Gifford, E.M., Lajiness, M.S. Systems Chemical Biology and the Semantic Web: what they mean for the future of drug discovery research, Drug Discovery Today, 2012, 17, 469-474.
Chen, B., Dong. X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics 2010, 11, 255.