ºìÐÓÊÓÆµ

Skip to main content

Haris Smajlovic

  • MSc (University of Sarajevo, 2017)

  • BSc (University of Sarajevo, 2015)

Notice of the Final Oral Examination for the Degree of Doctor of Philosophy

Topic

Secure Computational Genomics

Department of Computer Science

Date & location

  • Wednesday, November 27, 2024

  • 10:00 A.M.

  • Engineering & Computer Science Building

  • Room 467

Reviewers

Supervisory Committee

  • Dr. Ibrahim Numanagić, Department of Computer Science, University of Victoria (Supervisor)

  • Dr. Sean Chester, Department of Computer Science, UVic (Member)

  • Dr. Riham AlTawy, Department of Electrical and Computer Engineering, UVic (Outside Member) 

External Examiner

  • Dr. Yun William Yu, School of Computer Science, Carnegie Mellon University 

Chair of Oral Examination

  • Dr. Jay Cullen, School of Earth and Ocean Sciences, UVic 

Abstract

Scattered between different biobanks and healthcare providers across multiple countries, biomedical data is extensively used for research purposes. Collaboration and sharing of such data between multiple institutions often provide access to more diverse datasets and a chance to conduct comprehensive studies. However, these collaboration efforts are usually hindered by privacy issues that render the pooling of such data at a centralized database impossible. To enable collaborative studies on top of such datasets, we present an easy-to-use programming framework with two domain-specific languages, Sequre and Shechi, for secure high-performance computing on private, distributed datasets. Our framework automatically converts Pythonic code into a secure distributed equivalent using secure multiparty computation (SMC) in Sequre and, for the first time, multiparty homomorphic encryption (MHE) in Shechi to enable efficient distributed computation. It abstracts away considerations about the private and distributed aspects of the input data from end users through a familiar Pythonic syntax, and by introducing new data types for the efficient handling of distributed data as well as systematic compiler optimizations for cryptographic and distributed computation. We evaluate our framework on a wide range of applications, including complex genomic analysis tasks and statistical analysis of private electronic health records (EHRs). Our results demonstrate Sequre’s and Shechi’s ability to uncover optimizations missed even by expert developers, achieving up to 15× runtime improvements over the prior state-of-the-art solutions and a 40-fold improvement in code expressiveness compared to code manually optimized by experts. Finally, our solution enables the utilization of distributed datasets as a whole to conduct collective studies between non-trusting private data proprietors and, as a result, facilitates data sharing and collaboration efforts in privacy-sensitive fields such as biomedicine.