Research
Selected Publications & Preprints
He, Z., Chu, B., Yang, J., Gu, J., Chen, Z., Liu, L., Morrison, T., Belloy, M.E., Qi, X., Hejazi, N. and Mathur, M., 2024. Beyond guilty by association at scale: searching for causal variants on the basis of genome-wide summary statistics. bioRxiv, pp.2024-02.
Chen, Z.*, He, Z.*, Chu, B.B., Gu, J., Morrison, T., Sabatti, C. and Candès, E., 2024. Controlled Variable Selection from Summary Statistics Only? A Solution via GhostKnockoffs and Penalized Regression. arXiv preprint arXiv:2402.12724. *Equal contribution.
Chu, B.B., Gu, J., Chen, Z., Morrison, T., Candes, E.*, He, Z.* and Sabatti, C.*, 2023. Second-order group knockoffs with applications to GWAS. arXiv preprint arXiv:2310.15069. *Equal contribution.
Yu, C.X., Gu, J., Chen, Z. and He, Z., 2023. Summary Statistics Knockoffs Inference with Family-wise Error Rate Control. arXiv preprint arXiv:2310.09493.
Qi, X., Belloy, M.E., Gu, J., Liu, X., Tang, H.* and He, Z.*, 2023. Robust inference with GhostKnockoffs in genome-wide association studies. arXiv preprint arXiv:2310.04030. *Equal contribution.
Oh, H.SH., Rutledge, J., Nachun, D., ... , He, Z., ... , Wyss-Coray, T. (2023). Organ aging signatures in the plasma proteome track health and disease. Nature, 624, 164-172. (featured on the cover of Nature)
Gyawali, P.K., Guen, Y.L., Liu, X., Tang, H., Zou, J. and He, Z. (2023). Improving genetic risk prediction across diverse population by disentangling ancestry representations. Communications Biology, 6(1), p.964.
Le Guen, Y., Raulin, A., Logue, M. W., Sherva, R., Belloy, M. E., Eger, S. J., Chen, A., Kennedy, G., Kuchenbecker, L., O'Leary, J. P., Zhang, R., Merritt, V. C., Panizzon, M. S., Hauger, R. L., Gaziano, J. M., Bu, G., Thornton, T. A., Farrer, L. A., Napolioni, V., He, Z. and Greicius, M. D. (2023). Association of African Ancestry-Specific APOE Missense Variant R145C With Risk of Alzheimer Disease. JAMA, 329 (7): 551-560.
Gyawali, P.K., Liu, X., Zou, J. and He, Z. (2022). Ensembling improves stability and power of feature selection for deep learning models. Machine Learning in Computational Biology, 33-45.
He, Z., Liu, L., Belloy, M.E., Le Guen, Y., Sossin, A., Liu, X., Qi, X., Ma, S., Gyawali, P.K., Wyss-Coray, T., Tang, H., Sabatti, C., Candes, E., Greicius, M.D., Ionita-Laza, I. (2022). GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies. Nature Communications, 13(1), pp.1-16.
Lu, F., Sossin, A., Abell, N., Montgomery, S. B., He, Z. (2022). Deep learning-assisted genome-wide characterization of massively parallel reporter assays. Nucleic Acid Research, 50(20), pp.11442-11454.
Kassani, P.H., Lu, F., Guen, Y.L. and He, Z. (2022). Deep neural networks with controlled variable selection for the identification of putative causal genetic variants. Nature Machine Intelligence, 4(9), pp.761-771.
Abell, N.S., DeGorter, M.K., Gloudemans, M., Greenwald, E., Smith, K.S., He, Z., Montgomery, S.B. (2022). Multiple Causal Variants Underlie Genetic Associations in Humans. Science, 375 (6586), pp.1247-1254.
Yang, Y., Wang, C., Liu, L., Buxbaum, J., He, Z. and Ionita-Laza, I., (2022). KnockoffTrio: A knockoff framework for the identification of putative causal variants in genome-wide association studies with trio design. The American Journal of Human Genetics, 109(10), pp.1761-1776.
Ma, S., Dalgleish, J., Lee, J., Wang, C., Liu, L., Gill, R., Buxbaum, J.D., Chung, W.K., Aschard, H., Silverman, E.K., Cho, M.H., He, Z., Ionita-Laza, I. (2021). Powerful gene-based testing by integrating long-range chromatin interactions and knockoff genotypes. Proceedings of the National Academy of Sciences, 118(47).
He, Z., Liu, L., Wang, C., Le Guen, Y., Lee, J., Gogarten, S., Lu, Fred., Montgomery, S., Tang, H., Silverman, E., Cho, M.H., Greicius, M.D., Ionita-Laza, I. (2021). Identification of putative causal loci in whole-genome sequencing data via knockoff statistics. Nature Communications, 12(1), pp.1-18.
He, Z., Le Guen, Y., Liu, L., Lee, J., Ma, S., Yang, A.C., Liu, X., Rutledge, J., Losada, P.M., Song, B., Belloy, M.E., Butler III, R.R., Longo, F.M., Tang, H., Mormino, E.C., Wyss-Coray, T., Greicius, M.D., Ionita-Laza, I. (2021). Genome-wide analysis of common and rare variants via multiple knockoffs at biobank scale, with an application to Alzheimer disease genetics. The American Journal of Human Genetics, 108(12), pp.2336-2353.
Le Guen, Y., Belloy, M.E., Napolioni, V., Eger, S.J., Kennedy, G., Tao, R., He, Z., Greicius, M. (2021) A novel age-informed approach for genetic association analysis in Alzheimer’s disease. Alzheimer's research & therapy, 13(1), pp.1-14.
He, Z., Xu, B., Buxbaum, J., Ionita-Laza, I. (2019) A genome-wide scan statistic framework for whole-genome sequence data analysis. Nature Communications, 10(1), 3018.
He, Z., Liu, L., Wang, K., Ionita-Laza, I. (2018). A semi-supervised approach for predicting cell type specific functional consequences of non-coding variation using MPRAs. Nature Communications, 9(1), 5199.
He, Z., Xu, B., Lee, S., Ionita-Laza, I. (2017). Unified sequence-based association tests allowing for multiple functional annotations, and meta-analysis of noncoding variation in Metabochip data. The American Journal of Human Genetics, 101(3), 340-352.
He, Z., Zhang, M., Lee, S., Smith, J.A., Kardia, S.L.R., Diez Roux, A.V. and Mukherjee, B. (2017). Set-based tests for gene-environment interaction in longitudinal studies. Journal of the American Statistical Association, 112(519), 966-978.
He, Z., Zhang, M., Lee, S., Smith, J.A., Guo, X., Palmas, W., Kardia, S.L.R., Diez Roux, A.V., and Mukherjee, B. (2015). Set-based tests for genetic association in longitudinal studies. Biometrics, 71(3), 606-615.
He, Z., Zhang, M., Zhan, X., and Lu, Q. (2014). Modeling and testing for joint association using a genetic random field model. Biometrics, 70 (3), 471-479.