FastImpute: Development and Validation of a Workflow for Open-source, Reference-Free Genotype Imputation Methods - An Example in Breast Cancer (PRS313_BC)

Ge, Aaron; Balasubramanian, Jeya; Wu, Xueyao; Kraft, Peter; Almeida, Jonas S.

RESEARCH ARTICLE

FastImpute: Development and Validation of a Workflow for Open-source, Reference-Free Genotype Imputation Methods - An Example in Breast Cancer (PRS313_BC)

Aaron Ge¹^{, 2}^{, *} Jeya Balasubramanian¹ Xueyao Wu¹ Peter Kraft¹ Jonas S. Almeida¹ Authors Info & Affiliations

The Open Bioinformatics Journal • 16 Feb 2026 • RESEARCH ARTICLE • DOI: 10.2174/0118750362421210250929110508

Introduction

Genotype imputation improves the resolution of genetic data, but traditional methods are computationally intensive or compromise privacy. Deep learning alternatives are often too large for client-side deployment. In this study, FastImpute, a workflow for creating lightweight, reference-free imputation models, was developed that enables real-time, accessible genetic risk assessment on edge devices.

Methods

Using whole-genome sequencing data from 2,504 individuals in the 1000 Genomes Project, linear and logistic regression models were trained to impute single-nucleotide polymorphisms (SNPs) used in the breast cancer polygenic risk score PRS313_BC. Models used SNPs from commercial genotyping arrays, and performance was evaluated against sequencing data and benchmarked against Beagle.

Results

The polygenic risk score (PRS) calculated with our linear model correlated strongly with the PRS from true sequencing data (R² = 0.86), significantly outperforming no imputation and minor allele frequency imputation (R² = 0.38). Our logistic model correctly identified 4 of 6 individuals in the top 1% of breast cancer risk, matching Beagle’s performance.

Discussion

Our approach balances performance and efficiency, enabling deployment on personal devices and preserving user privacy through local data processing. This approach democratizes access to genetic risk assessment using direct-to-consumer data. However, this proof of concept requires validation across other genomic contexts before clinical use.

Conclusion

The FastImpute pipeline demonstrates that lightweight models can enable real-time genetic risk assessment on edge devices.

Keywords: Genotype imputation, Reference-free methods, FastImpute, Breast cancer, PRS313, Client-side imputation, Web technologies, Polygenic risk score, Direct-to-consumer test.

Fulltext HTML PDF ePub

FastImpute: Development and Validation of a Workflow for Open-source, Reference-Free Genotype Imputation Methods - An Example in Breast Cancer (PRS313_BC)

Abstract

Introduction

Methods

Results

Discussion

Conclusion

Bentham Is Proud To Announce Collaboration With Elsevier

Three Journals Receive Impact Factors

The Nursing Journal Directory Indexes Bentham Journal, The Open Public Health Journal

Follow Us

Authors & Information

Authors

Affiliations

Information

Published In

Article Information

Cite As

Article History

Copyright

ACKNOWLEDGEMENTS

Download

Download1

Download

Citations & Metrics

Citations

Cite As

Export Citation

Metrics

Article Usage (Last 30 Days)

Article Usage (Demographic)

Copyright & License

Copyright & License

© 2026 The Author(s). Published by Bentham Open.

Media

Figures

Tables

Abstract

Introduction

Methods

Results

Discussion

Conclusion

Bentham Is Proud To Announce Collaboration With Elsevier

Three Journals Receive Impact Factors

The Nursing Journal Directory Indexes Bentham Journal, The Open Public Health Journal

Authors

Affiliations

Information

Published In

Article Information

Cite As

Article History

Copyright

ACKNOWLEDGEMENTS

Download1

Download

Citations

Cite As

Export Citation

Metrics

Article Usage (Last 30 Days)

Article Usage (Demographic)

Copyright & License

© 2026 The Author(s). Published by Bentham Open.

Figures

Share

Share article link

Share on social media