Biostatistician - Machine Learning & Bioinformatics
Job Description
Job DescriptionDescription:
We are seeking a highly skilled and motivated Biostatistician to support the development and validation of novel molecular diagnostic tests by integrating genomic and proteomic data. The successful candidate will apply advanced statistical and machine learning techniques to analyze high-dimensional data, ensure quality assurance of next-generation sequencing (NGS) pipelines, and collaborate closely with cross-functional teams in assay development, bioinformatics, and clinical research.
ESSENTIAL RESPONSIBILITIES:
· Design and implement statistical models and machine learning algorithms to support discovery, validation, and performance evaluation of proteomic/genomicbased diagnostics.
· Perform bioinformatic analysis of NGS datasets, including variant calling, annotation, and filtering.
· Develop and execute quality assurance protocols for NGS data pipelines and result reproducibility.
· Collaborate with bioinformatics, lab scientists, and software engineers to integrate multi-omic datasets and refine analytical workflows.
· Support study design, statistical analysis plans, and interpretation of findings for both internal R&D and regulatory submissions.
· Write comprehensive documentation, including analysis reports, methods validation, and pipeline specifications.
· Develop reproducible and well-documented code in R, with SAS proficiency preferred for regulatory or clinical data support.
· Contribute to publications and presentations of research findings in scientific and clinical settings.
The above list represents the general duties considered essential functions of the job and is not to be considered an exhaustive description of all the work requirements that may be inherent in the position.
Requirements:
EXPERIENCE & QUALIFICATIONS:
· Strong experience analyzing NGS data, including pipeline QC, normalization, and interpretation (e.g., WES, RNA-Seq, targeted panels).
· Proficiency in R for statistical computing, data visualization, and reproducible research.
· Experience with machine learning techniques (e.g., random forests, support vector machines, regularization, cross-validation) applied to biomedical data.
· Working knowledge of bioinformatics tools and formats (e.g., FASTQ, VCF, BAM, GTF, Bioconductor).
· Experience in data integration from proteomic and genomic platforms is highly desirable.
· Familiarity with SAS for clinical or regulatory submissions is preferred.
· Excellent problem-solving skills, strong attention to detail, and ability to work independently or in teams.
· Experience in regulated environments (e.g., CLIA, CAP, FDA submission).
· Familiarity with cloud-based computing (e.g., AWS, GCP) and version control (e.g., Git).
· Background in molecular diagnostics or translational research projects.
· Strong written and verbal communication skills for both technical and nontechnical audiences.
EDUCATION:
PhD or Master’s degree in Biostatistics, Bioinformatics, Computational Biology, or related discipline.