Invention:
This invention is an improvement of nanopore sequencing base calling accuracy via an iterative approach using existing machine learning algorithms and alignment with ground-truth references. This new method allows base calling accuracy to improve as the technology is trained, which in turn creates countless opportunities for sequencing improvement in a variety of applications.
Background:
Nanopore sequencing is a method in which single strands of DNA or RNA pass through a nanopore, an extremely small biological or synthetic pore embedded in a membrane. As nucleotides move through the pore, they disrupt an electrical current in distinct ways. These current fluctuations are recorded and interpreted to determine the nucleotide sequence. This enables real-time, long-read sequencing without the need for amplification. However, the technique faces significant limitations in accurately identifying chemically modified nucleotides. These modifications often produce irregular or overlapping signal patterns that standard base calling algorithms cannot easily resolve, reducing overall sequencing accuracy and hindering its utility in applications such as RNA therapeutics.
Applications:
- Training of nanopore sequencing base calling software via machine learning
Advantages:
- System is trained over time improving accuracy
- Improved sequencing of DNA/RNA
- Utilizes common exiting ML technology
- Allows for determination of therapeutic RNA agents quality