We propose an adaptation of DiscoSnp for RadSeq data.
- Paper: biorxiv.org/content/early/
- Tool: github.com/GATB/DiscoSnp/
- Novelties:
- Clustering per locus
- Predictions of variants close to sequence extremities
- Order(s) of magnitude faster than Stack or IPyRAD.
- Predictions have high precision (up to 99.3% on simulated data) with good recall (82.2% when using high precision parameters)
- Applied on real biological data, RAD data from 259 specimens of Chiastocheta flies, morphologically assigned to 7 species, all individuals were successfully assigned to their species using both STRUCTURE and Maximum Likelihood phylogenetic reconstruction. Moreover, identified variants succeeded to reveal a within species structuration and the existence of two populations linked to their geographic distributions.