We propose an adaptation of DiscoSnp for RadSeq data.
- Clustering per locus
- Predictions of variants close to sequence extremities
- Order(s) of magnitude faster than Stack or IPyRAD.
- Predictions have high precision (up to 99.3% on simulated data) with good recall (82.2% when using high precision parameters)
- Applied on real biological data, RAD data from 259 specimens of Chiastocheta flies, morphologically assigned to 7 species, all individuals were successfully assigned to their species using both STRUCTURE and Maximum Likelihood phylogenetic reconstruction. Moreover, identified variants succeeded to reveal a within species structuration and the existence of two populations linked to their geographic distributions.