Contig cluster assignment varying cd-hit sequence identity from 0.95 - 1.00 and length cutoff from 0-0.8 Care! The cluster ids are not unqiue! Only the cluster id-len_diff_cutoff-ident_cutoff combination is unique. See the code for how this is managed in practice. See manuscript for further details

btESBL_contigclusters_sensax

Format

A tidy data frame with 6426 rows and 8 variables:

cluster

Cluster id from cd-hit

id

Sequence id within cluster from cd-hit

length

Length of sequence, bases

contig

Contid unique identfier

lane

sample ID

len_diff_cutoff

Length difference cutoff (-s in cd-hit)

ident_cutoff

Sequence identity cutoff (-c in cd hit)