Escherichia_coli_SRR38074175_example

NS Bio Bacterial Analysis · Generated 2026-05-11
General Summary

General Summary PASS

MetricValue
Top species (Kraken2)Escherichia coli (71.5%)
CarbapenemaseAbsent
BUSCOC 99.5% Β· S 99.4% Β· D 0.1% Β· F 0.3% Β· M 0.1% Β· n=874 (ENTEROBACTERIACEAE)
MLSTST73 (ecoli_achtman_4)
QC CheckValueStatus
BUSCO Complete β‰₯ 90%99.5%PASS
Total contigs5β€”
Largest contig5131858 bpβ€”
MLST Typing

MLST Typing

Schemeecoli_achtman_4
Sequence TypeST73

Allele calls

adkfumCgyrBicdmdhpurArecA
Allele3624913171125
Serotyping (E. coli)

Serotyping (E. coli)

NameSpeciesSerotypeO typeH typeQCWarnings
contigsEscherichia coliO6:H1O6H1--
AMR Genes (ABRicate)

AMR Genes (ABRicate)

GeneAccessionClassCoveragePlasmid
blaEC-5NG_049085.1CEPHALOSPORIN100.0%
Quality Filtering (fastp)

Quality Filtering (fastp)

Filtering Summary

Passed: 91,007
Low quality: 5,978
Too short: 1,368
Too long: 0
Passed  Low quality  Too short  Too long

Read Statistics

MetricBefore filteringAfter filtering
Total reads98,33191,007
Total bases366.6 Mbp357.6 Mbp
Q30 rate72.5%73.9%
Q20 rate85.4%86.6%
Mean read length 3,728 bp 3,929 bp
GC content 50.5% 50.5%
Taxonomic Classification (Kraken2)

Taxonomic Classification (Kraken2)

What is Kraken2? Kraken2 classifies sequencing reads by matching k-mers against a reference database. The percentage shown is the proportion of all reads assigned to each species and its sub-taxa (clade abundance).
Escherichia coli
71.46%
65,031 reads
Klebsiella pneumoniae
1.01%
917 reads
Escherichia albertii
0.18%
163 reads
Shigella dysenteriae
0.12%
106 reads
Escherichia fergusonii
0.09%
78 reads
Shigella flexneri
0.09%
85 reads
Escherichia marmotae
0.08%
74 reads
Salmonella enterica
0.07%
68 reads
Klebsiella grimontii
0.06%
51 reads
Escherichia sp. E4742
0.05%
42 reads

Top 10 species-level hits. Bar width scaled to highest-abundance species.

Genome Completeness (BUSCO)

Genome Completeness (BUSCO)

What is BUSCO? BUSCO (Benchmarking Universal Single-Copy Orthologs) assesses genome completeness by searching for conserved genes expected in the organism's lineage. A high "Complete" score indicates a near-complete assembly; "Fragmented" and "Missing" suggest gaps.
Complete (single): 99.4%
Complete (duplicated): 0.1%
Fragmented: 0.3%
Missing: 0.1%
Single   Duplicated   Fragmented   Missing
Complete (single-copy) 99.4%
Complete (duplicated) 0.1%
Complete (total) 99.5%
Fragmented 0.3%
Missing 0.1%
Total markers (n) 874
Lineage ENTEROBACTERIACEAE
Gene Annotation (Bakta)

Gene Annotation (Bakta)

Genome Statistics

Genome size5,208,221 bp
GC content50.42%
N505,132,192 bp
Coding ratio88.9%

Annotated Features

Feature typeCount
CDS (protein-coding genes)4,824
ncRNA195
tRNA88
sORF75
ncRNA region55
rRNA22
oriC3
tmRNA1
oriT1