Pangenome

Stats from vg stats after chopping the nodes into max. 32 bp nodes.

The degree distribution looks like this:

Note: the degree is winsorized at 100 (nodes with degree>100 at grouped shown at x=100).

Variation

Snarls

Stats extracted from the distance index with vg view -B. As an estimate of the variation, let’s look at the difference between maximum and minimum length of paths traversing the snarl. The depth represents how embedded a snarl is.

## Not available

Relative to hg38

Using vg deconstruct on the pangenome, relative to paths starting with hg38_.

## Not available

Variant genotyping

Genotype the variation present in the graph using vg call (from vg giraffe mapping of short-reads).

Short-read Mapping stats

  • Reads from HG002.

The graph below shows the cumulative proportion of mapped reads for different mapping quality threshold, i.e. the proportion of mapped reads with MAPQ>=x.

Long-reads Mapping stats

  • CCS reads from HG002.
  • ~10x depth.
## Not available

No MAPQ curve for now because all reads tend to map with maximum MAPQ of 255.