Fraxinus pennsylvanica genome assembly [accession PE_48]

We have made available here the preliminary genome assembly of Fraxinus pennsylvanica [accession PE_48], assembled by Laura Kelly. These data are as yet unpublished. If you want to publish any analysis of these data you must either wait until we have published them in a journal, or contact Richard Buggs to negotiate a co-authored paper.

Released on 4th April 2017. The genome of Fraxinus pennsylvanica was sequenced to a depth of approximately 49x on the Illumina NextSeq and HiSeq platforms. Paired reads for a library made from total genomic DNA, with an approximate average insert size of 500bp, were adapter trimmed and length and quality filtered. De novo assembly of the filtered read pairs, with a minimum read length of 50 nt, was conducted in the CLC Genomics Workbench under the following parameter settings: automatic optimization of word (k-mer) size; maximum size of bubble to try to resolve=5000; minimum contig length=200bp. As total genomic DNA was sequenced and assembled, contigs in the assembly include those that originate from the organellar genomes, as well as those from the nuclear genome. The assembly contained a single contig representing the Illumina PhiX control library; this contig was removed from the assembly. No scaffolding or gap filling has been performed.


Assembly statistics
Number of contigs 715,871
Assembly size (Mbp) 761
Estimated genome size (Mbp) 868
# contig > 1000 bp 199,534
# contig > 10000 bp 3,317
Largest contig (bp) 104,414
Smallest contig (bp) 200
GC (%) 37.7
N50 (bp) 2,079
L50 88,850
Ns (%) 0
Complete BUSCOs [% searched] 915 [63.5%]
Complete and single-copy BUSCOs 791 [54.9%]
Complete and duplicated BUSCOs 124 [8.6%]
Fragmented BUSCOs 214 [14.9%]
Missing BUSCOs 311 [21.6%]

As a public service, preliminary sequences of this genome are being made available before scientific publication. The purpose of this policy is to balance the desire that the ash genomes be made available to the scientific community as soon as possible with the reasonable expectation that the group responsible for the sequencing will publish their results in peer reviewed journals without concerns about potential pre-emption by other groups that did not directly participate in the effort.

These pre-publication data are preliminary and may contain errors. The goal of our policy is that early release should enable the progress of science. By accessing these data, you agree not to submit to scientific journals any articles containing analyses of these data data prior to peer-reviewed journal publication by us and our collaborators of a comprehensive genome analysis.

Any analyses involving data are included in this data usage policy, including annotation of genes, identification of sets of genomic features such as genes, gene families, regulatory elements, repeat structures, GC content, etc., and whole-genome comparisons of regions of among-species conservation. Also included are uses of the genome data as a reference for transcriptomic analyses (RNA seq, bisulfite seq, chip seq or similar). Interested parties are encouraged to contact the the principal investigator if they wish to discuss the possibility of collaborative publication of such analyses.

The data may be freely downloaded and used by all who respect the restrictions in the previous paragraphs. In the period before the peer-reviewed journal publication the assembly and raw sequence reads should not be redistributed or repackaged without permission of Richard Buggs.

Once moved to unreserved status, the data will be freely available for any subsequent use.

By downloading these data you are agreeing to the terms outlined above.

Proceed to data download