Fraxinus goodingii genome assembly

We have made available here the preliminary genome assembly of Fraxinus goodingii, assembled by Laura Kelly and Endymion Cooper. These data are as yet unpublished. If you want to publish any analysis of these data you must either wait until we have published them in a journal, or contact Richard Buggs to negotiate a co-authored paper.

Released on 23rd February 2017. The genome of Fraxinus goodingii was sequenced to a depth of approximately 54x on the Illumina NextSeq and HiSeq platforms. Paired reads for libraries made from total genomic DNA, with approximate average insert sizes of 300bp, 500bp and 800bp, were adapter trimmed and length and quality filtered. De novo assembly of the filtered read pairs, with a minimum read length of 50 nt, was conducted in the CLC Genomics Workbench under the following parameter settings: automatic optimization of word (k-mer) size; maximum size of bubble to try to resolve=5000; minimum contig length=200bp. As total genomic DNA was sequenced and assembled, contigs in the assembly include those that originate from the organellar genomes, as well as those from the nuclear genome. The assembly contained a single contig representing the Illumina PhiX control library; this contig was removed from the assembly. Assembled contigs were joined to form scaffolds using SSPACE (version 3.0) with default parameters. Library insert lengths were specified with a broad error range (ie ±40%). Gaps in the SSPACE scaffolds were filled using GapCloser (version 1.12) with default parameters. The average library insert lengths were specified using the estimates produced by SSPACE during scaffolding.


Assembly statistics
Number of scaffolds 452,616
Assembly size (Mbp) 904.2
Estimated genome size (Mbp) 969
# scaffolds > 1000 bp 66,509
# scaffolds > 10000 bp 19,606
Largest scaffold (bp) 464,370
Smallest scaffold (bp) 200
GC (%) 33.17
N50 (bp) 26,688
L50 8,711
Ns (%) 8.4
Complete BUSCOs [% searched] 1,264 [87.8%]
Complete and single-copy BUSCOs 1,043 [72.4%]
Complete and duplicated BUSCOs 221 [15.3%]
Fragmented BUSCOs 48 [3.3%]
Missing BUSCOs 128 [8.9%]

As a public service, preliminary sequences of this genome are being made available before scientific publication. The purpose of this policy is to balance the desire that the ash genomes be made available to the scientific community as soon as possible with the reasonable expectation that the group responsible for the sequencing will publish their results in peer reviewed journals without concerns about potential pre-emption by other groups that did not directly participate in the effort.

These pre-publication data are preliminary and may contain errors. The goal of our policy is that early release should enable the progress of science. By accessing these data, you agree not to submit to scientific journals any articles containing analyses of these data data prior to peer-reviewed journal publication by us and our collaborators of a comprehensive genome analysis.

Any analyses involving data are included in this data usage policy, including annotation of genes, identification of sets of genomic features such as genes, gene families, regulatory elements, repeat structures, GC content, etc., and whole-genome comparisons of regions of among-species conservation. Also included are uses of the genome data as a reference for transcriptomic analyses (RNA seq, bisulfite seq, chip seq or similar). Interested parties are encouraged to contact the the principal investigator if they wish to discuss the possibility of collaborative publication of such analyses.

The data may be freely downloaded and used by all who respect the restrictions in the previous paragraphs. In the period before the peer-reviewed journal publication the assembly and raw sequence reads should not be redistributed or repackaged without permission of Richard Buggs.

Once moved to unreserved status, the data will be freely available for any subsequent use.

By downloading these data you are agreeing to the terms outlined above.

Proceed to data download