Woodland Strawberry Genome Published (For Real This Time)

Written by James Schnable

Hi all, hope you’re enjoying the holiday break. I’m back with news of a new plant genome publication!

Today’s plant is the woodland strawberry (Fragaria vesca). Now these aren’t the strawberries you probably see at your local grocery store, those are garden strawberries (Fragaria x ananassa). Woodland strawberries were the predominant strawberries grown throughout europe until around 250 years ago when they were displaced by the new garden strawberries — created when a strawberry species brought from north america crossed with another species from chile when they were grown next to each other in france. The new hybrid species bore larger fruit than the woodland strawberry.

Wild strawberry (left) and domesticated strawberry (right). I'm not sure which species these are, downside of having to hunt down public domain photos.

Sequencing the genome of the garden strawberry directly would be a real mess, as the genome of that species is made up of four closely related genome-copies*. With modern DNA-sequencing technology, generating the raw sequence data that makes up a genome is — relatively — cheap and easy, but afterwards you are left with a lot of small pieces of DNA sequence, and putting those pieces together (like putting together a puzzle with millions of pieces) remains challenging. Mix together pieces from four closely related puzzles together with no way to tell them apart and the project becomes even more challenging.

Fortunately the woodland strawberry side-steps that problem, being a normal diploid plant without any of the whole genome duplications that would make sequencing garden strawberries such a terrible mess. It also has a pleasingly small genome, with a genome of 206 million base pairs spread over seven chromosomes, making it only slightly larger than the genome of the first plant to be sequenced (Arabidopsis 157 million base pairs and five chromosomes). Small genomes are easier to put together, with less total pieces to go around.

The research consortium that sequenced and assembled the strawberry genome, first assembled overlapping pieces of sequenced DNA into larger pieces called contigs and then using genetic map data to line those contigs up into seven pseudomolecules, each of which represents a whole strawberry chromosome. The strawberry genome itself wasn’t released prior to the publication of the paper, so I haven’t had a chance to look at it myself, but both the fact that they’ve been able to assemble all the way to the chromosome level, and that they developed and used genetic map data argue for a well done assembly.

Speaking of assembly, here are all the vital genome stats that I normally would have to hunt around for after reading a “new genome sequenced!” story in the popular press (some of these I’ve already mentioned above):

  • Strawberries have a haploid number of 7, and a genome size of 206 MB
  • The average base pair in the strawberry genome was sequenced 39 times using second generation technology (a label that includes Illumina, 454, and SOLiD sequencers, in this case a mixture of all three technologies were employed)
  • 34,809 predicted genes were identified across the strawberry genome.
  • The authors found no evidence of the whole genome duplications found in other rosids (I’m assuming this means the most recent whole genome duplication in the ancestors of strawberries was the pre-rosid hexaploidy.)
  • The paper describing the genome will shortly be available from Nature Genetics. The title is “The genome of the woodland strawberry (Fragaria vesca)” and the last name of the first author is Shulaev. UPDATE: Here’s the link to the genome paper.

Strawberry genome browser.

Aside from the enjoyment I always feel when a new genome goes live, I’m particularly happy to see the strawberry genome come out for two reasons.

The first is that there was no “strawberry genome” grant. Funding for sequencing the genome came from a number of sources. I take this as a sign that in addition to the rapidly declining cost of sequencing itself, the cost and difficulty of assembling and annotating the genome a new plant species are also continuing to decline at a rapid pace.

The second reason is that I once before announced the sequencing of the strawberry genome on this site. It was almost a year ago, after a reporter misunderstood a presentation at PAG and posted a “new genome sequenced” story online that was rapidly picked up by  a number of websites including my own. It was a bad break for the folks working hard to sequence, assemble, and annotate the real strawberry genome, and I’m very glad to see them get the moment in the spotlight they so richly deserve. The people who sequence genomes make the work of so many other researchers, including myself, possible.

Links to other coverage (updated as I find them):

*Either the result of duplicated copies of a single genome that have since evolved independently (autotetraploidy), or hybrids that merged the genomes of closely related strawberry species together in a single plant (allotetraploidy).

Written by Guest Expert

James Schnable is an assistant professor and the co-founder of two start ups. His academic lab works on comparative and functional genomics as well as high throughput phenotyping of maize, sorghum, and related orphan grain crops and wild grass species. He’s interested in plants, farming, and saving the world through agriculture, the usual. James blogs at James and the Giant Corn.


  1. This particular piece of research is good news, especially since many of the genes involved can be run ‘cross-platform’ on almond, apple, peach,cherry, raspberry and rose.
    The best news of all would perhaps be applicable to the strawberry itself. The most intense, sweet strawberry flavor comes from wild strawberries. If you’ve never eaten wild strawberries, you have no idea how tasty a strawberry can be — none at all. Some ‘domesticated’ strawberries bear only a visual resemblance to the crop, with flavor being similar to a very disappointing green apple.

  2. As more and more Rosid genomes are published the potential to identify adaptations involved in the domestication of one crop that can be “ported” to others becomes a lot more feasible. And I’ve had a few chances to try wild strawberries of some north American species or another. I agree there is no comparison!
    Strawberries are one if the youngest domesticated species, and often propagated vegetatively rather than sexually which has limited the potential for selection outside of intentional breeding programs. Hopefully this genomic information will help them catch up!

  3. Nice recap James. The other excellent thing about strawberry is that it is so agile for transformation and functional genomics. It is the Arabidopsis of perennials.
    Thanks for the coverage. kf

  4. Kevin,
    I hadn’t known that the wild strawberry was a perennial. Over the years I’ve heard about efforts to ‘perennialize’ various crops but there’s been very little about it lately.
    Any chance the strawberry work could be used to carry that project forward?

  5. Yes, strawberry is a perennial. The confusion comes from the fact that it is usually grown as an annual to limit disease pressure carried forward over the off-season.
    The issue of perennialization is one that is on our radar screen, mostly by accident. We have an EMS population and can look for plants that behave as annuals. We won’t lose them because we can save seed, we have the original M1 seed families and can regenerate plants in tissue culture. We’re watching for these.
    We also have an interesting line obtained through our previous NSF genome grant. The goal was to study novel genes of the unknown-ome. One of these genes when suppressed with RNAi turns the perennial into an annual. It grows just fine, then flowers, and dies.
    The gene has similarity to a class of genes in animals and fungus that control cell cycle. Could it be that perennials have two programs that contribute to this process, one for the first year and one for continued seasons? Could this gene play a role in that “second year” phenomenon?
    Heck, it is just wild speculation, but it is a fun plant nonetheless. We call it “kick-the-bucket” and should publish on the KTB RNAi lines in 2011. We are going full throttle on this in Q1 of this year and even I’m going to be getting my hands dirty.
    We’re also trying to overexpress this one in Arabidopsis to see if we could make it live longer. Wouldn’t that be fun? Arabidopsis T-DNA knockouts have no obvious phenotypes and Ox-ers are on the way.
    Wow, what a long answer to a good question. You threw one right into my wheelhouse!
    Also, if anyone is interested in the somewhat boring story behind the genome story, please check here.

Comments are closed.