Comparative transcriptome sequencing for identification of Australian pest fruit flies
A project undertaken at the School of Biological Earth and Environmental Sciences, University of New South Wales, and supervised by Kathryn Raphael
The Queensland fruit flies, Bactrocera tryoni (TRY) and B. neohumeralis (NEO), originally native to the rainforests of Queensland and Northern NSW, are now Australia’s worst horticultural pests, and a quarantine risk throughout the Asia-Pacific region. The overlapping distribution of the species on the east coast of Australia has been well characterised over nearly 100 years. However, TRY is an invasive pest while NEO, although a serious pest throughout its range, is not invasive. Species identification is a very important tool in the control of fruit fly pests. Whilst adults are usually easily identified (Figures 1 and 2), the larvae, which may be present in fruit, are very difficult if not impossible to identify from morphology alone. In addition, TRY and NEO are so similar genetically that no DNA sequence difference has so far been found that is useful for field identification.
Our aim in this project was to focus on gene regulatory differences as a route to finding DNA sequence differences. In order to achieve this the total of all expressed protein-coding genes (transcriptomes) of the two species was assembled for two tissues (brain and antennae), and two time points (AM and PM) and levels of gene (transcript) expression compared.
Brain. In the brain, differential expression analysis reveals that very few genes show a significant difference in expression between the two time points, but the well known orthologues of the Drosophila circadian genes are among those that show significant transcript cycling between the AM and PM samples. From the TRY v NEO comparison a list of 211 significantly different (DE) transcripts has been annotated in detail (Figure 3). Classifying these according to functional groups reveals many genes that are related to sensory function, especially chemosensory function. For example several transcripts encode odour and taste receptors, others encode detoxification enzymes that have a role in processing odours in the model organism Drosophila melanogaster.
Antennae. The same analysis of differential expression for the antennal samples shows a similar pattern. Very few genes show a significant difference in expression between the two time points, but many transcripts are different comparing the two species. Annotation of these reveals similar functional groups to the brain set of DE genes, with some overlap of specific transcripts. These will be of particular interest in follow-up studies. Mapping of all the DE transcripts onto the B. tryoni genome scaffolds reveals that a number are clustered together on the genome map, possibly indicating coordinated gene regulation. Another striking aspect of the data is that some of the genes are associated in D. melanogaster with male aggression and courtship behavior.
What about gene sequence differences between TRY and NEO?
Knowledge of gene expression differences between TRY and NEO is important because it gives an indication of which specific gene pathways are controlling the differences in the biology of the two species. Ultimately, DNA sequence differences must exist, either in protein coding, or more likely, gene regulatory regions. Comparison of the DNA sequence of selected transcripts reveals some sequence differences in coding regions leading to minor amino acid differences in the encoded protein. After mapping the DE transcripts to the B. tryoni genome and comparing the TRY and NEO genome sequences we have identified many sequence differences, particularly insertion/deletions, in the non-coding, potentially regulatory, regions surrounding coding sequences. Most of these are likely to be non-significant, polymorphic (ie not fixed) differences. It will be challenging to identify which sequences are important for differentiating the species, because the fly samples that we work on are from laboratory stocks and therefore quite inbred compared to wild flies. Resequencing of wild-caught flies of both species will be needed to show which sequence differences are fixed in wild flies. Future work will use genetic crosses and data from the transcriptomes developed in this project and the genome to localize genome regions that likely differentiate the species. Together these approaches will eventually enable the identification of the fundamental genetics of the difference in biology and pest status and the development of genetic tests that can differentiate the species.