Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE credits
The technological advances have led to faster and more cost-effective sequencing platforms, making it quicker and more affordable to generate genomic sequence data. For the study of bacterial genome, two main methods can be used, whole-genome sequencing and metagenomic shotgun sequencing, of which the first is the mostly used in the past years.
As a consequence of these advances, a vast amount of data is currently available and the need of bioinformatics tools to efficiently analyse and interpret it has dramatically increased. At present, there is a great quantity of tools to use in each step of bacterial genome characterization: (1) pre-processing, (2) de novo assembly, (3) annotation, and (4) taxonomic and functional comparisons. Therefore, it is difficult to decide which tools are better to use and the analysis is slowed down when changing from one tool to another. In order to tackle this, the pipeline BACTpipe was developed. This pipeline concatenates both bioinformatics tools selected based on a previous testing and additional scripts to perform the whole bacterial analysis at once. The most relevant output generated by BACTpipe are the annotated de novo assembled genomes, the newick file containing the phylogenetic relationships between species, and the gene presence-absence matrix, which the users can then filter according to their interests.
After testing BACTpipe with a set of bacterial whole-genome sequence data, 60 genes out of the 18195 found in all the Lactobacillus species analysed were classified as core genes, i.e. genes shared among all these species. Housekeeping genes or genes involved in the replication, transcription, or translation processes were identified
2016. , p. 94