P. astreoides RNAseq bioinformatic pipeline

This electronic notebook provides the scripts employed to analyze gene expression dynamics across developmental stages (planulae and adults) of the coral Porites astreoides inhabiting shallow (10 m) and mesophotic (45 m) reefs in Bermuda. All the scripts employed for the analysis can be found in the project electronic notebook.

1) RNA-Seq reads quality filtering and mapping

Total RNA was extracted from shallow and mesophotic P. astreoides adult and larvae collected in Bermuda. RNA-seq libraries were prepared using an in-house protocol at the Weizmann Institute of Science (Israel).

Raw sequence data - (NCBI SRA) - Raw Illumina sequence data (fastq format) for the 6 adult samples and 6 larvae samples sequenced in this study were deposited in the National Center for Biotechnology Information (NCBI BioProject PRJNA842853).

Quality filtering and mapping - RNA-Seq reads processing included adapter trimming using Cutadapt v2.6 (Martin, 2011) and quality filtering using Trimmomatic v0.39 (Bolger et al., 2014). Reads were aligned to the P. astreoides genome assembly using HISAT2 v2.2.1 (Kim et al., 2019). Transcripts assembly and quantification were performed using Stringtie v2.2.5 (Pertea et al., 2015).

2) Species identification

Symbiont species identification - High quality reads were blasted using Diamond v2.0.11 (Buchfink et al., 2021) against the NCBI, Reefgenomics, Marinegenomics and UQ eSpace proteomes databases of Symbiodiniaceae species Symbiodinium microadraticum, Symbiodinium tridacnidorum, Symbiodinium necroappetens, Symbiodinium natans, Symbiodinium linuacheae, Cladocopium goreaui, Cladocopium C15, Fugacium kawagutii, and Durusdinium trenchii (formerly Symbiodinium spp. clades A, C, F, and D (LaJeunesse et al., 2018)).

3) Differential expression

Coral host differential expression - DE analysis was conducted using Bioconductor DEseq2 v1.26.0 (Love et al., 2014) in the R environment (v3.6.3) by analyzing planulae and adult samples considering a single factor (depth) with two levels (shallow, mesophotic).

4) Functional enrichment

Coral host gene ontology enrichment analysis - GO annotation of the P. astreoides genome was retrieved from the Past_Genome Project. GO enrichment analysis was performed for both the DE and WGCNA data using the package Goseq (v1.42.0; Young et al. 2010) in the R environment.

5) Weighted correlation network analysis

Coral host WGCNA - WGCNA analysis (Langfelder and Horvath 2008) was performed using the R package WGCNA (v1.70.3), with soft thresholding power and adjacency of type “signed”.

6) SNPs characterization

Coral host SNPs characterization - Single nucleotide polymorphisms (SNPs) analysis was conducted using the Genome Analysis Toolkit framework (GATK, v4.2.0; (McKenna et al., 2010)) following the recommended RNA-Seq SNPs practice of the Broad Institute ((Auwera et al. 2013)), with necessary adjustments for genotype calling in non-model organisms where variants sites are not known beforehand. HISAT-aligned reads were sorted and marked for duplicates, variant calling was performed with the GATK HaplotypeCaller tool (McKenna et al., 2010) and genotypes were then jointly called using the GATK GenotypeGVCFs tool. The GATK SelectVariants and VariantFiltration tools were used to filter the joined variant-calling matrices for quality by depth. Filtering for linkage disequilibrium was carried out using PLINK (v2.0, (Purcell et al. 2007). To assess genetic differentiation among age-depth groups, the fixation index (Fst)(Weir & Cockerham, 1984) was estimated using the R package HIERFSTAT v0.5.10.

Written on November 16, 2022