Data Importing

As opposed to DESeq2 data importation from STAR, loading Cuffdiff data into CummeRbund is streamlined and simple. CummeRbund simply has a command to read in the exact output files of Cuffdiff and create an SQLite database of the dataset to interact with R. A CuffSet object is created which can then be further subsetted.    

## CuffSet instance with:
##   2 samples
##   36174 genes
##   97676 isoforms
##   44110 TSS
##   0 CDS
##   36174 promoters
##   44110 splicing
##   0 relCDS

   

Initial Data Exploration and Quality Control

Before looking at DEGS, a brief quality control analysis was done to be sure the dataset acts as expected. Since the dataset is already known to be quite robust, a quick dispersion plot was made to ensure quality of fit of the Cufflinks statistical model.        

Both conditions quite closely follow the same curve and exhibit rather minimal dispersion, indicating low sample-to-sample variability and general similarity of conditions.    

   

The squared coefficient of variation is another normalized measure of sample-to-sample variability. Differences in CV2 indicate a higher degree of variability among replicate FPKM estimates and thus a lower number of differentially expressed genes present in the dataset.

   

From the above plot, it seems that both condition yielded very similar squared coefficients of variation. Thus, we should expect a clear delineation of DEGs between treatment types.

Differential Expression Analysis

In an effort to make results most comparable to the output of DESeq2, default Cufflinks significance values and log2FoldChange were used to calculate the most significant Differentially Expressed Genes. More heatmaps were also generated in an effort to standardize results and aid in one-to-one comparison of workflows.    

   

From the above heatmap it appears that WRKY22, CPL3, AT4G31805, AT1G12845, AT1G51405, and AT5G15190 are all significantly upregulated in the Estradiol treatment group in comparison to the DMSO control group. Thus, these genes are likely implicated in the the induced overexpression of MUTE.

   

   

In contrast, the second heatmap above appears to indicate that FLA6, PDF1.2, PDF1.2b, AT3G61920, AT3G28270, AT3G10185, AT2G20721, AT1G60590, and AT1G34330 were all significantly downregulated by induced MUTE overexpression.