Summary
Feature tracks used in characterization of DNA methylation in the Pacific oyster genome [version oyster.v9_90 (fasta)]
This version of the genome represents longest genomic scaffolds (1670; 14%) that cover over 90% of genome.
Derived from genome build available at
Zhang, G; Fang, X; Guo, X; Li, L; Luo, R; Xu, F; Yang, P; Zhang, L; Wang, X; Qi, H; Zhu, Y; Yang, L; Huang, Z (2012) Genomic data from the Pacific oyster (Crassostrea gigas). GigaScience. http://dx.doi.org/10.5524/100030


[Track] oyster.v9_90 all CGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/Bedtools_Intersect/oyster.v9_90_allCGs
Preview:
scaffold22    fuzznuc    misc_feature    69    70    2.000    +    .    Sequence "scaffold22.1" ; note "*pat pattern1"
scaffold22    fuzznuc    misc_feature    73    74    2.000    +    .    Sequence "scaffold22.2" ; note "*pat pattern1"
scaffold22    fuzznuc    misc_feature    93    94    2.000    +    .    Sequence "scaffold22.3" ; note "*pat pattern1"
scaffold22    fuzznuc    misc_feature    156    157    2.000    +    .    Sequence "scaffold22.4" ; note "*pat pattern1"
scaffold22    fuzznuc    misc_feature    191    192    2.000    +    .    Sequence "scaffold22.5" ; note "*pat pattern1"
scaffold22    fuzznuc    misc_feature    240    241    2.000    +    .    Sequence "scaffold22.6" ; note "*pat pattern1"

Description:
fuzznuc on oyster.v9_90 fasta file.

[Track] Methylated CpGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/MethylatedCG_BED.bed
Preview:
scaffold1    263    263    CG    0.300    +
scaffold1    267    267    CG    0.100    +
scaffold1    9470    9470    CG    0.188    +
scaffold1    18706    18706    CG    0.071    +
scaffold1    20215    20215    CG    0.077    +
Description:
BSMAP used to map PE Bisulfite Illumina Reads from sperm sample
c1= scaffold, c2= start, c3= end, c4= motif, c5= percent methylation, c6= strand
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/Mollusk/BSMAPoutput_174gm_v9_90.sam -p 8
python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/Mollusk/methratiopython_174gm_v9_90.txt -s /Users/Shared/Apps/bsmap-2.73/samtools -z -u /Volumes/web/Mollusk/174gm_analysis/BSMAPoutput_174gm_v9_90.sa
Total number of aligned reads:
total 145949462 valid mappings, 123681367 covered cytosines, average coverage: 11.86 fold


[Track] Unmethylated CpGs
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/NoMethCG_BED.bed
Preview
scaffold1    64    64    CG    0.000    +
scaffold1    128    128    CG    0.000    +
scaffold1    10530    10530    CG    0.000    +
scaffold1    10569    10569    CG    0.000    +
scaffold1    11745    11745    CG    0.000    +
Description:



[Track] Methylated CpGs
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_METHbed.txt
Preview
scaffold1    9470    9470    CG    0.162    +
scaffold1    16825    16825    CG    0.067    +
scaffold1    18706    18706    CG    0.077    +
scaffold1    20215    20215    CG    0.071    +
scaffold1    20756    20756    CG    0.600    +
Description:
BSMAP used to map PE Bisulfite Illumina Reads from sperm sample
./bsmap -a /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/web/whale/ce_bs/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -o /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam -p 8
Total number of aligned reads:
pairs: 85147571 (50%)
single a: 16704916 (9.7%)
single b: 15703005 (9.2%)
python methratio.py -d /Volumes/web/whale/ce_bs/oyster.v9_90.fa -u -p -q -z -o /Volumes/web/whale/ce_bs/OUT_methratio_gonadPE_v9_90_B.txt -s /Users/Shared/Apps/bsmap-2.73/samtools /Volumes/web/whale/ce_bs/BSMAP_output_PE_v9_90.sam
Trimmed output (Galaxy) to have data for only CGs, positive strand, and 10x coverage
Specifically, methratio output ran through this workflow in Galaxy:
http://eagle.fish.washington.edu/cnidarian/Galaxy-Workflow-methratio_processing_BED.ga


[Track] Unmethylated CpGs
http://eagle.fish.washington.edu/cnidarian/TJGR_GonadPE_BS_v9_90_CG_10x_NOmethbed.txt
Preview
scaffold1    13612    13612    CG    0.000    +
scaffold1    13822    13822    CG    0.000    +
scaffold1    13936    13936    CG    0.000    +
scaffold1    13967    13967    CG    0.000    +
scaffold1    14032    14032    CG    0.000    +
Description:
Second product of Galaxy Workflow above.




[Track] Methylation Clusters 10-4
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_10_4_v9_90_bed.bed
Preview
scaffold43964    14454    14480
scaffold43964    49922    49938
scaffold43964    52400    52420
scaffold43964    58274    58294
scaffold43964    128548    128573
Description:
Based on above track. Intervals where the maximum distance between mCpG is 10bp, and the minimum # of mCpG is 4. Includes 9974 features.


[Track] Methylation Clusters 50-5
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_50_5_v9_90_bed.bed
Preview
scaffold43964    3034    3111
scaffold43964    14417    14480
scaffold43964    25418    25493
scaffold43964    39824    39905
scaffold43964    49292    49349
Description:
Based on above track. Intervals where the maximum distance between mCpG is 50p, and the minimum # of mCpG is 5. Includes 40890 features.


[Track] Methylation Clusters 100-4
http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_100_4_v9_90_bed.bed
Preview
scaffold43964    3034    3111
scaffold43964    14221    14291
scaffold43964    14417    14689
scaffold43964    19154    19187
scaffold43964    24969    25108
Description:
Based on above track. Intervals where the maximum distance between mCpG is 100p, and the minimum # of mCpG is 4. Includes 79161 features.
Corresponding fasta: http://eagle.fish.washington.edu/cnidarian/CpGMeth_clusters_100_4.fa




[Track] Repeats
http://eagle.fish.washington.edu/cnidarian/rm_020713/oysterv9_90.fa.out.gff
Preview
scaffold1    RepeatMasker    similarity    9873    9897     0.0    +    .    Target "Motif:AT_rich" 1 25
scaffold1    RepeatMasker    similarity    12513    12553     0.0    +    .    Target "Motif:(GA)n" 1 41
scaffold1    RepeatMasker    similarity    16199    16242    18.2    +    .    Target "Motif:AT_rich" 1 44
scaffold1    RepeatMasker    similarity    16261    16334    21.6    +    .    Target "Motif:AT_rich" 1 74
scaffold1    RepeatMasker    similarity    16494    16522     3.5    +    .    Target "Motif:AT_rich" 1 29
Description
RepeatMasker with Repbase;
Summary table @ http://eagle.fish.washington.edu/cnidarian/rm_020713/oyster.v9.fa.tbl

[Track] Transposable Elements
http://eagle.fish.washington.edu/cnidarian/TJGR_TE_oysterv9_90.gff
scaffold999    TRF    Tandem_Repeat    166754    166792    69    +    .    .
scaffold1    TRF    Tandem_Repeat    12513    12553    82    +    .    .
scaffold1259    WUBlastX    MuDR1x_AP    15516    15635    50    -    .    DNA
scaffold1327    WUBlastX    Zator-3_AAe    333539    334297    105    -    .    DNA
scaffold1627    WUBlastX    Zator-3_AAe    151603    151785    32    +    .    DNA
Description
RepeatProteinMask










[Track] CDS
http://aquacul4.fish.washington.edu/~steven/armina/oyster.v9.glean.final.rename.CDS.gff
Preview
scaffold980    GLEAN    CDS    134604    134778    .    +    2    Parent=CGI_10019211;
scaffold980    GLEAN    CDS    141499    141593    .    +    1    Parent=CGI_10019211;
scaffold980    GLEAN    CDS    142711    142811    .    +    2    Parent=CGI_10019211;
scaffold980    GLEAN    CDS    143780    143896    .    +    0    Parent=CGI_10019211;
scaffold980    GLEAN    CDS    144887    145029    .    +    0    Parent=CGI_10019211;

[Track] Introns
http://eagle.fish.washington.edu/Mollusk/174gm_analysis/oysterv9_90_Introns.bed
Preview

scaffold22    8845    13192
scaffold22    13237    14157
scaffold22    14229    15108
scaffold22    15180    15773
scaffold22    19018    19239
Description

[Track] Introns divided into 50bp windows
http://eagle.fish.washington.edu/cnidarian/oysterv9_90_Intron_50pbWindows.bed
Preview
scaffold22    8845    8895
scaffold22    8895    8945
scaffold22    8945    8995
scaffold22    8995    9045
scaffold22    9045    9095


[Track] mRNA
http://aquacul4.fish.washington.edu/~steven/armina/oyster.v9.glean.final.rename.mRNA.gff
Preview
scaffold6    GLEAN    mRNA    684420    688461    0.811719    +    .    ID=CGI_10022332;
scaffold6    GLEAN    mRNA    694464    700813    0.235103    +    .    ID=CGI_10022333;
scaffold6    GLEAN    mRNA    701995    741494    0.270237    +    .    ID=CGI_10022334;
scaffold1710    GLEAN    mRNA    22769    26100    0.999946    +    .    ID=CGI_10022335;
scaffold1710    GLEAN    mRNA    66509    80594    0.877603    +    .    ID=CGI_10022336;



[Track] Promoter Region:
http://eagle.fish.washington.edu/cnidarian/TJGR_genes_v9_promoter_5p1000.gff
Preview
scaffold40150    GLEAN    promoter    53687    54687    0.999676    -    .    ID=CGI_10003906;
scaffold40150    GLEAN    promoter    61510    62510    0.998077    -    .    ID=CGI_10003907;
scaffold40150    GLEAN    promoter    82433    83433    1    -    .    ID=CGI_10003910;
scaffold1177    GLEAN    promoter    70856    71856    0.889891    -    .    ID=CGI_10003913;
scaffold40178    GLEAN    promoter    50250    51250    0.999219    -    .    ID=CGI_10003915;


[Track] NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window.bed
scaffold22    0    50
scaffold22    50    100
scaffold22    100    150
scaffold22    150    200
scaffold22    200    250
Description: Complement of CDS interval in 50bp windows


[Track] +100x Mgo Expression - NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window_100xMgo.bed
scaffold1    100112    100162    121
scaffold1    100162    100212    118
scaffold1    100212    100262    106
scaffold100    80833    80883    4279
scaffold100    82089    82139    555
Description: bedtools | coveragebed using bam tophat output (-a) and NonCDS 50bp window bed (-b). split option.


[Track] +20x Mgo Expression - NonCDS 50bp windows
http://eagle.fish.washington.edu/cnidarian/TJGR_NonCDS_50window_20xMgo.bed
Preview
scaffold1    21144    21194    20
scaffold1    23024    23074    27
scaffold1    23074    23124    30
scaffold1    23124    23174    26
scaffold1    23174    23224    23
Description: bedtools | coveragebed using bam tophat output (-a) and NonCDS 50bp window bed (-b). split option.



[Track] SNPs Mgo RNA-seq Tophat
http://eagle.fish.washington.edu/cnidarian/TJGR_MgoSNP_vcf_to_gff.gff
Preview
scaffold1    SAMTools    SNP    18600    18600    33.8    .    .    REF=C;ALT=T;FILTER=.;INFO=DP%3D2%3BVDB%3D0.0160%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C2%2C0%3BMQ%3D50%3BFQ%3D-33;FORMAT=GT:PL:GQ;SAMPLE=1/1:65%2C6%2C0:10
scaffold1    SAMTools    SNP    18913    18913    4.77    .    .    REF=A;ALT=C;FILTER=.;INFO=DP%3D1%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C1%2C0%3BMQ%3D50%3BFQ%3D-30;FORMAT=GT:PL:GQ;SAMPLE=0/1:33%2C3%2C0:3
scaffold1    SAMTools    SNP    21342    21342    117    .    .    REF=T;ALT=A;FILTER=.;INFO=DP%3D31%3BVDB%3D0.0445%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C23%2C5%3BMQ%3D50%3BFQ%3D-111;FORMAT=GT:PL:GQ;SAMPLE=1/1:150%2C84%2C0:99
scaffold1    SAMTools    SNP    21381    21381    222    .    .    REF=G;ALT=A;FILTER=.;INFO=DP%3D37%3BVDB%3D0.0394%3BAF1%3D1%3BAC1%3D2%3BDP4%3D0%2C0%2C32%2C5%3BMQ%3D50%3BFQ%3D-138;FORMAT=GT:PL:GQ;SAMPLE=1/1:255%2C111%2C0:99
scaffold1    SAMTools    SNP    23620    23620    165    .    .    REF=A;ALT=T;FILTER=.;INFO=DP%3D16%3BVDB%3D0.0440%3BAF1%3D0.5%3BAC1%3D1%3BDP4%3D2%2C6%2C0%2C8%3BMQ%3D50%3BFQ%3D168%3BPV4%3D0.47%2C0.42%2C1%2C1;FORMAT=GT:PL:GQ;SAMPLE=0/1:195%2C0%2C254:99
 
Description:
SNPs identified in bam file from Mgo Tophat alignment




Track


Other files
GO Analyses:
http://eagle.fish.washington.edu/cnidarian/TJGR_Gene_GO_GOslim.txt

MBD-bisulfite seq data (gill tissue):
*data used in multivariate class*
http://128.95.149.81/bivalvia/All%20data%201059%20genes.xls
http://eagle.fish.washington.edu/cnidarian/MG_alldata1059.txt
.    CG in exon    CG intron    CG total    %CG in MBD exon    %CG in MBD intron    %CG in MBD total    .    Dgl    Fgo    Gil    Amu    Hem    Lpa    Mgo    overall abundance    abundCV    .    cell adhesion    cell cycle and proliferation    cell organization and biogenesis    cell-cell signaling    death    developmental processes    DNA metabolism    other biological processes    other metabolic processes    protein metabolism    RNA metabolism    signal transduction    stress response    transport    .    cell adhesion    cell cycle and proliferation    cell organization and biogenesis    cell-cell signaling    death    developmental processes    DNA metabolism    other biological processes    other metabolic processes    protein metabolism    RNA metabolism    signal transduction    stress response    transport
CGI_10026228    16    11    27    0.025316456    0    0.022222222    CGI_10026228    0    0    0    0    0    0    0.071608608    0.010229801    264.5751311    CGI_10026228                                                        1    CGI_10026228    N    N    N    N    N    N    N    N    N    N    N    N    N    Y
CGI_10026611    22    0    22    0    0.035294118    0.03    CGI_10026611    0    0    0    0.099655867    0    0    0    0.014236552    264.5751311    CGI_10026611                                1                            CGI_10026611    N    N    N    N    N    N    N    Y    N    N    N    N    N    N
CGI_10027943    42    11    53    0    0.126213592    0.087837838    CGI_10027943    0    0    0    0.0751982    0    0    0.048001375    0.017599939    176.5122486    CGI_10027943                            1    1                            CGI_10027943    N    N    N    N    N    N    Y    Y    N    N    N    N    N    N
 
Methods: DNA methylation data
Analyses are based on the results of high-resolution methylation analysis of genomic DNA from pooled oyster gill tissue (n=8). Briefly, genomic DNA was isolated and methylation enrichment performed using the MethylMiner Kit (Invitrogen) following the manufacturer’s instructions. A bisulfite treated DNA library of the methylation-enriched fraction was prepared for Illumina Sequencing at the University of Washington high throughput sequencing facility (Seattle, WA). High-throughput reads were mapped back to a subset of the oyster genome which included scaffolds longer than 1million bp (Zhang et al, 2012). Mapping of the bisulfite treated reads was performed using BS-MAP software (version 2.73). Cytosines in a CG dinucleotide context with greater than 5x coverage in the MBD library were considered to be methylated if at least one of the reads remained unconverted by the bisulfite treatment. One thousand fifty-five oyster genes were evaluated for further analysis of methylation and other gene attributes. Genes were selected if at least 1 CG dinucleotide had 5x coverage in the MBD library and were further limited to genes that were expressed in at least 1 of 6 oyster tissues based on the dataset of Zhang et al (2012). Proportion of methylation for a given gene was calculated by dividing the number of methylated cytosines by the total number of CG dinucleotides in the sequence. The proportion of methylation for exonic regions and intronic regions were also calculated per gene.


[Track] MethylKit analysis results - sperm methylation as Compared to gill by individual CG
http://eagle.fish.washington.edu/bivalvia/files%20for%20methylKit/diffmeth_bytissue_allCG_v9_90.txt
Preview
"","id","chr","start","end","strand","pvalue","qvalue","meth.diff"
"1","scaffold1.105280","scaffold1",105280,105280,"+",1,0.72307975717804,8.33333333333333
"2","scaffold1.105289","scaffold1",105289,105289,"+",1,0.72307975717804,7.14285714285714
"3","scaffold1.154709","scaffold1",154709,154709,"+",0.0019663626474772,0.00796012720063931,-46.1538461538462
"4","scaffold1.154924","scaffold1",154924,154924,"+",1,0.72307975717804,5.95238095238095
Description: Intervals are all CG with 10x coverage that were analyzed in both gill and sperm. Both tissues were mapped to oyster_v9_90. Results of methylkit analysis gives p-value, q-value and %difference in methylation. The sperm is considered the 'control' in this analysis, so positive values in the meth.diff column indicate higher methylation in the sperm.


[Track] MethylKit analysis results - sperm methylation as Compared to gill by 100bp tile
http://eagle.fish.washington.edu/bivalvia/diffmeth_bytissue_100bptile_v9_90.txt
Preview:
    id    chr    start    end    strand    pvalue    qvalue    meth.diff
1    scaffold1.105201.105300    scaffold1    105201    105300    *    0.544685352    0.57355062    7.317073171
2    scaffold1.130301.130400    scaffold1    130301    130400    *    4.99E-08    3.13E-07    -100
3    scaffold1.154701.154800    scaffold1    154701    154800    *    0.00013902    0.000525279    -44.69026549
4    scaffold1.154901.155000    scaffold1    154901    155000    *    0.647937411    0.631058554    9.523809524
5    scaffold1.155601.155700    scaffold1    155601    155700    *    2.39E-175    6.58E-172    -91.27868169
Description:
Intervals are all 100bp tiles that were analyzed in both gill and sperm. Both tissues were mapped to oyster_v9_90. Results of methylkit analysis gives p-value, q-value and %difference in methylation. The sperm is considered the 'control' in this analysis, so positive values in the meth.diff column indicate higher methylation in the sperm.







Summary Statistics for Gill RNAseq coverage on CDS, grouped by gene
http://eagle.fish.washington.edu/cnidarian/TJGR_Gil_cov_CDS_stats_cv.txt
Preview
CGI_10011974 3.676751918 12 0.222693761 0.306395993 0.471904398 0 1.47826087 154.0178099
CGI_10014715 12.11476909 17 1.205523862 0.712633476 1.097963507 0 2.951219512 154.0712784
CGI_10021734 0.050157776 7 0.000121899 0.007165397 0.011040788 0 0.03125 154.0848123
CGI_10015964 0.028011204 5 7.45E-05 0.005602241 0.008633633 0 0.019607843 154.1103512
CGI_10004322 65.58826056 6 283.8299465 10.93137676 16.84725338 0 41.6056338 154.1183124
Description:
Columns: ID sum CDScount var avg stdev min max cv