xxxxxxxxxx
# Intersect bed on features

Intersect bed on features

 
Based on notebook 04 - but separating out to see if there is difference between hypo and hyper methylated.

Based on notebook 04 - but separating out to see if there is difference between hypo and hyper methylated.

 
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/half-shell___Lab_notebook_of_Steven_Roberts_1ACECD10.png" alt="half-shell___Lab_notebook_of_Steven_Roberts_1ACECD10.png"/>

half-shell___Lab_notebook_of_Steven_Roberts_1ACECD10.png

In [2]:
x
 
!date
Fri Apr  3 06:28:54 PDT 2015
In [20]:
%pylab inline
import scipy.stats as stats
Populating the interactive namespace from numpy and matplotlib
 
Feature (from nb -03)    
**tldr** 4 "new" tracks
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/IGV_and_Directory_Listing_of__halfshell_2015-02-hs-bedgraph__1AA51F1B.png" width =100% alt="IGV_and_Directory_Listing_of__halfshell_2015-02-hs-bedgraph__1AA51F1B.png"/>
```
/Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf
/Users/sr320/data-genomic/tentacle/rebuilt.gtf
/Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff
/Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff
```

Feature (from nb -03)

tldr 4 "new" tracks IGV_and_Directory_Listing_of__halfshell_2015-02-hs-bedgraph__1AA51F1B.png

/Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf
/Users/sr320/data-genomic/tentacle/rebuilt.gtf
/Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff
/Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff
 
# DEGs

DEGs

 
`-wb    Write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by -f and -r.`

-wb Write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by -f and -r.

xxxxxxxxxx
 
tldr
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__7_22_AM_1ACED994.png" alt="Screenshot_4_3_15__7_22_AM_1ACED994.png"/>

tldr

Screenshot_4_3_15__7_22_AM_1ACED994.png

 
## Separating HYPO and HYPER

Separating HYPO and HYPER

In [1]:
x
!head ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph
track type=bedGraph name="2M_sig" description="2M_sig" visibility=full color=100,100,0 altColor=0,100,200 priority=20
scaffold1	163391	163444	-1.19635354862016
scaffold1	167390	167448	-1.34858424227208
scaffold1	177036	177092	-1.32513261026528
scaffold1	180263	180313	-1.59644601437398
scaffold1	184151	184202	-1.36802539236446
scaffold1	207852	207911	-1.4489540693628
scaffold1	221645	221697	-1.19168816975966
scaffold100	20261	20311	-1.38705592724581
scaffold100	43707	43766	-1.94554287545546
In [1]:
x
!fgrep -c "-" ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph
7224
In [3]:
x
!fgrep "-" ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph | head
scaffold1	163391	163444	-1.19635354862016
scaffold1	167390	167448	-1.34858424227208
scaffold1	177036	177092	-1.32513261026528
scaffold1	180263	180313	-1.59644601437398
scaffold1	184151	184202	-1.36802539236446
scaffold1	207852	207911	-1.4489540693628
scaffold1	221645	221697	-1.19168816975966
scaffold100	20261	20311	-1.38705592724581
scaffold100	43707	43766	-1.94554287545546
scaffold100	46611	46670	-1.2435587162076
In [4]:
x
!fgrep "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph
scaffold1	163391	163444	-1.19635354862016
scaffold1	167390	167448	-1.34858424227208
scaffold1	177036	177092	-1.32513261026528
scaffold1	180263	180313	-1.59644601437398
scaffold1	184151	184202	-1.36802539236446
scaffold1	207852	207911	-1.4489540693628
scaffold1	221645	221697	-1.19168816975966
scaffold100	20261	20311	-1.38705592724581
scaffold100	43707	43766	-1.94554287545546
scaffold100	46611	46670	-1.2435587162076
In [6]:
!fgrep "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.4M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph
scaffold1	55723	55780	-1.14983078196614
scaffold1	165162	165215	-1.24601772855566
scaffold1	171392	171453	-1.22260744814979
scaffold1	174287	174343	-1.69319890151177
scaffold1	176273	176334	-1.72785163633438
scaffold1	183256	183318	-1.30551922539134
scaffold1	184661	184715	-1.4004518443988
scaffold1	214736	214786	-1.21921626270337
scaffold1	215096	215156	-1.24410534350034
scaffold1	218534	218584	-1.13230161854171
In [7]:
!fgrep "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.6M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph
scaffold1	54599	54654	-1.38187662416007
scaffold1	163536	163586	-1.15032035523765
scaffold1	174287	174343	-1.62903936976887
scaffold1	184271	184330	-1.20699853451878
scaffold1	184661	184715	-1.61107459826899
scaffold1	185141	185192	-1.19168730137504
scaffold1	210863	210918	-1.74282743323306
scaffold1	215839	215890	-1.34660189199927
scaffold1	224010	224070	-1.29699353038817
scaffold1	227414	227469	-1.17931294986337
In [ ]:
 
In [ ]:
 
In [ ]:
 
 
### HYPO

HYPO

In [5]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
 726 Cufflinks
In [9]:
x
 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
-wb \
 426 Cufflinks
In [10]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
 372 Cufflinks
In [10]:
x
!intersectbed \
-wb \
-a /Users/sr320/git-repos/paper-Temp-stress/ipynb/data/array-design/OID40453_probe_locations.gff \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 11 \
| sort | uniq -c 
117460 Cufflinks
In [21]:
x
# Enter the data comparing Oyster 2 then Probes
obs = array([[726, 7224], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 177.835, with p=0.000
The uncorrected chi2 value is 178.264, with p=0.000
In [22]:
x
 
# Enter the data comparing Oyster 4 then Probes
obs = array([[426, 6560], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 388.828, with p=0.000
The uncorrected chi2 value is 389.505, with p=0.000
In [23]:
x
 
# Enter the data comparing Oyster 6 then Probes
obs = array([[372, 7645], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 616.865, with p=0.000
The uncorrected chi2 value is 617.661, with p=0.000
 
## HYPER

HYPER

In [13]:
x
!fgrep -v "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph
!wc -l /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph
track type=bedGraph name="2M_sig" description="2M_sig" visibility=full color=100,100,0 altColor=0,100,200 priority=20
scaffold100	250533	250586	1.72713841645018
scaffold100	362779	362836	1.24792813025432
scaffold100	437627	437684	1.26968497656438
scaffold100	439363	439415	1.8535900440036
scaffold100	458442	458498	1.33779652899652
scaffold100	636974	637034	1.47573175617257
scaffold100	637084	637143	1.24656795811596
scaffold100	642110	642170	1.32784939242625
scaffold100	676066	676117	2.0744756115782
    2804 /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph
In [12]:
!fgrep -v "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.4M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph
!wc -l /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph
track type=bedGraph name="4M_sig" description="4M_sig" visibility=full color=100,100,0 altColor=0,100,200 priority=20
scaffold1	162896	162952	1.31051906307266
scaffold1	174020	174073	1.13065801555915
scaffold1	178210	178267	1.2199265031441
scaffold1	208737	208792	1.31462945133609
scaffold100	91713	91767	1.16773934117713
scaffold100	250282	250335	1.35652322667099
scaffold100	300103	300158	1.23146709929105
scaffold100	303374	303434	1.44751323196346
scaffold100	306375	306430	1.14267878234681
    3588 /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph
In [14]:
!fgrep -v "-" \
./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.6M_sig.bedGraph \
> /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph
!head /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph
!wc -l /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph
track type=bedGraph name="6M_sig" description="6M_sig" visibility=full color=100,100,0 altColor=0,100,200 priority=20
scaffold1	162129	162191	1.85685479189849
scaffold1	172654	172714	1.33561271440876
scaffold1	178075	178128	1.42323539316231
scaffold1	178685	178740	1.30886296151914
scaffold1	214231	214288	1.23355990867606
scaffold1	219034	219092	1.34001786676384
scaffold1	223041	223094	1.32669837521425
scaffold1	230131	230189	1.41307400393928
scaffold100	244541	244592	2.500239239607
    4045 /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph
In [15]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
 154 Cufflinks
In [16]:
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
 278 Cufflinks
In [17]:
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 6 \
| sort | uniq -c 
 260 Cufflinks
In [18]:
 
!intersectbed \
-wb \
-a /Users/sr320/git-repos/paper-Temp-stress/ipynb/data/array-design/OID40453_probe_locations.gff \
-b /Users/sr320/data-genomic/tentacle/Cuffdiff_geneexp.sig.gtf \
| cut -f 11 \
| sort | uniq -c 
117460 Cufflinks
In [24]:
x
 
# Enter the data comparing Oyster 2 then Probes
obs = array([[154, 2803], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 201.876, with p=0.000
The uncorrected chi2 value is 202.623, with p=0.000
In [25]:
x
 
# Enter the data comparing Oyster 2 then Probes
obs = array([[278, 3587], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 162.143, with p=0.000
The uncorrected chi2 value is 162.728, with p=0.000
In [26]:
x
# Enter the data comparing Oyster 2 then Probes
obs = array([[260, 4044], [117460, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 243.013, with p=0.000
The uncorrected chi2 value is 243.693, with p=0.000
In [ ]:
In [ ]:
 
In [ ]:
 
 
# Rebuilt (new gtf based on RNAseq data)

Rebuilt (new gtf based on RNAseq data)

In [15]:
 
!intersectbed \
-wb \
-a ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.2M_sig.bedGraph \
-b /Users/sr320/data-genomic/tentacle/rebuilt.gtf \
| cut -f 6 \
| sort | uniq -c 
8768 Cufflinks
In [16]:
 
!intersectbed \
-wb \
-a ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.4M_sig.bedGraph \
-b /Users/sr320/data-genomic/tentacle/rebuilt.gtf \
| cut -f 6 \
| sort | uniq -c 
7694 Cufflinks
In [17]:
 
!intersectbed \
-wb \
-a ./data/2014.07.02.colson/genomeBrowserTracks/logFC_HS-preHS/2014.07.02.6M_sig.bedGraph \
-b /Users/sr320/data-genomic/tentacle/rebuilt.gtf \
| cut -f 6 \
| sort | uniq -c 
6160 Cufflinks
In [18]:
 
!intersectbed \
-wb \
-a /Users/sr320/git-repos/paper-Temp-stress/ipynb/data/array-design/OID40453_probe_locations.gff \
-b /Users/sr320/data-genomic/tentacle/rebuilt.gtf \
| cut -f 11 \
| sort | uniq -c 
1197818 Cufflinks
In [39]:
x
# Enter the data comparing Oyster 2 then Probes
obs = array([[8768, 10028], [1197818, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
chi2_corrected = stats.chi2_contingency(obs, correction=True)
CHI SQUARE
The corrected chi2 value is 2184.818, with p=0.000
The uncorrected chi2 value is 2185.528, with p=0.000
In [40]:
 
# Enter the data comparing Oyster 4 then Probes
obs = array([[7694, 10148], [1197818, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 3052.863, with p=0.000
The uncorrected chi2 value is 3053.724, with p=0.000
In [41]:
 
# Enter the data comparing Oyster 6 then Probes
obs = array([[6160, 11690], [1197818, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 6233.645, with p=0.000
The uncorrected chi2 value is 6234.874, with p=0.000
 
# Housekeeping Genes

Housekeeping Genes

 
Separating out hypo and hyper

Separating out hypo and hyper

 
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__7_23_AM_1ACED9DD.png" alt="Screenshot_4_3_15__7_23_AM_1ACED9DD.png"/>

Screenshot_4_3_15__7_23_AM_1ACED9DD.png

In [29]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
2172 GLEAN
1038 GLEAN
In [30]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
1988 GLEAN
1381 GLEAN
In [31]:
x
 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 6 \
| sort | uniq -c 
2367 GLEAN
1452 GLEAN
In [26]:
 
!intersectbed \
-wb \
-a /Users/sr320/git-repos/paper-Temp-stress/ipynb/data/array-design/OID40453_probe_locations.gff \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-housekeeping.gff \
| cut -f 11 \
| sort | uniq -c 
251970 GLEAN
In [42]:
 
# Enter the data comparing Oyster 2 then Probes
obs = array([[3210, 10028], [251970, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 34.806, with p=0.000
The uncorrected chi2 value is 34.923, with p=0.000
In [43]:
 
# Enter the data comparing Oyster 4 then Probes
obs = array([[3369, 10148], [251970, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 17.578, with p=0.000
The uncorrected chi2 value is 17.661, with p=0.000
In [47]:
 
# Enter the data comparing Oyster 6 then Probes
obs = array([[3819, 11690], [251970, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 28.378, with p=0.000
The uncorrected chi2 value is 28.476, with p=0.000
In [ ]:
 
In [ ]:
 
# Environmental Response Genes

Environmental Response Genes

 
separating Hypo and Hyper

separating Hypo and Hyper

 
tldr
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__7_28_AM_1ACEDB27.png" alt="Screenshot_4_3_15__7_28_AM_1ACEDB27.png"/>

tldr Screenshot_4_3_15__7_28_AM_1ACEDB27.png

In [ ]:
 
In [33]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
2063 GLEAN
 746 GLEAN
In [34]:
x
 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
1873 GLEAN
 865 GLEAN
In [35]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 6 \
| sort | uniq -c 
2175 GLEAN
1041 GLEAN
In [27]:
 
!intersectbed \
-wb \
-a /Users/sr320/git-repos/paper-Temp-stress/ipynb/data/array-design/OID40453_probe_locations.gff \
-b /Users/sr320/data-genomic/tentacle/Cgigas_v9_gene-env-response.gff \
| cut -f 11 \
| sort | uniq -c 
190475 GLEAN
In [45]:
 
# Enter the data comparing Oyster 2 then Probes
obs = array([[2809, 10028], [190475, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 1.413, with p=0.235
The uncorrected chi2 value is 1.439, with p=0.230
In [48]:
 
# Enter the data comparing Oyster 4 then Probes
obs = array([[2738, 10148], [190475, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 0.280, with p=0.597
The uncorrected chi2 value is 0.291, with p=0.589
In [49]:
 
# Enter the data comparing Oyster 6 then Probes
obs = array([[3216, 11690], [190475, 697753]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 0.141, with p=0.707
The uncorrected chi2 value is 0.149, with p=0.700
 
# TE-Blast

TE-Blast

 
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__7_42_AM_1ACEDE69.png" alt="Screenshot_4_3_15__7_42_AM_1ACEDE69.png"/>

Screenshot_4_3_15__7_42_AM_1ACEDE69.png

In [37]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
 368 WUBlastX
  15 WUBlastX
In [38]:
 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
 251 WUBlastX
   3 WUBlastX
In [39]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_TE-WUBLASTX.gff \
| cut -f 6 \
| sort | uniq -c | sed '/#/d'
 141 WUBlastX
  27 WUBlastX
# Promoter

Promoter

 
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__7_48_AM_1ACEDFD9.png" alt="Screenshot_4_3_15__7_48_AM_1ACEDFD9.png"/>

Screenshot_4_3_15__7_48_AM_1ACEDFD9.png

In [40]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
 720 flankbed	promoter
 256 flankbed	promoter
In [41]:
x
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
 684 flankbed	promoter
 308 flankbed	promoter
 
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'

!intersectbed \ -wb \ -a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \ -b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \ | cut -f 6,7 \ | sort | uniq -c | sed '/#/d' !intersectbed \ -wb \ -a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \ -b /Volumes/web-1/trilobite/Crassostrea_gigas_v9_tracks/Cgigas_v9_1k5p_gene_promoter.gff \ | cut -f 6,7 \ | sort | uniq -c | sed '/#/d'

 
# Chi2 test to compare hypo v hyper?

Chi2 test to compare hypo v hyper?

In [43]:
x
# Enter the data comparing Oyster ALL hypo versus hyper -HOUSEKEEPING
obs = array([[6527, 21429], [3871, 10434]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 70.128, with p=0.000
The uncorrected chi2 value is 70.329, with p=0.000
In [44]:
x
# Enter the data comparing Oyster ALL hypo versus hyper -DEGS
obs = array([[1524, 21429], [692, 10434]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 2.106, with p=0.147
The uncorrected chi2 value is 2.174, with p=0.140
In [45]:
x
# Enter the data comparing Oyster ALL hypo versus hyper -TE blast
obs = array([[760, 21429], [45, 10434]])
# Calculate the chi-square test
chi2_corrected = stats.chi2_contingency(obs, correction=True)
chi2_uncorrected = stats.chi2_contingency(obs, correction=False)
# Print the result
print('CHI SQUARE')
print('The corrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_corrected[0], chi2_corrected[1]))
print('The uncorrected chi2 value is {0:5.3f}, with p={1:5.3f}'.format(chi2_uncorrected[0], chi2_uncorrected[1]))
CHI SQUARE
The corrected chi2 value is 264.516, with p=0.000
The uncorrected chi2 value is 265.761, with p=0.000
 
<img src="http://eagle.fish.washington.edu/cnidarian/skitch/Screenshot_4_3_15__8_07_AM_1ACEE442.png" alt="Screenshot_4_3_15__8_07_AM_1ACEE442.png"/>

Screenshot_4_3_15__8_07_AM_1ACEE442.png

 
# Hypo v Hyper on Ensembl gff - gives data on gene body, and repeats

Hypo v Hyper on Ensembl gff - gives data on gene body, and repeats

In [48]:
x
!echo "hypo"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!echo "hyper"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
hypo
1065 GigaDB	CDS
1065 GigaDB	exon
6094 GigaDB	gene
6094 GigaDB	transcript
 918 dust	repeat_region
 819 trf	repeat_region
hyper
 308 GigaDB	CDS
 308 GigaDB	exon
2374 GigaDB	gene
2374 GigaDB	transcript
 322 dust	repeat_region
 156 trf	repeat_region
In [50]:
!echo "hypo"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!echo "hyper"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.4M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
hypo
   1 EnsemblGenomes	exon
   1 EnsemblGenomes	pseudogenic_tRNA
   1 EnsemblGenomes	transcript
 715 GigaDB	CDS
 715 GigaDB	exon
5389 GigaDB	gene
5389 GigaDB	transcript
 907 dust	repeat_region
 653 trf	repeat_region
hyper
   1 EnsemblGenomes	exon
   1 EnsemblGenomes	tRNA_gene
   1 EnsemblGenomes	transcript
 462 GigaDB	CDS
 462 GigaDB	exon
3102 GigaDB	gene
3102 GigaDB	transcript
 413 dust	repeat_region
 220 trf	repeat_region
In [49]:
 
!echo "hypo"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hypo.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
!echo "hyper"
!intersectbed \
-wb \
-a /Users/sr320/data-genomic/tentacle/2014.07.02.6M_sig.hyper.bedGraph \
-b /Volumes/web-1/trilobite/Crassostrea_gigas_ensembl_tracks/Crassostrea_gigas.GCA_000297895.1.25.sorted.gff3 \
| cut -f 6,7 \
| sort | uniq -c | sed '/#/d'
hypo
   1 EnsemblGenomes	exon
   1 EnsemblGenomes	snRNA
   1 EnsemblGenomes	snRNA_gene
 568 GigaDB	CDS
 568 GigaDB	exon
6295 GigaDB	gene
6295 GigaDB	transcript
1052 dust	repeat_region
 550 trf	repeat_region
hyper
 379 GigaDB	CDS
 380 GigaDB	exon
3394 GigaDB	gene
3394 GigaDB	transcript
 539 dust	repeat_region
 314 trf	repeat_region
In [ ]: