SUMMARISING RUN PARAMETERS ========================== Input filename: EPI-209_S29_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.9.1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction File was specified to be a non-directional MspI-digested RRBS sample. Sequences starting with either 'CAA' or 'CGA' will have the first 2 bp trimmed off to remove potential methylation-biased bases from the end-repair reaction Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /home/sam/data/geoduck_EPI/20180514_geoduck_trimgalore_rrbs/20180514_geoduck_trimmed_fastqc --threads 16 Output file will be GZIP compressed This is cutadapt 1.9.1 with Python 2.7.12 Command line parameters: -f fastq -e 0.1 -O 1 -a AGATCGGAAGAGC /home/sam/data/geoduck_EPI/20180514_geoduck_trimgalore_rrbs/EPI-209_S29_L004_R2_001.fastq.gz_qual_trimmed.fastq Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 592.11 s (19 us/read; 3.18 M reads/minute). === Summary === Total reads processed: 31,419,769 Reads with adapters: 21,614,014 (68.8%) Reads written (passing filters): 31,419,769 (100.0%) Total basepairs processed: 3,120,550,477 bp Total written (filtered): 2,628,782,596 bp (84.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 21614014 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.0% C: 22.0% G: 11.3% T: 30.9% none/other: 0.7% Overview of removed sequences length count expect max.err error counts 1 8734774 7854942.2 0 8734774 2 264168 1963735.6 0 264168 3 224610 490933.9 0 224610 4 192791 122733.5 0 192791 5 190150 30683.4 0 190150 6 194913 7670.8 0 194913 7 196982 1917.7 0 196982 8 209000 479.4 0 209000 9 196000 119.9 0 195173 827 10 203141 30.0 1 198412 4729 11 182865 7.5 1 177790 5075 12 191450 1.9 1 185766 5684 13 186738 0.5 1 181604 5134 14 205369 0.5 1 199330 6039 15 195656 0.5 1 190271 5385 16 196181 0.5 1 190746 5435 17 202165 0.5 1 196403 5762 18 181166 0.5 1 176280 4886 19 193370 0.5 1 187654 5716 20 190622 0.5 1 185238 5384 21 193328 0.5 1 187765 5563 22 198860 0.5 1 192907 5953 23 189427 0.5 1 183694 5733 24 199822 0.5 1 193557 6265 25 183895 0.5 1 178470 5425 26 182511 0.5 1 176453 6058 27 184136 0.5 1 177440 6696 28 194739 0.5 1 188979 5760 29 184983 0.5 1 179036 5947 30 204058 0.5 1 198106 5952 31 174604 0.5 1 169099 5505 32 189523 0.5 1 184209 5314 33 192826 0.5 1 186716 6110 34 187744 0.5 1 181492 6252 35 187013 0.5 1 182001 5012 36 179254 0.5 1 173687 5567 37 178941 0.5 1 173539 5402 38 164059 0.5 1 159187 4872 39 166796 0.5 1 161285 5511 40 164687 0.5 1 159412 5275 41 169686 0.5 1 164725 4961 42 168623 0.5 1 163921 4702 43 148078 0.5 1 143407 4671 44 153809 0.5 1 148976 4833 45 186123 0.5 1 180924 5199 46 146858 0.5 1 142559 4299 47 116142 0.5 1 112386 3756 48 155224 0.5 1 150892 4332 49 121186 0.5 1 117716 3470 50 121068 0.5 1 117322 3746 51 166414 0.5 1 162381 4033 52 107754 0.5 1 104617 3137 53 110077 0.5 1 107077 3000 54 97287 0.5 1 94592 2695 55 117805 0.5 1 114708 3097 56 112515 0.5 1 109320 3195 57 105922 0.5 1 102845 3077 58 102228 0.5 1 99390 2838 59 98462 0.5 1 95603 2859 60 93993 0.5 1 91182 2811 61 92745 0.5 1 89961 2784 62 94564 0.5 1 91720 2844 63 94361 0.5 1 91394 2967 64 92501 0.5 1 89409 3092 65 95404 0.5 1 92128 3276 66 99458 0.5 1 95616 3842 67 149612 0.5 1 136937 12675 68 755065 0.5 1 735966 19099 69 336790 0.5 1 326080 10710 70 177166 0.5 1 171002 6164 71 89780 0.5 1 86251 3529 72 56865 0.5 1 54449 2416 73 38055 0.5 1 36303 1752 74 28628 0.5 1 27218 1410 75 22781 0.5 1 21660 1121 76 19324 0.5 1 18326 998 77 17129 0.5 1 16193 936 78 14770 0.5 1 13913 857 79 12720 0.5 1 11998 722 80 11018 0.5 1 10351 667 81 9430 0.5 1 8848 582 82 7978 0.5 1 7469 509 83 6881 0.5 1 6440 441 84 5905 0.5 1 5462 443 85 4791 0.5 1 4440 351 86 4295 0.5 1 3927 368 87 4050 0.5 1 3700 350 88 4120 0.5 1 3775 345 89 4646 0.5 1 4212 434 90 5934 0.5 1 5409 525 91 8230 0.5 1 7513 717 92 12210 0.5 1 11133 1077 93 26572 0.5 1 24349 2223 94 77954 0.5 1 72243 5711 95 131539 0.5 1 122540 8999 96 55073 0.5 1 51211 3862 97 35038 0.5 1 32570 2468 98 14394 0.5 1 13380 1014 99 14041 0.5 1 13023 1018 100 16128 0.5 1 14975 1153 101 33498 0.5 1 30322 3176 RUN STATISTICS FOR INPUT FILE: EPI-209_S29_L004_R2_001.fastq.gz ============================================= 31419769 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 3859534 (12.3%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 0 (0.0%) RRBS reads trimmed by 2 bp at the start when read started with CAA (950674) or CGA (329245) in total: 1279919 (4.1%) Total number of sequences analysed for the sequence pair length validation: 31419769 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 1779807 (5.66%)