SUMMARISING RUN PARAMETERS ========================== Input filename: EPI-227_S35_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.9.1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction File was specified to be a non-directional MspI-digested RRBS sample. Sequences starting with either 'CAA' or 'CGA' will have the first 2 bp trimmed off to remove potential methylation-biased bases from the end-repair reaction Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /home/sam/data/geoduck_EPI/20180514_geoduck_trimgalore_rrbs/20180514_geoduck_trimmed_fastqc --threads 16 Output file will be GZIP compressed This is cutadapt 1.9.1 with Python 2.7.12 Command line parameters: -f fastq -e 0.1 -O 1 -a AGATCGGAAGAGC /home/sam/data/geoduck_EPI/20180514_geoduck_trimgalore_rrbs/EPI-227_S35_L004_R2_001.fastq.gz_qual_trimmed.fastq Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 294.47 s (19 us/read; 3.24 M reads/minute). === Summary === Total reads processed: 15,898,223 Reads with adapters: 11,357,796 (71.4%) Reads written (passing filters): 15,898,223 (100.0%) Total basepairs processed: 1,549,620,960 bp Total written (filtered): 1,203,933,001 bp (77.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 11357796 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.8% C: 20.0% G: 16.1% T: 30.9% none/other: 1.3% Overview of removed sequences length count expect max.err error counts 1 3955798 3974555.8 0 3955798 2 136333 993638.9 0 136333 3 98864 248409.7 0 98864 4 82368 62102.4 0 82368 5 80534 15525.6 0 80534 6 82671 3881.4 0 82671 7 81418 970.4 0 81418 8 85537 242.6 0 85537 9 81127 60.6 0 80760 367 10 83368 15.2 1 81278 2090 11 77785 3.8 1 75443 2342 12 81136 0.9 1 78467 2669 13 78103 0.2 1 75738 2365 14 86577 0.2 1 83838 2739 15 81527 0.2 1 79046 2481 16 82612 0.2 1 80248 2364 17 87108 0.2 1 84354 2754 18 78145 0.2 1 75861 2284 19 83138 0.2 1 80524 2614 20 81133 0.2 1 78554 2579 21 82309 0.2 1 79742 2567 22 84765 0.2 1 82055 2710 23 84062 0.2 1 81396 2666 24 89131 0.2 1 86156 2975 25 80742 0.2 1 78281 2461 26 81313 0.2 1 78349 2964 27 82983 0.2 1 79734 3249 28 87089 0.2 1 84151 2938 29 82274 0.2 1 79428 2846 30 92313 0.2 1 89442 2871 31 79650 0.2 1 76971 2679 32 84986 0.2 1 82345 2641 33 88102 0.2 1 84949 3153 34 86728 0.2 1 83460 3268 35 86747 0.2 1 84254 2493 36 83900 0.2 1 81181 2719 37 84181 0.2 1 81456 2725 38 77996 0.2 1 75485 2511 39 79417 0.2 1 76653 2764 40 79107 0.2 1 76323 2784 41 81242 0.2 1 78611 2631 42 80669 0.2 1 78166 2503 43 72353 0.2 1 69986 2367 44 74687 0.2 1 72179 2508 45 93341 0.2 1 90486 2855 46 72356 0.2 1 70015 2341 47 57231 0.2 1 55269 1962 48 77101 0.2 1 74747 2354 49 58571 0.2 1 56788 1783 50 60296 0.2 1 58366 1930 51 82607 0.2 1 80461 2146 52 54498 0.2 1 52824 1674 53 55618 0.2 1 54032 1586 54 49767 0.2 1 48244 1523 55 59499 0.2 1 57748 1751 56 55828 0.2 1 54186 1642 57 53261 0.2 1 51623 1638 58 52509 0.2 1 50901 1608 59 49705 0.2 1 48178 1527 60 47800 0.2 1 46295 1505 61 48351 0.2 1 46806 1545 62 48640 0.2 1 47024 1616 63 49084 0.2 1 47337 1747 64 48949 0.2 1 47129 1820 65 53475 0.2 1 51312 2163 66 62659 0.2 1 59841 2818 67 119621 0.2 1 107144 12477 68 807328 0.2 1 785922 21406 69 378101 0.2 1 365571 12530 70 209465 0.2 1 202186 7279 71 103265 0.2 1 99212 4053 72 61903 0.2 1 59393 2510 73 37412 0.2 1 35677 1735 74 27118 0.2 1 25757 1361 75 20836 0.2 1 19785 1051 76 16713 0.2 1 15762 951 77 14084 0.2 1 13311 773 78 12135 0.2 1 11390 745 79 10765 0.2 1 10087 678 80 9445 0.2 1 8834 611 81 8111 0.2 1 7558 553 82 7292 0.2 1 6749 543 83 6614 0.2 1 6128 486 84 5752 0.2 1 5319 433 85 5440 0.2 1 4983 457 86 5014 0.2 1 4550 464 87 5220 0.2 1 4733 487 88 5618 0.2 1 5085 533 89 6397 0.2 1 5876 521 90 8184 0.2 1 7415 769 91 11679 0.2 1 10678 1001 92 17547 0.2 1 16049 1498 93 38226 0.2 1 35145 3081 94 116093 0.2 1 108311 7782 95 198434 0.2 1 185734 12700 96 82887 0.2 1 77554 5333 97 50087 0.2 1 46854 3233 98 19379 0.2 1 18136 1243 99 18430 0.2 1 17168 1262 100 17979 0.2 1 16757 1222 101 32048 0.2 1 29556 2492 RUN STATISTICS FOR INPUT FILE: EPI-227_S35_L004_R2_001.fastq.gz ============================================= 15898223 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 3065015 (19.3%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 0 (0.0%) RRBS reads trimmed by 2 bp at the start when read started with CAA (413863) or CGA (150699) in total: 564562 (3.6%) Total number of sequences analysed for the sequence pair length validation: 15898223 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 2366050 (14.88%)