No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default) Path to Cutadapt set as: 'cutadapt' (default) 1.16 Cutadapt seems to be working fine (tested command 'cutadapt --version') AUTO-DETECTING ADAPTER TYPE =========================== Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz <<) Found perfect matches for the following adapter sequences: Adapter type Count Sequence Sequences analysed Percentage Illumina 2009 AGATCGGAAGAGC 1000000 0.20 Nextera 1234 CTGTCTCTTATA 1000000 0.12 smallRNA 2 TGGAATTCTCGG 1000000 0.00 Using Illumina adapter for trimming (count: 2009). Second best hit was Nextera (count: 1234) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11149.52 s (31 us/read; 1.94 M reads/minute). === Summary === Total reads processed: 360,317,079 Reads with adapters: 80,220,248 (22.3%) Reads written (passing filters): 360,317,079 (100.0%) Total basepairs processed: 43,967,642,352 bp Quality-trimmed: 1,560,685,314 bp (3.5%) Total written (filtered): 42,238,637,269 bp (96.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 80220248 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.7% C: 23.8% G: 13.9% T: 29.5% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 60338788 90079269.8 0 60338788 2 13605218 22519817.4 0 13605218 3 3509248 5629954.4 0 3509248 4 1388109 1407488.6 0 1388109 5 269107 351872.1 0 269107 6 38516 87968.0 0 38516 7 24597 21992.0 0 24597 8 13084 5498.0 0 13084 9 3788 1374.5 0 1722 2066 10 15707 343.6 1 5942 9765 11 4631 85.9 1 1305 3326 12 9306 21.5 1 5226 4080 13 10308 5.4 1 6011 4297 14 7214 5.4 1 4454 2760 15 9487 5.4 1 5666 3821 16 7360 5.4 1 4595 2765 17 5695 5.4 1 2558 3137 18 17902 5.4 1 11159 6743 19 16909 5.4 1 12269 4640 20 3927 5.4 1 1688 2239 21 2448 5.4 1 1009 1439 22 11394 5.4 1 7038 4356 23 8390 5.4 1 5384 3006 24 11059 5.4 1 7864 3195 25 5527 5.4 1 2843 2684 26 7450 5.4 1 4873 2577 27 9300 5.4 1 5970 3330 28 16610 5.4 1 11225 5385 29 3060 5.4 1 1401 1659 30 8133 5.4 1 4772 3361 31 11251 5.4 1 7831 3420 32 11086 5.4 1 7279 3807 33 7409 5.4 1 4966 2443 34 18112 5.4 1 12473 5639 35 3598 5.4 1 1708 1890 36 10948 5.4 1 6916 4032 37 9972 5.4 1 6855 3117 38 12068 5.4 1 8008 4060 39 10475 5.4 1 7070 3405 40 7730 5.4 1 5298 2432 41 15577 5.4 1 10426 5151 42 14206 5.4 1 9693 4513 43 13143 5.4 1 9768 3375 44 9748 5.4 1 6353 3395 45 3463 5.4 1 1943 1520 46 5989 5.4 1 4121 1868 47 3489 5.4 1 1874 1615 48 12721 5.4 1 8917 3804 49 12876 5.4 1 9453 3423 50 4344 5.4 1 2587 1757 51 8241 5.4 1 5807 2434 52 14678 5.4 1 10804 3874 53 10691 5.4 1 7559 3132 54 6585 5.4 1 4001 2584 55 22691 5.4 1 17087 5604 56 12588 5.4 1 9530 3058 57 8760 5.4 1 6001 2759 58 7707 5.4 1 5513 2194 59 16258 5.4 1 12946 3312 60 4532 5.4 1 2802 1730 61 2863 5.4 1 1719 1144 62 14085 5.4 1 10944 3141 63 7437 5.4 1 4926 2511 64 6375 5.4 1 4354 2021 65 8460 5.4 1 6271 2189 66 22175 5.4 1 17281 4894 67 10442 5.4 1 7941 2501 68 9244 5.4 1 6990 2254 69 10026 5.4 1 7687 2339 70 9896 5.4 1 7644 2252 71 10434 5.4 1 8167 2267 72 11523 5.4 1 8923 2600 73 12491 5.4 1 9955 2536 74 13421 5.4 1 10746 2675 75 23025 5.4 1 15119 7906 76 58275 5.4 1 51215 7060 77 34068 5.4 1 29418 4650 78 19389 5.4 1 15709 3680 79 12783 5.4 1 9957 2826 80 9794 5.4 1 7282 2512 81 7950 5.4 1 5779 2171 82 7225 5.4 1 5102 2123 83 6698 5.4 1 4721 1977 84 6438 5.4 1 4485 1953 85 6426 5.4 1 4408 2018 86 6221 5.4 1 4277 1944 87 6176 5.4 1 4261 1915 88 6119 5.4 1 4198 1921 89 5828 5.4 1 3925 1903 90 5873 5.4 1 4054 1819 91 5786 5.4 1 3959 1827 92 5637 5.4 1 3802 1835 93 5482 5.4 1 3764 1718 94 5299 5.4 1 3668 1631 95 5376 5.4 1 3657 1719 96 5158 5.4 1 3560 1598 97 5088 5.4 1 3510 1578 98 5111 5.4 1 3454 1657 99 4842 5.4 1 3330 1512 100 4810 5.4 1 3333 1477 101 4645 5.4 1 3175 1470 102 4645 5.4 1 3210 1435 103 4531 5.4 1 3152 1379 104 4251 5.4 1 2924 1327 105 4186 5.4 1 2897 1289 106 4246 5.4 1 3014 1232 107 4096 5.4 1 2835 1261 108 3968 5.4 1 2782 1186 109 3865 5.4 1 2738 1127 110 3704 5.4 1 2601 1103 111 3388 5.4 1 2329 1059 112 3440 5.4 1 2398 1042 113 3310 5.4 1 2321 989 114 3078 5.4 1 2196 882 115 3117 5.4 1 2226 891 116 2949 5.4 1 2077 872 117 2757 5.4 1 1966 791 118 2541 5.4 1 1804 737 119 2416 5.4 1 1735 681 120 2306 5.4 1 1624 682 121 2288 5.4 1 1583 705 122 2058 5.4 1 1438 620 123 1907 5.4 1 1317 590 124 1907 5.4 1 1342 565 125 1725 5.4 1 1237 488 126 1458 5.4 1 1056 402 127 1372 5.4 1 1025 347 128 1171 5.4 1 916 255 129 937 5.4 1 713 224 130 782 5.4 1 614 168 131 646 5.4 1 506 140 132 565 5.4 1 430 135 133 517 5.4 1 413 104 134 485 5.4 1 356 129 135 428 5.4 1 307 121 136 362 5.4 1 247 115 137 298 5.4 1 193 105 138 277 5.4 1 187 90 139 274 5.4 1 167 107 140 271 5.4 1 147 124 141 283 5.4 1 136 147 142 200 5.4 1 95 105 143 223 5.4 1 126 97 144 255 5.4 1 128 127 145 269 5.4 1 130 139 146 341 5.4 1 178 163 147 424 5.4 1 224 200 148 731 5.4 1 408 323 149 1575 5.4 1 862 713 150 5215 5.4 1 2875 2340 151 2608 5.4 1 1396 1212 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz ============================================= 360317079 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12989.08 s (36 us/read; 1.66 M reads/minute). === Summary === Total reads processed: 360,317,079 Reads with adapters: 88,689,096 (24.6%) Reads written (passing filters): 360,317,079 (100.0%) Total basepairs processed: 45,099,303,736 bp Quality-trimmed: 3,640,004,759 bp (8.1%) Total written (filtered): 41,159,151,426 bp (91.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 88689096 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.7% C: 25.7% G: 14.1% T: 28.4% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 64753352 90079269.8 0 64753352 2 15118719 22519817.4 0 15118719 3 3729572 5629954.4 0 3729572 4 1305726 1407488.6 0 1305726 5 282692 351872.1 0 282692 6 65920 87968.0 0 65920 7 74364 21992.0 0 74364 8 39672 5498.0 0 39672 9 10697 1374.5 0 7957 2740 10 45048 343.6 1 22457 22591 11 16931 85.9 1 6797 10134 12 33507 21.5 1 18660 14847 13 18755 5.4 1 9059 9696 14 77663 5.4 1 49412 28251 15 19148 5.4 1 11210 7938 16 13449 5.4 1 7018 6431 17 46580 5.4 1 30044 16536 18 7407 5.4 1 3684 3723 19 36440 5.4 1 24063 12377 20 23547 5.4 1 15404 8143 21 3906 5.4 1 1601 2305 22 6888 5.4 1 3092 3796 23 31056 5.4 1 19155 11901 24 88045 5.4 1 57413 30632 25 30799 5.4 1 19866 10933 26 36196 5.4 1 25396 10800 27 5129 5.4 1 2416 2713 28 34937 5.4 1 23009 11928 29 7029 5.4 1 3389 3640 30 42551 5.4 1 28247 14304 31 8699 5.4 1 4768 3931 32 47855 5.4 1 32717 15138 33 61631 5.4 1 46017 15614 34 6085 5.4 1 2982 3103 35 21581 5.4 1 12096 9485 36 21455 5.4 1 12945 8510 37 34351 5.4 1 24705 9646 38 20076 5.4 1 12167 7909 39 19987 5.4 1 13873 6114 40 9385 5.4 1 5173 4212 41 36717 5.4 1 24830 11887 42 71632 5.4 1 49925 21707 43 9887 5.4 1 6005 3882 44 38952 5.4 1 26928 12024 45 79259 5.4 1 56070 23189 46 26512 5.4 1 18489 8023 47 9088 5.4 1 5336 3752 48 74256 5.4 1 53627 20629 49 39821 5.4 1 28381 11440 50 20324 5.4 1 12864 7460 51 162789 5.4 1 120770 42019 52 31359 5.4 1 20923 10436 53 23724 5.4 1 16238 7486 54 23206 5.4 1 17066 6140 55 45128 5.4 1 32754 12374 56 27964 5.4 1 19730 8234 57 37238 5.4 1 27604 9634 58 45726 5.4 1 34990 10736 59 32778 5.4 1 24460 8318 60 32782 5.4 1 24959 7823 61 34414 5.4 1 26281 8133 62 37938 5.4 1 28887 9051 63 48599 5.4 1 37903 10696 64 54110 5.4 1 42463 11647 65 59526 5.4 1 47451 12075 66 64166 5.4 1 51851 12315 67 97138 5.4 1 75275 21863 68 336948 5.4 1 303077 33871 69 153767 5.4 1 137728 16039 70 53931 5.4 1 44491 9440 71 32760 5.4 1 26004 6756 72 21295 5.4 1 15869 5426 73 18768 5.4 1 13719 5049 74 15872 5.4 1 11155 4717 75 15804 5.4 1 11156 4648 76 14814 5.4 1 10205 4609 77 14079 5.4 1 9578 4501 78 13655 5.4 1 9317 4338 79 13972 5.4 1 9433 4539 80 13906 5.4 1 9318 4588 81 14585 5.4 1 9682 4903 82 14091 5.4 1 9379 4712 83 13670 5.4 1 9016 4654 84 13368 5.4 1 8829 4539 85 13338 5.4 1 8815 4523 86 12781 5.4 1 8370 4411 87 12074 5.4 1 7866 4208 88 12093 5.4 1 7932 4161 89 11792 5.4 1 7575 4217 90 11746 5.4 1 7667 4079 91 11488 5.4 1 7466 4022 92 10807 5.4 1 6964 3843 93 10319 5.4 1 6692 3627 94 10417 5.4 1 6799 3618 95 9889 5.4 1 6369 3520 96 10264 5.4 1 6712 3552 97 10210 5.4 1 6675 3535 98 9677 5.4 1 6295 3382 99 9855 5.4 1 6448 3407 100 10309 5.4 1 6852 3457 101 10187 5.4 1 6734 3453 102 10110 5.4 1 6782 3328 103 10134 5.4 1 6763 3371 104 9856 5.4 1 6548 3308 105 9690 5.4 1 6432 3258 106 9644 5.4 1 6483 3161 107 9175 5.4 1 6221 2954 108 9012 5.4 1 6138 2874 109 8617 5.4 1 5849 2768 110 8348 5.4 1 5665 2683 111 8148 5.4 1 5526 2622 112 8056 5.4 1 5465 2591 113 7615 5.4 1 5133 2482 114 7471 5.4 1 5029 2442 115 7212 5.4 1 4856 2356 116 6823 5.4 1 4570 2253 117 6711 5.4 1 4491 2220 118 6527 5.4 1 4378 2149 119 6212 5.4 1 4141 2071 120 5847 5.4 1 3907 1940 121 5372 5.4 1 3547 1825 122 5203 5.4 1 3453 1750 123 4806 5.4 1 3214 1592 124 4515 5.4 1 2955 1560 125 4058 5.4 1 2678 1380 126 3676 5.4 1 2455 1221 127 3085 5.4 1 1972 1113 128 2674 5.4 1 1709 965 129 2058 5.4 1 1250 808 130 1711 5.4 1 1012 699 131 1369 5.4 1 825 544 132 1121 5.4 1 640 481 133 980 5.4 1 525 455 134 760 5.4 1 415 345 135 627 5.4 1 326 301 136 513 5.4 1 265 248 137 402 5.4 1 189 213 138 269 5.4 1 123 146 139 216 5.4 1 95 121 140 183 5.4 1 74 109 141 151 5.4 1 53 98 142 152 5.4 1 54 98 143 130 5.4 1 40 90 144 110 5.4 1 29 81 145 131 5.4 1 40 91 146 110 5.4 1 34 76 147 136 5.4 1 50 86 148 288 5.4 1 120 168 149 844 5.4 1 443 401 150 3002 5.4 1 1628 1374 151 842 5.4 1 425 417 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz ============================================= 360317079 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Total number of sequences analysed: 360317079 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 72308877 (20.07%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9.84 s (37 us/read; 1.63 M reads/minute). === Summary === Total reads processed: 266,864 Reads with adapters: 66,657 (25.0%) Reads written (passing filters): 266,864 (100.0%) Total basepairs processed: 33,661,250 bp Quality-trimmed: 1,161,529 bp (3.5%) Total written (filtered): 32,363,429 bp (96.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 66657 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.2% C: 22.7% G: 13.5% T: 28.1% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 51635 66716.0 0 51635 2 10308 16679.0 0 10308 3 2918 4169.8 0 2918 4 992 1042.4 0 992 5 157 260.6 0 157 6 25 65.2 0 25 7 6 16.3 0 6 8 4 4.1 0 4 9 1 1.0 0 0 1 10 8 0.3 1 6 2 11 2 0.1 1 0 2 12 1 0.0 1 1 13 1 0.0 1 0 1 14 3 0.0 1 1 2 15 3 0.0 1 3 17 1 0.0 1 0 1 18 3 0.0 1 1 2 19 3 0.0 1 2 1 22 4 0.0 1 1 3 23 3 0.0 1 1 2 24 5 0.0 1 4 1 25 2 0.0 1 1 1 26 3 0.0 1 1 2 27 2 0.0 1 2 28 2 0.0 1 2 30 4 0.0 1 2 2 31 3 0.0 1 2 1 32 2 0.0 1 2 33 2 0.0 1 2 34 5 0.0 1 3 2 35 3 0.0 1 3 36 6 0.0 1 1 5 37 3 0.0 1 3 38 1 0.0 1 1 39 2 0.0 1 2 41 10 0.0 1 6 4 42 4 0.0 1 4 43 4 0.0 1 4 44 3 0.0 1 2 1 45 1 0.0 1 1 47 1 0.0 1 0 1 48 2 0.0 1 2 49 1 0.0 1 1 50 1 0.0 1 1 51 3 0.0 1 3 52 4 0.0 1 1 3 54 1 0.0 1 1 55 7 0.0 1 7 56 2 0.0 1 2 57 1 0.0 1 0 1 59 2 0.0 1 0 2 61 2 0.0 1 1 1 62 3 0.0 1 3 63 1 0.0 1 0 1 64 2 0.0 1 0 2 65 1 0.0 1 1 66 3 0.0 1 3 67 2 0.0 1 1 1 68 2 0.0 1 1 1 69 4 0.0 1 4 70 4 0.0 1 3 1 71 4 0.0 1 4 72 2 0.0 1 1 1 73 6 0.0 1 5 1 74 10 0.0 1 5 5 75 73 0.0 1 7 66 76 82 0.0 1 20 62 77 43 0.0 1 13 30 78 38 0.0 1 14 24 79 20 0.0 1 5 15 80 11 0.0 1 4 7 81 5 0.0 1 1 4 82 4 0.0 1 2 2 83 2 0.0 1 0 2 84 3 0.0 1 1 2 85 1 0.0 1 0 1 86 2 0.0 1 1 1 87 2 0.0 1 1 1 88 2 0.0 1 1 1 89 1 0.0 1 0 1 90 2 0.0 1 1 1 91 1 0.0 1 1 92 2 0.0 1 1 1 93 2 0.0 1 0 2 95 4 0.0 1 0 4 96 1 0.0 1 1 98 3 0.0 1 2 1 99 3 0.0 1 1 2 100 2 0.0 1 2 103 2 0.0 1 1 1 104 2 0.0 1 2 105 2 0.0 1 2 111 2 0.0 1 1 1 112 1 0.0 1 1 113 1 0.0 1 0 1 115 2 0.0 1 2 117 1 0.0 1 1 118 1 0.0 1 0 1 119 1 0.0 1 1 121 1 0.0 1 1 122 3 0.0 1 2 1 124 1 0.0 1 0 1 126 2 0.0 1 1 1 128 2 0.0 1 1 1 129 1 0.0 1 1 130 4 0.0 1 2 2 131 1 0.0 1 1 134 1 0.0 1 1 135 2 0.0 1 2 138 1 0.0 1 0 1 139 1 0.0 1 0 1 140 1 0.0 1 0 1 141 1 0.0 1 1 142 1 0.0 1 1 143 1 0.0 1 1 145 1 0.0 1 0 1 146 1 0.0 1 1 147 3 0.0 1 2 1 148 5 0.0 1 0 5 149 11 0.0 1 2 9 150 48 0.0 1 7 41 151 24 0.0 1 5 19 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001.fastq.gz ============================================= 266864 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.47 s (32 us/read; 1.89 M reads/minute). === Summary === Total reads processed: 266,864 Reads with adapters: 56,715 (21.3%) Reads written (passing filters): 266,864 (100.0%) Total basepairs processed: 31,541,791 bp Quality-trimmed: 5,198,294 bp (16.5%) Total written (filtered): 26,213,777 bp (83.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 56715 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.8% C: 25.6% G: 13.3% T: 30.4% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 42033 66716.0 0 42033 2 10351 16679.0 0 10351 3 2329 4169.8 0 2329 4 905 1042.4 0 905 5 171 260.6 0 171 6 25 65.2 0 25 7 13 16.3 0 13 8 4 4.1 0 4 9 6 1.0 0 4 2 10 16 0.3 1 5 11 11 6 0.1 1 3 3 12 8 0.0 1 5 3 13 1 0.0 1 1 14 13 0.0 1 6 7 15 4 0.0 1 3 1 16 5 0.0 1 3 2 17 5 0.0 1 4 1 18 3 0.0 1 2 1 19 7 0.0 1 4 3 20 6 0.0 1 3 3 21 1 0.0 1 1 22 2 0.0 1 2 23 2 0.0 1 2 24 15 0.0 1 6 9 25 5 0.0 1 2 3 26 5 0.0 1 4 1 27 1 0.0 1 1 28 10 0.0 1 5 5 29 2 0.0 1 0 2 30 7 0.0 1 5 2 32 3 0.0 1 1 2 33 11 0.0 1 9 2 34 1 0.0 1 0 1 35 5 0.0 1 4 1 36 3 0.0 1 2 1 37 4 0.0 1 4 38 1 0.0 1 1 39 3 0.0 1 2 1 40 3 0.0 1 0 3 41 8 0.0 1 6 2 42 10 0.0 1 6 4 43 1 0.0 1 0 1 44 6 0.0 1 6 45 9 0.0 1 7 2 46 9 0.0 1 7 2 47 3 0.0 1 1 2 48 14 0.0 1 11 3 49 4 0.0 1 3 1 50 3 0.0 1 3 51 13 0.0 1 9 4 52 7 0.0 1 3 4 53 1 0.0 1 1 54 6 0.0 1 4 2 55 7 0.0 1 4 3 56 1 0.0 1 1 57 6 0.0 1 3 3 58 12 0.0 1 4 8 59 8 0.0 1 5 3 60 5 0.0 1 4 1 61 6 0.0 1 4 2 62 11 0.0 1 9 2 63 10 0.0 1 10 64 11 0.0 1 9 2 65 7 0.0 1 5 2 66 10 0.0 1 8 2 67 129 0.0 1 5 124 68 181 0.0 1 72 109 69 49 0.0 1 27 22 70 35 0.0 1 4 31 71 14 0.0 1 8 6 72 3 0.0 1 2 1 73 6 0.0 1 4 2 74 3 0.0 1 1 2 75 5 0.0 1 1 4 76 1 0.0 1 1 77 2 0.0 1 0 2 78 1 0.0 1 1 79 1 0.0 1 1 81 3 0.0 1 3 82 2 0.0 1 2 83 2 0.0 1 2 84 2 0.0 1 0 2 85 2 0.0 1 0 2 86 2 0.0 1 0 2 87 1 0.0 1 1 88 3 0.0 1 2 1 89 1 0.0 1 0 1 90 4 0.0 1 3 1 91 4 0.0 1 2 2 94 2 0.0 1 1 1 95 1 0.0 1 1 96 3 0.0 1 1 2 97 2 0.0 1 0 2 98 2 0.0 1 0 2 99 1 0.0 1 0 1 100 2 0.0 1 0 2 101 1 0.0 1 0 1 102 1 0.0 1 0 1 103 1 0.0 1 0 1 105 2 0.0 1 0 2 106 1 0.0 1 0 1 107 1 0.0 1 1 109 1 0.0 1 1 110 1 0.0 1 1 111 2 0.0 1 1 1 116 1 0.0 1 0 1 117 1 0.0 1 1 118 2 0.0 1 2 119 1 0.0 1 0 1 122 1 0.0 1 1 123 1 0.0 1 0 1 125 1 0.0 1 0 1 126 1 0.0 1 1 127 1 0.0 1 0 1 128 1 0.0 1 0 1 129 3 0.0 1 2 1 132 1 0.0 1 1 133 1 0.0 1 0 1 140 1 0.0 1 0 1 143 1 0.0 1 0 1 147 1 0.0 1 0 1 148 3 0.0 1 1 2 149 2 0.0 1 1 1 150 20 0.0 1 3 17 151 8 0.0 1 2 6 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001.fastq.gz ============================================= 266864 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Total number of sequences analysed: 266864 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 62879 (23.56%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-5_S18_L005_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9.20 s (36 us/read; 1.67 M reads/minute). === Summary === Total reads processed: 256,911 Reads with adapters: 64,054 (24.9%) Reads written (passing filters): 256,911 (100.0%) Total basepairs processed: 32,102,933 bp Quality-trimmed: 1,174,253 bp (3.7%) Total written (filtered): 30,796,322 bp (95.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 64054 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.0% C: 22.9% G: 13.3% T: 28.2% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 49428 64227.8 0 49428 2 10009 16056.9 0 10009 3 2799 4014.2 0 2799 4 949 1003.6 0 949 5 204 250.9 0 204 6 20 62.7 0 20 7 10 15.7 0 10 8 3 3.9 0 3 9 3 1.0 0 3 10 10 0.2 1 3 7 11 2 0.1 1 0 2 12 2 0.0 1 1 1 13 4 0.0 1 3 1 14 3 0.0 1 2 1 15 4 0.0 1 3 1 16 2 0.0 1 2 18 4 0.0 1 3 1 19 5 0.0 1 3 2 20 1 0.0 1 0 1 22 6 0.0 1 1 5 23 5 0.0 1 3 2 24 6 0.0 1 5 1 25 1 0.0 1 0 1 26 1 0.0 1 1 27 2 0.0 1 0 2 28 4 0.0 1 2 2 30 3 0.0 1 1 2 31 3 0.0 1 2 1 32 4 0.0 1 3 1 33 2 0.0 1 2 34 1 0.0 1 0 1 35 3 0.0 1 1 2 36 2 0.0 1 1 1 38 3 0.0 1 2 1 39 3 0.0 1 3 41 7 0.0 1 3 4 42 3 0.0 1 3 43 2 0.0 1 1 1 44 3 0.0 1 1 2 45 4 0.0 1 2 2 46 1 0.0 1 1 49 1 0.0 1 1 50 3 0.0 1 2 1 51 1 0.0 1 1 52 1 0.0 1 0 1 53 2 0.0 1 2 54 2 0.0 1 0 2 55 5 0.0 1 4 1 56 7 0.0 1 6 1 57 5 0.0 1 4 1 58 1 0.0 1 1 59 4 0.0 1 4 60 1 0.0 1 1 61 1 0.0 1 1 62 1 0.0 1 1 63 2 0.0 1 2 65 1 0.0 1 1 66 4 0.0 1 4 67 4 0.0 1 4 68 1 0.0 1 0 1 69 2 0.0 1 2 70 4 0.0 1 3 1 71 2 0.0 1 2 72 7 0.0 1 6 1 73 4 0.0 1 3 1 74 9 0.0 1 5 4 75 77 0.0 1 8 69 76 87 0.0 1 37 50 77 58 0.0 1 25 33 78 30 0.0 1 15 15 79 18 0.0 1 3 15 80 12 0.0 1 4 8 81 9 0.0 1 3 6 82 7 0.0 1 2 5 83 6 0.0 1 5 1 84 4 0.0 1 2 2 85 3 0.0 1 1 2 86 3 0.0 1 1 2 87 4 0.0 1 2 2 88 1 0.0 1 1 89 3 0.0 1 0 3 90 5 0.0 1 2 3 93 2 0.0 1 1 1 94 2 0.0 1 2 95 1 0.0 1 0 1 97 1 0.0 1 0 1 98 1 0.0 1 0 1 101 2 0.0 1 1 1 102 3 0.0 1 1 2 103 2 0.0 1 0 2 104 1 0.0 1 1 106 3 0.0 1 3 107 1 0.0 1 0 1 108 1 0.0 1 0 1 109 2 0.0 1 0 2 110 1 0.0 1 0 1 114 1 0.0 1 0 1 115 1 0.0 1 1 117 1 0.0 1 0 1 118 2 0.0 1 2 119 1 0.0 1 1 121 1 0.0 1 0 1 122 2 0.0 1 0 2 124 1 0.0 1 1 129 1 0.0 1 0 1 130 2 0.0 1 1 1 131 1 0.0 1 0 1 132 1 0.0 1 0 1 134 1 0.0 1 1 135 2 0.0 1 2 137 1 0.0 1 0 1 138 1 0.0 1 0 1 141 1 0.0 1 1 142 1 0.0 1 0 1 145 1 0.0 1 0 1 147 4 0.0 1 0 4 148 5 0.0 1 0 5 149 14 0.0 1 1 13 150 39 0.0 1 7 32 151 13 0.0 1 2 11 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001.fastq.gz ============================================= 256911 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.19 s (32 us/read; 1.88 M reads/minute). === Summary === Total reads processed: 256,911 Reads with adapters: 55,558 (21.6%) Reads written (passing filters): 256,911 (100.0%) Total basepairs processed: 30,598,372 bp Quality-trimmed: 4,787,179 bp (15.6%) Total written (filtered): 25,683,113 bp (83.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 55558 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.0% C: 25.3% G: 13.3% T: 30.5% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 41307 64227.8 0 41307 2 10041 16056.9 0 10041 3 2188 4014.2 0 2188 4 915 1003.6 0 915 5 179 250.9 0 179 6 24 62.7 0 24 7 15 15.7 0 15 8 11 3.9 0 11 9 3 1.0 0 1 2 10 14 0.2 1 4 10 11 6 0.1 1 2 4 12 4 0.0 1 2 2 13 1 0.0 1 0 1 14 10 0.0 1 7 3 15 6 0.0 1 4 2 16 1 0.0 1 1 17 8 0.0 1 4 4 18 1 0.0 1 0 1 19 8 0.0 1 5 3 20 4 0.0 1 3 1 21 3 0.0 1 1 2 22 1 0.0 1 1 23 2 0.0 1 1 1 24 16 0.0 1 13 3 25 5 0.0 1 2 3 26 5 0.0 1 1 4 28 4 0.0 1 4 30 7 0.0 1 6 1 31 2 0.0 1 1 1 32 7 0.0 1 5 2 33 10 0.0 1 8 2 34 2 0.0 1 1 1 35 1 0.0 1 1 36 1 0.0 1 1 37 1 0.0 1 1 38 2 0.0 1 1 1 39 6 0.0 1 2 4 40 3 0.0 1 2 1 41 9 0.0 1 5 4 42 9 0.0 1 4 5 43 3 0.0 1 2 1 44 9 0.0 1 8 1 45 13 0.0 1 10 3 46 7 0.0 1 4 3 47 3 0.0 1 1 2 48 7 0.0 1 3 4 49 6 0.0 1 5 1 50 7 0.0 1 5 2 51 18 0.0 1 12 6 52 9 0.0 1 7 2 53 3 0.0 1 2 1 54 2 0.0 1 2 55 5 0.0 1 3 2 56 2 0.0 1 2 57 7 0.0 1 6 1 58 16 0.0 1 15 1 59 10 0.0 1 8 2 60 2 0.0 1 2 61 2 0.0 1 2 62 6 0.0 1 5 1 63 11 0.0 1 10 1 64 2 0.0 1 2 65 14 0.0 1 10 4 66 12 0.0 1 8 4 67 99 0.0 1 12 87 68 180 0.0 1 70 110 69 82 0.0 1 42 40 70 45 0.0 1 12 33 71 14 0.0 1 7 7 72 4 0.0 1 0 4 73 7 0.0 1 3 4 74 3 0.0 1 1 2 75 3 0.0 1 2 1 77 3 0.0 1 2 1 79 4 0.0 1 3 1 80 6 0.0 1 4 2 81 3 0.0 1 1 2 82 4 0.0 1 2 2 83 1 0.0 1 1 84 1 0.0 1 0 1 85 2 0.0 1 0 2 86 2 0.0 1 1 1 87 1 0.0 1 1 88 1 0.0 1 1 90 2 0.0 1 1 1 91 1 0.0 1 0 1 93 2 0.0 1 1 1 94 2 0.0 1 0 2 95 3 0.0 1 2 1 97 1 0.0 1 1 98 2 0.0 1 1 1 100 1 0.0 1 0 1 101 1 0.0 1 1 103 1 0.0 1 1 105 1 0.0 1 1 106 3 0.0 1 2 1 108 2 0.0 1 1 1 109 1 0.0 1 1 110 1 0.0 1 1 111 2 0.0 1 1 1 113 4 0.0 1 3 1 114 1 0.0 1 0 1 118 3 0.0 1 2 1 119 1 0.0 1 0 1 120 2 0.0 1 1 1 121 1 0.0 1 1 123 2 0.0 1 1 1 124 1 0.0 1 0 1 129 1 0.0 1 1 131 2 0.0 1 1 1 134 1 0.0 1 1 146 1 0.0 1 1 148 2 0.0 1 0 2 149 2 0.0 1 0 2 150 19 0.0 1 4 15 151 6 0.0 1 1 5 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001.fastq.gz ============================================= 256911 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Total number of sequences analysed: 256911 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 59359 (23.10%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-6_S22_L006_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 15.01 s (30 us/read; 1.99 M reads/minute). === Summary === Total reads processed: 498,123 Reads with adapters: 109,256 (21.9%) Reads written (passing filters): 498,123 (100.0%) Total basepairs processed: 58,840,808 bp Quality-trimmed: 2,336,141 bp (4.0%) Total written (filtered): 56,251,464 bp (95.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 109256 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.4% C: 23.3% G: 14.3% T: 29.5% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 80908 124530.8 0 80908 2 19555 31132.7 0 19555 3 4911 7783.2 0 4911 4 1960 1945.8 0 1960 5 402 486.4 0 402 6 45 121.6 0 45 7 19 30.4 0 19 8 19 7.6 0 19 9 3 1.9 0 0 3 10 14 0.5 1 1 13 12 14 0.0 1 10 4 13 6 0.0 1 4 2 14 9 0.0 1 5 4 15 7 0.0 1 5 2 16 5 0.0 1 4 1 17 2 0.0 1 1 1 18 17 0.0 1 13 4 19 21 0.0 1 21 20 3 0.0 1 2 1 21 4 0.0 1 1 3 22 8 0.0 1 5 3 23 4 0.0 1 3 1 24 9 0.0 1 7 2 25 3 0.0 1 2 1 26 6 0.0 1 1 5 27 10 0.0 1 8 2 28 8 0.0 1 7 1 29 1 0.0 1 1 30 4 0.0 1 2 2 31 10 0.0 1 8 2 32 5 0.0 1 3 2 33 5 0.0 1 5 34 5 0.0 1 3 2 35 3 0.0 1 1 2 36 9 0.0 1 7 2 37 3 0.0 1 3 38 12 0.0 1 10 2 39 5 0.0 1 3 2 40 10 0.0 1 8 2 41 8 0.0 1 8 42 10 0.0 1 8 2 43 1 0.0 1 0 1 44 1 0.0 1 1 45 3 0.0 1 2 1 46 2 0.0 1 2 47 10 0.0 1 9 1 48 5 0.0 1 4 1 49 3 0.0 1 2 1 50 7 0.0 1 6 1 51 12 0.0 1 8 4 52 7 0.0 1 4 3 53 13 0.0 1 5 8 54 8 0.0 1 6 2 55 15 0.0 1 14 1 56 3 0.0 1 3 57 7 0.0 1 6 1 58 6 0.0 1 6 59 3 0.0 1 3 60 5 0.0 1 5 61 4 0.0 1 4 62 4 0.0 1 2 2 63 5 0.0 1 4 1 64 7 0.0 1 5 2 65 21 0.0 1 19 2 66 7 0.0 1 7 67 6 0.0 1 5 1 68 12 0.0 1 10 2 69 9 0.0 1 6 3 70 7 0.0 1 7 71 15 0.0 1 11 4 72 31 0.0 1 21 10 73 34 0.0 1 16 18 74 106 0.0 1 32 74 75 174 0.0 1 100 74 76 118 0.0 1 81 37 77 85 0.0 1 54 31 78 61 0.0 1 37 24 79 35 0.0 1 16 19 80 14 0.0 1 11 3 81 9 0.0 1 4 5 82 7 0.0 1 4 3 83 8 0.0 1 5 3 84 5 0.0 1 3 2 85 7 0.0 1 4 3 86 9 0.0 1 4 5 87 6 0.0 1 4 2 88 6 0.0 1 2 4 89 3 0.0 1 1 2 90 5 0.0 1 4 1 91 7 0.0 1 4 3 92 6 0.0 1 4 2 93 3 0.0 1 2 1 94 2 0.0 1 2 95 5 0.0 1 2 3 96 4 0.0 1 2 2 97 3 0.0 1 1 2 98 6 0.0 1 4 2 99 7 0.0 1 6 1 100 3 0.0 1 1 2 101 3 0.0 1 3 102 4 0.0 1 2 2 103 5 0.0 1 4 1 104 5 0.0 1 4 1 105 2 0.0 1 2 106 3 0.0 1 3 107 1 0.0 1 1 108 3 0.0 1 3 109 4 0.0 1 3 1 110 4 0.0 1 3 1 111 2 0.0 1 2 112 6 0.0 1 4 2 114 1 0.0 1 1 115 2 0.0 1 0 2 116 2 0.0 1 1 1 118 2 0.0 1 2 119 2 0.0 1 1 1 122 1 0.0 1 1 124 3 0.0 1 2 1 128 1 0.0 1 1 131 4 0.0 1 3 1 132 2 0.0 1 1 1 134 1 0.0 1 1 135 2 0.0 1 1 1 136 1 0.0 1 0 1 137 4 0.0 1 4 138 1 0.0 1 1 139 4 0.0 1 4 141 1 0.0 1 0 1 142 1 0.0 1 1 143 2 0.0 1 1 1 144 1 0.0 1 0 1 145 1 0.0 1 0 1 146 1 0.0 1 0 1 147 2 0.0 1 0 2 148 8 0.0 1 4 4 149 23 0.0 1 14 9 150 75 0.0 1 24 51 151 27 0.0 1 8 19 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001.fastq.gz ============================================= 498123 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 15.61 s (31 us/read; 1.91 M reads/minute). === Summary === Total reads processed: 498,123 Reads with adapters: 114,354 (23.0%) Reads written (passing filters): 498,123 (100.0%) Total basepairs processed: 59,164,906 bp Quality-trimmed: 5,701,729 bp (9.6%) Total written (filtered): 53,160,578 bp (89.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 114354 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.8% C: 25.8% G: 14.0% T: 28.8% none/other: 0.7% Overview of removed sequences length count expect max.err error counts 1 83738 124530.8 0 83738 2 21124 31132.7 0 21124 3 4777 7783.2 0 4777 4 1789 1945.8 0 1789 5 349 486.4 0 349 6 58 121.6 0 58 7 49 30.4 0 49 8 30 7.6 0 30 9 7 1.9 0 5 2 10 20 0.5 1 8 12 11 6 0.1 1 2 4 12 20 0.0 1 10 10 13 7 0.0 1 7 14 35 0.0 1 18 17 15 16 0.0 1 14 2 16 13 0.0 1 8 5 17 38 0.0 1 25 13 18 4 0.0 1 3 1 19 22 0.0 1 14 8 20 19 0.0 1 16 3 21 1 0.0 1 1 22 4 0.0 1 0 4 23 18 0.0 1 10 8 24 37 0.0 1 29 8 25 15 0.0 1 8 7 26 27 0.0 1 17 10 27 5 0.0 1 2 3 28 9 0.0 1 5 4 29 3 0.0 1 3 30 24 0.0 1 17 7 31 1 0.0 1 1 32 16 0.0 1 11 5 33 33 0.0 1 24 9 34 5 0.0 1 2 3 35 14 0.0 1 7 7 36 8 0.0 1 8 37 19 0.0 1 11 8 38 5 0.0 1 5 39 14 0.0 1 13 1 40 3 0.0 1 0 3 41 25 0.0 1 18 7 42 34 0.0 1 27 7 43 6 0.0 1 5 1 44 23 0.0 1 12 11 45 43 0.0 1 35 8 46 14 0.0 1 10 4 47 4 0.0 1 3 1 48 28 0.0 1 21 7 49 16 0.0 1 12 4 50 4 0.0 1 3 1 51 95 0.0 1 71 24 52 10 0.0 1 7 3 53 4 0.0 1 4 54 22 0.0 1 19 3 55 20 0.0 1 17 3 56 11 0.0 1 10 1 57 19 0.0 1 17 2 58 30 0.0 1 23 7 59 24 0.0 1 17 7 60 19 0.0 1 14 5 61 16 0.0 1 13 3 62 23 0.0 1 16 7 63 31 0.0 1 25 6 64 25 0.0 1 21 4 65 19 0.0 1 16 3 66 43 0.0 1 29 14 67 182 0.0 1 32 150 68 407 0.0 1 308 99 69 182 0.0 1 129 53 70 84 0.0 1 55 29 71 34 0.0 1 22 12 72 33 0.0 1 19 14 73 25 0.0 1 19 6 74 16 0.0 1 9 7 75 17 0.0 1 16 1 76 6 0.0 1 1 5 77 10 0.0 1 8 2 78 8 0.0 1 5 3 79 14 0.0 1 10 4 80 11 0.0 1 10 1 81 8 0.0 1 3 5 82 7 0.0 1 6 1 83 6 0.0 1 6 84 10 0.0 1 5 5 85 7 0.0 1 3 4 86 8 0.0 1 8 87 7 0.0 1 4 3 88 6 0.0 1 4 2 89 7 0.0 1 5 2 90 7 0.0 1 4 3 91 12 0.0 1 8 4 92 9 0.0 1 5 4 93 2 0.0 1 2 94 5 0.0 1 2 3 95 4 0.0 1 3 1 96 6 0.0 1 3 3 97 9 0.0 1 6 3 98 8 0.0 1 6 2 99 6 0.0 1 4 2 100 5 0.0 1 4 1 101 3 0.0 1 2 1 102 9 0.0 1 6 3 103 6 0.0 1 4 2 104 4 0.0 1 3 1 105 2 0.0 1 1 1 106 10 0.0 1 8 2 107 5 0.0 1 1 4 108 2 0.0 1 1 1 109 8 0.0 1 4 4 110 7 0.0 1 4 3 111 5 0.0 1 4 1 112 6 0.0 1 6 113 1 0.0 1 1 114 9 0.0 1 6 3 115 5 0.0 1 4 1 116 5 0.0 1 3 2 119 2 0.0 1 1 1 120 1 0.0 1 0 1 121 3 0.0 1 2 1 122 1 0.0 1 0 1 124 5 0.0 1 3 2 125 2 0.0 1 1 1 126 7 0.0 1 6 1 127 2 0.0 1 1 1 128 1 0.0 1 1 129 1 0.0 1 1 130 1 0.0 1 1 138 1 0.0 1 0 1 144 2 0.0 1 0 2 145 2 0.0 1 0 2 146 2 0.0 1 0 2 148 2 0.0 1 0 2 149 12 0.0 1 4 8 150 54 0.0 1 18 36 151 8 0.0 1 1 7 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001.fastq.gz ============================================= 498123 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Total number of sequences analysed: 498123 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 119990 (24.09%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-7_S26_L007_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 15.37 s (30 us/read; 1.99 M reads/minute). === Summary === Total reads processed: 510,999 Reads with adapters: 111,846 (21.9%) Reads written (passing filters): 510,999 (100.0%) Total basepairs processed: 60,454,446 bp Quality-trimmed: 2,368,086 bp (3.9%) Total written (filtered): 57,827,278 bp (95.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 111846 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.6% C: 23.5% G: 14.1% T: 29.3% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 82925 127749.8 0 82925 2 19968 31937.4 0 19968 3 4929 7984.4 0 4929 4 2047 1996.1 0 2047 5 420 499.0 0 420 6 57 124.8 0 57 7 28 31.2 0 28 8 17 7.8 0 17 9 2 1.9 0 1 1 10 17 0.5 1 5 12 11 5 0.1 1 3 2 12 6 0.0 1 4 2 13 3 0.0 1 2 1 14 8 0.0 1 6 2 15 7 0.0 1 6 1 16 4 0.0 1 3 1 17 4 0.0 1 0 4 18 12 0.0 1 8 4 19 9 0.0 1 7 2 20 7 0.0 1 2 5 21 1 0.0 1 1 22 12 0.0 1 9 3 23 5 0.0 1 2 3 24 10 0.0 1 10 25 4 0.0 1 3 1 26 4 0.0 1 3 1 27 8 0.0 1 6 2 28 13 0.0 1 10 3 29 1 0.0 1 1 30 6 0.0 1 3 3 31 12 0.0 1 7 5 32 7 0.0 1 5 2 33 6 0.0 1 5 1 34 14 0.0 1 10 4 35 5 0.0 1 5 36 9 0.0 1 0 9 37 6 0.0 1 4 2 38 15 0.0 1 13 2 39 4 0.0 1 3 1 40 2 0.0 1 0 2 41 8 0.0 1 7 1 42 6 0.0 1 4 2 43 4 0.0 1 4 44 3 0.0 1 3 45 3 0.0 1 2 1 46 5 0.0 1 3 2 47 9 0.0 1 8 1 48 7 0.0 1 4 3 49 4 0.0 1 1 3 50 2 0.0 1 2 51 10 0.0 1 7 3 52 11 0.0 1 9 2 53 6 0.0 1 1 5 54 8 0.0 1 6 2 55 20 0.0 1 16 4 56 8 0.0 1 6 2 57 7 0.0 1 2 5 58 8 0.0 1 4 4 59 2 0.0 1 1 1 60 5 0.0 1 3 2 61 13 0.0 1 12 1 62 7 0.0 1 5 2 63 3 0.0 1 3 64 6 0.0 1 6 65 14 0.0 1 10 4 66 11 0.0 1 8 3 67 4 0.0 1 3 1 68 9 0.0 1 8 1 69 9 0.0 1 7 2 70 7 0.0 1 6 1 71 13 0.0 1 10 3 72 20 0.0 1 10 10 73 31 0.0 1 14 17 74 107 0.0 1 35 72 75 226 0.0 1 149 77 76 99 0.0 1 74 25 77 72 0.0 1 48 24 78 45 0.0 1 31 14 79 33 0.0 1 22 11 80 16 0.0 1 5 11 81 18 0.0 1 12 6 82 11 0.0 1 8 3 83 6 0.0 1 4 2 84 7 0.0 1 3 4 85 6 0.0 1 3 3 86 2 0.0 1 1 1 87 4 0.0 1 2 2 88 1 0.0 1 0 1 89 6 0.0 1 3 3 90 7 0.0 1 4 3 91 6 0.0 1 3 3 92 9 0.0 1 5 4 93 4 0.0 1 3 1 94 5 0.0 1 3 2 95 6 0.0 1 4 2 96 5 0.0 1 4 1 97 1 0.0 1 0 1 98 1 0.0 1 0 1 99 5 0.0 1 5 100 3 0.0 1 2 1 101 2 0.0 1 2 102 4 0.0 1 3 1 103 6 0.0 1 3 3 104 1 0.0 1 1 105 5 0.0 1 4 1 106 1 0.0 1 1 107 3 0.0 1 2 1 108 5 0.0 1 3 2 110 1 0.0 1 1 111 3 0.0 1 2 1 112 4 0.0 1 3 1 114 2 0.0 1 2 115 2 0.0 1 1 1 117 3 0.0 1 3 118 1 0.0 1 1 121 3 0.0 1 1 2 123 2 0.0 1 2 124 2 0.0 1 2 125 1 0.0 1 1 126 1 0.0 1 1 127 1 0.0 1 1 128 2 0.0 1 2 130 1 0.0 1 1 131 1 0.0 1 1 132 2 0.0 1 2 135 2 0.0 1 2 136 2 0.0 1 2 137 3 0.0 1 1 2 138 6 0.0 1 6 139 1 0.0 1 0 1 140 1 0.0 1 1 141 1 0.0 1 1 142 1 0.0 1 1 144 3 0.0 1 2 1 145 3 0.0 1 1 2 146 4 0.0 1 4 147 4 0.0 1 4 148 12 0.0 1 5 7 149 21 0.0 1 11 10 150 78 0.0 1 31 47 151 32 0.0 1 12 20 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001.fastq.gz ============================================= 510999 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 15.95 s (31 us/read; 1.92 M reads/minute). === Summary === Total reads processed: 510,999 Reads with adapters: 116,745 (22.8%) Reads written (passing filters): 510,999 (100.0%) Total basepairs processed: 60,905,950 bp Quality-trimmed: 6,222,214 bp (10.2%) Total written (filtered): 54,387,931 bp (89.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 116745 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.0% C: 25.7% G: 13.8% T: 28.9% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 85527 127749.8 0 85527 2 21610 31937.4 0 21610 3 4908 7984.4 0 4908 4 1850 1996.1 0 1850 5 366 499.0 0 366 6 68 124.8 0 68 7 62 31.2 0 62 8 33 7.8 0 33 9 3 1.9 0 1 2 10 34 0.5 1 15 19 11 13 0.1 1 4 9 12 19 0.0 1 10 9 13 7 0.0 1 2 5 14 48 0.0 1 26 22 15 13 0.0 1 10 3 16 5 0.0 1 2 3 17 24 0.0 1 14 10 18 2 0.0 1 0 2 19 15 0.0 1 6 9 20 14 0.0 1 10 4 22 7 0.0 1 1 6 23 15 0.0 1 10 5 24 47 0.0 1 35 12 25 18 0.0 1 12 6 26 12 0.0 1 10 2 27 5 0.0 1 2 3 28 11 0.0 1 9 2 29 1 0.0 1 0 1 30 16 0.0 1 14 2 31 3 0.0 1 1 2 32 29 0.0 1 21 8 33 37 0.0 1 31 6 34 1 0.0 1 0 1 35 7 0.0 1 3 4 36 6 0.0 1 3 3 37 17 0.0 1 10 7 38 7 0.0 1 5 2 39 15 0.0 1 12 3 40 3 0.0 1 2 1 41 15 0.0 1 13 2 42 36 0.0 1 21 15 43 3 0.0 1 1 2 44 20 0.0 1 11 9 45 38 0.0 1 27 11 46 15 0.0 1 13 2 47 6 0.0 1 4 2 48 34 0.0 1 27 7 49 20 0.0 1 16 4 50 11 0.0 1 6 5 51 72 0.0 1 52 20 52 16 0.0 1 12 4 53 10 0.0 1 8 2 54 17 0.0 1 13 4 55 23 0.0 1 16 7 56 15 0.0 1 10 5 57 16 0.0 1 15 1 58 20 0.0 1 15 5 59 19 0.0 1 19 60 18 0.0 1 15 3 61 23 0.0 1 16 7 62 29 0.0 1 24 5 63 18 0.0 1 12 6 64 16 0.0 1 12 4 65 23 0.0 1 18 5 66 31 0.0 1 25 6 67 196 0.0 1 36 160 68 426 0.0 1 338 88 69 177 0.0 1 112 65 70 96 0.0 1 70 26 71 41 0.0 1 32 9 72 19 0.0 1 9 10 73 19 0.0 1 14 5 74 9 0.0 1 4 5 75 7 0.0 1 3 4 76 6 0.0 1 5 1 77 7 0.0 1 4 3 78 12 0.0 1 7 5 79 4 0.0 1 1 3 80 8 0.0 1 7 1 81 7 0.0 1 3 4 82 10 0.0 1 5 5 83 6 0.0 1 3 3 84 4 0.0 1 3 1 85 11 0.0 1 6 5 86 6 0.0 1 4 2 87 3 0.0 1 3 88 5 0.0 1 3 2 89 4 0.0 1 0 4 90 5 0.0 1 1 4 91 11 0.0 1 4 7 92 7 0.0 1 6 1 93 3 0.0 1 0 3 94 8 0.0 1 6 2 95 9 0.0 1 4 5 96 6 0.0 1 4 2 97 6 0.0 1 4 2 98 8 0.0 1 4 4 99 3 0.0 1 2 1 100 6 0.0 1 3 3 101 3 0.0 1 1 2 102 6 0.0 1 3 3 103 2 0.0 1 2 104 5 0.0 1 3 2 105 4 0.0 1 1 3 106 3 0.0 1 2 1 107 4 0.0 1 4 108 3 0.0 1 3 109 2 0.0 1 1 1 110 2 0.0 1 0 2 111 4 0.0 1 2 2 112 5 0.0 1 3 2 113 5 0.0 1 5 114 1 0.0 1 1 115 1 0.0 1 1 116 2 0.0 1 1 1 117 3 0.0 1 1 2 118 3 0.0 1 2 1 119 5 0.0 1 2 3 120 1 0.0 1 1 121 2 0.0 1 2 122 4 0.0 1 3 1 123 4 0.0 1 3 1 125 3 0.0 1 3 126 2 0.0 1 1 1 128 1 0.0 1 1 129 1 0.0 1 1 130 1 0.0 1 0 1 131 2 0.0 1 2 132 1 0.0 1 0 1 133 2 0.0 1 1 1 134 1 0.0 1 1 135 1 0.0 1 1 140 1 0.0 1 1 141 1 0.0 1 0 1 143 1 0.0 1 1 145 1 0.0 1 1 147 1 0.0 1 0 1 148 3 0.0 1 2 1 149 7 0.0 1 4 3 150 32 0.0 1 10 22 151 12 0.0 1 4 8 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001.fastq.gz ============================================= 510999 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Total number of sequences analysed: 510999 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 123040 (24.08%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-8_S30_L008_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-3_S9_L003_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-3_S9_L003_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-3_S9_L003_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-3_S9_L003_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-3_S9_L003_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.41 s (45 us/read; 1.33 M reads/minute). === Summary === Total reads processed: 53,313 Reads with adapters: 12,793 (24.0%) Reads written (passing filters): 53,313 (100.0%) Total basepairs processed: 7,048,549 bp Quality-trimmed: 228,714 bp (3.2%) Total written (filtered): 6,802,743 bp (96.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12793 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.0% C: 24.7% G: 13.4% T: 27.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 9937 13328.2 0 9937 2 2102 3332.1 0 2102 3 511 833.0 0 511 4 179 208.3 0 179 5 40 52.1 0 40 6 4 13.0 0 4 7 5 3.3 0 5 11 2 0.0 1 0 2 12 1 0.0 1 0 1 13 1 0.0 1 1 14 1 0.0 1 1 17 1 0.0 1 0 1 18 1 0.0 1 1 24 1 0.0 1 1 36 2 0.0 1 1 1 41 1 0.0 1 1 42 2 0.0 1 2 60 1 0.0 1 1 66 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-3_S9_L003_R1_001.fastq.gz ============================================= 53313 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-3_S9_L003_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-3_S9_L003_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-3_S9_L003_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-3_S9_L003_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-3_S9_L003_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.76 s (33 us/read; 1.82 M reads/minute). === Summary === Total reads processed: 53,313 Reads with adapters: 11,291 (21.2%) Reads written (passing filters): 53,313 (100.0%) Total basepairs processed: 6,567,162 bp Quality-trimmed: 1,650,357 bp (25.1%) Total written (filtered): 4,897,123 bp (74.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 11291 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.2% C: 26.3% G: 13.1% T: 29.0% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 8487 13328.2 0 8487 2 2076 3332.1 0 2076 3 445 833.0 0 445 4 147 208.3 0 147 5 29 52.1 0 29 6 3 13.0 0 3 7 4 3.3 0 4 8 1 0.8 0 1 10 4 0.1 1 1 3 11 1 0.0 1 0 1 12 2 0.0 1 0 2 14 4 0.0 1 3 1 16 2 0.0 1 2 17 3 0.0 1 2 1 20 1 0.0 1 0 1 21 1 0.0 1 0 1 23 1 0.0 1 1 24 2 0.0 1 2 25 1 0.0 1 0 1 26 2 0.0 1 2 27 1 0.0 1 1 28 1 0.0 1 0 1 30 2 0.0 1 2 32 1 0.0 1 1 33 2 0.0 1 2 34 1 0.0 1 0 1 35 1 0.0 1 1 36 2 0.0 1 1 1 40 2 0.0 1 2 41 2 0.0 1 2 42 3 0.0 1 2 1 45 2 0.0 1 1 1 47 1 0.0 1 1 48 4 0.0 1 3 1 51 3 0.0 1 1 2 52 2 0.0 1 2 53 1 0.0 1 0 1 54 1 0.0 1 1 56 2 0.0 1 2 57 2 0.0 1 1 1 59 1 0.0 1 1 60 1 0.0 1 1 63 2 0.0 1 1 1 64 2 0.0 1 2 65 3 0.0 1 1 2 66 1 0.0 1 0 1 67 2 0.0 1 1 1 68 8 0.0 1 7 1 69 5 0.0 1 5 71 2 0.0 1 2 72 2 0.0 1 1 1 78 1 0.0 1 1 85 1 0.0 1 1 87 1 0.0 1 1 95 1 0.0 1 0 1 98 1 0.0 1 0 1 103 1 0.0 1 1 104 1 0.0 1 1 116 2 0.0 1 0 2 132 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-3_S9_L003_R2_001.fastq.gz ============================================= 53313 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-3_S9_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-3_S9_L003_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-3_S9_L003_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-3_S9_L003_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-3_S9_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-3_S9_L003_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Total number of sequences analysed: 53313 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 12116 (22.73%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-3_S9_L003_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-3_S9_L003_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-3_S9_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-3_S9_L003_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-4_S13_L004_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-4_S13_L004_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-4_S13_L004_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-4_S13_L004_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-4_S13_L004_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.44 s (45 us/read; 1.34 M reads/minute). === Summary === Total reads processed: 54,441 Reads with adapters: 13,028 (23.9%) Reads written (passing filters): 54,441 (100.0%) Total basepairs processed: 7,139,942 bp Quality-trimmed: 243,519 bp (3.4%) Total written (filtered): 6,878,887 bp (96.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 13028 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.4% C: 23.5% G: 14.0% T: 28.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 10104 13610.2 0 10104 2 2184 3402.6 0 2184 3 511 850.6 0 511 4 178 212.7 0 178 5 32 53.2 0 32 6 3 13.3 0 3 7 2 3.3 0 2 8 2 0.8 0 2 9 1 0.2 0 1 14 1 0.0 1 0 1 17 1 0.0 1 0 1 36 1 0.0 1 1 42 1 0.0 1 1 43 1 0.0 1 0 1 48 1 0.0 1 0 1 55 1 0.0 1 0 1 56 1 0.0 1 0 1 78 1 0.0 1 1 96 1 0.0 1 0 1 117 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-4_S13_L004_R1_001.fastq.gz ============================================= 54441 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-4_S13_L004_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-4_S13_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-4_S13_L004_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-4_S13_L004_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-4_S13_L004_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.87 s (34 us/read; 1.75 M reads/minute). === Summary === Total reads processed: 54,441 Reads with adapters: 11,434 (21.0%) Reads written (passing filters): 54,441 (100.0%) Total basepairs processed: 6,768,265 bp Quality-trimmed: 1,642,531 bp (24.3%) Total written (filtered): 5,106,511 bp (75.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 11434 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.1% C: 25.7% G: 13.4% T: 29.4% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 8766 13610.2 0 8766 2 1961 3402.6 0 1961 3 456 850.6 0 456 4 134 212.7 0 134 5 23 53.2 0 23 6 6 13.3 0 6 7 1 3.3 0 1 8 2 0.8 0 2 10 2 0.1 1 1 1 11 1 0.0 1 1 14 2 0.0 1 2 15 1 0.0 1 1 19 2 0.0 1 1 1 21 1 0.0 1 1 23 1 0.0 1 0 1 24 3 0.0 1 2 1 25 1 0.0 1 0 1 26 1 0.0 1 1 27 1 0.0 1 0 1 28 1 0.0 1 1 30 1 0.0 1 1 33 1 0.0 1 1 35 1 0.0 1 0 1 40 1 0.0 1 0 1 41 1 0.0 1 1 42 2 0.0 1 1 1 43 1 0.0 1 1 44 5 0.0 1 2 3 45 3 0.0 1 2 1 46 1 0.0 1 0 1 47 1 0.0 1 1 48 1 0.0 1 1 49 1 0.0 1 1 51 8 0.0 1 4 4 52 2 0.0 1 2 53 1 0.0 1 1 55 1 0.0 1 1 56 2 0.0 1 1 1 58 1 0.0 1 0 1 59 1 0.0 1 1 63 1 0.0 1 0 1 64 2 0.0 1 2 65 2 0.0 1 2 66 3 0.0 1 3 67 3 0.0 1 3 68 4 0.0 1 4 69 2 0.0 1 2 70 2 0.0 1 1 1 71 1 0.0 1 1 72 1 0.0 1 0 1 75 2 0.0 1 1 1 76 1 0.0 1 1 86 1 0.0 1 1 88 1 0.0 1 1 89 1 0.0 1 1 90 2 0.0 1 1 1 98 1 0.0 1 1 117 1 0.0 1 1 126 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-4_S13_L004_R2_001.fastq.gz ============================================= 54441 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-4_S13_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-4_S13_L004_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-4_S13_L004_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-4_S13_L004_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-4_S13_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-4_S13_L004_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Total number of sequences analysed: 54441 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 11971 (21.99%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-4_S13_L004_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-4_S13_L004_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-4_S13_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-4_S13_L004_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5_S17_L005_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5_S17_L005_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5_S17_L005_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5_S17_L005_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5_S17_L005_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.35 s (52 us/read; 1.15 M reads/minute). === Summary === Total reads processed: 25,880 Reads with adapters: 5,985 (23.1%) Reads written (passing filters): 25,880 (100.0%) Total basepairs processed: 3,517,457 bp Quality-trimmed: 92,838 bp (2.6%) Total written (filtered): 3,415,401 bp (97.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 5985 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.5% C: 25.0% G: 13.7% T: 27.7% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 4640 6470.0 0 4640 2 988 1617.5 0 988 3 241 404.4 0 241 4 82 101.1 0 82 5 10 25.3 0 10 6 1 6.3 0 1 7 1 1.6 0 1 10 2 0.0 1 0 2 11 1 0.0 1 1 34 1 0.0 1 1 43 1 0.0 1 0 1 53 1 0.0 1 1 62 1 0.0 1 1 67 1 0.0 1 1 70 1 0.0 1 0 1 73 3 0.0 1 0 3 75 1 0.0 1 0 1 76 4 0.0 1 4 77 1 0.0 1 1 80 1 0.0 1 0 1 94 1 0.0 1 0 1 128 1 0.0 1 0 1 151 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5_S17_L005_R1_001.fastq.gz ============================================= 25880 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5_S17_L005_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5_S17_L005_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5_S17_L005_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5_S17_L005_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5_S17_L005_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.91 s (35 us/read; 1.71 M reads/minute). === Summary === Total reads processed: 25,880 Reads with adapters: 5,305 (20.5%) Reads written (passing filters): 25,880 (100.0%) Total basepairs processed: 2,796,540 bp Quality-trimmed: 750,455 bp (26.8%) Total written (filtered): 2,035,872 bp (72.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 5305 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 27.9% C: 26.9% G: 15.1% T: 29.4% none/other: 0.7% Overview of removed sequences length count expect max.err error counts 1 3767 6470.0 0 3767 2 1155 1617.5 0 1155 3 224 404.4 0 224 4 87 101.1 0 87 5 15 25.3 0 15 6 3 6.3 0 3 7 2 1.6 0 2 11 1 0.0 1 0 1 13 1 0.0 1 0 1 14 1 0.0 1 0 1 15 1 0.0 1 1 22 1 0.0 1 1 23 1 0.0 1 0 1 24 1 0.0 1 1 30 1 0.0 1 1 44 1 0.0 1 1 46 3 0.0 1 3 48 3 0.0 1 2 1 50 1 0.0 1 1 51 4 0.0 1 3 1 53 1 0.0 1 0 1 57 1 0.0 1 1 58 1 0.0 1 1 59 1 0.0 1 1 60 1 0.0 1 1 63 1 0.0 1 0 1 65 2 0.0 1 2 66 3 0.0 1 1 2 67 5 0.0 1 0 5 68 6 0.0 1 5 1 69 5 0.0 1 3 2 70 2 0.0 1 0 2 104 1 0.0 1 1 118 1 0.0 1 0 1 149 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5_S17_L005_R2_001.fastq.gz ============================================= 25880 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5_S17_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5_S17_L005_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5_S17_L005_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5_S17_L005_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5_S17_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5_S17_L005_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Total number of sequences analysed: 25880 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 7597 (29.35%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5_S17_L005_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5_S17_L005_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5_S17_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5_S17_L005_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.93 s (46 us/read; 1.30 M reads/minute). === Summary === Total reads processed: 63,365 Reads with adapters: 15,099 (23.8%) Reads written (passing filters): 63,365 (100.0%) Total basepairs processed: 8,424,415 bp Quality-trimmed: 237,122 bp (2.8%) Total written (filtered): 8,166,503 bp (96.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 15099 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.9% C: 24.3% G: 13.6% T: 27.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 11998 15841.2 0 11998 2 2190 3960.3 0 2190 3 621 990.1 0 621 4 205 247.5 0 205 5 45 61.9 0 45 6 9 15.5 0 9 7 3 3.9 0 3 8 1 1.0 0 1 10 1 0.1 1 0 1 12 2 0.0 1 1 1 15 3 0.0 1 1 2 17 1 0.0 1 0 1 22 1 0.0 1 0 1 23 1 0.0 1 1 26 1 0.0 1 0 1 34 1 0.0 1 0 1 36 1 0.0 1 1 41 1 0.0 1 0 1 47 1 0.0 1 1 54 2 0.0 1 1 1 55 1 0.0 1 1 63 1 0.0 1 1 69 1 0.0 1 1 70 1 0.0 1 1 71 1 0.0 1 1 89 1 0.0 1 0 1 98 1 0.0 1 0 1 102 2 0.0 1 1 1 122 1 0.0 1 0 1 147 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001.fastq.gz ============================================= 63365 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.00 s (32 us/read; 1.90 M reads/minute). === Summary === Total reads processed: 63,365 Reads with adapters: 12,831 (20.2%) Reads written (passing filters): 63,365 (100.0%) Total basepairs processed: 7,558,375 bp Quality-trimmed: 2,143,772 bp (28.4%) Total written (filtered): 5,393,560 bp (71.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12831 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.6% C: 27.6% G: 13.4% T: 27.8% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 9589 15841.2 0 9589 2 2369 3960.3 0 2369 3 542 990.1 0 542 4 203 247.5 0 203 5 36 61.9 0 36 6 6 15.5 0 6 7 3 3.9 0 3 8 1 1.0 0 1 9 1 0.2 0 1 10 2 0.1 1 0 2 11 1 0.0 1 1 12 1 0.0 1 0 1 13 1 0.0 1 0 1 14 4 0.0 1 2 2 15 1 0.0 1 1 17 1 0.0 1 1 18 1 0.0 1 1 20 1 0.0 1 1 23 3 0.0 1 1 2 24 2 0.0 1 0 2 33 1 0.0 1 0 1 34 1 0.0 1 0 1 38 2 0.0 1 2 39 1 0.0 1 1 41 2 0.0 1 2 42 1 0.0 1 1 44 1 0.0 1 0 1 45 2 0.0 1 2 46 1 0.0 1 0 1 48 1 0.0 1 1 49 3 0.0 1 3 50 1 0.0 1 1 51 8 0.0 1 6 2 52 1 0.0 1 0 1 53 2 0.0 1 1 1 55 2 0.0 1 1 1 56 1 0.0 1 1 57 5 0.0 1 1 4 58 1 0.0 1 1 61 1 0.0 1 1 62 1 0.0 1 1 63 1 0.0 1 0 1 64 1 0.0 1 1 66 7 0.0 1 5 2 67 2 0.0 1 2 68 7 0.0 1 6 1 69 1 0.0 1 1 77 1 0.0 1 1 78 1 0.0 1 1 121 1 0.0 1 0 1 122 1 0.0 1 1 130 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001.fastq.gz ============================================= 63365 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Total number of sequences analysed: 63365 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 17074 (26.95%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-1_S3_L001_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 3.03 s (44 us/read; 1.36 M reads/minute). === Summary === Total reads processed: 68,454 Reads with adapters: 16,462 (24.0%) Reads written (passing filters): 68,454 (100.0%) Total basepairs processed: 9,072,717 bp Quality-trimmed: 256,864 bp (2.8%) Total written (filtered): 8,793,704 bp (96.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 16462 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.6% C: 24.3% G: 13.5% T: 27.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 13052 17113.5 0 13052 2 2407 4278.4 0 2407 3 685 1069.6 0 685 4 238 267.4 0 238 5 46 66.8 0 46 6 11 16.7 0 11 7 2 4.2 0 2 8 1 1.0 0 1 11 2 0.0 1 1 1 15 1 0.0 1 0 1 24 1 0.0 1 1 25 1 0.0 1 0 1 34 2 0.0 1 2 36 1 0.0 1 0 1 38 1 0.0 1 1 53 2 0.0 1 1 1 57 1 0.0 1 1 59 1 0.0 1 0 1 60 1 0.0 1 0 1 62 1 0.0 1 1 66 1 0.0 1 1 68 1 0.0 1 0 1 74 1 0.0 1 0 1 81 1 0.0 1 0 1 97 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001.fastq.gz ============================================= 68454 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.17 s (32 us/read; 1.89 M reads/minute). === Summary === Total reads processed: 68,454 Reads with adapters: 14,054 (20.5%) Reads written (passing filters): 68,454 (100.0%) Total basepairs processed: 8,215,195 bp Quality-trimmed: 2,193,580 bp (26.7%) Total written (filtered): 5,997,962 bp (73.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 14054 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.9% C: 26.8% G: 13.4% T: 28.5% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 10462 17113.5 0 10462 2 2650 4278.4 0 2650 3 574 1069.6 0 574 4 208 267.4 0 208 5 43 66.8 0 43 6 8 16.7 0 8 7 7 4.2 0 7 8 1 1.0 0 1 10 4 0.1 1 3 1 12 2 0.0 1 0 2 14 1 0.0 1 0 1 15 2 0.0 1 1 1 17 2 0.0 1 2 20 2 0.0 1 1 1 23 1 0.0 1 0 1 24 5 0.0 1 5 25 2 0.0 1 1 1 26 2 0.0 1 1 1 28 2 0.0 1 1 1 29 2 0.0 1 1 1 30 1 0.0 1 1 31 2 0.0 1 1 1 32 2 0.0 1 0 2 33 1 0.0 1 1 36 2 0.0 1 2 37 1 0.0 1 1 38 1 0.0 1 1 39 1 0.0 1 1 42 3 0.0 1 1 2 45 2 0.0 1 1 1 48 2 0.0 1 2 49 3 0.0 1 3 51 11 0.0 1 6 5 53 1 0.0 1 0 1 54 2 0.0 1 1 1 55 4 0.0 1 4 57 1 0.0 1 1 58 1 0.0 1 0 1 62 1 0.0 1 1 63 1 0.0 1 1 65 1 0.0 1 1 66 2 0.0 1 2 67 2 0.0 1 1 1 68 8 0.0 1 7 1 69 2 0.0 1 1 1 70 1 0.0 1 1 74 1 0.0 1 1 75 1 0.0 1 1 76 1 0.0 1 1 78 1 0.0 1 1 80 1 0.0 1 1 85 1 0.0 1 0 1 86 1 0.0 1 1 88 1 0.0 1 1 94 1 0.0 1 1 98 1 0.0 1 0 1 100 1 0.0 1 0 1 102 1 0.0 1 0 1 111 1 0.0 1 1 112 1 0.0 1 0 1 118 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001.fastq.gz ============================================= 68454 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 68454 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 17698 (25.85%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-2_S7_L002_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 40.71 s (41 us/read; 1.46 M reads/minute). === Summary === Total reads processed: 991,000 Reads with adapters: 224,438 (22.6%) Reads written (passing filters): 991,000 (100.0%) Total basepairs processed: 132,025,856 bp Quality-trimmed: 3,692,130 bp (2.8%) Total written (filtered): 128,002,352 bp (97.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 224438 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.6% C: 25.4% G: 13.7% T: 28.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 174859 247750.0 0 174859 2 35877 61937.5 0 35877 3 9006 15484.4 0 9006 4 3130 3871.1 0 3130 5 617 967.8 0 617 6 132 241.9 0 132 7 33 60.5 0 33 8 13 15.1 0 13 9 8 3.8 0 3 5 10 26 0.9 1 3 23 11 7 0.2 1 2 5 12 13 0.1 1 8 5 13 15 0.0 1 10 5 14 7 0.0 1 4 3 15 8 0.0 1 5 3 16 4 0.0 1 1 3 17 4 0.0 1 1 3 18 20 0.0 1 9 11 19 13 0.0 1 5 8 20 3 0.0 1 2 1 21 4 0.0 1 2 2 22 9 0.0 1 4 5 23 10 0.0 1 4 6 24 8 0.0 1 5 3 25 8 0.0 1 4 4 26 3 0.0 1 1 2 27 10 0.0 1 8 2 28 20 0.0 1 6 14 29 9 0.0 1 3 6 30 11 0.0 1 3 8 31 8 0.0 1 2 6 32 10 0.0 1 6 4 33 4 0.0 1 2 2 34 17 0.0 1 10 7 35 6 0.0 1 4 2 36 6 0.0 1 1 5 37 4 0.0 1 1 3 38 13 0.0 1 9 4 39 12 0.0 1 10 2 40 10 0.0 1 4 6 41 14 0.0 1 8 6 42 13 0.0 1 8 5 43 4 0.0 1 3 1 44 2 0.0 1 2 45 5 0.0 1 0 5 46 5 0.0 1 2 3 47 2 0.0 1 2 48 10 0.0 1 8 2 49 9 0.0 1 7 2 50 2 0.0 1 1 1 51 3 0.0 1 2 1 52 5 0.0 1 4 1 53 7 0.0 1 3 4 54 6 0.0 1 4 2 55 19 0.0 1 13 6 56 6 0.0 1 1 5 57 9 0.0 1 5 4 58 5 0.0 1 5 59 6 0.0 1 4 2 60 8 0.0 1 6 2 61 1 0.0 1 1 62 9 0.0 1 3 6 63 7 0.0 1 6 1 64 3 0.0 1 1 2 65 9 0.0 1 7 2 66 16 0.0 1 8 8 67 12 0.0 1 8 4 68 5 0.0 1 4 1 69 11 0.0 1 9 2 70 7 0.0 1 5 2 71 8 0.0 1 7 1 72 10 0.0 1 6 4 73 13 0.0 1 10 3 74 12 0.0 1 8 4 75 31 0.0 1 12 19 76 37 0.0 1 29 8 77 16 0.0 1 9 7 78 9 0.0 1 4 5 79 9 0.0 1 3 6 80 7 0.0 1 5 2 81 3 0.0 1 1 2 82 2 0.0 1 1 1 83 2 0.0 1 2 85 3 0.0 1 2 1 86 3 0.0 1 0 3 87 2 0.0 1 1 1 88 2 0.0 1 0 2 90 5 0.0 1 1 4 91 2 0.0 1 0 2 92 1 0.0 1 1 93 1 0.0 1 0 1 94 2 0.0 1 2 95 2 0.0 1 0 2 96 1 0.0 1 0 1 97 2 0.0 1 2 99 2 0.0 1 0 2 100 1 0.0 1 1 101 4 0.0 1 0 4 102 3 0.0 1 2 1 105 1 0.0 1 1 106 1 0.0 1 1 107 1 0.0 1 0 1 109 1 0.0 1 1 111 1 0.0 1 0 1 112 2 0.0 1 0 2 113 2 0.0 1 0 2 114 2 0.0 1 2 115 1 0.0 1 1 116 2 0.0 1 0 2 117 1 0.0 1 0 1 119 1 0.0 1 0 1 121 2 0.0 1 0 2 123 4 0.0 1 2 2 124 1 0.0 1 1 125 2 0.0 1 2 126 2 0.0 1 1 1 130 2 0.0 1 1 1 131 2 0.0 1 2 134 1 0.0 1 1 137 1 0.0 1 0 1 139 1 0.0 1 0 1 141 1 0.0 1 0 1 142 1 0.0 1 1 143 1 0.0 1 0 1 145 1 0.0 1 1 148 2 0.0 1 1 1 149 1 0.0 1 0 1 150 3 0.0 1 2 1 151 2 0.0 1 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001.fastq.gz ============================================= 991000 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 31.06 s (31 us/read; 1.91 M reads/minute). === Summary === Total reads processed: 991,000 Reads with adapters: 208,477 (21.0%) Reads written (passing filters): 991,000 (100.0%) Total basepairs processed: 116,898,739 bp Quality-trimmed: 22,202,813 bp (19.0%) Total written (filtered): 94,271,187 bp (80.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 208477 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.5% C: 28.3% G: 14.0% T: 27.8% none/other: 0.3% Overview of removed sequences length count expect max.err error counts 1 152478 247750.0 0 152478 2 40714 61937.5 0 40714 3 8464 15484.4 0 8464 4 3191 3871.1 0 3191 5 653 967.8 0 653 6 108 241.9 0 108 7 66 60.5 0 66 8 41 15.1 0 41 9 18 3.8 0 10 8 10 44 0.9 1 20 24 11 13 0.2 1 3 10 12 32 0.1 1 17 15 13 19 0.0 1 9 10 14 72 0.0 1 38 34 15 22 0.0 1 10 12 16 12 0.0 1 6 6 17 40 0.0 1 22 18 18 11 0.0 1 5 6 19 31 0.0 1 19 12 20 19 0.0 1 13 6 21 1 0.0 1 1 22 6 0.0 1 5 1 23 20 0.0 1 12 8 24 78 0.0 1 43 35 25 40 0.0 1 16 24 26 31 0.0 1 19 12 27 2 0.0 1 1 1 28 30 0.0 1 20 10 29 5 0.0 1 2 3 30 32 0.0 1 19 13 31 5 0.0 1 2 3 32 42 0.0 1 28 14 33 48 0.0 1 33 15 34 7 0.0 1 2 5 35 20 0.0 1 5 15 36 19 0.0 1 15 4 37 36 0.0 1 27 9 38 10 0.0 1 7 3 39 18 0.0 1 9 9 40 11 0.0 1 5 6 41 33 0.0 1 18 15 42 69 0.0 1 47 22 43 15 0.0 1 8 7 44 25 0.0 1 14 11 45 58 0.0 1 37 21 46 18 0.0 1 10 8 47 14 0.0 1 8 6 48 57 0.0 1 37 20 49 42 0.0 1 26 16 50 24 0.0 1 15 9 51 152 0.0 1 107 45 52 38 0.0 1 21 17 53 23 0.0 1 18 5 54 14 0.0 1 12 2 55 46 0.0 1 34 12 56 16 0.0 1 12 4 57 37 0.0 1 25 12 58 38 0.0 1 28 10 59 22 0.0 1 15 7 60 41 0.0 1 27 14 61 34 0.0 1 20 14 62 33 0.0 1 19 14 63 39 0.0 1 26 13 64 52 0.0 1 43 9 65 51 0.0 1 42 9 66 49 0.0 1 38 11 67 88 0.0 1 65 23 68 274 0.0 1 237 37 69 121 0.0 1 108 13 70 50 0.0 1 42 8 71 27 0.0 1 22 5 72 9 0.0 1 5 4 73 16 0.0 1 7 9 74 17 0.0 1 13 4 75 9 0.0 1 6 3 76 7 0.0 1 2 5 77 7 0.0 1 6 1 78 9 0.0 1 7 2 79 14 0.0 1 7 7 80 9 0.0 1 7 2 81 8 0.0 1 4 4 82 5 0.0 1 3 2 83 11 0.0 1 7 4 84 7 0.0 1 7 85 7 0.0 1 4 3 86 10 0.0 1 6 4 87 12 0.0 1 6 6 88 4 0.0 1 2 2 89 5 0.0 1 3 2 90 7 0.0 1 3 4 91 5 0.0 1 2 3 92 4 0.0 1 3 1 93 3 0.0 1 1 2 94 6 0.0 1 5 1 95 4 0.0 1 3 1 96 7 0.0 1 5 2 97 9 0.0 1 5 4 98 9 0.0 1 6 3 99 3 0.0 1 2 1 100 8 0.0 1 1 7 101 10 0.0 1 2 8 102 5 0.0 1 3 2 103 4 0.0 1 2 2 104 6 0.0 1 3 3 105 5 0.0 1 4 1 106 3 0.0 1 1 2 107 4 0.0 1 0 4 108 5 0.0 1 3 2 109 3 0.0 1 2 1 110 5 0.0 1 3 2 111 4 0.0 1 1 3 112 3 0.0 1 2 1 113 8 0.0 1 4 4 114 4 0.0 1 2 2 115 3 0.0 1 3 116 6 0.0 1 0 6 117 5 0.0 1 3 2 118 3 0.0 1 1 2 119 2 0.0 1 1 1 120 6 0.0 1 4 2 121 2 0.0 1 2 122 6 0.0 1 2 4 123 4 0.0 1 2 2 124 3 0.0 1 1 2 125 1 0.0 1 0 1 126 2 0.0 1 1 1 127 5 0.0 1 1 4 128 2 0.0 1 1 1 129 1 0.0 1 1 130 3 0.0 1 2 1 131 1 0.0 1 0 1 132 1 0.0 1 0 1 133 1 0.0 1 1 134 2 0.0 1 0 2 138 1 0.0 1 1 142 1 0.0 1 0 1 145 1 0.0 1 0 1 146 1 0.0 1 0 1 150 2 0.0 1 0 2 151 3 0.0 1 1 2 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001.fastq.gz ============================================= 991000 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Total number of sequences analysed: 991000 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 222638 (22.47%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-3_S11_L003_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 33.99 s (41 us/read; 1.47 M reads/minute). === Summary === Total reads processed: 833,817 Reads with adapters: 187,573 (22.5%) Reads written (passing filters): 833,817 (100.0%) Total basepairs processed: 110,663,660 bp Quality-trimmed: 3,368,816 bp (3.0%) Total written (filtered): 107,020,491 bp (96.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 187573 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.5% C: 25.6% G: 13.6% T: 28.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 146090 208454.2 0 146090 2 30165 52113.6 0 30165 3 7425 13028.4 0 7425 4 2678 3257.1 0 2678 5 528 814.3 0 528 6 85 203.6 0 85 7 35 50.9 0 35 8 13 12.7 0 13 9 8 3.2 0 2 6 10 20 0.8 1 0 20 11 2 0.2 1 1 1 12 3 0.0 1 1 2 13 7 0.0 1 4 3 14 5 0.0 1 1 4 15 6 0.0 1 2 4 16 1 0.0 1 1 17 2 0.0 1 1 1 18 17 0.0 1 8 9 19 10 0.0 1 4 6 20 6 0.0 1 1 5 21 2 0.0 1 0 2 22 4 0.0 1 2 2 23 1 0.0 1 1 24 4 0.0 1 2 2 25 2 0.0 1 2 26 1 0.0 1 1 27 2 0.0 1 0 2 28 9 0.0 1 4 5 29 1 0.0 1 0 1 30 5 0.0 1 4 1 31 4 0.0 1 1 3 32 6 0.0 1 5 1 33 5 0.0 1 4 1 34 5 0.0 1 3 2 35 3 0.0 1 1 2 36 9 0.0 1 4 5 37 6 0.0 1 4 2 38 5 0.0 1 4 1 39 11 0.0 1 5 6 40 3 0.0 1 1 2 41 11 0.0 1 7 4 42 4 0.0 1 4 43 13 0.0 1 9 4 44 4 0.0 1 2 2 45 2 0.0 1 0 2 46 3 0.0 1 2 1 47 3 0.0 1 1 2 48 8 0.0 1 5 3 49 8 0.0 1 7 1 50 3 0.0 1 1 2 51 4 0.0 1 2 2 52 3 0.0 1 0 3 53 4 0.0 1 3 1 54 4 0.0 1 2 2 55 15 0.0 1 11 4 56 4 0.0 1 2 2 57 6 0.0 1 2 4 58 2 0.0 1 2 59 11 0.0 1 9 2 60 9 0.0 1 3 6 61 2 0.0 1 2 62 9 0.0 1 7 2 63 13 0.0 1 6 7 64 7 0.0 1 7 65 5 0.0 1 5 66 12 0.0 1 9 3 67 10 0.0 1 4 6 68 5 0.0 1 5 69 4 0.0 1 3 1 70 2 0.0 1 2 71 6 0.0 1 5 1 72 5 0.0 1 3 2 73 13 0.0 1 9 4 74 10 0.0 1 9 1 75 16 0.0 1 6 10 76 18 0.0 1 10 8 77 18 0.0 1 13 5 78 14 0.0 1 10 4 79 11 0.0 1 6 5 80 4 0.0 1 3 1 81 4 0.0 1 3 1 82 2 0.0 1 0 2 83 1 0.0 1 1 84 1 0.0 1 1 85 1 0.0 1 1 86 3 0.0 1 3 87 1 0.0 1 0 1 88 2 0.0 1 2 89 4 0.0 1 4 91 2 0.0 1 1 1 92 1 0.0 1 1 93 1 0.0 1 0 1 94 2 0.0 1 1 1 95 1 0.0 1 0 1 96 2 0.0 1 1 1 98 1 0.0 1 0 1 99 2 0.0 1 0 2 102 3 0.0 1 1 2 103 2 0.0 1 2 104 2 0.0 1 2 105 1 0.0 1 0 1 106 2 0.0 1 1 1 109 4 0.0 1 2 2 110 2 0.0 1 0 2 111 1 0.0 1 1 112 2 0.0 1 0 2 114 1 0.0 1 0 1 116 1 0.0 1 0 1 118 1 0.0 1 0 1 120 2 0.0 1 1 1 121 1 0.0 1 0 1 123 1 0.0 1 0 1 124 1 0.0 1 0 1 127 1 0.0 1 1 128 2 0.0 1 0 2 129 1 0.0 1 1 132 1 0.0 1 1 137 1 0.0 1 0 1 144 1 0.0 1 0 1 148 1 0.0 1 0 1 150 4 0.0 1 1 3 151 4 0.0 1 3 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001.fastq.gz ============================================= 833817 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 27.36 s (33 us/read; 1.83 M reads/minute). === Summary === Total reads processed: 833,817 Reads with adapters: 177,526 (21.3%) Reads written (passing filters): 833,817 (100.0%) Total basepairs processed: 99,706,719 bp Quality-trimmed: 18,778,989 bp (18.8%) Total written (filtered): 80,583,817 bp (80.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 177526 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.5% C: 28.3% G: 14.0% T: 27.8% none/other: 0.3% Overview of removed sequences length count expect max.err error counts 1 130392 208454.2 0 130392 2 34656 52113.6 0 34656 3 7270 13028.4 0 7270 4 2532 3257.1 0 2532 5 511 814.3 0 511 6 85 203.6 0 85 7 74 50.9 0 74 8 30 12.7 0 30 9 13 3.2 0 8 5 10 30 0.8 1 10 20 11 20 0.2 1 7 13 12 20 0.0 1 8 12 13 17 0.0 1 8 9 14 48 0.0 1 32 16 15 9 0.0 1 5 4 16 6 0.0 1 3 3 17 22 0.0 1 14 8 18 6 0.0 1 2 4 19 20 0.0 1 8 12 20 13 0.0 1 8 5 21 5 0.0 1 2 3 22 4 0.0 1 1 3 23 25 0.0 1 15 10 24 38 0.0 1 21 17 25 24 0.0 1 16 8 26 16 0.0 1 9 7 27 3 0.0 1 2 1 28 21 0.0 1 13 8 29 2 0.0 1 2 30 37 0.0 1 17 20 31 7 0.0 1 3 4 32 24 0.0 1 14 10 33 19 0.0 1 11 8 34 4 0.0 1 2 2 35 22 0.0 1 9 13 36 19 0.0 1 10 9 37 9 0.0 1 6 3 38 10 0.0 1 3 7 39 16 0.0 1 9 7 40 8 0.0 1 3 5 41 22 0.0 1 12 10 42 44 0.0 1 28 16 43 6 0.0 1 3 3 44 16 0.0 1 9 7 45 53 0.0 1 35 18 46 17 0.0 1 12 5 47 6 0.0 1 4 2 48 43 0.0 1 30 13 49 23 0.0 1 15 8 50 17 0.0 1 9 8 51 111 0.0 1 82 29 52 21 0.0 1 14 7 53 17 0.0 1 13 4 54 14 0.0 1 10 4 55 30 0.0 1 19 11 56 29 0.0 1 17 12 57 20 0.0 1 14 6 58 29 0.0 1 23 6 59 21 0.0 1 14 7 60 23 0.0 1 15 8 61 21 0.0 1 13 8 62 21 0.0 1 14 7 63 30 0.0 1 21 9 64 28 0.0 1 22 6 65 31 0.0 1 25 6 66 50 0.0 1 38 12 67 73 0.0 1 45 28 68 211 0.0 1 185 26 69 71 0.0 1 62 9 70 32 0.0 1 27 5 71 25 0.0 1 21 4 72 7 0.0 1 5 2 73 8 0.0 1 6 2 74 13 0.0 1 7 6 75 5 0.0 1 2 3 76 5 0.0 1 1 4 77 10 0.0 1 4 6 78 6 0.0 1 4 2 79 2 0.0 1 2 80 5 0.0 1 2 3 81 6 0.0 1 3 3 82 4 0.0 1 3 1 83 9 0.0 1 5 4 84 7 0.0 1 2 5 85 6 0.0 1 6 86 6 0.0 1 2 4 87 3 0.0 1 1 2 88 8 0.0 1 6 2 89 6 0.0 1 3 3 90 7 0.0 1 6 1 91 8 0.0 1 5 3 92 3 0.0 1 2 1 93 4 0.0 1 2 2 94 3 0.0 1 2 1 95 6 0.0 1 3 3 96 7 0.0 1 4 3 97 1 0.0 1 1 98 2 0.0 1 1 1 99 2 0.0 1 0 2 100 5 0.0 1 1 4 101 4 0.0 1 3 1 102 7 0.0 1 5 2 103 2 0.0 1 0 2 104 2 0.0 1 0 2 105 3 0.0 1 2 1 106 5 0.0 1 3 2 107 8 0.0 1 5 3 108 4 0.0 1 2 2 109 6 0.0 1 5 1 110 7 0.0 1 2 5 111 2 0.0 1 2 112 3 0.0 1 3 113 1 0.0 1 0 1 114 7 0.0 1 3 4 115 1 0.0 1 1 116 5 0.0 1 5 117 5 0.0 1 1 4 118 1 0.0 1 1 119 2 0.0 1 2 120 4 0.0 1 1 3 121 3 0.0 1 2 1 122 3 0.0 1 3 123 5 0.0 1 3 2 124 3 0.0 1 2 1 126 3 0.0 1 2 1 127 1 0.0 1 0 1 128 4 0.0 1 1 3 129 1 0.0 1 0 1 130 2 0.0 1 1 1 131 3 0.0 1 1 2 133 1 0.0 1 0 1 137 1 0.0 1 0 1 139 1 0.0 1 0 1 142 1 0.0 1 0 1 143 2 0.0 1 0 2 149 1 0.0 1 1 150 4 0.0 1 2 2 151 2 0.0 1 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001.fastq.gz ============================================= 833817 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Total number of sequences analysed: 833817 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 182438 (21.88%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-4_S15_L004_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9196.27 s (27 us/read; 2.18 M reads/minute). === Summary === Total reads processed: 334,676,537 Reads with adapters: 70,851,982 (21.2%) Reads written (passing filters): 334,676,537 (100.0%) Total basepairs processed: 37,061,256,798 bp Quality-trimmed: 1,992,513,435 bp (5.4%) Total written (filtered): 34,771,499,884 bp (93.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 70851982 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.4% C: 23.1% G: 14.5% T: 29.4% none/other: 1.6% Overview of removed sequences length count expect max.err error counts 1 49946073 83669134.2 0 49946073 2 13486014 20917283.6 0 13486014 3 3132015 5229320.9 0 3132015 4 1363475 1307330.2 0 1363475 5 270073 326832.6 0 270073 6 31497 81708.1 0 31497 7 23758 20427.0 0 23758 8 18763 5106.8 0 18763 9 3275 1276.7 0 1335 1940 10 15838 319.2 1 7501 8337 11 3000 79.8 1 865 2135 12 9557 19.9 1 7005 2552 13 10482 5.0 1 7857 2625 14 8386 5.0 1 6592 1794 15 10113 5.0 1 7639 2474 16 8477 5.0 1 6752 1725 17 3712 5.0 1 1822 1890 18 16599 5.0 1 12447 4152 19 24728 5.0 1 21590 3138 20 2260 5.0 1 1068 1192 21 1909 5.0 1 922 987 22 12086 5.0 1 9142 2944 23 9466 5.0 1 7559 1907 24 14821 5.0 1 12727 2094 25 4114 5.0 1 2449 1665 26 8728 5.0 1 7096 1632 27 10130 5.0 1 8112 2018 28 18534 5.0 1 15347 3187 29 2268 5.0 1 1128 1140 30 6142 5.0 1 4044 2098 31 15510 5.0 1 13236 2274 32 11619 5.0 1 9215 2404 33 8746 5.0 1 7302 1444 34 19935 5.0 1 16588 3347 35 2621 5.0 1 1443 1178 36 6309 5.0 1 4129 2180 37 9645 5.0 1 7860 1785 38 17019 5.0 1 14649 2370 39 10585 5.0 1 8657 1928 40 4544 5.0 1 2872 1672 41 21327 5.0 1 17382 3945 42 17137 5.0 1 14529 2608 43 13511 5.0 1 11960 1551 44 10930 5.0 1 9372 1558 45 2105 5.0 1 1329 776 46 7728 5.0 1 6683 1045 47 2475 5.0 1 1622 853 48 12364 5.0 1 10349 2015 49 17014 5.0 1 15061 1953 50 2786 5.0 1 1856 930 51 10377 5.0 1 8927 1450 52 18494 5.0 1 16226 2268 53 11957 5.0 1 10082 1875 54 4967 5.0 1 3456 1511 55 22493 5.0 1 19324 3169 56 18617 5.0 1 16731 1886 57 6333 5.0 1 4818 1515 58 6637 5.0 1 5333 1304 59 27504 5.0 1 25337 2167 60 3308 5.0 1 2317 991 61 2317 5.0 1 1549 768 62 17981 5.0 1 16048 1933 63 5202 5.0 1 3849 1353 64 4788 5.0 1 3685 1103 65 11243 5.0 1 9620 1623 66 25893 5.0 1 22887 3006 67 12846 5.0 1 11236 1610 68 11694 5.0 1 10221 1473 69 12806 5.0 1 11220 1586 70 13130 5.0 1 11424 1706 71 14298 5.0 1 12328 1970 72 15831 5.0 1 13331 2500 73 18925 5.0 1 15389 3536 74 28030 5.0 1 19581 8449 75 215898 5.0 1 31797 184101 76 390056 5.0 1 244021 146035 77 271644 5.0 1 188562 83082 78 153127 5.0 1 97122 56005 79 85688 5.0 1 54909 30779 80 49362 5.0 1 30479 18883 81 31386 5.0 1 18805 12581 82 22096 5.0 1 13156 8940 83 16737 5.0 1 9751 6986 84 14012 5.0 1 8113 5899 85 12706 5.0 1 7329 5377 86 11724 5.0 1 6971 4753 87 10632 5.0 1 6390 4242 88 10005 5.0 1 6061 3944 89 9466 5.0 1 5765 3701 90 8822 5.0 1 5435 3387 91 8407 5.0 1 5174 3233 92 8240 5.0 1 5304 2936 93 7834 5.0 1 5000 2834 94 7275 5.0 1 4754 2521 95 7152 5.0 1 4756 2396 96 7058 5.0 1 4668 2390 97 6718 5.0 1 4392 2326 98 6487 5.0 1 4375 2112 99 6404 5.0 1 4315 2089 100 6272 5.0 1 4301 1971 101 6079 5.0 1 4152 1927 102 6017 5.0 1 4071 1946 103 6008 5.0 1 4107 1901 104 5748 5.0 1 3961 1787 105 5580 5.0 1 3877 1703 106 5544 5.0 1 3811 1733 107 5379 5.0 1 3753 1626 108 5359 5.0 1 3710 1649 109 5106 5.0 1 3624 1482 110 5025 5.0 1 3547 1478 111 4884 5.0 1 3465 1419 112 4964 5.0 1 3507 1457 113 4739 5.0 1 3414 1325 114 4651 5.0 1 3336 1315 115 4622 5.0 1 3322 1300 116 4717 5.0 1 3466 1251 117 4530 5.0 1 3280 1250 118 4450 5.0 1 3252 1198 119 4489 5.0 1 3263 1226 120 4267 5.0 1 3158 1109 121 4293 5.0 1 3151 1142 122 4084 5.0 1 3020 1064 123 4173 5.0 1 3080 1093 124 4179 5.0 1 3100 1079 125 4236 5.0 1 3210 1026 126 4337 5.0 1 3325 1012 127 4447 5.0 1 3416 1031 128 5221 5.0 1 4342 879 129 4503 5.0 1 3599 904 130 4628 5.0 1 3674 954 131 3817 5.0 1 2987 830 132 3796 5.0 1 2889 907 133 3600 5.0 1 2737 863 134 3725 5.0 1 2783 942 135 3616 5.0 1 2650 966 136 3597 5.0 1 2659 938 137 3478 5.0 1 2501 977 138 3422 5.0 1 2435 987 139 3344 5.0 1 2225 1119 140 3464 5.0 1 2247 1217 141 3462 5.0 1 2165 1297 142 3662 5.0 1 2284 1378 143 3697 5.0 1 2171 1526 144 4148 5.0 1 2326 1822 145 4627 5.0 1 2259 2368 146 5795 5.0 1 2497 3298 147 8517 5.0 1 3140 5377 148 16270 5.0 1 5191 11079 149 39151 5.0 1 11349 27802 150 119547 5.0 1 33070 86477 151 59906 5.0 1 16229 43677 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001.fastq.gz ============================================= 334676537 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9609.88 s (29 us/read; 2.09 M reads/minute). === Summary === Total reads processed: 334,676,537 Reads with adapters: 75,529,864 (22.6%) Reads written (passing filters): 334,676,537 (100.0%) Total basepairs processed: 37,650,893,595 bp Quality-trimmed: 3,667,776,006 bp (9.7%) Total written (filtered): 33,645,292,032 bp (89.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 75529864 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.3% C: 25.1% G: 14.2% T: 28.9% none/other: 1.5% Overview of removed sequences length count expect max.err error counts 1 52899788 83669134.2 0 52899788 2 14026357 20917283.6 0 14026357 3 3149311 5229320.9 0 3149311 4 1299949 1307330.2 0 1299949 5 274100 326832.6 0 274100 6 47719 81708.1 0 47719 7 51314 20427.0 0 51314 8 32241 5106.8 0 32241 9 7160 1276.7 0 4983 2177 10 29839 319.2 1 15654 14185 11 9518 79.8 1 3980 5538 12 21007 19.9 1 13129 7878 13 12246 5.0 1 6364 5882 14 49771 5.0 1 34512 15259 15 14776 5.0 1 10000 4776 16 10524 5.0 1 6082 4442 17 31867 5.0 1 22443 9424 18 5650 5.0 1 3093 2557 19 27602 5.0 1 20046 7556 20 22139 5.0 1 16372 5767 21 2688 5.0 1 1094 1594 22 4562 5.0 1 2116 2446 23 22033 5.0 1 15019 7014 24 54775 5.0 1 38293 16482 25 21035 5.0 1 14912 6123 26 30011 5.0 1 23700 6311 27 3259 5.0 1 1609 1650 28 22829 5.0 1 16216 6613 29 4058 5.0 1 2053 2005 30 23914 5.0 1 16939 6975 31 6039 5.0 1 3654 2385 32 30384 5.0 1 22413 7971 33 56837 5.0 1 46299 10538 34 3908 5.0 1 2020 1888 35 10892 5.0 1 6038 4854 36 10338 5.0 1 6259 4079 37 28511 5.0 1 22626 5885 38 11097 5.0 1 6857 4240 39 16072 5.0 1 12306 3766 40 7287 5.0 1 4229 3058 41 26370 5.0 1 19241 7129 42 44150 5.0 1 33043 11107 43 8048 5.0 1 5498 2550 44 27440 5.0 1 20243 7197 45 43337 5.0 1 32849 10488 46 20454 5.0 1 15751 4703 47 7316 5.0 1 4599 2717 48 49782 5.0 1 38054 11728 49 27998 5.0 1 21117 6881 50 15085 5.0 1 9846 5239 51 93014 5.0 1 71821 21193 52 20416 5.0 1 14651 5765 53 14291 5.0 1 10322 3969 54 20910 5.0 1 16929 3981 55 31429 5.0 1 23949 7480 56 16314 5.0 1 11622 4692 57 26790 5.0 1 21077 5713 58 37293 5.0 1 30323 6970 59 23880 5.0 1 18851 5029 60 23305 5.0 1 18743 4562 61 23531 5.0 1 18932 4599 62 25868 5.0 1 20687 5181 63 31520 5.0 1 25411 6109 64 35337 5.0 1 28547 6790 65 38804 5.0 1 31084 7720 66 44830 5.0 1 34313 10517 67 340055 5.0 1 50118 289937 68 756436 5.0 1 486254 270182 69 399520 5.0 1 310962 88558 70 151584 5.0 1 85909 65675 71 88846 5.0 1 59718 29128 72 39750 5.0 1 24713 15037 73 24746 5.0 1 15748 8998 74 16698 5.0 1 10465 6233 75 14585 5.0 1 8916 5669 76 13690 5.0 1 8615 5075 77 12248 5.0 1 7410 4838 78 12124 5.0 1 7386 4738 79 11799 5.0 1 7261 4538 80 11412 5.0 1 6931 4481 81 11400 5.0 1 7045 4355 82 11240 5.0 1 7080 4160 83 10426 5.0 1 6498 3928 84 9978 5.0 1 6294 3684 85 9812 5.0 1 6166 3646 86 9378 5.0 1 5998 3380 87 9004 5.0 1 5646 3358 88 9461 5.0 1 5828 3633 89 9434 5.0 1 5758 3676 90 9378 5.0 1 5877 3501 91 9007 5.0 1 5500 3507 92 8650 5.0 1 5451 3199 93 8259 5.0 1 5097 3162 94 8348 5.0 1 5260 3088 95 7563 5.0 1 4609 2954 96 7765 5.0 1 4812 2953 97 7584 5.0 1 4769 2815 98 6956 5.0 1 4348 2608 99 7376 5.0 1 4598 2778 100 7722 5.0 1 5001 2721 101 7521 5.0 1 4884 2637 102 7357 5.0 1 4778 2579 103 7420 5.0 1 4856 2564 104 7401 5.0 1 4928 2473 105 7162 5.0 1 4835 2327 106 7074 5.0 1 4747 2327 107 7153 5.0 1 4939 2214 108 6957 5.0 1 4752 2205 109 6708 5.0 1 4681 2027 110 6480 5.0 1 4485 1995 111 6317 5.0 1 4396 1921 112 6048 5.0 1 4239 1809 113 5740 5.0 1 3931 1809 114 5521 5.0 1 3837 1684 115 5148 5.0 1 3501 1647 116 5161 5.0 1 3589 1572 117 4981 5.0 1 3433 1548 118 4965 5.0 1 3431 1534 119 4590 5.0 1 3127 1463 120 4481 5.0 1 3079 1402 121 4150 5.0 1 2890 1260 122 4103 5.0 1 2805 1298 123 3851 5.0 1 2630 1221 124 3739 5.0 1 2569 1170 125 3457 5.0 1 2426 1031 126 3343 5.0 1 2378 965 127 3067 5.0 1 2182 885 128 3541 5.0 1 2699 842 129 2605 5.0 1 1875 730 130 2604 5.0 1 1939 665 131 1923 5.0 1 1419 504 132 1899 5.0 1 1366 533 133 1661 5.0 1 1184 477 134 1604 5.0 1 1147 457 135 1419 5.0 1 1005 414 136 1368 5.0 1 946 422 137 1230 5.0 1 837 393 138 1165 5.0 1 801 364 139 1120 5.0 1 726 394 140 1093 5.0 1 678 415 141 1012 5.0 1 565 447 142 1120 5.0 1 606 514 143 1017 5.0 1 499 518 144 1057 5.0 1 489 568 145 1063 5.0 1 412 651 146 1210 5.0 1 424 786 147 1728 5.0 1 509 1219 148 5293 5.0 1 1487 3806 149 15622 5.0 1 4285 11337 150 66322 5.0 1 17669 48653 151 18140 5.0 1 4703 13437 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001.fastq.gz ============================================= 334676537 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Total number of sequences analysed: 334676537 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 98853907 (29.54%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-5_S19_L005_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9240.12 s (27 us/read; 2.19 M reads/minute). === Summary === Total reads processed: 337,457,885 Reads with adapters: 71,397,322 (21.2%) Reads written (passing filters): 337,457,885 (100.0%) Total basepairs processed: 37,394,877,637 bp Quality-trimmed: 2,006,887,418 bp (5.4%) Total written (filtered): 35,091,462,361 bp (93.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 71397322 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.4% C: 23.2% G: 14.5% T: 29.4% none/other: 1.5% Overview of removed sequences length count expect max.err error counts 1 50372425 84364471.2 0 50372425 2 13584853 21091117.8 0 13584853 3 3154523 5272779.5 0 3154523 4 1374791 1318194.9 0 1374791 5 272259 329548.7 0 272259 6 31825 82387.2 0 31825 7 24024 20596.8 0 24024 8 18858 5149.2 0 18858 9 3032 1287.3 0 1206 1826 10 15571 321.8 1 7469 8102 11 3065 80.5 1 879 2186 12 9553 20.1 1 7037 2516 13 10410 5.0 1 7747 2663 14 8338 5.0 1 6593 1745 15 9680 5.0 1 7405 2275 16 8358 5.0 1 6640 1718 17 3583 5.0 1 1809 1774 18 16405 5.0 1 12252 4153 19 24541 5.0 1 21446 3095 20 2103 5.0 1 967 1136 21 1853 5.0 1 903 950 22 11643 5.0 1 8914 2729 23 9197 5.0 1 7354 1843 24 14757 5.0 1 12570 2187 25 3864 5.0 1 2198 1666 26 8368 5.0 1 6795 1573 27 9918 5.0 1 7855 2063 28 18024 5.0 1 14658 3366 29 2008 5.0 1 990 1018 30 5869 5.0 1 3805 2064 31 14760 5.0 1 12450 2310 32 11364 5.0 1 8983 2381 33 8417 5.0 1 6979 1438 34 19159 5.0 1 15780 3379 35 2660 5.0 1 1451 1209 36 6276 5.0 1 4070 2206 37 9581 5.0 1 7720 1861 38 16692 5.0 1 14426 2266 39 10375 5.0 1 8565 1810 40 4521 5.0 1 2833 1688 41 20801 5.0 1 16874 3927 42 17328 5.0 1 14864 2464 43 13574 5.0 1 11992 1582 44 10984 5.0 1 9357 1627 45 2050 5.0 1 1280 770 46 7818 5.0 1 6789 1029 47 2477 5.0 1 1609 868 48 12210 5.0 1 10200 2010 49 17200 5.0 1 15235 1965 50 2768 5.0 1 1895 873 51 10155 5.0 1 8742 1413 52 18132 5.0 1 16063 2069 53 11645 5.0 1 9914 1731 54 4866 5.0 1 3491 1375 55 22403 5.0 1 19543 2860 56 18662 5.0 1 16980 1682 57 6274 5.0 1 4783 1491 58 6505 5.0 1 5308 1197 59 27488 5.0 1 25620 1868 60 3204 5.0 1 2313 891 61 2257 5.0 1 1538 719 62 18009 5.0 1 16301 1708 63 5136 5.0 1 3715 1421 64 4605 5.0 1 3545 1060 65 11099 5.0 1 9595 1504 66 25713 5.0 1 22879 2834 67 12917 5.0 1 11357 1560 68 11537 5.0 1 10168 1369 69 12762 5.0 1 11175 1587 70 12807 5.0 1 11177 1630 71 14207 5.0 1 12255 1952 72 15852 5.0 1 13352 2500 73 18613 5.0 1 15069 3544 74 27414 5.0 1 19496 7918 75 208761 5.0 1 32182 176579 76 388363 5.0 1 243362 145001 77 279502 5.0 1 191222 88280 78 156937 5.0 1 98688 58249 79 87613 5.0 1 55899 31714 80 51253 5.0 1 31486 19767 81 32175 5.0 1 19227 12948 82 22211 5.0 1 13193 9018 83 16963 5.0 1 9807 7156 84 13895 5.0 1 7977 5918 85 12277 5.0 1 7092 5185 86 11354 5.0 1 6699 4655 87 10571 5.0 1 6398 4173 88 9848 5.0 1 5855 3993 89 9269 5.0 1 5673 3596 90 8774 5.0 1 5496 3278 91 8214 5.0 1 5074 3140 92 7750 5.0 1 4908 2842 93 7432 5.0 1 4714 2718 94 7097 5.0 1 4580 2517 95 7047 5.0 1 4580 2467 96 6736 5.0 1 4378 2358 97 6583 5.0 1 4328 2255 98 6494 5.0 1 4353 2141 99 6228 5.0 1 4200 2028 100 6160 5.0 1 4129 2031 101 6036 5.0 1 4139 1897 102 5746 5.0 1 3938 1808 103 5788 5.0 1 4017 1771 104 5593 5.0 1 3841 1752 105 5473 5.0 1 3765 1708 106 5515 5.0 1 3843 1672 107 5266 5.0 1 3630 1636 108 5260 5.0 1 3691 1569 109 5156 5.0 1 3657 1499 110 4927 5.0 1 3490 1437 111 4704 5.0 1 3283 1421 112 4682 5.0 1 3295 1387 113 4604 5.0 1 3329 1275 114 4613 5.0 1 3324 1289 115 4472 5.0 1 3219 1253 116 4379 5.0 1 3152 1227 117 4357 5.0 1 3137 1220 118 4319 5.0 1 3174 1145 119 4275 5.0 1 3115 1160 120 4169 5.0 1 3056 1113 121 4104 5.0 1 3017 1087 122 4102 5.0 1 3044 1058 123 3923 5.0 1 2875 1048 124 4153 5.0 1 3110 1043 125 4058 5.0 1 3077 981 126 4273 5.0 1 3318 955 127 4216 5.0 1 3316 900 128 5336 5.0 1 4427 909 129 4378 5.0 1 3451 927 130 4406 5.0 1 3535 871 131 3803 5.0 1 2928 875 132 3743 5.0 1 2897 846 133 3600 5.0 1 2757 843 134 3669 5.0 1 2795 874 135 3682 5.0 1 2774 908 136 3630 5.0 1 2653 977 137 3526 5.0 1 2510 1016 138 3517 5.0 1 2457 1060 139 3363 5.0 1 2268 1095 140 3414 5.0 1 2198 1216 141 3458 5.0 1 2152 1306 142 3634 5.0 1 2228 1406 143 3793 5.0 1 2172 1621 144 4126 5.0 1 2250 1876 145 4619 5.0 1 2252 2367 146 5777 5.0 1 2465 3312 147 8323 5.0 1 3063 5260 148 16149 5.0 1 5150 10999 149 38079 5.0 1 11114 26965 150 117228 5.0 1 32800 84428 151 57424 5.0 1 15597 41827 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001.fastq.gz ============================================= 337457885 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 9759.48 s (29 us/read; 2.07 M reads/minute). === Summary === Total reads processed: 337,457,885 Reads with adapters: 75,968,497 (22.5%) Reads written (passing filters): 337,457,885 (100.0%) Total basepairs processed: 37,962,561,744 bp Quality-trimmed: 3,715,982,260 bp (9.8%) Total written (filtered): 33,913,014,949 bp (89.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 75968497 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.2% C: 25.2% G: 14.2% T: 28.9% none/other: 1.5% Overview of removed sequences length count expect max.err error counts 1 53251749 84364471.2 0 53251749 2 14161587 21091117.8 0 14161587 3 3179820 5272779.5 0 3179820 4 1308712 1318194.9 0 1308712 5 276711 329548.7 0 276711 6 47612 82387.2 0 47612 7 50159 20596.8 0 50159 8 31236 5149.2 0 31236 9 7152 1287.3 0 4833 2319 10 29137 321.8 1 14994 14143 11 9597 80.5 1 3893 5704 12 20461 20.1 1 12760 7701 13 12152 5.0 1 6201 5951 14 48001 5.0 1 33146 14855 15 14011 5.0 1 9416 4595 16 10339 5.0 1 5973 4366 17 30716 5.0 1 21418 9298 18 5537 5.0 1 3060 2477 19 26499 5.0 1 19207 7292 20 21365 5.0 1 15652 5713 21 2611 5.0 1 1064 1547 22 4464 5.0 1 2049 2415 23 20803 5.0 1 13909 6894 24 52787 5.0 1 37075 15712 25 20236 5.0 1 14323 5913 26 28710 5.0 1 22712 5998 27 3016 5.0 1 1487 1529 28 21615 5.0 1 15433 6182 29 3924 5.0 1 2008 1916 30 23411 5.0 1 16521 6890 31 5858 5.0 1 3606 2252 32 29414 5.0 1 21792 7622 33 54062 5.0 1 44170 9892 34 3908 5.0 1 2000 1908 35 10710 5.0 1 5873 4837 36 10168 5.0 1 6091 4077 37 27495 5.0 1 21809 5686 38 11004 5.0 1 6714 4290 39 15794 5.0 1 12033 3761 40 6979 5.0 1 4028 2951 41 25050 5.0 1 18098 6952 42 43309 5.0 1 32363 10946 43 7876 5.0 1 5266 2610 44 26271 5.0 1 19250 7021 45 42065 5.0 1 31775 10290 46 19945 5.0 1 15226 4719 47 7061 5.0 1 4422 2639 48 48106 5.0 1 36674 11432 49 27182 5.0 1 20418 6764 50 14509 5.0 1 9484 5025 51 89691 5.0 1 68589 21102 52 19465 5.0 1 13784 5681 53 13751 5.0 1 9746 4005 54 20076 5.0 1 16168 3908 55 30874 5.0 1 23545 7329 56 15533 5.0 1 11006 4527 57 25782 5.0 1 20205 5577 58 35697 5.0 1 29237 6460 59 22596 5.0 1 17784 4812 60 22217 5.0 1 17697 4520 61 22491 5.0 1 18065 4426 62 24733 5.0 1 19684 5049 63 30241 5.0 1 24303 5938 64 33275 5.0 1 26611 6664 65 36902 5.0 1 29376 7526 66 43287 5.0 1 32157 11130 67 331963 5.0 1 47956 284007 68 745425 5.0 1 476908 268517 69 398389 5.0 1 307966 90423 70 153354 5.0 1 85773 67581 71 85512 5.0 1 57848 27664 72 37770 5.0 1 23517 14253 73 24470 5.0 1 15464 9006 74 16010 5.0 1 9974 6036 75 14434 5.0 1 8652 5782 76 12957 5.0 1 8042 4915 77 11765 5.0 1 7049 4716 78 11632 5.0 1 6945 4687 79 11427 5.0 1 6937 4490 80 11082 5.0 1 6653 4429 81 11034 5.0 1 6729 4305 82 10713 5.0 1 6557 4156 83 10088 5.0 1 6223 3865 84 9550 5.0 1 5904 3646 85 9398 5.0 1 5814 3584 86 9162 5.0 1 5702 3460 87 8671 5.0 1 5258 3413 88 9147 5.0 1 5508 3639 89 8911 5.0 1 5425 3486 90 9110 5.0 1 5487 3623 91 9038 5.0 1 5551 3487 92 8167 5.0 1 4909 3258 93 7995 5.0 1 4873 3122 94 7900 5.0 1 4988 2912 95 7276 5.0 1 4334 2942 96 7328 5.0 1 4499 2829 97 7443 5.0 1 4702 2741 98 6685 5.0 1 4128 2557 99 6935 5.0 1 4292 2643 100 7462 5.0 1 4767 2695 101 7130 5.0 1 4606 2524 102 7202 5.0 1 4595 2607 103 7260 5.0 1 4724 2536 104 7176 5.0 1 4687 2489 105 6893 5.0 1 4602 2291 106 6915 5.0 1 4662 2253 107 6727 5.0 1 4610 2117 108 6563 5.0 1 4522 2041 109 6441 5.0 1 4481 1960 110 6217 5.0 1 4378 1839 111 6006 5.0 1 4113 1893 112 5795 5.0 1 4022 1773 113 5504 5.0 1 3787 1717 114 5530 5.0 1 3745 1785 115 5024 5.0 1 3416 1608 116 5078 5.0 1 3466 1612 117 4816 5.0 1 3328 1488 118 4569 5.0 1 3148 1421 119 4359 5.0 1 2991 1368 120 4355 5.0 1 2988 1367 121 4076 5.0 1 2852 1224 122 3904 5.0 1 2713 1191 123 3548 5.0 1 2496 1052 124 3595 5.0 1 2559 1036 125 3438 5.0 1 2400 1038 126 3352 5.0 1 2401 951 127 3064 5.0 1 2244 820 128 3581 5.0 1 2749 832 129 2520 5.0 1 1830 690 130 2482 5.0 1 1841 641 131 1918 5.0 1 1383 535 132 1859 5.0 1 1380 479 133 1573 5.0 1 1122 451 134 1618 5.0 1 1188 430 135 1448 5.0 1 1071 377 136 1463 5.0 1 1031 432 137 1287 5.0 1 919 368 138 1164 5.0 1 805 359 139 1100 5.0 1 733 367 140 1100 5.0 1 661 439 141 1026 5.0 1 585 441 142 1029 5.0 1 544 485 143 979 5.0 1 498 481 144 1017 5.0 1 502 515 145 1081 5.0 1 416 665 146 1227 5.0 1 426 801 147 1605 5.0 1 509 1096 148 5468 5.0 1 1477 3991 149 15278 5.0 1 4203 11075 150 68349 5.0 1 18398 49951 151 18078 5.0 1 4811 13267 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001.fastq.gz ============================================= 337457885 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Total number of sequences analysed: 337457885 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 99581376 (29.51%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-6_S23_L006_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12.05 s (39 us/read; 1.55 M reads/minute). === Summary === Total reads processed: 311,953 Reads with adapters: 72,500 (23.2%) Reads written (passing filters): 311,953 (100.0%) Total basepairs processed: 40,441,546 bp Quality-trimmed: 1,163,411 bp (2.9%) Total written (filtered): 39,112,230 bp (96.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 72500 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.2% C: 23.8% G: 13.5% T: 28.6% none/other: 1.0% Overview of removed sequences length count expect max.err error counts 1 56003 77988.2 0 56003 2 11411 19497.1 0 11411 3 2963 4874.3 0 2963 4 1045 1218.6 0 1045 5 186 304.6 0 186 6 25 76.2 0 25 7 16 19.0 0 16 8 2 4.8 0 2 10 6 0.3 1 1 5 11 3 0.1 1 1 2 12 1 0.0 1 1 13 6 0.0 1 2 4 14 2 0.0 1 1 1 16 2 0.0 1 1 1 18 2 0.0 1 2 19 6 0.0 1 5 1 20 1 0.0 1 1 22 2 0.0 1 1 1 23 1 0.0 1 1 24 2 0.0 1 1 1 25 1 0.0 1 0 1 26 3 0.0 1 2 1 27 3 0.0 1 2 1 28 1 0.0 1 1 29 1 0.0 1 0 1 30 2 0.0 1 2 31 2 0.0 1 0 2 33 1 0.0 1 1 34 2 0.0 1 0 2 35 1 0.0 1 1 36 3 0.0 1 1 2 37 2 0.0 1 1 1 38 2 0.0 1 1 1 39 2 0.0 1 2 42 2 0.0 1 2 43 2 0.0 1 1 1 44 3 0.0 1 2 1 46 1 0.0 1 1 47 1 0.0 1 1 52 1 0.0 1 1 53 2 0.0 1 1 1 54 1 0.0 1 1 55 5 0.0 1 2 3 56 2 0.0 1 2 58 1 0.0 1 1 59 3 0.0 1 2 1 62 3 0.0 1 2 1 63 2 0.0 1 2 64 2 0.0 1 2 65 3 0.0 1 2 1 66 4 0.0 1 3 1 67 3 0.0 1 1 2 69 2 0.0 1 1 1 70 4 0.0 1 1 3 71 2 0.0 1 2 72 7 0.0 1 5 2 73 2 0.0 1 2 74 10 0.0 1 2 8 75 151 0.0 1 5 146 76 142 0.0 1 25 117 77 85 0.0 1 7 78 78 69 0.0 1 14 55 79 25 0.0 1 5 20 80 18 0.0 1 4 14 81 13 0.0 1 6 7 82 13 0.0 1 2 11 83 6 0.0 1 0 6 84 1 0.0 1 0 1 85 5 0.0 1 2 3 86 5 0.0 1 0 5 87 3 0.0 1 2 1 88 2 0.0 1 1 1 89 2 0.0 1 1 1 90 2 0.0 1 0 2 92 1 0.0 1 1 96 2 0.0 1 0 2 97 2 0.0 1 0 2 99 2 0.0 1 2 102 2 0.0 1 1 1 103 1 0.0 1 0 1 104 1 0.0 1 0 1 105 1 0.0 1 0 1 106 1 0.0 1 1 107 2 0.0 1 0 2 108 1 0.0 1 0 1 110 2 0.0 1 0 2 112 1 0.0 1 1 114 1 0.0 1 1 115 1 0.0 1 0 1 116 2 0.0 1 0 2 120 2 0.0 1 0 2 121 2 0.0 1 0 2 127 1 0.0 1 1 130 5 0.0 1 3 2 131 1 0.0 1 0 1 132 1 0.0 1 0 1 134 1 0.0 1 0 1 135 3 0.0 1 1 2 137 2 0.0 1 0 2 139 1 0.0 1 1 141 1 0.0 1 0 1 142 1 0.0 1 1 143 2 0.0 1 1 1 144 1 0.0 1 0 1 145 4 0.0 1 0 4 146 1 0.0 1 0 1 147 5 0.0 1 1 4 148 6 0.0 1 2 4 149 31 0.0 1 3 28 150 52 0.0 1 2 50 151 33 0.0 1 5 28 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001.fastq.gz ============================================= 311953 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 10.26 s (33 us/read; 1.82 M reads/minute). === Summary === Total reads processed: 311,953 Reads with adapters: 67,501 (21.6%) Reads written (passing filters): 311,953 (100.0%) Total basepairs processed: 36,501,373 bp Quality-trimmed: 6,452,319 bp (17.7%) Total written (filtered): 29,887,216 bp (81.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 67501 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.3% C: 26.4% G: 13.7% T: 28.3% none/other: 1.3% Overview of removed sequences length count expect max.err error counts 1 49440 77988.2 0 49440 2 12901 19497.1 0 12901 3 2780 4874.3 0 2780 4 1060 1218.6 0 1060 5 201 304.6 0 201 6 26 76.2 0 26 7 15 19.0 0 15 8 10 4.8 0 10 9 4 1.2 0 0 4 10 9 0.3 1 3 6 11 3 0.1 1 2 1 12 6 0.0 1 0 6 13 5 0.0 1 4 1 14 12 0.0 1 10 2 15 3 0.0 1 2 1 16 5 0.0 1 2 3 17 4 0.0 1 3 1 18 3 0.0 1 3 19 4 0.0 1 3 1 20 6 0.0 1 3 3 23 2 0.0 1 2 24 11 0.0 1 5 6 25 5 0.0 1 4 1 26 11 0.0 1 9 2 28 4 0.0 1 2 2 30 4 0.0 1 3 1 31 1 0.0 1 1 32 5 0.0 1 3 2 33 9 0.0 1 7 2 34 1 0.0 1 0 1 35 2 0.0 1 1 1 36 1 0.0 1 0 1 37 2 0.0 1 1 1 38 1 0.0 1 1 39 1 0.0 1 1 40 1 0.0 1 1 41 3 0.0 1 2 1 42 8 0.0 1 5 3 44 7 0.0 1 6 1 45 12 0.0 1 9 3 46 1 0.0 1 0 1 47 1 0.0 1 1 48 6 0.0 1 4 2 49 1 0.0 1 0 1 50 3 0.0 1 3 51 13 0.0 1 9 4 52 10 0.0 1 7 3 53 2 0.0 1 2 54 3 0.0 1 2 1 55 6 0.0 1 3 3 56 6 0.0 1 4 2 57 7 0.0 1 5 2 58 5 0.0 1 3 2 59 7 0.0 1 6 1 60 2 0.0 1 2 61 2 0.0 1 2 62 5 0.0 1 4 1 63 3 0.0 1 2 1 64 7 0.0 1 5 2 65 8 0.0 1 4 4 66 9 0.0 1 7 2 67 256 0.0 1 11 245 68 255 0.0 1 66 189 69 73 0.0 1 28 45 70 41 0.0 1 6 35 71 17 0.0 1 3 14 72 16 0.0 1 1 15 73 6 0.0 1 2 4 74 6 0.0 1 2 4 76 1 0.0 1 0 1 77 3 0.0 1 1 2 78 1 0.0 1 1 79 3 0.0 1 1 2 80 2 0.0 1 0 2 81 1 0.0 1 0 1 82 2 0.0 1 1 1 83 1 0.0 1 1 85 2 0.0 1 2 87 2 0.0 1 0 2 88 2 0.0 1 1 1 89 2 0.0 1 0 2 90 2 0.0 1 0 2 91 5 0.0 1 1 4 92 1 0.0 1 1 93 1 0.0 1 1 95 2 0.0 1 0 2 96 2 0.0 1 0 2 97 1 0.0 1 1 98 1 0.0 1 0 1 99 3 0.0 1 1 2 101 4 0.0 1 3 1 103 1 0.0 1 1 104 1 0.0 1 0 1 105 3 0.0 1 1 2 107 2 0.0 1 2 108 1 0.0 1 0 1 111 1 0.0 1 0 1 112 2 0.0 1 1 1 116 1 0.0 1 1 122 1 0.0 1 0 1 123 1 0.0 1 1 124 1 0.0 1 0 1 127 2 0.0 1 2 130 2 0.0 1 2 131 1 0.0 1 1 133 1 0.0 1 0 1 135 1 0.0 1 1 137 1 0.0 1 0 1 139 1 0.0 1 0 1 142 1 0.0 1 1 144 1 0.0 1 1 147 2 0.0 1 0 2 149 7 0.0 1 1 6 150 47 0.0 1 9 38 151 10 0.0 1 1 9 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001.fastq.gz ============================================= 311953 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Total number of sequences analysed: 311953 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 73249 (23.48%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-7_S27_L007_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12.98 s (38 us/read; 1.58 M reads/minute). === Summary === Total reads processed: 341,141 Reads with adapters: 79,138 (23.2%) Reads written (passing filters): 341,141 (100.0%) Total basepairs processed: 44,015,343 bp Quality-trimmed: 1,267,590 bp (2.9%) Total written (filtered): 42,567,682 bp (96.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 79138 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.3% C: 23.7% G: 13.5% T: 28.5% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 61157 85285.2 0 61157 2 12269 21321.3 0 12269 3 3268 5330.3 0 3268 4 1226 1332.6 0 1226 5 241 333.1 0 241 6 40 83.3 0 40 7 6 20.8 0 6 8 3 5.2 0 3 9 3 1.3 0 0 3 10 11 0.3 1 3 8 11 1 0.1 1 1 12 2 0.0 1 1 1 13 5 0.0 1 2 3 14 1 0.0 1 1 15 4 0.0 1 2 2 16 1 0.0 1 0 1 18 3 0.0 1 2 1 19 2 0.0 1 2 20 2 0.0 1 0 2 22 5 0.0 1 4 1 23 2 0.0 1 0 2 24 2 0.0 1 2 25 1 0.0 1 1 26 2 0.0 1 1 1 27 1 0.0 1 0 1 28 4 0.0 1 3 1 30 1 0.0 1 1 31 3 0.0 1 2 1 32 1 0.0 1 0 1 33 2 0.0 1 2 34 1 0.0 1 1 35 1 0.0 1 1 36 2 0.0 1 2 37 5 0.0 1 4 1 38 2 0.0 1 1 1 39 5 0.0 1 4 1 40 1 0.0 1 0 1 41 6 0.0 1 6 42 2 0.0 1 1 1 43 3 0.0 1 2 1 44 1 0.0 1 1 46 2 0.0 1 2 47 1 0.0 1 1 49 2 0.0 1 1 1 51 2 0.0 1 2 52 1 0.0 1 0 1 53 4 0.0 1 3 1 55 6 0.0 1 6 56 4 0.0 1 4 57 3 0.0 1 1 2 58 1 0.0 1 1 59 6 0.0 1 6 61 1 0.0 1 1 63 3 0.0 1 3 64 2 0.0 1 2 66 6 0.0 1 4 2 67 3 0.0 1 2 1 68 1 0.0 1 1 69 5 0.0 1 3 2 70 1 0.0 1 1 71 3 0.0 1 3 72 2 0.0 1 1 1 73 9 0.0 1 4 5 74 9 0.0 1 3 6 75 178 0.0 1 10 168 76 149 0.0 1 30 119 77 77 0.0 1 19 58 78 63 0.0 1 17 46 79 22 0.0 1 6 16 80 18 0.0 1 5 13 81 13 0.0 1 6 7 82 6 0.0 1 0 6 83 7 0.0 1 0 7 84 4 0.0 1 2 2 85 6 0.0 1 3 3 86 6 0.0 1 1 5 87 3 0.0 1 2 1 88 6 0.0 1 0 6 89 3 0.0 1 2 1 90 5 0.0 1 1 4 91 1 0.0 1 1 92 4 0.0 1 1 3 93 4 0.0 1 0 4 94 3 0.0 1 0 3 95 1 0.0 1 1 96 1 0.0 1 0 1 97 1 0.0 1 0 1 98 1 0.0 1 0 1 99 2 0.0 1 0 2 100 2 0.0 1 0 2 101 1 0.0 1 1 102 1 0.0 1 0 1 103 1 0.0 1 0 1 104 1 0.0 1 0 1 105 1 0.0 1 0 1 106 1 0.0 1 0 1 107 3 0.0 1 0 3 109 2 0.0 1 0 2 110 4 0.0 1 0 4 111 1 0.0 1 0 1 114 1 0.0 1 1 119 1 0.0 1 1 122 1 0.0 1 1 126 1 0.0 1 1 127 2 0.0 1 1 1 130 2 0.0 1 1 1 132 1 0.0 1 0 1 133 1 0.0 1 0 1 134 1 0.0 1 1 135 1 0.0 1 0 1 136 2 0.0 1 1 1 140 1 0.0 1 0 1 141 2 0.0 1 0 2 142 2 0.0 1 0 2 143 1 0.0 1 0 1 144 2 0.0 1 1 1 145 1 0.0 1 0 1 146 4 0.0 1 1 3 147 2 0.0 1 0 2 148 11 0.0 1 1 10 149 27 0.0 1 2 25 150 64 0.0 1 6 58 151 30 0.0 1 5 25 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001.fastq.gz ============================================= 341141 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11.04 s (32 us/read; 1.85 M reads/minute). === Summary === Total reads processed: 341,141 Reads with adapters: 73,998 (21.7%) Reads written (passing filters): 341,141 (100.0%) Total basepairs processed: 39,919,622 bp Quality-trimmed: 7,277,289 bp (18.2%) Total written (filtered): 32,468,411 bp (81.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 73998 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.1% C: 26.5% G: 13.6% T: 28.5% none/other: 1.2% Overview of removed sequences length count expect max.err error counts 1 54202 85285.2 0 54202 2 14131 21321.3 0 14131 3 3120 5330.3 0 3120 4 1154 1332.6 0 1154 5 206 333.1 0 206 6 29 83.3 0 29 7 18 20.8 0 18 8 6 5.2 0 6 9 3 1.3 0 1 2 10 9 0.3 1 5 4 11 4 0.1 1 2 2 12 2 0.0 1 1 1 13 3 0.0 1 2 1 14 10 0.0 1 5 5 15 5 0.0 1 1 4 16 1 0.0 1 0 1 17 6 0.0 1 4 2 18 4 0.0 1 1 3 19 5 0.0 1 1 4 20 3 0.0 1 2 1 22 5 0.0 1 2 3 23 5 0.0 1 3 2 24 13 0.0 1 8 5 25 6 0.0 1 4 2 26 6 0.0 1 3 3 28 9 0.0 1 5 4 29 1 0.0 1 0 1 30 7 0.0 1 4 3 32 7 0.0 1 5 2 33 12 0.0 1 8 4 35 6 0.0 1 4 2 36 3 0.0 1 1 2 37 2 0.0 1 2 38 3 0.0 1 3 39 5 0.0 1 4 1 41 7 0.0 1 3 4 42 14 0.0 1 6 8 43 3 0.0 1 1 2 44 3 0.0 1 3 45 9 0.0 1 6 3 46 7 0.0 1 6 1 47 3 0.0 1 1 2 48 10 0.0 1 9 1 49 5 0.0 1 3 2 50 3 0.0 1 1 2 51 21 0.0 1 11 10 52 4 0.0 1 2 2 53 4 0.0 1 3 1 54 2 0.0 1 1 1 55 13 0.0 1 10 3 56 2 0.0 1 1 1 57 3 0.0 1 2 1 58 5 0.0 1 3 2 59 8 0.0 1 6 2 60 5 0.0 1 5 61 7 0.0 1 5 2 62 3 0.0 1 1 2 63 4 0.0 1 3 1 64 7 0.0 1 7 65 10 0.0 1 6 4 66 13 0.0 1 4 9 67 251 0.0 1 8 243 68 252 0.0 1 78 174 69 82 0.0 1 39 43 70 38 0.0 1 13 25 71 26 0.0 1 12 14 72 9 0.0 1 0 9 73 7 0.0 1 1 6 74 4 0.0 1 2 2 75 1 0.0 1 0 1 76 2 0.0 1 0 2 77 2 0.0 1 0 2 78 2 0.0 1 0 2 79 3 0.0 1 2 1 80 2 0.0 1 1 1 81 3 0.0 1 1 2 83 3 0.0 1 1 2 84 1 0.0 1 0 1 85 4 0.0 1 1 3 86 2 0.0 1 0 2 88 2 0.0 1 2 89 3 0.0 1 1 2 90 2 0.0 1 0 2 93 1 0.0 1 1 94 1 0.0 1 0 1 96 3 0.0 1 2 1 97 1 0.0 1 0 1 98 1 0.0 1 1 99 3 0.0 1 1 2 100 1 0.0 1 0 1 101 1 0.0 1 0 1 103 2 0.0 1 0 2 105 1 0.0 1 1 106 1 0.0 1 1 107 4 0.0 1 1 3 108 1 0.0 1 1 109 2 0.0 1 0 2 111 1 0.0 1 0 1 113 1 0.0 1 0 1 114 1 0.0 1 1 115 1 0.0 1 1 116 1 0.0 1 0 1 117 1 0.0 1 0 1 118 1 0.0 1 1 120 1 0.0 1 0 1 122 1 0.0 1 1 125 2 0.0 1 1 1 128 1 0.0 1 0 1 129 1 0.0 1 1 130 1 0.0 1 0 1 133 1 0.0 1 0 1 141 1 0.0 1 0 1 144 1 0.0 1 0 1 148 2 0.0 1 0 2 149 8 0.0 1 1 7 150 50 0.0 1 6 44 151 11 0.0 1 3 8 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001.fastq.gz ============================================= 341141 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Total number of sequences analysed: 341141 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 81477 (23.88%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-5to7kb-8_S31_L008_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-6_S21_L006_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-6_S21_L006_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-6_S21_L006_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-6_S21_L006_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-6_S21_L006_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.40 s (48 us/read; 1.25 M reads/minute). === Summary === Total reads processed: 29,238 Reads with adapters: 6,900 (23.6%) Reads written (passing filters): 29,238 (100.0%) Total basepairs processed: 3,979,508 bp Quality-trimmed: 101,141 bp (2.5%) Total written (filtered): 3,868,128 bp (97.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6900 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.3% C: 24.0% G: 14.8% T: 28.7% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 5416 7309.5 0 5416 2 1049 1827.4 0 1049 3 295 456.8 0 295 4 96 114.2 0 96 5 22 28.6 0 22 6 2 7.1 0 2 7 1 1.8 0 1 10 1 0.0 1 0 1 28 1 0.0 1 1 30 1 0.0 1 1 41 1 0.0 1 1 53 1 0.0 1 0 1 62 1 0.0 1 0 1 66 1 0.0 1 1 69 1 0.0 1 1 70 1 0.0 1 1 74 1 0.0 1 0 1 76 3 0.0 1 3 77 1 0.0 1 1 79 1 0.0 1 0 1 81 1 0.0 1 1 82 1 0.0 1 0 1 134 1 0.0 1 0 1 143 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-6_S21_L006_R1_001.fastq.gz ============================================= 29238 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-6_S21_L006_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-6_S21_L006_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-6_S21_L006_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-6_S21_L006_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-6_S21_L006_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.97 s (33 us/read; 1.81 M reads/minute). === Summary === Total reads processed: 29,238 Reads with adapters: 6,008 (20.5%) Reads written (passing filters): 29,238 (100.0%) Total basepairs processed: 3,157,431 bp Quality-trimmed: 833,914 bp (26.4%) Total written (filtered): 2,312,523 bp (73.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6008 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 27.5% C: 26.9% G: 15.8% T: 29.2% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 4275 7309.5 0 4275 2 1295 1827.4 0 1295 3 260 456.8 0 260 4 101 114.2 0 101 5 25 28.6 0 25 6 3 7.1 0 3 8 1 0.4 0 1 10 1 0.0 1 0 1 11 1 0.0 1 0 1 13 1 0.0 1 0 1 24 2 0.0 1 1 1 26 1 0.0 1 0 1 28 2 0.0 1 0 2 30 3 0.0 1 1 2 33 1 0.0 1 0 1 39 1 0.0 1 0 1 43 1 0.0 1 0 1 45 1 0.0 1 0 1 46 1 0.0 1 1 48 1 0.0 1 1 50 2 0.0 1 0 2 51 1 0.0 1 1 54 2 0.0 1 1 1 55 1 0.0 1 1 58 1 0.0 1 0 1 60 1 0.0 1 1 63 1 0.0 1 1 67 4 0.0 1 3 1 68 7 0.0 1 7 69 3 0.0 1 2 1 71 1 0.0 1 0 1 72 1 0.0 1 1 81 1 0.0 1 0 1 92 1 0.0 1 0 1 112 1 0.0 1 0 1 113 1 0.0 1 0 1 149 1 0.0 1 0 1 150 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-6_S21_L006_R2_001.fastq.gz ============================================= 29238 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-6_S21_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-6_S21_L006_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-6_S21_L006_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-6_S21_L006_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-6_S21_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-6_S21_L006_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Total number of sequences analysed: 29238 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 8435 (28.85%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-6_S21_L006_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-6_S21_L006_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-6_S21_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-6_S21_L006_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-7_S25_L007_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-7_S25_L007_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-7_S25_L007_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-7_S25_L007_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-7_S25_L007_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.69 s (55 us/read; 1.09 M reads/minute). === Summary === Total reads processed: 12,504 Reads with adapters: 2,989 (23.9%) Reads written (passing filters): 12,504 (100.0%) Total basepairs processed: 1,640,714 bp Quality-trimmed: 89,348 bp (5.4%) Total written (filtered): 1,546,301 bp (94.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 2989 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.6% C: 24.8% G: 15.1% T: 27.4% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 2234 3126.0 0 2234 2 568 781.5 0 568 3 117 195.4 0 117 4 44 48.8 0 44 5 8 12.2 0 8 6 1 3.1 0 1 9 1 0.0 0 0 1 10 1 0.0 1 0 1 22 1 0.0 1 0 1 24 1 0.0 1 1 50 1 0.0 1 0 1 63 1 0.0 1 1 65 1 0.0 1 1 73 1 0.0 1 0 1 75 3 0.0 1 2 1 76 4 0.0 1 4 128 1 0.0 1 1 149 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-7_S25_L007_R1_001.fastq.gz ============================================= 12504 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-7_S25_L007_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-7_S25_L007_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-7_S25_L007_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-7_S25_L007_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-7_S25_L007_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.45 s (36 us/read; 1.66 M reads/minute). === Summary === Total reads processed: 12,504 Reads with adapters: 2,708 (21.7%) Reads written (passing filters): 12,504 (100.0%) Total basepairs processed: 1,395,904 bp Quality-trimmed: 425,569 bp (30.5%) Total written (filtered): 964,713 bp (69.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 2708 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.5% C: 27.4% G: 13.7% T: 27.4% none/other: 2.0% Overview of removed sequences length count expect max.err error counts 1 1964 3126.0 0 1964 2 532 781.5 0 532 3 128 195.4 0 128 4 47 48.8 0 47 5 7 12.2 0 7 6 1 3.1 0 1 7 1 0.8 0 1 12 1 0.0 1 0 1 18 1 0.0 1 0 1 42 1 0.0 1 0 1 45 1 0.0 1 1 51 2 0.0 1 0 2 52 1 0.0 1 0 1 54 1 0.0 1 0 1 55 1 0.0 1 1 64 1 0.0 1 1 65 2 0.0 1 2 67 1 0.0 1 0 1 68 6 0.0 1 5 1 69 2 0.0 1 2 70 1 0.0 1 1 96 1 0.0 1 0 1 100 1 0.0 1 1 105 1 0.0 1 0 1 116 1 0.0 1 0 1 150 2 0.0 1 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-7_S25_L007_R2_001.fastq.gz ============================================= 12504 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-7_S25_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-7_S25_L007_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-7_S25_L007_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-7_S25_L007_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-7_S25_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-7_S25_L007_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Total number of sequences analysed: 12504 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 4218 (33.73%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-7_S25_L007_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-7_S25_L007_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-7_S25_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-7_S25_L007_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8_S29_L008_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8_S29_L008_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8_S29_L008_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8_S29_L008_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8_S29_L008_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.56 s (53 us/read; 1.12 M reads/minute). === Summary === Total reads processed: 10,428 Reads with adapters: 2,537 (24.3%) Reads written (passing filters): 10,428 (100.0%) Total basepairs processed: 1,373,621 bp Quality-trimmed: 73,308 bp (5.3%) Total written (filtered): 1,295,710 bp (94.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 2537 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.5% C: 26.6% G: 14.4% T: 27.2% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 1910 2607.0 0 1910 2 450 651.8 0 450 3 111 162.9 0 111 4 37 40.7 0 37 5 11 10.2 0 11 6 1 2.5 0 1 11 1 0.0 1 0 1 15 1 0.0 1 1 41 2 0.0 1 2 61 1 0.0 1 1 65 1 0.0 1 1 70 1 0.0 1 1 74 2 0.0 1 1 1 75 2 0.0 1 1 1 76 1 0.0 1 1 77 1 0.0 1 1 82 1 0.0 1 1 113 1 0.0 1 0 1 150 1 0.0 1 0 1 151 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8_S29_L008_R1_001.fastq.gz ============================================= 10428 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8_S29_L008_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8_S29_L008_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8_S29_L008_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8_S29_L008_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8_S29_L008_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.39 s (37 us/read; 1.62 M reads/minute). === Summary === Total reads processed: 10,428 Reads with adapters: 2,215 (21.2%) Reads written (passing filters): 10,428 (100.0%) Total basepairs processed: 1,167,532 bp Quality-trimmed: 376,873 bp (32.3%) Total written (filtered): 786,368 bp (67.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 2215 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.0% C: 28.0% G: 14.3% T: 26.5% none/other: 2.1% Overview of removed sequences length count expect max.err error counts 1 1607 2607.0 0 1607 2 464 651.8 0 464 3 92 162.9 0 92 4 24 40.7 0 24 5 4 10.2 0 4 6 1 2.5 0 1 7 2 0.6 0 2 16 1 0.0 1 1 24 1 0.0 1 1 30 1 0.0 1 0 1 42 1 0.0 1 1 51 2 0.0 1 2 58 1 0.0 1 1 62 1 0.0 1 1 63 1 0.0 1 1 64 1 0.0 1 1 67 3 0.0 1 1 2 68 2 0.0 1 2 69 2 0.0 1 1 1 70 1 0.0 1 0 1 72 1 0.0 1 0 1 117 1 0.0 1 0 1 149 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8_S29_L008_R2_001.fastq.gz ============================================= 10428 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8_S29_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8_S29_L008_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8_S29_L008_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8_S29_L008_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8_S29_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8_S29_L008_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Total number of sequences analysed: 10428 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 3616 (34.68%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8_S29_L008_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8_S29_L008_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8_S29_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8_S29_L008_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.20 s (56 us/read; 1.07 M reads/minute). === Summary === Total reads processed: 21,429 Reads with adapters: 5,461 (25.5%) Reads written (passing filters): 21,429 (100.0%) Total basepairs processed: 2,970,617 bp Quality-trimmed: 104,883 bp (3.5%) Total written (filtered): 2,858,543 bp (96.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 5461 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.5% C: 24.4% G: 13.8% T: 27.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 4378 5357.2 0 4378 2 786 1339.3 0 786 3 223 334.8 0 223 4 55 83.7 0 55 5 9 20.9 0 9 6 2 5.2 0 2 10 1 0.0 1 0 1 18 1 0.0 1 0 1 19 1 0.0 1 0 1 28 1 0.0 1 1 38 1 0.0 1 1 52 1 0.0 1 1 60 1 0.0 1 1 70 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001.fastq.gz ============================================= 21429 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.74 s (34 us/read; 1.74 M reads/minute). === Summary === Total reads processed: 21,429 Reads with adapters: 4,359 (20.3%) Reads written (passing filters): 21,429 (100.0%) Total basepairs processed: 2,410,501 bp Quality-trimmed: 953,802 bp (39.6%) Total written (filtered): 1,449,722 bp (60.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 4359 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 28.4% C: 28.1% G: 14.4% T: 28.0% none/other: 1.0% Overview of removed sequences length count expect max.err error counts 1 3183 5357.2 0 3183 2 891 1339.3 0 891 3 192 334.8 0 192 4 54 83.7 0 54 5 9 20.9 0 9 6 3 5.2 0 3 7 2 1.3 0 2 9 1 0.1 0 1 10 2 0.0 1 1 1 15 1 0.0 1 1 17 2 0.0 1 0 2 20 1 0.0 1 1 32 2 0.0 1 1 1 37 2 0.0 1 1 1 41 1 0.0 1 1 48 1 0.0 1 1 51 1 0.0 1 0 1 64 2 0.0 1 1 1 65 1 0.0 1 1 66 1 0.0 1 1 67 2 0.0 1 2 68 3 0.0 1 2 1 71 1 0.0 1 1 99 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001.fastq.gz ============================================= 21429 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Total number of sequences analysed: 21429 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 7719 (36.02%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-1_S4_L001_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.18 s (54 us/read; 1.11 M reads/minute). === Summary === Total reads processed: 21,702 Reads with adapters: 5,593 (25.8%) Reads written (passing filters): 21,702 (100.0%) Total basepairs processed: 2,994,474 bp Quality-trimmed: 110,023 bp (3.7%) Total written (filtered): 2,876,687 bp (96.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 5593 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.3% C: 23.9% G: 13.5% T: 27.2% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 4384 5425.5 0 4384 2 900 1356.4 0 900 3 217 339.1 0 217 4 67 84.8 0 67 5 10 21.2 0 10 6 1 5.3 0 1 8 1 0.3 0 1 10 2 0.0 1 1 1 12 1 0.0 1 0 1 18 1 0.0 1 1 32 1 0.0 1 0 1 33 1 0.0 1 1 40 1 0.0 1 0 1 60 1 0.0 1 1 64 1 0.0 1 1 66 1 0.0 1 0 1 70 1 0.0 1 0 1 73 1 0.0 1 0 1 109 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001.fastq.gz ============================================= 21702 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 0.71 s (33 us/read; 1.83 M reads/minute). === Summary === Total reads processed: 21,702 Reads with adapters: 4,540 (20.9%) Reads written (passing filters): 21,702 (100.0%) Total basepairs processed: 2,476,067 bp Quality-trimmed: 944,413 bp (38.1%) Total written (filtered): 1,522,669 bp (61.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 4540 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.3% C: 27.8% G: 14.0% T: 28.1% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 3291 5425.5 0 3291 2 926 1356.4 0 926 3 183 339.1 0 183 4 75 84.8 0 75 5 12 21.2 0 12 6 1 5.3 0 1 7 1 1.3 0 1 8 1 0.3 0 1 12 2 0.0 1 1 1 14 2 0.0 1 2 17 1 0.0 1 1 19 1 0.0 1 0 1 26 1 0.0 1 1 28 1 0.0 1 0 1 30 1 0.0 1 1 35 1 0.0 1 1 36 1 0.0 1 0 1 41 2 0.0 1 1 1 44 1 0.0 1 0 1 51 3 0.0 1 3 54 1 0.0 1 1 58 1 0.0 1 0 1 59 1 0.0 1 1 60 2 0.0 1 2 61 1 0.0 1 1 62 2 0.0 1 0 2 63 1 0.0 1 1 64 1 0.0 1 0 1 65 1 0.0 1 1 66 2 0.0 1 2 67 4 0.0 1 1 3 68 4 0.0 1 3 1 69 1 0.0 1 1 71 1 0.0 1 0 1 72 2 0.0 1 2 74 2 0.0 1 1 1 78 1 0.0 1 0 1 95 1 0.0 1 1 103 1 0.0 1 0 1 110 1 0.0 1 1 115 1 0.0 1 1 117 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001.fastq.gz ============================================= 21702 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 21702 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 7398 (34.09%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-2_S8_L002_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.75 s (34 us/read; 1.77 M reads/minute). === Summary === Total reads processed: 258,035 Reads with adapters: 58,250 (22.6%) Reads written (passing filters): 258,035 (100.0%) Total basepairs processed: 31,975,215 bp Quality-trimmed: 1,070,389 bp (3.3%) Total written (filtered): 30,787,365 bp (96.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 58250 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.0% C: 23.7% G: 14.0% T: 29.2% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 44243 64508.8 0 44243 2 9500 16127.2 0 9500 3 2583 4031.8 0 2583 4 979 1007.9 0 979 5 224 252.0 0 224 6 31 63.0 0 31 7 17 15.7 0 17 8 6 3.9 0 6 9 1 1.0 0 1 10 11 0.2 1 7 4 11 4 0.1 1 1 3 12 3 0.0 1 1 2 13 6 0.0 1 3 3 14 4 0.0 1 1 3 15 2 0.0 1 1 1 16 2 0.0 1 0 2 17 2 0.0 1 2 18 16 0.0 1 11 5 19 8 0.0 1 4 4 20 5 0.0 1 2 3 22 10 0.0 1 7 3 23 5 0.0 1 1 4 24 7 0.0 1 6 1 25 1 0.0 1 1 27 12 0.0 1 11 1 28 10 0.0 1 6 4 29 5 0.0 1 2 3 30 5 0.0 1 3 2 31 11 0.0 1 9 2 32 9 0.0 1 7 2 33 7 0.0 1 4 3 34 8 0.0 1 4 4 35 8 0.0 1 7 1 36 4 0.0 1 3 1 37 7 0.0 1 6 1 38 6 0.0 1 4 2 39 5 0.0 1 3 2 40 8 0.0 1 6 2 41 4 0.0 1 4 42 9 0.0 1 7 2 43 8 0.0 1 4 4 44 6 0.0 1 5 1 45 6 0.0 1 6 46 5 0.0 1 3 2 47 4 0.0 1 4 48 4 0.0 1 3 1 49 11 0.0 1 7 4 50 5 0.0 1 4 1 51 1 0.0 1 1 52 4 0.0 1 4 53 8 0.0 1 6 2 54 7 0.0 1 2 5 55 4 0.0 1 3 1 56 12 0.0 1 7 5 57 6 0.0 1 6 58 6 0.0 1 5 1 59 5 0.0 1 3 2 60 11 0.0 1 6 5 61 6 0.0 1 4 2 62 4 0.0 1 3 1 63 9 0.0 1 6 3 64 3 0.0 1 3 65 6 0.0 1 5 1 66 11 0.0 1 9 2 67 11 0.0 1 9 2 68 6 0.0 1 5 1 69 2 0.0 1 1 1 70 4 0.0 1 0 4 71 6 0.0 1 5 1 72 5 0.0 1 4 1 73 9 0.0 1 7 2 74 9 0.0 1 8 1 75 13 0.0 1 12 1 76 13 0.0 1 11 2 77 38 0.0 1 31 7 78 24 0.0 1 19 5 79 13 0.0 1 11 2 80 10 0.0 1 8 2 81 5 0.0 1 3 2 82 4 0.0 1 0 4 83 5 0.0 1 4 1 84 6 0.0 1 4 2 85 2 0.0 1 2 86 6 0.0 1 2 4 87 3 0.0 1 1 2 88 5 0.0 1 5 89 1 0.0 1 1 90 4 0.0 1 2 2 91 2 0.0 1 2 92 6 0.0 1 4 2 93 6 0.0 1 3 3 95 7 0.0 1 6 1 96 3 0.0 1 1 2 97 5 0.0 1 2 3 98 4 0.0 1 3 1 99 2 0.0 1 1 1 100 2 0.0 1 2 101 4 0.0 1 0 4 102 6 0.0 1 6 103 3 0.0 1 2 1 104 4 0.0 1 2 2 105 3 0.0 1 3 106 5 0.0 1 2 3 107 1 0.0 1 1 108 4 0.0 1 3 1 109 2 0.0 1 1 1 110 1 0.0 1 1 112 2 0.0 1 0 2 113 7 0.0 1 5 2 114 2 0.0 1 2 115 1 0.0 1 1 116 2 0.0 1 2 117 3 0.0 1 2 1 119 1 0.0 1 1 121 1 0.0 1 1 122 1 0.0 1 1 123 1 0.0 1 1 124 2 0.0 1 2 128 1 0.0 1 1 129 1 0.0 1 1 134 2 0.0 1 2 135 1 0.0 1 1 136 1 0.0 1 1 137 1 0.0 1 1 145 1 0.0 1 0 1 148 2 0.0 1 0 2 150 2 0.0 1 0 2 151 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001.fastq.gz ============================================= 258035 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.82 s (34 us/read; 1.76 M reads/minute). === Summary === Total reads processed: 258,035 Reads with adapters: 61,989 (24.0%) Reads written (passing filters): 258,035 (100.0%) Total basepairs processed: 32,021,413 bp Quality-trimmed: 3,282,600 bp (10.3%) Total written (filtered): 28,534,739 bp (89.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 61989 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.5% C: 25.4% G: 14.3% T: 28.6% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 45285 64508.8 0 45285 2 10852 16127.2 0 10852 3 2504 4031.8 0 2504 4 870 1007.9 0 870 5 190 252.0 0 190 6 37 63.0 0 37 7 40 15.7 0 40 8 23 3.9 0 23 9 7 1.0 0 5 2 10 21 0.2 1 12 9 11 10 0.1 1 3 7 12 18 0.0 1 10 8 13 12 0.0 1 6 6 14 37 0.0 1 24 13 15 16 0.0 1 7 9 16 9 0.0 1 4 5 17 30 0.0 1 23 7 18 4 0.0 1 1 3 19 22 0.0 1 12 10 20 12 0.0 1 7 5 21 1 0.0 1 0 1 22 1 0.0 1 0 1 23 29 0.0 1 23 6 24 49 0.0 1 34 15 25 18 0.0 1 16 2 26 24 0.0 1 17 7 27 2 0.0 1 1 1 28 24 0.0 1 14 10 29 2 0.0 1 1 1 30 27 0.0 1 22 5 31 4 0.0 1 3 1 32 29 0.0 1 22 7 33 54 0.0 1 39 15 34 2 0.0 1 0 2 35 14 0.0 1 5 9 36 8 0.0 1 2 6 37 28 0.0 1 20 8 38 9 0.0 1 3 6 39 19 0.0 1 14 5 40 5 0.0 1 2 3 41 23 0.0 1 15 8 42 63 0.0 1 44 19 43 6 0.0 1 5 1 44 22 0.0 1 20 2 45 70 0.0 1 48 22 46 16 0.0 1 10 6 47 5 0.0 1 2 3 48 35 0.0 1 23 12 49 23 0.0 1 16 7 50 7 0.0 1 5 2 51 105 0.0 1 78 27 52 22 0.0 1 16 6 53 22 0.0 1 14 8 54 13 0.0 1 10 3 55 34 0.0 1 28 6 56 25 0.0 1 17 8 57 17 0.0 1 15 2 58 33 0.0 1 23 10 59 18 0.0 1 13 5 60 24 0.0 1 22 2 61 17 0.0 1 15 2 62 28 0.0 1 24 4 63 25 0.0 1 18 7 64 37 0.0 1 31 6 65 47 0.0 1 40 7 66 41 0.0 1 35 6 67 73 0.0 1 55 18 68 250 0.0 1 236 14 69 94 0.0 1 75 19 70 31 0.0 1 23 8 71 23 0.0 1 18 5 72 13 0.0 1 10 3 73 18 0.0 1 12 6 74 11 0.0 1 7 4 75 8 0.0 1 5 3 76 19 0.0 1 16 3 77 13 0.0 1 11 2 78 6 0.0 1 3 3 79 9 0.0 1 6 3 80 7 0.0 1 5 2 81 7 0.0 1 3 4 82 16 0.0 1 13 3 83 7 0.0 1 3 4 84 4 0.0 1 3 1 85 8 0.0 1 4 4 86 10 0.0 1 5 5 87 5 0.0 1 5 88 4 0.0 1 1 3 89 9 0.0 1 6 3 90 9 0.0 1 7 2 91 8 0.0 1 1 7 92 5 0.0 1 2 3 93 2 0.0 1 2 94 9 0.0 1 9 95 8 0.0 1 5 3 96 8 0.0 1 4 4 97 11 0.0 1 9 2 98 10 0.0 1 6 4 99 7 0.0 1 3 4 100 7 0.0 1 5 2 101 6 0.0 1 5 1 102 10 0.0 1 6 4 103 4 0.0 1 3 1 104 5 0.0 1 4 1 105 8 0.0 1 4 4 106 4 0.0 1 2 2 107 4 0.0 1 2 2 108 6 0.0 1 4 2 109 1 0.0 1 0 1 110 6 0.0 1 4 2 111 4 0.0 1 3 1 112 4 0.0 1 3 1 113 4 0.0 1 1 3 114 5 0.0 1 3 2 115 2 0.0 1 2 116 6 0.0 1 6 117 4 0.0 1 1 3 118 4 0.0 1 3 1 119 7 0.0 1 4 3 120 4 0.0 1 3 1 121 4 0.0 1 1 3 122 2 0.0 1 2 123 2 0.0 1 2 124 4 0.0 1 4 125 2 0.0 1 1 1 126 1 0.0 1 1 127 4 0.0 1 4 128 1 0.0 1 0 1 129 1 0.0 1 0 1 130 2 0.0 1 1 1 131 1 0.0 1 1 132 1 0.0 1 0 1 133 1 0.0 1 1 136 2 0.0 1 2 138 1 0.0 1 1 139 1 0.0 1 1 143 1 0.0 1 0 1 146 1 0.0 1 1 150 4 0.0 1 0 4 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001.fastq.gz ============================================= 258035 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Total number of sequences analysed: 258035 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 53285 (20.65%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-3_S12_L003_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.71 s (33 us/read; 1.82 M reads/minute). === Summary === Total reads processed: 263,614 Reads with adapters: 59,745 (22.7%) Reads written (passing filters): 263,614 (100.0%) Total basepairs processed: 32,701,025 bp Quality-trimmed: 1,097,320 bp (3.4%) Total written (filtered): 31,486,211 bp (96.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 59745 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.2% C: 23.7% G: 13.7% T: 29.3% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 45424 65903.5 0 45424 2 9848 16475.9 0 9848 3 2585 4119.0 0 2585 4 983 1029.7 0 983 5 192 257.4 0 192 6 24 64.4 0 24 7 23 16.1 0 23 8 5 4.0 0 5 9 2 1.0 0 1 1 10 12 0.3 1 6 6 11 1 0.1 1 1 12 6 0.0 1 4 2 13 6 0.0 1 3 3 14 6 0.0 1 4 2 15 7 0.0 1 5 2 16 8 0.0 1 6 2 18 11 0.0 1 7 4 19 11 0.0 1 10 1 20 3 0.0 1 2 1 21 2 0.0 1 1 1 22 4 0.0 1 4 23 5 0.0 1 3 2 24 9 0.0 1 5 4 25 4 0.0 1 0 4 26 9 0.0 1 7 2 27 6 0.0 1 4 2 28 6 0.0 1 5 1 29 5 0.0 1 3 2 30 4 0.0 1 1 3 31 1 0.0 1 1 32 6 0.0 1 2 4 33 6 0.0 1 3 3 34 12 0.0 1 11 1 35 6 0.0 1 4 2 36 7 0.0 1 4 3 37 9 0.0 1 7 2 38 6 0.0 1 4 2 39 8 0.0 1 5 3 40 7 0.0 1 3 4 41 6 0.0 1 5 1 42 7 0.0 1 4 3 43 5 0.0 1 4 1 44 7 0.0 1 6 1 45 10 0.0 1 8 2 46 1 0.0 1 1 47 4 0.0 1 2 2 48 6 0.0 1 2 4 49 5 0.0 1 4 1 50 6 0.0 1 5 1 51 4 0.0 1 3 1 52 6 0.0 1 5 1 53 8 0.0 1 8 54 16 0.0 1 12 4 55 4 0.0 1 3 1 56 12 0.0 1 10 2 57 8 0.0 1 8 58 8 0.0 1 6 2 59 7 0.0 1 5 2 60 12 0.0 1 11 1 61 3 0.0 1 1 2 62 4 0.0 1 4 63 9 0.0 1 5 4 64 6 0.0 1 5 1 65 8 0.0 1 5 3 66 6 0.0 1 6 67 17 0.0 1 14 3 68 3 0.0 1 3 69 5 0.0 1 5 70 10 0.0 1 7 3 71 6 0.0 1 3 3 72 5 0.0 1 4 1 73 10 0.0 1 6 4 74 6 0.0 1 6 75 11 0.0 1 7 4 76 9 0.0 1 8 1 77 35 0.0 1 34 1 78 20 0.0 1 12 8 79 17 0.0 1 11 6 80 10 0.0 1 9 1 81 7 0.0 1 6 1 82 4 0.0 1 2 2 83 2 0.0 1 1 1 84 5 0.0 1 2 3 85 3 0.0 1 3 86 5 0.0 1 4 1 87 6 0.0 1 0 6 88 2 0.0 1 1 1 89 2 0.0 1 1 1 90 1 0.0 1 1 91 2 0.0 1 2 92 3 0.0 1 2 1 93 6 0.0 1 4 2 94 1 0.0 1 0 1 95 4 0.0 1 3 1 96 6 0.0 1 4 2 97 4 0.0 1 3 1 99 3 0.0 1 3 100 5 0.0 1 4 1 101 3 0.0 1 3 102 1 0.0 1 0 1 103 2 0.0 1 1 1 104 1 0.0 1 0 1 105 3 0.0 1 2 1 107 2 0.0 1 0 2 108 1 0.0 1 1 110 5 0.0 1 3 2 111 1 0.0 1 0 1 112 2 0.0 1 1 1 113 5 0.0 1 4 1 115 3 0.0 1 2 1 116 1 0.0 1 1 117 1 0.0 1 0 1 118 3 0.0 1 3 119 2 0.0 1 1 1 120 1 0.0 1 0 1 121 1 0.0 1 0 1 122 1 0.0 1 1 123 2 0.0 1 1 1 126 1 0.0 1 0 1 127 3 0.0 1 2 1 128 1 0.0 1 0 1 130 1 0.0 1 1 131 1 0.0 1 1 138 1 0.0 1 0 1 142 1 0.0 1 1 143 1 0.0 1 0 1 149 1 0.0 1 1 151 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001.fastq.gz ============================================= 263614 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 8.76 s (33 us/read; 1.81 M reads/minute). === Summary === Total reads processed: 263,614 Reads with adapters: 63,442 (24.1%) Reads written (passing filters): 263,614 (100.0%) Total basepairs processed: 32,676,756 bp Quality-trimmed: 3,323,211 bp (10.2%) Total written (filtered): 29,151,359 bp (89.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 63442 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.8% C: 25.6% G: 13.9% T: 28.5% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 46329 65903.5 0 46329 2 10984 16475.9 0 10984 3 2673 4119.0 0 2673 4 938 1029.7 0 938 5 212 257.4 0 212 6 51 64.4 0 51 7 59 16.1 0 59 8 20 4.0 0 20 9 8 1.0 0 7 1 10 38 0.3 1 17 21 11 7 0.1 1 2 5 12 32 0.0 1 14 18 13 16 0.0 1 5 11 14 49 0.0 1 32 17 15 14 0.0 1 11 3 16 7 0.0 1 3 4 17 37 0.0 1 27 10 18 5 0.0 1 3 2 19 24 0.0 1 14 10 20 20 0.0 1 14 6 21 3 0.0 1 1 2 22 1 0.0 1 0 1 23 19 0.0 1 11 8 24 65 0.0 1 43 22 25 20 0.0 1 14 6 26 25 0.0 1 14 11 27 5 0.0 1 3 2 28 24 0.0 1 12 12 29 3 0.0 1 2 1 30 22 0.0 1 14 8 31 2 0.0 1 2 32 32 0.0 1 22 10 33 37 0.0 1 28 9 34 3 0.0 1 1 2 35 15 0.0 1 10 5 36 12 0.0 1 6 6 37 21 0.0 1 14 7 38 14 0.0 1 11 3 39 19 0.0 1 14 5 40 8 0.0 1 4 4 41 29 0.0 1 21 8 42 48 0.0 1 31 17 43 10 0.0 1 7 3 44 23 0.0 1 17 6 45 53 0.0 1 37 16 46 23 0.0 1 13 10 47 4 0.0 1 3 1 48 46 0.0 1 42 4 49 24 0.0 1 17 7 50 17 0.0 1 13 4 51 105 0.0 1 86 19 52 23 0.0 1 17 6 53 15 0.0 1 13 2 54 15 0.0 1 13 2 55 20 0.0 1 15 5 56 13 0.0 1 11 2 57 28 0.0 1 20 8 58 33 0.0 1 23 10 59 21 0.0 1 17 4 60 28 0.0 1 19 9 61 24 0.0 1 20 4 62 27 0.0 1 21 6 63 29 0.0 1 25 4 64 34 0.0 1 24 10 65 48 0.0 1 33 15 66 45 0.0 1 35 10 67 70 0.0 1 58 12 68 160 0.0 1 142 18 69 98 0.0 1 84 14 70 48 0.0 1 38 10 71 21 0.0 1 19 2 72 17 0.0 1 12 5 73 13 0.0 1 10 3 74 7 0.0 1 7 75 7 0.0 1 6 1 76 9 0.0 1 7 2 77 9 0.0 1 4 5 78 21 0.0 1 11 10 79 11 0.0 1 7 4 80 10 0.0 1 5 5 81 8 0.0 1 3 5 82 8 0.0 1 6 2 83 13 0.0 1 12 1 84 7 0.0 1 4 3 85 10 0.0 1 10 86 4 0.0 1 2 2 87 10 0.0 1 8 2 88 5 0.0 1 5 89 6 0.0 1 5 1 90 8 0.0 1 6 2 91 4 0.0 1 3 1 92 6 0.0 1 6 93 7 0.0 1 5 2 94 9 0.0 1 7 2 95 5 0.0 1 4 1 96 5 0.0 1 4 1 97 7 0.0 1 6 1 98 11 0.0 1 7 4 99 8 0.0 1 7 1 100 3 0.0 1 3 101 4 0.0 1 3 1 102 10 0.0 1 7 3 103 11 0.0 1 6 5 104 2 0.0 1 1 1 105 6 0.0 1 4 2 106 3 0.0 1 1 2 107 4 0.0 1 3 1 108 5 0.0 1 3 2 109 8 0.0 1 6 2 110 3 0.0 1 2 1 111 4 0.0 1 4 112 7 0.0 1 5 2 113 8 0.0 1 7 1 114 5 0.0 1 4 1 115 7 0.0 1 5 2 116 3 0.0 1 2 1 117 6 0.0 1 1 5 118 6 0.0 1 3 3 119 4 0.0 1 3 1 120 3 0.0 1 1 2 121 5 0.0 1 2 3 122 4 0.0 1 3 1 123 1 0.0 1 0 1 124 5 0.0 1 5 125 2 0.0 1 2 126 5 0.0 1 4 1 127 2 0.0 1 0 2 128 1 0.0 1 1 129 1 0.0 1 1 131 1 0.0 1 1 136 1 0.0 1 0 1 140 1 0.0 1 1 150 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001.fastq.gz ============================================= 263614 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Total number of sequences analysed: 263614 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 54408 (20.64%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-4_S16_L004_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 3.66 s (42 us/read; 1.43 M reads/minute). === Summary === Total reads processed: 87,562 Reads with adapters: 20,789 (23.7%) Reads written (passing filters): 87,562 (100.0%) Total basepairs processed: 11,436,175 bp Quality-trimmed: 304,214 bp (2.7%) Total written (filtered): 11,089,885 bp (97.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 20789 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.3% C: 22.8% G: 13.7% T: 28.6% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 16267 21890.5 0 16267 2 3095 5472.6 0 3095 3 873 1368.2 0 873 4 298 342.0 0 298 5 56 85.5 0 56 6 7 21.4 0 7 7 6 5.3 0 6 8 2 1.3 0 2 9 1 0.3 0 0 1 10 1 0.1 1 0 1 16 3 0.0 1 3 17 1 0.0 1 0 1 18 1 0.0 1 0 1 21 1 0.0 1 1 25 1 0.0 1 1 30 2 0.0 1 1 1 31 2 0.0 1 2 34 1 0.0 1 0 1 36 1 0.0 1 0 1 38 1 0.0 1 1 39 2 0.0 1 2 40 2 0.0 1 1 1 41 1 0.0 1 1 42 1 0.0 1 0 1 43 1 0.0 1 0 1 48 2 0.0 1 2 49 1 0.0 1 1 59 1 0.0 1 1 66 2 0.0 1 1 1 68 2 0.0 1 2 69 1 0.0 1 1 70 1 0.0 1 0 1 71 1 0.0 1 0 1 72 1 0.0 1 1 73 2 0.0 1 1 1 74 4 0.0 1 0 4 75 21 0.0 1 3 18 76 27 0.0 1 9 18 77 17 0.0 1 5 12 78 19 0.0 1 9 10 79 5 0.0 1 3 2 80 6 0.0 1 5 1 81 1 0.0 1 1 82 2 0.0 1 0 2 83 2 0.0 1 0 2 85 2 0.0 1 2 86 1 0.0 1 1 87 2 0.0 1 1 1 88 1 0.0 1 0 1 90 1 0.0 1 0 1 96 1 0.0 1 0 1 102 1 0.0 1 0 1 108 1 0.0 1 1 109 1 0.0 1 1 118 1 0.0 1 0 1 122 1 0.0 1 0 1 126 1 0.0 1 0 1 131 1 0.0 1 1 146 1 0.0 1 0 1 148 1 0.0 1 1 149 3 0.0 1 1 2 150 19 0.0 1 3 16 151 5 0.0 1 0 5 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001.fastq.gz ============================================= 87562 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.81 s (32 us/read; 1.87 M reads/minute). === Summary === Total reads processed: 87,562 Reads with adapters: 18,427 (21.0%) Reads written (passing filters): 87,562 (100.0%) Total basepairs processed: 10,025,776 bp Quality-trimmed: 1,943,431 bp (19.4%) Total written (filtered): 8,039,227 bp (80.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 18427 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.6% C: 25.9% G: 13.5% T: 30.0% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 13325 21890.5 0 13325 2 3622 5472.6 0 3622 3 791 1368.2 0 791 4 329 342.0 0 329 5 61 85.5 0 61 6 8 21.4 0 8 7 10 5.3 0 10 8 1 1.3 0 1 9 1 0.3 0 1 10 3 0.1 1 2 1 13 2 0.0 1 0 2 14 1 0.0 1 0 1 16 2 0.0 1 1 1 17 2 0.0 1 0 2 18 1 0.0 1 0 1 20 1 0.0 1 0 1 22 1 0.0 1 0 1 23 1 0.0 1 0 1 24 2 0.0 1 2 25 2 0.0 1 2 26 1 0.0 1 1 29 2 0.0 1 1 1 30 2 0.0 1 1 1 31 3 0.0 1 2 1 32 3 0.0 1 2 1 33 4 0.0 1 3 1 35 1 0.0 1 1 37 1 0.0 1 1 38 1 0.0 1 1 41 6 0.0 1 6 44 1 0.0 1 1 45 2 0.0 1 2 46 1 0.0 1 1 47 1 0.0 1 0 1 48 2 0.0 1 2 49 3 0.0 1 3 50 2 0.0 1 1 1 51 4 0.0 1 3 1 52 3 0.0 1 2 1 53 2 0.0 1 1 1 54 4 0.0 1 4 55 1 0.0 1 0 1 56 1 0.0 1 1 57 1 0.0 1 0 1 58 4 0.0 1 2 2 59 3 0.0 1 2 1 61 1 0.0 1 0 1 62 4 0.0 1 2 2 63 2 0.0 1 2 64 2 0.0 1 1 1 65 5 0.0 1 4 1 66 3 0.0 1 2 1 67 38 0.0 1 3 35 68 64 0.0 1 29 35 69 25 0.0 1 16 9 70 15 0.0 1 6 9 71 7 0.0 1 2 5 72 2 0.0 1 0 2 73 1 0.0 1 1 74 1 0.0 1 0 1 80 1 0.0 1 0 1 87 1 0.0 1 0 1 90 2 0.0 1 2 92 1 0.0 1 1 93 2 0.0 1 2 96 1 0.0 1 0 1 100 1 0.0 1 0 1 101 1 0.0 1 0 1 108 2 0.0 1 1 1 109 1 0.0 1 1 112 1 0.0 1 0 1 113 2 0.0 1 1 1 117 1 0.0 1 1 135 1 0.0 1 0 1 136 1 0.0 1 0 1 140 1 0.0 1 0 1 143 1 0.0 1 1 148 1 0.0 1 0 1 149 1 0.0 1 0 1 150 6 0.0 1 0 6 151 3 0.0 1 0 3 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001.fastq.gz ============================================= 87562 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Total number of sequences analysed: 87562 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 21734 (24.82%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-5_S20_L005_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 4.21 s (42 us/read; 1.41 M reads/minute). === Summary === Total reads processed: 99,111 Reads with adapters: 23,804 (24.0%) Reads written (passing filters): 99,111 (100.0%) Total basepairs processed: 12,944,228 bp Quality-trimmed: 339,740 bp (2.6%) Total written (filtered): 12,558,590 bp (97.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 23804 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.9% C: 22.6% G: 13.1% T: 28.9% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 18504 24777.8 0 18504 2 3686 6194.4 0 3686 3 1014 1548.6 0 1014 4 336 387.2 0 336 5 60 96.8 0 60 6 6 24.2 0 6 7 5 6.0 0 5 8 3 1.5 0 3 9 1 0.4 0 1 10 4 0.1 1 0 4 13 1 0.0 1 1 16 1 0.0 1 1 18 2 0.0 1 2 22 1 0.0 1 1 25 2 0.0 1 0 2 26 1 0.0 1 0 1 27 2 0.0 1 1 1 29 1 0.0 1 1 34 3 0.0 1 1 2 37 1 0.0 1 1 38 1 0.0 1 1 41 4 0.0 1 2 2 44 1 0.0 1 0 1 48 1 0.0 1 0 1 53 1 0.0 1 1 55 1 0.0 1 1 57 2 0.0 1 2 58 1 0.0 1 1 62 1 0.0 1 1 64 1 0.0 1 1 65 1 0.0 1 1 66 2 0.0 1 2 69 1 0.0 1 0 1 70 1 0.0 1 1 72 1 0.0 1 0 1 73 1 0.0 1 1 74 2 0.0 1 1 1 75 15 0.0 1 1 14 76 33 0.0 1 7 26 77 21 0.0 1 7 14 78 16 0.0 1 5 11 79 10 0.0 1 4 6 80 6 0.0 1 3 3 81 2 0.0 1 0 2 82 1 0.0 1 0 1 83 2 0.0 1 1 1 84 2 0.0 1 0 2 86 1 0.0 1 1 88 2 0.0 1 0 2 91 1 0.0 1 1 95 1 0.0 1 0 1 97 2 0.0 1 0 2 100 1 0.0 1 0 1 101 1 0.0 1 1 108 1 0.0 1 0 1 113 1 0.0 1 0 1 120 1 0.0 1 0 1 126 1 0.0 1 1 129 1 0.0 1 1 132 1 0.0 1 0 1 135 1 0.0 1 1 140 2 0.0 1 1 1 144 1 0.0 1 1 147 1 0.0 1 1 149 2 0.0 1 0 2 150 10 0.0 1 4 6 151 7 0.0 1 2 5 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001.fastq.gz ============================================= 99111 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 3.23 s (33 us/read; 1.84 M reads/minute). === Summary === Total reads processed: 99,111 Reads with adapters: 21,130 (21.3%) Reads written (passing filters): 99,111 (100.0%) Total basepairs processed: 11,417,041 bp Quality-trimmed: 2,151,435 bp (18.8%) Total written (filtered): 9,219,206 bp (80.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 21130 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.8% C: 25.8% G: 13.8% T: 29.8% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 15526 24777.8 0 15526 2 4041 6194.4 0 4041 3 787 1548.6 0 787 4 377 387.2 0 377 5 93 96.8 0 93 6 12 24.2 0 12 7 8 6.0 0 8 8 3 1.5 0 3 9 1 0.4 0 1 10 3 0.1 1 0 3 11 1 0.0 1 1 12 4 0.0 1 4 13 2 0.0 1 0 2 14 5 0.0 1 4 1 16 4 0.0 1 1 3 17 3 0.0 1 2 1 22 2 0.0 1 1 1 23 1 0.0 1 1 24 3 0.0 1 3 25 2 0.0 1 2 26 4 0.0 1 4 28 5 0.0 1 3 2 32 2 0.0 1 2 33 5 0.0 1 2 3 36 1 0.0 1 1 37 2 0.0 1 1 1 38 2 0.0 1 1 1 39 1 0.0 1 1 42 5 0.0 1 5 44 4 0.0 1 3 1 45 2 0.0 1 1 1 46 2 0.0 1 0 2 48 1 0.0 1 1 50 1 0.0 1 1 51 8 0.0 1 6 2 52 1 0.0 1 1 55 2 0.0 1 1 1 56 3 0.0 1 2 1 57 1 0.0 1 0 1 58 1 0.0 1 1 59 5 0.0 1 5 61 1 0.0 1 0 1 62 3 0.0 1 3 63 2 0.0 1 2 64 3 0.0 1 2 1 65 2 0.0 1 2 66 2 0.0 1 2 67 39 0.0 1 4 35 68 55 0.0 1 23 32 69 29 0.0 1 14 15 70 12 0.0 1 2 10 71 6 0.0 1 2 4 72 1 0.0 1 0 1 73 1 0.0 1 0 1 74 2 0.0 1 1 1 77 1 0.0 1 0 1 81 1 0.0 1 1 85 1 0.0 1 1 90 1 0.0 1 0 1 92 1 0.0 1 0 1 98 1 0.0 1 0 1 102 1 0.0 1 1 106 2 0.0 1 0 2 110 1 0.0 1 1 111 1 0.0 1 1 114 1 0.0 1 0 1 117 1 0.0 1 0 1 118 1 0.0 1 0 1 121 1 0.0 1 1 123 1 0.0 1 0 1 127 1 0.0 1 1 131 1 0.0 1 0 1 133 2 0.0 1 1 1 137 1 0.0 1 0 1 147 1 0.0 1 1 149 4 0.0 1 1 3 150 9 0.0 1 2 7 151 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001.fastq.gz ============================================= 99111 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Total number of sequences analysed: 99111 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 23739 (23.95%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-6_S24_L006_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 10626.56 s (29 us/read; 2.10 M reads/minute). === Summary === Total reads processed: 372,443,998 Reads with adapters: 80,265,508 (21.6%) Reads written (passing filters): 372,443,998 (100.0%) Total basepairs processed: 43,423,870,063 bp Quality-trimmed: 1,813,021,197 bp (4.2%) Total written (filtered): 41,416,634,105 bp (95.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 80265508 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.5% C: 23.3% G: 14.3% T: 29.4% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 59245360 93110999.5 0 59245360 2 14414406 23277749.9 0 14414406 3 3607115 5819437.5 0 3607115 4 1486890 1454859.4 0 1486890 5 289552 363714.8 0 289552 6 33411 90928.7 0 33411 7 17904 22732.2 0 17904 8 11002 5683.0 0 11002 9 3111 1420.8 0 904 2207 10 12404 355.2 1 4357 8047 11 2688 88.8 1 584 2104 12 5889 22.2 1 3925 1964 13 6122 5.5 1 4224 1898 14 4855 5.5 1 3576 1279 15 5909 5.5 1 4188 1721 16 4703 5.5 1 3482 1221 17 2831 5.5 1 1321 1510 18 9730 5.5 1 6763 2967 19 14222 5.5 1 11869 2353 20 2075 5.5 1 953 1122 21 1359 5.5 1 628 731 22 7466 5.5 1 5236 2230 23 5444 5.5 1 4059 1385 24 7650 5.5 1 6192 1458 25 3013 5.5 1 1781 1232 26 4510 5.5 1 3425 1085 27 5655 5.5 1 4257 1398 28 10377 5.5 1 8042 2335 29 1449 5.5 1 697 752 30 4372 5.5 1 2965 1407 31 6837 5.5 1 5631 1206 32 6183 5.5 1 4701 1482 33 4569 5.5 1 3638 931 34 6898 5.5 1 5030 1868 35 3456 5.5 1 2090 1366 36 5169 5.5 1 3995 1174 37 7375 5.5 1 5624 1751 38 8775 5.5 1 7203 1572 39 14454 5.5 1 11975 2479 40 1694 5.5 1 942 752 41 5066 5.5 1 3330 1736 42 7061 5.5 1 5536 1525 43 8462 5.5 1 7229 1233 44 5510 5.5 1 4365 1145 45 1514 5.5 1 881 633 46 4120 5.5 1 3375 745 47 1717 5.5 1 1046 671 48 7121 5.5 1 5708 1413 49 8480 5.5 1 7212 1268 50 1906 5.5 1 1200 706 51 5223 5.5 1 4238 985 52 9373 5.5 1 7946 1427 53 6325 5.5 1 5052 1273 54 3170 5.5 1 2117 1053 55 12433 5.5 1 10395 2038 56 9115 5.5 1 7913 1202 57 4068 5.5 1 2942 1126 58 3976 5.5 1 3100 876 59 12760 5.5 1 11575 1185 60 1943 5.5 1 1320 623 61 1528 5.5 1 991 537 62 8979 5.5 1 7799 1180 63 3476 5.5 1 2480 996 64 2768 5.5 1 2016 752 65 5589 5.5 1 4669 920 66 13129 5.5 1 11342 1787 67 6377 5.5 1 5448 929 68 5667 5.5 1 4789 878 69 6342 5.5 1 5412 930 70 6305 5.5 1 5389 916 71 6755 5.5 1 5725 1030 72 7787 5.5 1 6493 1294 73 8902 5.5 1 7237 1665 74 13116 5.5 1 9011 4105 75 68811 5.5 1 14808 54003 76 141022 5.5 1 94577 46445 77 110493 5.5 1 78201 32292 78 70912 5.5 1 47609 23303 79 43363 5.5 1 29584 13779 80 26403 5.5 1 17825 8578 81 16565 5.5 1 11018 5547 82 11322 5.5 1 7432 3890 83 8285 5.5 1 5361 2924 84 6571 5.5 1 4277 2294 85 5574 5.5 1 3633 1941 86 5068 5.5 1 3272 1796 87 4931 5.5 1 3334 1597 88 4592 5.5 1 3129 1463 89 4595 5.5 1 3167 1428 90 4076 5.5 1 2733 1343 91 3751 5.5 1 2520 1231 92 3593 5.5 1 2447 1146 93 3468 5.5 1 2391 1077 94 3295 5.5 1 2277 1018 95 3329 5.5 1 2277 1052 96 3110 5.5 1 2133 977 97 3148 5.5 1 2148 1000 98 3016 5.5 1 2104 912 99 2946 5.5 1 2071 875 100 2902 5.5 1 2040 862 101 2885 5.5 1 2039 846 102 2707 5.5 1 1910 797 103 2561 5.5 1 1848 713 104 2526 5.5 1 1740 786 105 2620 5.5 1 1896 724 106 2561 5.5 1 1839 722 107 2359 5.5 1 1730 629 108 2459 5.5 1 1800 659 109 2312 5.5 1 1652 660 110 2253 5.5 1 1656 597 111 2215 5.5 1 1618 597 112 2163 5.5 1 1545 618 113 2085 5.5 1 1507 578 114 2158 5.5 1 1590 568 115 2005 5.5 1 1479 526 116 2107 5.5 1 1563 544 117 2052 5.5 1 1532 520 118 2039 5.5 1 1512 527 119 2019 5.5 1 1493 526 120 1942 5.5 1 1475 467 121 1812 5.5 1 1316 496 122 1801 5.5 1 1333 468 123 1736 5.5 1 1305 431 124 1729 5.5 1 1296 433 125 1708 5.5 1 1318 390 126 1689 5.5 1 1314 375 127 1622 5.5 1 1257 365 128 1974 5.5 1 1601 373 129 1616 5.5 1 1294 322 130 1692 5.5 1 1347 345 131 1542 5.5 1 1212 330 132 1467 5.5 1 1163 304 133 1390 5.5 1 1074 316 134 1418 5.5 1 1108 310 135 1543 5.5 1 1163 380 136 1710 5.5 1 1306 404 137 2114 5.5 1 1668 446 138 5226 5.5 1 4790 436 139 3416 5.5 1 3019 397 140 2425 5.5 1 2035 390 141 1720 5.5 1 1298 422 142 1467 5.5 1 1047 420 143 1384 5.5 1 919 465 144 1554 5.5 1 998 556 145 1687 5.5 1 968 719 146 2051 5.5 1 1114 937 147 2872 5.5 1 1385 1487 148 5820 5.5 1 2576 3244 149 14836 5.5 1 6249 8587 150 49326 5.5 1 21035 28291 151 19045 5.5 1 6985 12060 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001.fastq.gz ============================================= 372443998 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11519.76 s (31 us/read; 1.94 M reads/minute). === Summary === Total reads processed: 372,443,998 Reads with adapters: 86,031,637 (23.1%) Reads written (passing filters): 372,443,998 (100.0%) Total basepairs processed: 44,070,726,638 bp Quality-trimmed: 3,699,118,639 bp (8.4%) Total written (filtered): 40,140,473,061 bp (91.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 86031637 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.1% C: 25.7% G: 14.0% T: 28.7% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 62932164 93110999.5 0 62932164 2 15763308 23277749.9 0 15763308 3 3630638 5819437.5 0 3630638 4 1390758 1454859.4 0 1390758 5 285910 363714.8 0 285910 6 44754 90928.7 0 44754 7 38813 22732.2 0 38813 8 20281 5683.0 0 20281 9 5803 1420.8 0 3248 2555 10 22842 355.2 1 10314 12528 11 7097 88.8 1 2587 4510 12 14537 22.2 1 8636 5901 13 8112 5.5 1 4006 4106 14 35258 5.5 1 23275 11983 15 8916 5.5 1 5616 3300 16 6571 5.5 1 3537 3034 17 22095 5.5 1 14704 7391 18 3434 5.5 1 1658 1776 19 17657 5.5 1 12293 5364 20 12174 5.5 1 8567 3607 21 1629 5.5 1 616 1013 22 2961 5.5 1 1350 1611 23 14038 5.5 1 9036 5002 24 39182 5.5 1 26291 12891 25 13711 5.5 1 9272 4439 26 17342 5.5 1 13161 4181 27 2021 5.5 1 943 1078 28 14623 5.5 1 9946 4677 29 2640 5.5 1 1295 1345 30 16676 5.5 1 11583 5093 31 3602 5.5 1 2029 1573 32 20314 5.5 1 14546 5768 33 30288 5.5 1 23871 6417 34 2494 5.5 1 1245 1249 35 8521 5.5 1 4770 3751 36 7839 5.5 1 4752 3087 37 15842 5.5 1 12063 3779 38 7963 5.5 1 4752 3211 39 9215 5.5 1 6785 2430 40 4028 5.5 1 2223 1805 41 15873 5.5 1 11121 4752 42 30172 5.5 1 21809 8363 43 4209 5.5 1 2597 1612 44 16612 5.5 1 11943 4669 45 31129 5.5 1 22916 8213 46 11543 5.5 1 8571 2972 47 4055 5.5 1 2446 1609 48 31668 5.5 1 23729 7939 49 17189 5.5 1 12774 4415 50 9119 5.5 1 5772 3347 51 65138 5.5 1 49833 15305 52 12546 5.5 1 8644 3902 53 9251 5.5 1 6606 2645 54 10466 5.5 1 8114 2352 55 18862 5.5 1 13995 4867 56 10545 5.5 1 7471 3074 57 15133 5.5 1 11674 3459 58 19812 5.5 1 15859 3953 59 13552 5.5 1 10570 2982 60 13377 5.5 1 10567 2810 61 13586 5.5 1 10694 2892 62 14919 5.5 1 11768 3151 63 18785 5.5 1 14930 3855 64 20603 5.5 1 16613 3990 65 22385 5.5 1 17978 4407 66 25310 5.5 1 19734 5576 67 122545 5.5 1 28400 94145 68 306243 5.5 1 220245 85998 69 170620 5.5 1 139154 31466 70 66147 5.5 1 41059 25088 71 39597 5.5 1 28654 10943 72 18236 5.5 1 12191 6045 73 12676 5.5 1 8334 4342 74 9031 5.5 1 5780 3251 75 8608 5.5 1 5441 3167 76 7710 5.5 1 4997 2713 77 7287 5.5 1 4520 2767 78 6991 5.5 1 4328 2663 79 6914 5.5 1 4285 2629 80 6958 5.5 1 4412 2546 81 6817 5.5 1 4218 2599 82 6801 5.5 1 4263 2538 83 6287 5.5 1 3921 2366 84 6764 5.5 1 4198 2566 85 6262 5.5 1 3900 2362 86 5621 5.5 1 3578 2043 87 5397 5.5 1 3301 2096 88 5517 5.5 1 3420 2097 89 5380 5.5 1 3324 2056 90 5498 5.5 1 3396 2102 91 5416 5.5 1 3352 2064 92 6135 5.5 1 3546 2589 93 5226 5.5 1 3157 2069 94 5214 5.5 1 3170 2044 95 4910 5.5 1 2896 2014 96 5787 5.5 1 3301 2486 97 4658 5.5 1 2861 1797 98 4423 5.5 1 2775 1648 99 4421 5.5 1 2733 1688 100 4728 5.5 1 3005 1723 101 4696 5.5 1 2980 1716 102 4631 5.5 1 2884 1747 103 4627 5.5 1 2978 1649 104 4455 5.5 1 2900 1555 105 4440 5.5 1 2915 1525 106 4348 5.5 1 2809 1539 107 4158 5.5 1 2724 1434 108 4226 5.5 1 2759 1467 109 4061 5.5 1 2764 1297 110 3891 5.5 1 2614 1277 111 3704 5.5 1 2459 1245 112 3625 5.5 1 2402 1223 113 3421 5.5 1 2240 1181 114 3391 5.5 1 2198 1193 115 3243 5.5 1 2162 1081 116 3133 5.5 1 2094 1039 117 3019 5.5 1 1990 1029 118 3016 5.5 1 2046 970 119 2881 5.5 1 1921 960 120 2800 5.5 1 1868 932 121 2480 5.5 1 1659 821 122 2424 5.5 1 1627 797 123 2289 5.5 1 1471 818 124 2268 5.5 1 1465 803 125 2002 5.5 1 1317 685 126 1918 5.5 1 1288 630 127 1741 5.5 1 1176 565 128 1688 5.5 1 1184 504 129 1359 5.5 1 866 493 130 1280 5.5 1 897 383 131 995 5.5 1 651 344 132 906 5.5 1 595 311 133 807 5.5 1 505 302 134 775 5.5 1 513 262 135 707 5.5 1 449 258 136 706 5.5 1 460 246 137 590 5.5 1 364 226 138 545 5.5 1 347 198 139 553 5.5 1 330 223 140 500 5.5 1 294 206 141 463 5.5 1 260 203 142 455 5.5 1 232 223 143 475 5.5 1 214 261 144 497 5.5 1 200 297 145 487 5.5 1 195 292 146 554 5.5 1 218 336 147 759 5.5 1 277 482 148 2280 5.5 1 798 1482 149 6141 5.5 1 2138 4003 150 28266 5.5 1 10082 18184 151 6246 5.5 1 2158 4088 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001.fastq.gz ============================================= 372443998 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Total number of sequences analysed: 372443998 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 90957756 (24.42%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-7_S28_L007_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 10716.32 s (28 us/read; 2.11 M reads/minute). === Summary === Total reads processed: 377,720,238 Reads with adapters: 81,190,946 (21.5%) Reads written (passing filters): 377,720,238 (100.0%) Total basepairs processed: 44,137,321,483 bp Quality-trimmed: 1,790,708,809 bp (4.1%) Total written (filtered): 42,153,855,184 bp (95.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 81190946 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.6% C: 23.2% G: 14.3% T: 29.4% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 60062944 94430059.5 0 60062944 2 14458552 23607514.9 0 14458552 3 3668644 5901878.7 0 3668644 4 1514088 1475469.7 0 1514088 5 296428 368867.4 0 296428 6 34325 92216.9 0 34325 7 17506 23054.2 0 17506 8 11030 5763.6 0 11030 9 2952 1440.9 0 847 2105 10 12326 360.2 1 4426 7900 11 2746 90.1 1 570 2176 12 5760 22.5 1 3870 1890 13 6237 5.6 1 4374 1863 14 4689 5.6 1 3488 1201 15 5952 5.6 1 4256 1696 16 4675 5.6 1 3479 1196 17 2519 5.6 1 1132 1387 18 9156 5.6 1 6359 2797 19 14052 5.6 1 11787 2265 20 1959 5.6 1 934 1025 21 1367 5.6 1 621 746 22 7274 5.6 1 5216 2058 23 5488 5.6 1 4117 1371 24 7754 5.6 1 6325 1429 25 2859 5.6 1 1654 1205 26 4354 5.6 1 3345 1009 27 5628 5.6 1 4243 1385 28 10234 5.6 1 7980 2254 29 1344 5.6 1 660 684 30 4079 5.6 1 2740 1339 31 6952 5.6 1 5654 1298 32 6107 5.6 1 4595 1512 33 4687 5.6 1 3686 1001 34 6836 5.6 1 5001 1835 35 3195 5.6 1 1882 1313 36 5117 5.6 1 3952 1165 37 7224 5.6 1 5495 1729 38 8831 5.6 1 7254 1577 39 14332 5.6 1 11919 2413 40 1625 5.6 1 891 734 41 4743 5.6 1 3149 1594 42 6951 5.6 1 5435 1516 43 8013 5.6 1 6814 1199 44 5498 5.6 1 4328 1170 45 1442 5.6 1 828 614 46 4060 5.6 1 3291 769 47 1578 5.6 1 940 638 48 6955 5.6 1 5512 1443 49 8571 5.6 1 7255 1316 50 1772 5.6 1 1103 669 51 5242 5.6 1 4253 989 52 9256 5.6 1 7778 1478 53 6058 5.6 1 4864 1194 54 3057 5.6 1 2080 977 55 12152 5.6 1 10091 2061 56 8887 5.6 1 7741 1146 57 3891 5.6 1 2845 1046 58 3771 5.6 1 2944 827 59 12626 5.6 1 11334 1292 60 1864 5.6 1 1200 664 61 1479 5.6 1 998 481 62 8820 5.6 1 7630 1190 63 3154 5.6 1 2248 906 64 2696 5.6 1 1965 731 65 5457 5.6 1 4557 900 66 12512 5.6 1 10804 1708 67 5927 5.6 1 5025 902 68 5574 5.6 1 4706 868 69 6120 5.6 1 5206 914 70 6130 5.6 1 5187 943 71 6553 5.6 1 5494 1059 72 7316 5.6 1 6051 1265 73 8909 5.6 1 7176 1733 74 12845 5.6 1 8768 4077 75 75976 5.6 1 15028 60948 76 148428 5.6 1 102903 45525 77 107146 5.6 1 78133 29013 78 62019 5.6 1 42416 19603 79 36095 5.6 1 25235 10860 80 22919 5.6 1 15603 7316 81 14348 5.6 1 9634 4714 82 9658 5.6 1 6469 3189 83 7234 5.6 1 4728 2506 84 5798 5.6 1 3768 2030 85 5156 5.6 1 3393 1763 86 4721 5.6 1 3064 1657 87 4576 5.6 1 3054 1522 88 4309 5.6 1 2961 1348 89 4336 5.6 1 3032 1304 90 3815 5.6 1 2568 1247 91 3683 5.6 1 2482 1201 92 3520 5.6 1 2365 1155 93 3172 5.6 1 2169 1003 94 3230 5.6 1 2262 968 95 3028 5.6 1 2090 938 96 3172 5.6 1 2185 987 97 2839 5.6 1 1950 889 98 2858 5.6 1 2041 817 99 2924 5.6 1 2091 833 100 2721 5.6 1 1899 822 101 2683 5.6 1 1927 756 102 2660 5.6 1 1845 815 103 2498 5.6 1 1795 703 104 2571 5.6 1 1853 718 105 2510 5.6 1 1801 709 106 2381 5.6 1 1731 650 107 2401 5.6 1 1713 688 108 2365 5.6 1 1680 685 109 2297 5.6 1 1682 615 110 2280 5.6 1 1660 620 111 2167 5.6 1 1609 558 112 2135 5.6 1 1521 614 113 2021 5.6 1 1450 571 114 2007 5.6 1 1436 571 115 1964 5.6 1 1412 552 116 1975 5.6 1 1444 531 117 1951 5.6 1 1444 507 118 1938 5.6 1 1458 480 119 1810 5.6 1 1396 414 120 1871 5.6 1 1375 496 121 1697 5.6 1 1274 423 122 1702 5.6 1 1288 414 123 1604 5.6 1 1174 430 124 1678 5.6 1 1262 416 125 1610 5.6 1 1242 368 126 1620 5.6 1 1245 375 127 1580 5.6 1 1200 380 128 1859 5.6 1 1522 337 129 1599 5.6 1 1292 307 130 1553 5.6 1 1260 293 131 1494 5.6 1 1217 277 132 1329 5.6 1 1040 289 133 1335 5.6 1 1054 281 134 1416 5.6 1 1103 313 135 1423 5.6 1 1081 342 136 1703 5.6 1 1316 387 137 2071 5.6 1 1630 441 138 5334 5.6 1 4925 409 139 3326 5.6 1 2967 359 140 2196 5.6 1 1826 370 141 1596 5.6 1 1245 351 142 1408 5.6 1 986 422 143 1380 5.6 1 908 472 144 1473 5.6 1 953 520 145 1520 5.6 1 897 623 146 1854 5.6 1 967 887 147 2646 5.6 1 1293 1353 148 5516 5.6 1 2385 3131 149 13966 5.6 1 6048 7918 150 47659 5.6 1 20533 27126 151 19927 5.6 1 7446 12481 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001.fastq.gz ============================================= 377720238 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11671.24 s (31 us/read; 1.94 M reads/minute). === Summary === Total reads processed: 377,720,238 Reads with adapters: 87,347,023 (23.1%) Reads written (passing filters): 377,720,238 (100.0%) Total basepairs processed: 44,857,084,411 bp Quality-trimmed: 3,938,365,587 bp (8.8%) Total written (filtered): 40,693,489,573 bp (90.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 87347023 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.2% C: 25.6% G: 14.0% T: 28.7% none/other: 0.5% Overview of removed sequences length count expect max.err error counts 1 64066312 94430059.5 0 64066312 2 15993355 23607514.9 0 15993355 3 3673052 5901878.7 0 3673052 4 1404374 1475469.7 0 1404374 5 287493 368867.4 0 287493 6 45262 92216.9 0 45262 7 38025 23054.2 0 38025 8 19736 5763.6 0 19736 9 5639 1440.9 0 3062 2577 10 22476 360.2 1 10016 12460 11 6881 90.1 1 2519 4362 12 14167 22.5 1 8245 5922 13 7919 5.6 1 3855 4064 14 34347 5.6 1 22553 11794 15 8822 5.6 1 5533 3289 16 6314 5.6 1 3362 2952 17 20823 5.6 1 13622 7201 18 3401 5.6 1 1669 1732 19 17012 5.6 1 11793 5219 20 11714 5.6 1 7969 3745 21 1483 5.6 1 564 919 22 2815 5.6 1 1226 1589 23 13432 5.6 1 8450 4982 24 38274 5.6 1 25571 12703 25 13332 5.6 1 9022 4310 26 16357 5.6 1 12278 4079 27 1902 5.6 1 877 1025 28 14080 5.6 1 9560 4520 29 2632 5.6 1 1300 1332 30 16074 5.6 1 10910 5164 31 3407 5.6 1 1872 1535 32 19645 5.6 1 13857 5788 33 29129 5.6 1 22791 6338 34 2383 5.6 1 1175 1208 35 8232 5.6 1 4522 3710 36 7624 5.6 1 4717 2907 37 15203 5.6 1 11541 3662 38 7593 5.6 1 4665 2928 39 8831 5.6 1 6456 2375 40 3676 5.6 1 2004 1672 41 15192 5.6 1 10484 4708 42 28932 5.6 1 20805 8127 43 3982 5.6 1 2484 1498 44 15845 5.6 1 11181 4664 45 29799 5.6 1 21618 8181 46 10912 5.6 1 7970 2942 47 3962 5.6 1 2284 1678 48 29788 5.6 1 22199 7589 49 16618 5.6 1 12128 4490 50 8748 5.6 1 5598 3150 51 61798 5.6 1 46815 14983 52 11840 5.6 1 8120 3720 53 8824 5.6 1 6172 2652 54 9801 5.6 1 7490 2311 55 17944 5.6 1 13298 4646 56 9879 5.6 1 7108 2771 57 14218 5.6 1 10907 3311 58 18684 5.6 1 14872 3812 59 12372 5.6 1 9635 2737 60 12356 5.6 1 9673 2683 61 12687 5.6 1 9925 2762 62 14097 5.6 1 11080 3017 63 17831 5.6 1 14094 3737 64 19526 5.6 1 15656 3870 65 21185 5.6 1 17020 4165 66 24063 5.6 1 18432 5631 67 125538 5.6 1 26579 98959 68 305402 5.6 1 222022 83380 69 158741 5.6 1 129787 28954 70 56952 5.6 1 35453 21499 71 33493 5.6 1 24121 9372 72 15689 5.6 1 10366 5323 73 10962 5.6 1 7238 3724 74 8003 5.6 1 5121 2882 75 7723 5.6 1 4808 2915 76 6772 5.6 1 4348 2424 77 6557 5.6 1 3962 2595 78 6159 5.6 1 3817 2342 79 6032 5.6 1 3737 2295 80 6187 5.6 1 3851 2336 81 6071 5.6 1 3713 2358 82 6010 5.6 1 3763 2247 83 5651 5.6 1 3449 2202 84 6136 5.6 1 3717 2419 85 5500 5.6 1 3284 2216 86 4939 5.6 1 3064 1875 87 4840 5.6 1 2969 1871 88 4961 5.6 1 3013 1948 89 4727 5.6 1 2799 1928 90 4805 5.6 1 2923 1882 91 4868 5.6 1 2917 1951 92 5313 5.6 1 3001 2312 93 4731 5.6 1 2787 1944 94 4586 5.6 1 2722 1864 95 4451 5.6 1 2582 1869 96 5016 5.6 1 2800 2216 97 4172 5.6 1 2493 1679 98 3928 5.6 1 2362 1566 99 3927 5.6 1 2368 1559 100 4198 5.6 1 2669 1529 101 4135 5.6 1 2616 1519 102 3880 5.6 1 2448 1432 103 3981 5.6 1 2518 1463 104 3767 5.6 1 2406 1361 105 3776 5.6 1 2419 1357 106 3858 5.6 1 2504 1354 107 3652 5.6 1 2366 1286 108 3489 5.6 1 2318 1171 109 3422 5.6 1 2261 1161 110 3268 5.6 1 2144 1124 111 3067 5.6 1 2005 1062 112 3151 5.6 1 2074 1077 113 2992 5.6 1 1949 1043 114 3003 5.6 1 1926 1077 115 2804 5.6 1 1805 999 116 2673 5.6 1 1772 901 117 2642 5.6 1 1697 945 118 2566 5.6 1 1694 872 119 2557 5.6 1 1671 886 120 2301 5.6 1 1491 810 121 2224 5.6 1 1469 755 122 2110 5.6 1 1383 727 123 1904 5.6 1 1233 671 124 1899 5.6 1 1224 675 125 1727 5.6 1 1133 594 126 1681 5.6 1 1118 563 127 1483 5.6 1 955 528 128 1548 5.6 1 1080 468 129 1241 5.6 1 785 456 130 1127 5.6 1 759 368 131 889 5.6 1 546 343 132 814 5.6 1 530 284 133 696 5.6 1 412 284 134 666 5.6 1 426 240 135 605 5.6 1 391 214 136 575 5.6 1 349 226 137 520 5.6 1 325 195 138 548 5.6 1 353 195 139 447 5.6 1 280 167 140 442 5.6 1 268 174 141 428 5.6 1 227 201 142 417 5.6 1 218 199 143 401 5.6 1 191 210 144 415 5.6 1 193 222 145 464 5.6 1 181 283 146 494 5.6 1 190 304 147 649 5.6 1 226 423 148 2178 5.6 1 800 1378 149 5593 5.6 1 1968 3625 150 25016 5.6 1 9051 15965 151 5687 5.6 1 1989 3698 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001.fastq.gz ============================================= 377720238 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Total number of sequences analysed: 377720238 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 91737759 (24.29%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-8to10kb-8_S32_L008_R2_001_trimmed.fq.gz ====================================================================================================