No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default) Path to Cutadapt set as: 'cutadapt' (default) 1.16 Cutadapt seems to be working fine (tested command 'cutadapt --version') AUTO-DETECTING ADAPTER TYPE =========================== Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz <<) Found perfect matches for the following adapter sequences: Adapter type Count Sequence Sequences analysed Percentage Illumina 1369 AGATCGGAAGAGC 1000000 0.14 Nextera 979 CTGTCTCTTATA 1000000 0.10 smallRNA 3 TGGAATTCTCGG 1000000 0.00 Using Illumina adapter for trimming (count: 1369). Second best hit was Nextera (count: 979) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-1_S1_L001_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11276.36 s (31 us/read; 1.96 M reads/minute). === Summary === Total reads processed: 368,469,262 Reads with adapters: 79,594,596 (21.6%) Reads written (passing filters): 368,469,262 (100.0%) Total basepairs processed: 44,653,718,483 bp Quality-trimmed: 1,544,810,377 bp (3.5%) Total written (filtered): 42,952,779,623 bp (96.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 79594596 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.4% C: 22.3% G: 13.9% T: 29.3% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 60105325 92117315.5 0 60105325 2 13304644 23029328.9 0 13304644 3 3563395 5757332.2 0 3563395 4 1494825 1439333.1 0 1494825 5 284071 359833.3 0 284071 6 34958 89958.3 0 34958 7 18635 22489.6 0 18635 8 9808 5622.4 0 9808 9 3174 1405.6 0 1075 2099 10 12903 351.4 1 4177 8726 11 3209 87.8 1 721 2488 12 6310 22.0 1 3722 2588 13 7013 5.5 1 4292 2721 14 4967 5.5 1 3207 1760 15 6446 5.5 1 4002 2444 16 5136 5.5 1 3293 1843 17 3315 5.5 1 1500 1815 18 11099 5.5 1 7038 4061 19 12966 5.5 1 9723 3243 20 2029 5.5 1 852 1177 21 1735 5.5 1 724 1011 22 7984 5.5 1 5026 2958 23 5421 5.5 1 3516 1905 24 8278 5.5 1 6009 2269 25 3185 5.5 1 1582 1603 26 5173 5.5 1 3430 1743 27 6047 5.5 1 3980 2067 28 11376 5.5 1 7753 3623 29 2049 5.5 1 936 1113 30 4589 5.5 1 2764 1825 31 6916 5.5 1 5020 1896 32 6246 5.5 1 4221 2025 33 5580 5.5 1 3803 1777 34 6925 5.5 1 4486 2439 35 5226 5.5 1 2710 2516 36 17183 5.5 1 11702 5481 37 6390 5.5 1 4056 2334 38 3701 5.5 1 2470 1231 39 9744 5.5 1 6961 2783 40 8291 5.5 1 5970 2321 41 6421 5.5 1 4838 1583 42 6431 5.5 1 4351 2080 43 2171 5.5 1 1148 1023 44 4000 5.5 1 2771 1229 45 1994 5.5 1 1042 952 46 8102 5.5 1 5687 2415 47 8569 5.5 1 6379 2190 48 2457 5.5 1 1418 1039 49 5387 5.5 1 3816 1571 50 9522 5.5 1 7173 2349 51 6717 5.5 1 4823 1894 52 3854 5.5 1 2274 1580 53 14118 5.5 1 10703 3415 54 8290 5.5 1 6302 1988 55 5163 5.5 1 3556 1607 56 4759 5.5 1 3321 1438 57 11331 5.5 1 9048 2283 58 2678 5.5 1 1617 1061 59 1823 5.5 1 1028 795 60 9101 5.5 1 7066 2035 61 4342 5.5 1 2831 1511 62 3666 5.5 1 2479 1187 63 5505 5.5 1 4049 1456 64 13068 5.5 1 10103 2965 65 6346 5.5 1 4737 1609 66 5739 5.5 1 4383 1356 67 6105 5.5 1 4632 1473 68 6135 5.5 1 4660 1475 69 6534 5.5 1 4979 1555 70 6979 5.5 1 5288 1691 71 7559 5.5 1 5857 1702 72 8738 5.5 1 6491 2247 73 38520 5.5 1 8918 29602 74 60833 5.5 1 37982 22851 75 32033 5.5 1 21254 10779 76 16561 5.5 1 10168 6393 77 11258 5.5 1 6829 4429 78 8014 5.5 1 4889 3125 79 6569 5.5 1 3986 2583 80 5625 5.5 1 3520 2105 81 5058 5.5 1 3212 1846 82 4996 5.5 1 3190 1806 83 4794 5.5 1 3036 1758 84 4788 5.5 1 3074 1714 85 4600 5.5 1 2944 1656 86 4341 5.5 1 2830 1511 87 4435 5.5 1 2902 1533 88 4424 5.5 1 2924 1500 89 4240 5.5 1 2758 1482 90 4046 5.5 1 2650 1396 91 4137 5.5 1 2729 1408 92 3844 5.5 1 2517 1327 93 3829 5.5 1 2555 1274 94 3727 5.5 1 2391 1336 95 3800 5.5 1 2500 1300 96 3504 5.5 1 2317 1187 97 3489 5.5 1 2346 1143 98 3480 5.5 1 2346 1134 99 3457 5.5 1 2343 1114 100 3254 5.5 1 2254 1000 101 3294 5.5 1 2219 1075 102 3057 5.5 1 2023 1034 103 3008 5.5 1 2042 966 104 2858 5.5 1 1916 942 105 2823 5.5 1 1928 895 106 2796 5.5 1 1864 932 107 2761 5.5 1 1864 897 108 2755 5.5 1 1836 919 109 2537 5.5 1 1741 796 110 2486 5.5 1 1682 804 111 2380 5.5 1 1573 807 112 2432 5.5 1 1690 742 113 2207 5.5 1 1481 726 114 2128 5.5 1 1408 720 115 2078 5.5 1 1403 675 116 1982 5.5 1 1320 662 117 1875 5.5 1 1244 631 118 1759 5.5 1 1247 512 119 1678 5.5 1 1120 558 120 1655 5.5 1 1152 503 121 1587 5.5 1 1074 513 122 1441 5.5 1 948 493 123 1379 5.5 1 911 468 124 1259 5.5 1 823 436 125 1267 5.5 1 836 431 126 1167 5.5 1 784 383 127 991 5.5 1 676 315 128 887 5.5 1 616 271 129 737 5.5 1 506 231 130 667 5.5 1 455 212 131 580 5.5 1 389 191 132 526 5.5 1 333 193 133 519 5.5 1 344 175 134 494 5.5 1 321 173 135 397 5.5 1 232 165 136 395 5.5 1 202 193 137 387 5.5 1 198 189 138 360 5.5 1 209 151 139 304 5.5 1 126 178 140 338 5.5 1 131 207 141 371 5.5 1 147 224 142 358 5.5 1 139 219 143 407 5.5 1 132 275 144 411 5.5 1 162 249 145 549 5.5 1 173 376 146 694 5.5 1 227 467 147 1002 5.5 1 281 721 148 2028 5.5 1 631 1397 149 5434 5.5 1 1650 3784 150 17673 5.5 1 5166 12507 151 9911 5.5 1 2839 7072 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-1_S1_L001_R1_001.fastq.gz ============================================= 368469262 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-1_S1_L001_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-1_S1_L001_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-1_S1_L001_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-1_S1_L001_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-1_S1_L001_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12233.40 s (33 us/read; 1.81 M reads/minute). === Summary === Total reads processed: 368,469,262 Reads with adapters: 88,605,467 (24.0%) Reads written (passing filters): 368,469,262 (100.0%) Total basepairs processed: 46,158,097,968 bp Quality-trimmed: 4,390,806,671 bp (9.5%) Total written (filtered): 41,536,295,439 bp (90.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 88605467 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.8% C: 25.1% G: 13.7% T: 28.1% none/other: 0.3% Overview of removed sequences length count expect max.err error counts 1 65740285 92117315.5 0 65740285 2 15310568 23029328.9 0 15310568 3 3699552 5757332.2 0 3699552 4 1349516 1439333.1 0 1349516 5 280042 359833.3 0 280042 6 54063 89958.3 0 54063 7 53552 22489.6 0 53552 8 25896 5622.4 0 25896 9 7592 1405.6 0 4838 2754 10 31511 351.4 1 14585 16926 11 10878 87.8 1 4113 6765 12 21755 22.0 1 12048 9707 13 11806 5.5 1 5665 6141 14 52990 5.5 1 33226 19764 15 11900 5.5 1 6995 4905 16 9003 5.5 1 4601 4402 17 31943 5.5 1 20109 11834 18 4889 5.5 1 2384 2505 19 23859 5.5 1 15464 8395 20 14748 5.5 1 9334 5414 21 2440 5.5 1 923 1517 22 4358 5.5 1 1903 2455 23 19518 5.5 1 11813 7705 24 60756 5.5 1 38897 21859 25 20234 5.5 1 12826 7408 26 22166 5.5 1 15403 6763 27 3223 5.5 1 1544 1679 28 22240 5.5 1 14538 7702 29 4345 5.5 1 2143 2202 30 27352 5.5 1 17936 9416 31 5432 5.5 1 2955 2477 32 30622 5.5 1 20696 9926 33 37132 5.5 1 27497 9635 34 3901 5.5 1 1921 1980 35 14695 5.5 1 8281 6414 36 14479 5.5 1 8893 5586 37 20236 5.5 1 14546 5690 38 12760 5.5 1 7828 4932 39 11769 5.5 1 8146 3623 40 5516 5.5 1 3036 2480 41 21439 5.5 1 14446 6993 42 51416 5.5 1 35373 16043 43 6835 5.5 1 4021 2814 44 24467 5.5 1 16615 7852 45 56425 5.5 1 39901 16524 46 15638 5.5 1 10646 4992 47 5353 5.5 1 3093 2260 48 42292 5.5 1 30662 11630 49 26694 5.5 1 19124 7570 50 12014 5.5 1 7546 4468 51 115500 5.5 1 86303 29197 52 22315 5.5 1 15081 7234 53 13235 5.5 1 9133 4102 54 12421 5.5 1 8936 3485 55 25968 5.5 1 18923 7045 56 17229 5.5 1 12332 4897 57 18075 5.5 1 13486 4589 58 26177 5.5 1 20050 6127 59 18716 5.5 1 14087 4629 60 17611 5.5 1 13199 4412 61 22687 5.5 1 17156 5531 62 24504 5.5 1 18927 5577 63 25300 5.5 1 19549 5751 64 24998 5.5 1 19314 5684 65 29976 5.5 1 23370 6606 66 40155 5.5 1 31143 9012 67 101351 5.5 1 53700 47651 68 219942 5.5 1 170506 49436 69 92747 5.5 1 76218 16529 70 30894 5.5 1 22763 8131 71 17028 5.5 1 11820 5208 72 11714 5.5 1 7839 3875 73 10940 5.5 1 7191 3749 74 9337 5.5 1 6227 3110 75 8286 5.5 1 5433 2853 76 8030 5.5 1 5259 2771 77 7921 5.5 1 5080 2841 78 8004 5.5 1 5097 2907 79 8339 5.5 1 5380 2959 80 8238 5.5 1 5242 2996 81 7818 5.5 1 4863 2955 82 7817 5.5 1 4939 2878 83 7564 5.5 1 4811 2753 84 7173 5.5 1 4485 2688 85 6685 5.5 1 4116 2569 86 6773 5.5 1 4188 2585 87 6574 5.5 1 4065 2509 88 6452 5.5 1 4004 2448 89 6543 5.5 1 4100 2443 90 6131 5.5 1 3764 2367 91 5872 5.5 1 3624 2248 92 5998 5.5 1 3701 2297 93 5653 5.5 1 3404 2249 94 5755 5.5 1 3487 2268 95 5869 5.5 1 3640 2229 96 5451 5.5 1 3383 2068 97 5651 5.5 1 3492 2159 98 5929 5.5 1 3660 2269 99 5869 5.5 1 3637 2232 100 5789 5.5 1 3639 2150 101 5747 5.5 1 3524 2223 102 5602 5.5 1 3485 2117 103 5588 5.5 1 3540 2048 104 5387 5.5 1 3333 2054 105 5257 5.5 1 3346 1911 106 5133 5.5 1 3191 1942 107 4959 5.5 1 3156 1803 108 4732 5.5 1 3044 1688 109 4602 5.5 1 2903 1699 110 4612 5.5 1 2898 1714 111 4474 5.5 1 2807 1667 112 4252 5.5 1 2742 1510 113 4055 5.5 1 2595 1460 114 3927 5.5 1 2466 1461 115 3855 5.5 1 2462 1393 116 3792 5.5 1 2380 1412 117 3563 5.5 1 2254 1309 118 3451 5.5 1 2146 1305 119 3287 5.5 1 2035 1252 120 3092 5.5 1 1917 1175 121 3009 5.5 1 1917 1092 122 2662 5.5 1 1622 1040 123 2591 5.5 1 1604 987 124 2494 5.5 1 1532 962 125 2158 5.5 1 1332 826 126 1949 5.5 1 1112 837 127 1574 5.5 1 965 609 128 1378 5.5 1 781 597 129 1178 5.5 1 625 553 130 876 5.5 1 467 409 131 759 5.5 1 390 369 132 631 5.5 1 316 315 133 497 5.5 1 230 267 134 452 5.5 1 212 240 135 386 5.5 1 177 209 136 338 5.5 1 159 179 137 249 5.5 1 100 149 138 231 5.5 1 88 143 139 184 5.5 1 53 131 140 180 5.5 1 58 122 141 140 5.5 1 39 101 142 150 5.5 1 41 109 143 140 5.5 1 41 99 144 162 5.5 1 44 118 145 143 5.5 1 36 107 146 201 5.5 1 59 142 147 252 5.5 1 74 178 148 754 5.5 1 223 531 149 2177 5.5 1 623 1554 150 8818 5.5 1 2490 6328 151 2288 5.5 1 650 1638 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-1_S1_L001_R2_001.fastq.gz ============================================= 368469262 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-1_S1_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-1_S1_L001_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-1_S1_L001_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-1_S1_L001_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-1_S1_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-1_S1_L001_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Total number of sequences analysed: 368469262 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 78087642 (21.19%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-1_S1_L001_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-1_S1_L001_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-1_S1_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-1_S1_L001_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2_S5_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2_S5_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2_S5_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2_S5_L002_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2_S5_L002_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11570.25 s (31 us/read; 1.96 M reads/minute). === Summary === Total reads processed: 377,072,437 Reads with adapters: 81,687,193 (21.7%) Reads written (passing filters): 377,072,437 (100.0%) Total basepairs processed: 45,779,442,385 bp Quality-trimmed: 1,582,148,221 bp (3.5%) Total written (filtered): 44,038,626,908 bp (96.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 81687193 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.4% C: 22.3% G: 13.9% T: 29.3% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 61748741 94268109.2 0 61748741 2 13647943 23567027.3 0 13647943 3 3649207 5891756.8 0 3649207 4 1523738 1472939.2 0 1523738 5 288184 368234.8 0 288184 6 36201 92058.7 0 36201 7 17928 23014.7 0 17928 8 9237 5753.7 0 9237 9 3201 1438.4 0 1055 2146 10 12776 359.6 1 3950 8826 11 3231 89.9 1 717 2514 12 6049 22.5 1 3462 2587 13 6707 5.6 1 4038 2669 14 4631 5.6 1 2944 1687 15 6219 5.6 1 3847 2372 16 4777 5.6 1 3116 1661 17 3230 5.6 1 1419 1811 18 10287 5.6 1 6511 3776 19 12285 5.6 1 9157 3128 20 2004 5.6 1 829 1175 21 1516 5.6 1 655 861 22 7405 5.6 1 4683 2722 23 5127 5.6 1 3354 1773 24 7828 5.6 1 5655 2173 25 2924 5.6 1 1440 1484 26 5004 5.6 1 3347 1657 27 5857 5.6 1 3865 1992 28 10568 5.6 1 7259 3309 29 2009 5.6 1 886 1123 30 4302 5.6 1 2524 1778 31 6663 5.6 1 4881 1782 32 5841 5.6 1 3866 1975 33 5482 5.6 1 3695 1787 34 6718 5.6 1 4455 2263 35 5039 5.6 1 2658 2381 36 16294 5.6 1 11080 5214 37 5912 5.6 1 3728 2184 38 3485 5.6 1 2305 1180 39 9356 5.6 1 6775 2581 40 7977 5.6 1 5877 2100 41 6049 5.6 1 4558 1491 42 5994 5.6 1 4040 1954 43 2050 5.6 1 1099 951 44 3741 5.6 1 2611 1130 45 1976 5.6 1 1028 948 46 7799 5.6 1 5485 2314 47 8110 5.6 1 6076 2034 48 2264 5.6 1 1287 977 49 5157 5.6 1 3643 1514 50 9233 5.6 1 6953 2280 51 6518 5.6 1 4694 1824 52 3591 5.6 1 2189 1402 53 13144 5.6 1 9964 3180 54 7954 5.6 1 5990 1964 55 5005 5.6 1 3457 1548 56 4581 5.6 1 3177 1404 57 10323 5.6 1 8315 2008 58 2673 5.6 1 1627 1046 59 1771 5.6 1 1034 737 60 8790 5.6 1 6742 2048 61 4077 5.6 1 2665 1412 62 3532 5.6 1 2365 1167 63 5386 5.6 1 3939 1447 64 12584 5.6 1 9818 2766 65 5910 5.6 1 4504 1406 66 5404 5.6 1 4009 1395 67 5878 5.6 1 4453 1425 68 5853 5.6 1 4401 1452 69 6281 5.6 1 4763 1518 70 6802 5.6 1 5241 1561 71 7174 5.6 1 5486 1688 72 8350 5.6 1 6163 2187 73 36176 5.6 1 8496 27680 74 59933 5.6 1 35492 24441 75 35724 5.6 1 22371 13353 76 19621 5.6 1 11210 8411 77 13076 5.6 1 7680 5396 78 9304 5.6 1 5531 3773 79 7025 5.6 1 4215 2810 80 6051 5.6 1 3736 2315 81 5340 5.6 1 3342 1998 82 4789 5.6 1 3022 1767 83 4765 5.6 1 2946 1819 84 4638 5.6 1 2933 1705 85 4426 5.6 1 2852 1574 86 4428 5.6 1 2863 1565 87 4127 5.6 1 2645 1482 88 4200 5.6 1 2739 1461 89 4029 5.6 1 2609 1420 90 3985 5.6 1 2603 1382 91 3987 5.6 1 2626 1361 92 3741 5.6 1 2468 1273 93 3638 5.6 1 2368 1270 94 3605 5.6 1 2420 1185 95 3546 5.6 1 2378 1168 96 3446 5.6 1 2305 1141 97 3319 5.6 1 2227 1092 98 3342 5.6 1 2193 1149 99 3254 5.6 1 2144 1110 100 3181 5.6 1 2129 1052 101 3274 5.6 1 2166 1108 102 3017 5.6 1 2050 967 103 2999 5.6 1 2037 962 104 2841 5.6 1 1952 889 105 2819 5.6 1 1907 912 106 2722 5.6 1 1866 856 107 2638 5.6 1 1769 869 108 2607 5.6 1 1739 868 109 2485 5.6 1 1690 795 110 2488 5.6 1 1692 796 111 2393 5.6 1 1610 783 112 2358 5.6 1 1599 759 113 2157 5.6 1 1466 691 114 2100 5.6 1 1414 686 115 2037 5.6 1 1351 686 116 1874 5.6 1 1271 603 117 1977 5.6 1 1313 664 118 1758 5.6 1 1227 531 119 1758 5.6 1 1151 607 120 1716 5.6 1 1152 564 121 1536 5.6 1 1046 490 122 1376 5.6 1 907 469 123 1365 5.6 1 904 461 124 1265 5.6 1 845 420 125 1253 5.6 1 856 397 126 1069 5.6 1 697 372 127 984 5.6 1 675 309 128 884 5.6 1 623 261 129 721 5.6 1 489 232 130 629 5.6 1 400 229 131 606 5.6 1 390 216 132 542 5.6 1 357 185 133 537 5.6 1 356 181 134 539 5.6 1 342 197 135 474 5.6 1 280 194 136 400 5.6 1 226 174 137 369 5.6 1 193 176 138 370 5.6 1 194 176 139 359 5.6 1 165 194 140 326 5.6 1 120 206 141 358 5.6 1 137 221 142 361 5.6 1 132 229 143 436 5.6 1 155 281 144 459 5.6 1 190 269 145 537 5.6 1 172 365 146 681 5.6 1 212 469 147 1047 5.6 1 319 728 148 2274 5.6 1 664 1610 149 5755 5.6 1 1708 4047 150 19097 5.6 1 5563 13534 151 10033 5.6 1 2811 7222 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2_S5_L002_R1_001.fastq.gz ============================================= 377072437 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2_S5_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2_S5_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2_S5_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2_S5_L002_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed 370000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2_S5_L002_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12501.20 s (33 us/read; 1.81 M reads/minute). === Summary === Total reads processed: 377,072,437 Reads with adapters: 90,631,124 (24.0%) Reads written (passing filters): 377,072,437 (100.0%) Total basepairs processed: 47,112,865,190 bp Quality-trimmed: 4,195,593,705 bp (8.9%) Total written (filtered): 42,680,315,285 bp (90.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 90631124 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.8% C: 25.1% G: 13.7% T: 28.1% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 67152818 94268109.2 0 67152818 2 15748176 23567027.3 0 15748176 3 3788674 5891756.8 0 3788674 4 1384657 1472939.2 0 1384657 5 287354 368234.8 0 287354 6 54417 92058.7 0 54417 7 53824 23014.7 0 53824 8 25265 5753.7 0 25265 9 7882 1438.4 0 4984 2898 10 31538 359.6 1 14504 17034 11 10978 89.9 1 4147 6831 12 22103 22.5 1 12260 9843 13 11948 5.6 1 5748 6200 14 53231 5.6 1 33382 19849 15 12018 5.6 1 7003 5015 16 9117 5.6 1 4686 4431 17 32326 5.6 1 20206 12120 18 4822 5.6 1 2305 2517 19 23821 5.6 1 15385 8436 20 14814 5.6 1 9370 5444 21 2436 5.6 1 940 1496 22 4481 5.6 1 1954 2527 23 19477 5.6 1 11809 7668 24 61787 5.6 1 39034 22753 25 20247 5.6 1 12882 7365 26 21997 5.6 1 15138 6859 27 3295 5.6 1 1601 1694 28 22239 5.6 1 14341 7898 29 4554 5.6 1 2205 2349 30 27879 5.6 1 18279 9600 31 5564 5.6 1 2977 2587 32 30724 5.6 1 20708 10016 33 36426 5.6 1 26925 9501 34 3917 5.6 1 1925 1992 35 14871 5.6 1 8320 6551 36 14804 5.6 1 8924 5880 37 20116 5.6 1 14415 5701 38 12855 5.6 1 7768 5087 39 12166 5.6 1 8356 3810 40 5775 5.6 1 3236 2539 41 21594 5.6 1 14613 6981 42 52025 5.6 1 36005 16020 43 6670 5.6 1 3964 2706 44 24528 5.6 1 16837 7691 45 58551 5.6 1 41318 17233 46 15568 5.6 1 10495 5073 47 5432 5.6 1 3204 2228 48 43088 5.6 1 31161 11927 49 27430 5.6 1 19432 7998 50 12194 5.6 1 7796 4398 51 118875 5.6 1 88524 30351 52 22499 5.6 1 15146 7353 53 13243 5.6 1 8952 4291 54 12204 5.6 1 8664 3540 55 26276 5.6 1 19001 7275 56 17531 5.6 1 12483 5048 57 18297 5.6 1 13543 4754 58 26014 5.6 1 19831 6183 59 18164 5.6 1 13440 4724 60 17905 5.6 1 13569 4336 61 22547 5.6 1 17175 5372 62 24706 5.6 1 18974 5732 63 25196 5.6 1 19514 5682 64 25230 5.6 1 19457 5773 65 30391 5.6 1 23718 6673 66 40115 5.6 1 31365 8750 67 103771 5.6 1 54118 49653 68 220604 5.6 1 169116 51488 69 98530 5.6 1 80632 17898 70 33662 5.6 1 24339 9323 71 18916 5.6 1 12965 5951 72 12525 5.6 1 8293 4232 73 11480 5.6 1 7594 3886 74 9599 5.6 1 6411 3188 75 8603 5.6 1 5549 3054 76 8199 5.6 1 5256 2943 77 8271 5.6 1 5381 2890 78 8408 5.6 1 5289 3119 79 8640 5.6 1 5466 3174 80 8641 5.6 1 5474 3167 81 8351 5.6 1 5268 3083 82 7973 5.6 1 5108 2865 83 7765 5.6 1 4854 2911 84 7419 5.6 1 4648 2771 85 7071 5.6 1 4399 2672 86 7120 5.6 1 4319 2801 87 6865 5.6 1 4175 2690 88 6837 5.6 1 4233 2604 89 6834 5.6 1 4175 2659 90 6332 5.6 1 3908 2424 91 6085 5.6 1 3779 2306 92 6171 5.6 1 3771 2400 93 5759 5.6 1 3448 2311 94 5959 5.6 1 3670 2289 95 5979 5.6 1 3628 2351 96 5736 5.6 1 3542 2194 97 5757 5.6 1 3483 2274 98 6317 5.6 1 3931 2386 99 6126 5.6 1 3776 2350 100 6122 5.6 1 3849 2273 101 6038 5.6 1 3783 2255 102 5989 5.6 1 3765 2224 103 5786 5.6 1 3634 2152 104 5753 5.6 1 3655 2098 105 5674 5.6 1 3567 2107 106 5536 5.6 1 3495 2041 107 5168 5.6 1 3266 1902 108 5107 5.6 1 3273 1834 109 4885 5.6 1 3082 1803 110 4943 5.6 1 3119 1824 111 4731 5.6 1 2958 1773 112 4603 5.6 1 2934 1669 113 4346 5.6 1 2732 1614 114 4311 5.6 1 2743 1568 115 4202 5.6 1 2677 1525 116 4098 5.6 1 2516 1582 117 3717 5.6 1 2351 1366 118 3626 5.6 1 2263 1363 119 3472 5.6 1 2147 1325 120 3235 5.6 1 2021 1214 121 3090 5.6 1 1956 1134 122 2917 5.6 1 1802 1115 123 2749 5.6 1 1672 1077 124 2550 5.6 1 1539 1011 125 2330 5.6 1 1426 904 126 2036 5.6 1 1224 812 127 1707 5.6 1 994 713 128 1477 5.6 1 846 631 129 1283 5.6 1 690 593 130 1021 5.6 1 569 452 131 872 5.6 1 451 421 132 686 5.6 1 346 340 133 584 5.6 1 282 302 134 488 5.6 1 225 263 135 414 5.6 1 181 233 136 313 5.6 1 137 176 137 261 5.6 1 102 159 138 206 5.6 1 81 125 139 188 5.6 1 70 118 140 167 5.6 1 50 117 141 156 5.6 1 41 115 142 150 5.6 1 36 114 143 122 5.6 1 36 86 144 153 5.6 1 44 109 145 159 5.6 1 42 117 146 188 5.6 1 47 141 147 261 5.6 1 71 190 148 801 5.6 1 243 558 149 2340 5.6 1 630 1710 150 9887 5.6 1 2687 7200 151 2255 5.6 1 638 1617 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2_S5_L002_R2_001.fastq.gz ============================================= 377072437 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2_S5_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2_S5_L002_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2_S5_L002_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2_S5_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2_S5_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2_S5_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 377072437 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 78553966 (20.83%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2_S5_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2_S5_L002_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2_S5_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2_S5_L002_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.17 s (48 us/read; 1.26 M reads/minute). === Summary === Total reads processed: 45,471 Reads with adapters: 12,498 (27.5%) Reads written (passing filters): 45,471 (100.0%) Total basepairs processed: 6,281,433 bp Quality-trimmed: 196,524 bp (3.1%) Total written (filtered): 6,068,398 bp (96.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12498 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 36.2% C: 24.6% G: 12.5% T: 26.7% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 10043 11367.8 0 10043 2 1721 2841.9 0 1721 3 526 710.5 0 526 4 157 177.6 0 157 5 27 44.4 0 27 6 6 11.1 0 6 7 3 2.8 0 3 8 1 0.7 0 1 11 1 0.0 1 0 1 12 1 0.0 1 1 13 1 0.0 1 0 1 19 1 0.0 1 0 1 30 1 0.0 1 1 33 1 0.0 1 0 1 36 1 0.0 1 0 1 37 1 0.0 1 1 39 1 0.0 1 1 40 1 0.0 1 1 53 1 0.0 1 1 65 1 0.0 1 1 82 1 0.0 1 0 1 150 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001.fastq.gz ============================================= 45471 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.28 s (28 us/read; 2.13 M reads/minute). === Summary === Total reads processed: 45,471 Reads with adapters: 8,828 (19.4%) Reads written (passing filters): 45,471 (100.0%) Total basepairs processed: 5,183,367 bp Quality-trimmed: 2,042,773 bp (39.4%) Total written (filtered): 3,126,445 bp (60.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8828 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.5% C: 26.9% G: 13.8% T: 28.6% none/other: 1.2% Overview of removed sequences length count expect max.err error counts 1 6546 11367.8 0 6546 2 1699 2841.9 0 1699 3 382 710.5 0 382 4 121 177.6 0 121 5 27 44.4 0 27 6 4 11.1 0 4 7 4 2.8 0 4 12 2 0.0 1 1 1 13 1 0.0 1 1 17 1 0.0 1 0 1 20 2 0.0 1 1 1 23 1 0.0 1 1 28 1 0.0 1 0 1 30 2 0.0 1 1 1 34 1 0.0 1 1 35 1 0.0 1 1 36 1 0.0 1 1 38 1 0.0 1 1 44 1 0.0 1 0 1 45 1 0.0 1 0 1 46 1 0.0 1 1 48 2 0.0 1 2 51 2 0.0 1 2 54 1 0.0 1 1 55 3 0.0 1 3 59 1 0.0 1 0 1 62 2 0.0 1 2 67 3 0.0 1 2 1 68 8 0.0 1 7 1 69 1 0.0 1 1 86 1 0.0 1 1 87 1 0.0 1 1 91 1 0.0 1 1 112 1 0.0 1 0 1 115 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001.fastq.gz ============================================= 45471 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Total number of sequences analysed: 45471 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 15918 (35.01%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-1_S2_L001_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 2.17 s (47 us/read; 1.27 M reads/minute). === Summary === Total reads processed: 46,064 Reads with adapters: 12,711 (27.6%) Reads written (passing filters): 46,064 (100.0%) Total basepairs processed: 6,334,661 bp Quality-trimmed: 208,056 bp (3.3%) Total written (filtered): 6,110,027 bp (96.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12711 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.7% C: 24.3% G: 13.0% T: 26.9% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 10159 11516.0 0 10159 2 1792 2879.0 0 1792 3 557 719.8 0 557 4 163 179.9 0 163 5 22 45.0 0 22 6 6 11.2 0 6 7 3 2.8 0 3 10 1 0.0 1 1 12 1 0.0 1 1 30 1 0.0 1 0 1 31 1 0.0 1 1 34 1 0.0 1 0 1 44 1 0.0 1 1 51 1 0.0 1 1 60 1 0.0 1 1 73 1 0.0 1 0 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001.fastq.gz ============================================= 46064 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001.fastq.gz <<< This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 1.41 s (31 us/read; 1.97 M reads/minute). === Summary === Total reads processed: 46,064 Reads with adapters: 9,262 (20.1%) Reads written (passing filters): 46,064 (100.0%) Total basepairs processed: 5,308,036 bp Quality-trimmed: 2,045,052 bp (38.5%) Total written (filtered): 3,247,224 bp (61.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9262 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.4% C: 27.3% G: 13.9% T: 28.5% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 6857 11516.0 0 6857 2 1778 2879.0 0 1778 3 396 719.8 0 396 4 127 179.9 0 127 5 33 45.0 0 33 6 4 11.2 0 4 7 3 2.8 0 3 8 1 0.7 0 1 10 3 0.0 1 2 1 11 2 0.0 1 0 2 12 1 0.0 1 0 1 13 2 0.0 1 1 1 16 1 0.0 1 0 1 19 1 0.0 1 0 1 21 1 0.0 1 0 1 24 1 0.0 1 1 30 1 0.0 1 1 31 1 0.0 1 1 32 1 0.0 1 0 1 33 3 0.0 1 3 36 1 0.0 1 1 38 1 0.0 1 0 1 44 1 0.0 1 1 45 2 0.0 1 1 1 48 1 0.0 1 1 49 3 0.0 1 3 51 3 0.0 1 3 53 1 0.0 1 0 1 56 1 0.0 1 1 57 1 0.0 1 1 60 1 0.0 1 1 61 1 0.0 1 1 64 1 0.0 1 0 1 65 1 0.0 1 0 1 66 1 0.0 1 1 67 5 0.0 1 2 3 68 7 0.0 1 6 1 70 1 0.0 1 0 1 71 1 0.0 1 0 1 77 1 0.0 1 0 1 80 2 0.0 1 1 1 91 1 0.0 1 0 1 96 1 0.0 1 1 97 2 0.0 1 0 2 100 1 0.0 1 1 114 1 0.0 1 0 1 122 1 0.0 1 0 1 127 1 0.0 1 1 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001.fastq.gz ============================================= 46064 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 46064 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 15644 (33.96%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-2_S6_L002_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11088.12 s (31 us/read; 1.92 M reads/minute). === Summary === Total reads processed: 355,466,683 Reads with adapters: 79,048,358 (22.2%) Reads written (passing filters): 355,466,683 (100.0%) Total basepairs processed: 43,353,617,008 bp Quality-trimmed: 1,533,561,622 bp (3.5%) Total written (filtered): 41,652,013,242 bp (96.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 79048358 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.8% C: 23.6% G: 14.0% T: 29.5% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 59477575 88866670.8 0 59477575 2 13319509 22216667.7 0 13319509 3 3475244 5554166.9 0 3475244 4 1371542 1388541.7 0 1371542 5 266086 347135.4 0 266086 6 38136 86783.9 0 38136 7 25616 21696.0 0 25616 8 13806 5424.0 0 13806 9 3936 1356.0 0 1793 2143 10 16268 339.0 1 6331 9937 11 4587 84.7 1 1265 3322 12 9778 21.2 1 5656 4122 13 10514 5.3 1 6181 4333 14 7504 5.3 1 4603 2901 15 9934 5.3 1 6053 3881 16 7736 5.3 1 4812 2924 17 5734 5.3 1 2572 3162 18 18640 5.3 1 11803 6837 19 17764 5.3 1 12873 4891 20 3994 5.3 1 1711 2283 21 2535 5.3 1 1060 1475 22 11839 5.3 1 7484 4355 23 8957 5.3 1 5802 3155 24 11867 5.3 1 8484 3383 25 5775 5.3 1 2906 2869 26 7914 5.3 1 5045 2869 27 9687 5.3 1 6149 3538 28 16958 5.3 1 11370 5588 29 3246 5.3 1 1418 1828 30 8180 5.3 1 4751 3429 31 12091 5.3 1 8381 3710 32 11201 5.3 1 7243 3958 33 7671 5.3 1 5192 2479 34 18699 5.3 1 12737 5962 35 3643 5.3 1 1748 1895 36 11139 5.3 1 6957 4182 37 10719 5.3 1 7479 3240 38 12497 5.3 1 8328 4169 39 10645 5.3 1 7215 3430 40 8114 5.3 1 5583 2531 41 16108 5.3 1 10963 5145 42 14788 5.3 1 10260 4528 43 13791 5.3 1 10349 3442 44 10355 5.3 1 6847 3508 45 3551 5.3 1 1921 1630 46 6208 5.3 1 4287 1921 47 3701 5.3 1 2045 1656 48 13268 5.3 1 9406 3862 49 13331 5.3 1 9870 3461 50 4440 5.3 1 2658 1782 51 8654 5.3 1 6158 2496 52 15386 5.3 1 11467 3919 53 11040 5.3 1 7880 3160 54 6560 5.3 1 4010 2550 55 22978 5.3 1 17353 5625 56 12903 5.3 1 9759 3144 57 9023 5.3 1 6237 2786 58 8126 5.3 1 5786 2340 59 16919 5.3 1 13552 3367 60 4642 5.3 1 2888 1754 61 3054 5.3 1 1809 1245 62 14417 5.3 1 11036 3381 63 7371 5.3 1 4931 2440 64 6195 5.3 1 4260 1935 65 8812 5.3 1 6509 2303 66 22629 5.3 1 17628 5001 67 10835 5.3 1 8230 2605 68 9413 5.3 1 7155 2258 69 10511 5.3 1 8088 2423 70 10354 5.3 1 8026 2328 71 11094 5.3 1 8609 2485 72 11884 5.3 1 9171 2713 73 12695 5.3 1 10008 2687 74 14231 5.3 1 11391 2840 75 23213 5.3 1 15438 7775 76 58749 5.3 1 51548 7201 77 33510 5.3 1 28987 4523 78 18641 5.3 1 15097 3544 79 12287 5.3 1 9681 2606 80 9316 5.3 1 7000 2316 81 7901 5.3 1 5682 2219 82 7314 5.3 1 5236 2078 83 6820 5.3 1 4777 2043 84 6712 5.3 1 4620 2092 85 6675 5.3 1 4534 2141 86 6560 5.3 1 4494 2066 87 6366 5.3 1 4390 1976 88 6375 5.3 1 4402 1973 89 6209 5.3 1 4319 1890 90 6342 5.3 1 4337 2005 91 6011 5.3 1 4115 1896 92 6016 5.3 1 4080 1936 93 5766 5.3 1 3964 1802 94 5526 5.3 1 3809 1717 95 5530 5.3 1 3799 1731 96 5362 5.3 1 3718 1644 97 5253 5.3 1 3640 1613 98 5155 5.3 1 3514 1641 99 4934 5.3 1 3397 1537 100 4906 5.3 1 3389 1517 101 4862 5.3 1 3328 1534 102 4834 5.3 1 3383 1451 103 4637 5.3 1 3159 1478 104 4322 5.3 1 3013 1309 105 4263 5.3 1 2951 1312 106 4258 5.3 1 2931 1327 107 4195 5.3 1 2927 1268 108 4013 5.3 1 2786 1227 109 3777 5.3 1 2620 1157 110 3737 5.3 1 2626 1111 111 3566 5.3 1 2496 1070 112 3417 5.3 1 2325 1092 113 3281 5.3 1 2355 926 114 3153 5.3 1 2160 993 115 3048 5.3 1 2188 860 116 2945 5.3 1 2041 904 117 2716 5.3 1 1918 798 118 2683 5.3 1 1906 777 119 2457 5.3 1 1732 725 120 2424 5.3 1 1708 716 121 2331 5.3 1 1652 679 122 2025 5.3 1 1406 619 123 1995 5.3 1 1446 549 124 1936 5.3 1 1377 559 125 1711 5.3 1 1178 533 126 1577 5.3 1 1149 428 127 1320 5.3 1 963 357 128 1262 5.3 1 984 278 129 1026 5.3 1 775 251 130 817 5.3 1 630 187 131 695 5.3 1 560 135 132 621 5.3 1 481 140 133 486 5.3 1 370 116 134 466 5.3 1 361 105 135 407 5.3 1 300 107 136 372 5.3 1 271 101 137 329 5.3 1 227 102 138 280 5.3 1 175 105 139 291 5.3 1 184 107 140 220 5.3 1 119 101 141 248 5.3 1 138 110 142 250 5.3 1 142 108 143 234 5.3 1 135 99 144 247 5.3 1 138 109 145 271 5.3 1 142 129 146 331 5.3 1 166 165 147 396 5.3 1 212 184 148 746 5.3 1 398 348 149 1673 5.3 1 884 789 150 4760 5.3 1 2553 2207 151 2451 5.3 1 1262 1189 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001.fastq.gz ============================================= 355466683 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11944.54 s (34 us/read; 1.79 M reads/minute). === Summary === Total reads processed: 355,466,683 Reads with adapters: 87,792,091 (24.7%) Reads written (passing filters): 355,466,683 (100.0%) Total basepairs processed: 44,550,696,710 bp Quality-trimmed: 3,599,479,756 bp (8.1%) Total written (filtered): 40,651,166,533 bp (91.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 87792091 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.7% C: 25.5% G: 14.1% T: 28.5% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 64078764 88866670.8 0 64078764 2 14923642 22216667.7 0 14923642 3 3682921 5554166.9 0 3682921 4 1291006 1388541.7 0 1291006 5 279128 347135.4 0 279128 6 66199 86783.9 0 66199 7 75229 21696.0 0 75229 8 41159 5424.0 0 41159 9 10758 1356.0 0 7896 2862 10 45248 339.0 1 22852 22396 11 16906 84.7 1 6712 10194 12 34689 21.2 1 19362 15327 13 18824 5.3 1 9127 9697 14 78209 5.3 1 50301 27908 15 19623 5.3 1 11660 7963 16 13639 5.3 1 7164 6475 17 46412 5.3 1 30477 15935 18 7329 5.3 1 3587 3742 19 37076 5.3 1 24716 12360 20 24264 5.3 1 15777 8487 21 3829 5.3 1 1472 2357 22 6866 5.3 1 3066 3800 23 30775 5.3 1 18949 11826 24 90660 5.3 1 59176 31484 25 31622 5.3 1 20548 11074 26 36912 5.3 1 26064 10848 27 5139 5.3 1 2480 2659 28 35810 5.3 1 23572 12238 29 6880 5.3 1 3377 3503 30 43337 5.3 1 28840 14497 31 8677 5.3 1 4820 3857 32 48625 5.3 1 33493 15132 33 63118 5.3 1 47066 16052 34 6125 5.3 1 3004 3121 35 22130 5.3 1 12368 9762 36 21874 5.3 1 13264 8610 37 35345 5.3 1 25768 9577 38 19950 5.3 1 12158 7792 39 20417 5.3 1 14266 6151 40 9236 5.3 1 5165 4071 41 37033 5.3 1 25164 11869 42 73083 5.3 1 51256 21827 43 9847 5.3 1 6041 3806 44 39673 5.3 1 27486 12187 45 80374 5.3 1 56800 23574 46 27604 5.3 1 18970 8634 47 9432 5.3 1 5650 3782 48 76424 5.3 1 55491 20933 49 40499 5.3 1 29089 11410 50 20532 5.3 1 13290 7242 51 164629 5.3 1 123228 41401 52 31635 5.3 1 21269 10366 53 23913 5.3 1 16745 7168 54 23961 5.3 1 17767 6194 55 45427 5.3 1 33255 12172 56 28357 5.3 1 20081 8276 57 37627 5.3 1 28557 9070 58 46873 5.3 1 36119 10754 59 33581 5.3 1 25569 8012 60 33588 5.3 1 25684 7904 61 35038 5.3 1 27102 7936 62 39024 5.3 1 30111 8913 63 49081 5.3 1 38457 10624 64 55419 5.3 1 43950 11469 65 60451 5.3 1 48730 11721 66 65618 5.3 1 53678 11940 67 98489 5.3 1 77551 20938 68 345375 5.3 1 314032 31343 69 154747 5.3 1 139923 14824 70 53520 5.3 1 44552 8968 71 32944 5.3 1 26511 6433 72 21067 5.3 1 15914 5153 73 18576 5.3 1 13666 4910 74 15816 5.3 1 11395 4421 75 15769 5.3 1 11103 4666 76 15095 5.3 1 10546 4549 77 14371 5.3 1 9853 4518 78 13542 5.3 1 9409 4133 79 13707 5.3 1 9230 4477 80 13900 5.3 1 9516 4384 81 14165 5.3 1 9489 4676 82 13732 5.3 1 9122 4610 83 13559 5.3 1 9059 4500 84 13247 5.3 1 8844 4403 85 13096 5.3 1 8642 4454 86 12572 5.3 1 8384 4188 87 11844 5.3 1 7806 4038 88 11934 5.3 1 7874 4060 89 11495 5.3 1 7585 3910 90 11441 5.3 1 7640 3801 91 11154 5.3 1 7434 3720 92 10485 5.3 1 6806 3679 93 10116 5.3 1 6716 3400 94 10562 5.3 1 6958 3604 95 9612 5.3 1 6354 3258 96 10091 5.3 1 6644 3447 97 10001 5.3 1 6618 3383 98 9704 5.3 1 6423 3281 99 9745 5.3 1 6358 3387 100 10152 5.3 1 6817 3335 101 9939 5.3 1 6720 3219 102 9854 5.3 1 6597 3257 103 10042 5.3 1 6803 3239 104 9737 5.3 1 6574 3163 105 9511 5.3 1 6365 3146 106 9287 5.3 1 6263 3024 107 8914 5.3 1 6068 2846 108 8557 5.3 1 5785 2772 109 8447 5.3 1 5733 2714 110 8098 5.3 1 5504 2594 111 8075 5.3 1 5503 2572 112 7756 5.3 1 5264 2492 113 7572 5.3 1 5208 2364 114 7358 5.3 1 4978 2380 115 7022 5.3 1 4746 2276 116 6628 5.3 1 4533 2095 117 6561 5.3 1 4374 2187 118 6279 5.3 1 4271 2008 119 6078 5.3 1 4145 1933 120 5607 5.3 1 3858 1749 121 5247 5.3 1 3524 1723 122 5119 5.3 1 3459 1660 123 4655 5.3 1 3156 1499 124 4339 5.3 1 2930 1409 125 3917 5.3 1 2649 1268 126 3691 5.3 1 2445 1246 127 2957 5.3 1 1874 1083 128 2546 5.3 1 1606 940 129 2074 5.3 1 1273 801 130 1729 5.3 1 1024 705 131 1292 5.3 1 749 543 132 1126 5.3 1 645 481 133 961 5.3 1 523 438 134 787 5.3 1 437 350 135 618 5.3 1 324 294 136 463 5.3 1 229 234 137 432 5.3 1 199 233 138 281 5.3 1 128 153 139 244 5.3 1 112 132 140 173 5.3 1 60 113 141 139 5.3 1 45 94 142 117 5.3 1 40 77 143 129 5.3 1 36 93 144 132 5.3 1 41 91 145 109 5.3 1 31 78 146 114 5.3 1 36 78 147 142 5.3 1 49 93 148 278 5.3 1 127 151 149 685 5.3 1 334 351 150 2763 5.3 1 1481 1282 151 810 5.3 1 409 401 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001.fastq.gz ============================================= 355466683 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Total number of sequences analysed: 355466683 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 71185899 (20.03%) >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz<<< Started analysis of Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 5% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 10% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 15% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 20% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 25% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 30% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 35% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 40% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 45% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 50% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 55% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 60% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 65% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 70% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 75% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 80% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 85% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 90% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Approx 95% complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Analysis complete for Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_val_2.fq.gz Deleting both intermediate output files Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-3_S10_L003_R2_001_trimmed.fq.gz ==================================================================================================== Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 11293.68 s (31 us/read; 1.91 M reads/minute). === Summary === Total reads processed: 360,317,079 Reads with adapters: 80,220,248 (22.3%) Reads written (passing filters): 360,317,079 (100.0%) Total basepairs processed: 43,967,642,352 bp Quality-trimmed: 1,560,685,314 bp (3.5%) Total written (filtered): 42,238,637,269 bp (96.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 80220248 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.7% C: 23.8% G: 13.9% T: 29.5% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 60338788 90079269.8 0 60338788 2 13605218 22519817.4 0 13605218 3 3509248 5629954.4 0 3509248 4 1388109 1407488.6 0 1388109 5 269107 351872.1 0 269107 6 38516 87968.0 0 38516 7 24597 21992.0 0 24597 8 13084 5498.0 0 13084 9 3788 1374.5 0 1722 2066 10 15707 343.6 1 5942 9765 11 4631 85.9 1 1305 3326 12 9306 21.5 1 5226 4080 13 10308 5.4 1 6011 4297 14 7214 5.4 1 4454 2760 15 9487 5.4 1 5666 3821 16 7360 5.4 1 4595 2765 17 5695 5.4 1 2558 3137 18 17902 5.4 1 11159 6743 19 16909 5.4 1 12269 4640 20 3927 5.4 1 1688 2239 21 2448 5.4 1 1009 1439 22 11394 5.4 1 7038 4356 23 8390 5.4 1 5384 3006 24 11059 5.4 1 7864 3195 25 5527 5.4 1 2843 2684 26 7450 5.4 1 4873 2577 27 9300 5.4 1 5970 3330 28 16610 5.4 1 11225 5385 29 3060 5.4 1 1401 1659 30 8133 5.4 1 4772 3361 31 11251 5.4 1 7831 3420 32 11086 5.4 1 7279 3807 33 7409 5.4 1 4966 2443 34 18112 5.4 1 12473 5639 35 3598 5.4 1 1708 1890 36 10948 5.4 1 6916 4032 37 9972 5.4 1 6855 3117 38 12068 5.4 1 8008 4060 39 10475 5.4 1 7070 3405 40 7730 5.4 1 5298 2432 41 15577 5.4 1 10426 5151 42 14206 5.4 1 9693 4513 43 13143 5.4 1 9768 3375 44 9748 5.4 1 6353 3395 45 3463 5.4 1 1943 1520 46 5989 5.4 1 4121 1868 47 3489 5.4 1 1874 1615 48 12721 5.4 1 8917 3804 49 12876 5.4 1 9453 3423 50 4344 5.4 1 2587 1757 51 8241 5.4 1 5807 2434 52 14678 5.4 1 10804 3874 53 10691 5.4 1 7559 3132 54 6585 5.4 1 4001 2584 55 22691 5.4 1 17087 5604 56 12588 5.4 1 9530 3058 57 8760 5.4 1 6001 2759 58 7707 5.4 1 5513 2194 59 16258 5.4 1 12946 3312 60 4532 5.4 1 2802 1730 61 2863 5.4 1 1719 1144 62 14085 5.4 1 10944 3141 63 7437 5.4 1 4926 2511 64 6375 5.4 1 4354 2021 65 8460 5.4 1 6271 2189 66 22175 5.4 1 17281 4894 67 10442 5.4 1 7941 2501 68 9244 5.4 1 6990 2254 69 10026 5.4 1 7687 2339 70 9896 5.4 1 7644 2252 71 10434 5.4 1 8167 2267 72 11523 5.4 1 8923 2600 73 12491 5.4 1 9955 2536 74 13421 5.4 1 10746 2675 75 23025 5.4 1 15119 7906 76 58275 5.4 1 51215 7060 77 34068 5.4 1 29418 4650 78 19389 5.4 1 15709 3680 79 12783 5.4 1 9957 2826 80 9794 5.4 1 7282 2512 81 7950 5.4 1 5779 2171 82 7225 5.4 1 5102 2123 83 6698 5.4 1 4721 1977 84 6438 5.4 1 4485 1953 85 6426 5.4 1 4408 2018 86 6221 5.4 1 4277 1944 87 6176 5.4 1 4261 1915 88 6119 5.4 1 4198 1921 89 5828 5.4 1 3925 1903 90 5873 5.4 1 4054 1819 91 5786 5.4 1 3959 1827 92 5637 5.4 1 3802 1835 93 5482 5.4 1 3764 1718 94 5299 5.4 1 3668 1631 95 5376 5.4 1 3657 1719 96 5158 5.4 1 3560 1598 97 5088 5.4 1 3510 1578 98 5111 5.4 1 3454 1657 99 4842 5.4 1 3330 1512 100 4810 5.4 1 3333 1477 101 4645 5.4 1 3175 1470 102 4645 5.4 1 3210 1435 103 4531 5.4 1 3152 1379 104 4251 5.4 1 2924 1327 105 4186 5.4 1 2897 1289 106 4246 5.4 1 3014 1232 107 4096 5.4 1 2835 1261 108 3968 5.4 1 2782 1186 109 3865 5.4 1 2738 1127 110 3704 5.4 1 2601 1103 111 3388 5.4 1 2329 1059 112 3440 5.4 1 2398 1042 113 3310 5.4 1 2321 989 114 3078 5.4 1 2196 882 115 3117 5.4 1 2226 891 116 2949 5.4 1 2077 872 117 2757 5.4 1 1966 791 118 2541 5.4 1 1804 737 119 2416 5.4 1 1735 681 120 2306 5.4 1 1624 682 121 2288 5.4 1 1583 705 122 2058 5.4 1 1438 620 123 1907 5.4 1 1317 590 124 1907 5.4 1 1342 565 125 1725 5.4 1 1237 488 126 1458 5.4 1 1056 402 127 1372 5.4 1 1025 347 128 1171 5.4 1 916 255 129 937 5.4 1 713 224 130 782 5.4 1 614 168 131 646 5.4 1 506 140 132 565 5.4 1 430 135 133 517 5.4 1 413 104 134 485 5.4 1 356 129 135 428 5.4 1 307 121 136 362 5.4 1 247 115 137 298 5.4 1 193 105 138 277 5.4 1 187 90 139 274 5.4 1 167 107 140 271 5.4 1 147 124 141 283 5.4 1 136 147 142 200 5.4 1 95 105 143 223 5.4 1 126 97 144 255 5.4 1 128 127 145 269 5.4 1 130 139 146 341 5.4 1 178 163 147 424 5.4 1 224 200 148 731 5.4 1 408 323 149 1575 5.4 1 862 713 150 5215 5.4 1 2875 2340 151 2608 5.4 1 1396 1212 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001.fastq.gz ============================================= 360317079 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/20180328_fastqc_trimmed_hiseq_geoduck --threads 28' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed 40000000 sequences processed 50000000 sequences processed 60000000 sequences processed 70000000 sequences processed 80000000 sequences processed 90000000 sequences processed 100000000 sequences processed 110000000 sequences processed 120000000 sequences processed 130000000 sequences processed 140000000 sequences processed 150000000 sequences processed 160000000 sequences processed 170000000 sequences processed 180000000 sequences processed 190000000 sequences processed 200000000 sequences processed 210000000 sequences processed 220000000 sequences processed 230000000 sequences processed 240000000 sequences processed 250000000 sequences processed 260000000 sequences processed 270000000 sequences processed 280000000 sequences processed 290000000 sequences processed 300000000 sequences processed 310000000 sequences processed 320000000 sequences processed 330000000 sequences processed 340000000 sequences processed 350000000 sequences processed 360000000 sequences processed This is cutadapt 1.16 with Python 2.7.14 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 12203.25 s (34 us/read; 1.77 M reads/minute). === Summary === Total reads processed: 360,317,079 Reads with adapters: 88,689,096 (24.6%) Reads written (passing filters): 360,317,079 (100.0%) Total basepairs processed: 45,099,303,736 bp Quality-trimmed: 3,640,004,759 bp (8.1%) Total written (filtered): 41,159,151,426 bp (91.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 88689096 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 31.7% C: 25.7% G: 14.1% T: 28.4% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 64753352 90079269.8 0 64753352 2 15118719 22519817.4 0 15118719 3 3729572 5629954.4 0 3729572 4 1305726 1407488.6 0 1305726 5 282692 351872.1 0 282692 6 65920 87968.0 0 65920 7 74364 21992.0 0 74364 8 39672 5498.0 0 39672 9 10697 1374.5 0 7957 2740 10 45048 343.6 1 22457 22591 11 16931 85.9 1 6797 10134 12 33507 21.5 1 18660 14847 13 18755 5.4 1 9059 9696 14 77663 5.4 1 49412 28251 15 19148 5.4 1 11210 7938 16 13449 5.4 1 7018 6431 17 46580 5.4 1 30044 16536 18 7407 5.4 1 3684 3723 19 36440 5.4 1 24063 12377 20 23547 5.4 1 15404 8143 21 3906 5.4 1 1601 2305 22 6888 5.4 1 3092 3796 23 31056 5.4 1 19155 11901 24 88045 5.4 1 57413 30632 25 30799 5.4 1 19866 10933 26 36196 5.4 1 25396 10800 27 5129 5.4 1 2416 2713 28 34937 5.4 1 23009 11928 29 7029 5.4 1 3389 3640 30 42551 5.4 1 28247 14304 31 8699 5.4 1 4768 3931 32 47855 5.4 1 32717 15138 33 61631 5.4 1 46017 15614 34 6085 5.4 1 2982 3103 35 21581 5.4 1 12096 9485 36 21455 5.4 1 12945 8510 37 34351 5.4 1 24705 9646 38 20076 5.4 1 12167 7909 39 19987 5.4 1 13873 6114 40 9385 5.4 1 5173 4212 41 36717 5.4 1 24830 11887 42 71632 5.4 1 49925 21707 43 9887 5.4 1 6005 3882 44 38952 5.4 1 26928 12024 45 79259 5.4 1 56070 23189 46 26512 5.4 1 18489 8023 47 9088 5.4 1 5336 3752 48 74256 5.4 1 53627 20629 49 39821 5.4 1 28381 11440 50 20324 5.4 1 12864 7460 51 162789 5.4 1 120770 42019 52 31359 5.4 1 20923 10436 53 23724 5.4 1 16238 7486 54 23206 5.4 1 17066 6140 55 45128 5.4 1 32754 12374 56 27964 5.4 1 19730 8234 57 37238 5.4 1 27604 9634 58 45726 5.4 1 34990 10736 59 32778 5.4 1 24460 8318 60 32782 5.4 1 24959 7823 61 34414 5.4 1 26281 8133 62 37938 5.4 1 28887 9051 63 48599 5.4 1 37903 10696 64 54110 5.4 1 42463 11647 65 59526 5.4 1 47451 12075 66 64166 5.4 1 51851 12315 67 97138 5.4 1 75275 21863 68 336948 5.4 1 303077 33871 69 153767 5.4 1 137728 16039 70 53931 5.4 1 44491 9440 71 32760 5.4 1 26004 6756 72 21295 5.4 1 15869 5426 73 18768 5.4 1 13719 5049 74 15872 5.4 1 11155 4717 75 15804 5.4 1 11156 4648 76 14814 5.4 1 10205 4609 77 14079 5.4 1 9578 4501 78 13655 5.4 1 9317 4338 79 13972 5.4 1 9433 4539 80 13906 5.4 1 9318 4588 81 14585 5.4 1 9682 4903 82 14091 5.4 1 9379 4712 83 13670 5.4 1 9016 4654 84 13368 5.4 1 8829 4539 85 13338 5.4 1 8815 4523 86 12781 5.4 1 8370 4411 87 12074 5.4 1 7866 4208 88 12093 5.4 1 7932 4161 89 11792 5.4 1 7575 4217 90 11746 5.4 1 7667 4079 91 11488 5.4 1 7466 4022 92 10807 5.4 1 6964 3843 93 10319 5.4 1 6692 3627 94 10417 5.4 1 6799 3618 95 9889 5.4 1 6369 3520 96 10264 5.4 1 6712 3552 97 10210 5.4 1 6675 3535 98 9677 5.4 1 6295 3382 99 9855 5.4 1 6448 3407 100 10309 5.4 1 6852 3457 101 10187 5.4 1 6734 3453 102 10110 5.4 1 6782 3328 103 10134 5.4 1 6763 3371 104 9856 5.4 1 6548 3308 105 9690 5.4 1 6432 3258 106 9644 5.4 1 6483 3161 107 9175 5.4 1 6221 2954 108 9012 5.4 1 6138 2874 109 8617 5.4 1 5849 2768 110 8348 5.4 1 5665 2683 111 8148 5.4 1 5526 2622 112 8056 5.4 1 5465 2591 113 7615 5.4 1 5133 2482 114 7471 5.4 1 5029 2442 115 7212 5.4 1 4856 2356 116 6823 5.4 1 4570 2253 117 6711 5.4 1 4491 2220 118 6527 5.4 1 4378 2149 119 6212 5.4 1 4141 2071 120 5847 5.4 1 3907 1940 121 5372 5.4 1 3547 1825 122 5203 5.4 1 3453 1750 123 4806 5.4 1 3214 1592 124 4515 5.4 1 2955 1560 125 4058 5.4 1 2678 1380 126 3676 5.4 1 2455 1221 127 3085 5.4 1 1972 1113 128 2674 5.4 1 1709 965 129 2058 5.4 1 1250 808 130 1711 5.4 1 1012 699 131 1369 5.4 1 825 544 132 1121 5.4 1 640 481 133 980 5.4 1 525 455 134 760 5.4 1 415 345 135 627 5.4 1 326 301 136 513 5.4 1 265 248 137 402 5.4 1 189 213 138 269 5.4 1 123 146 139 216 5.4 1 95 121 140 183 5.4 1 74 109 141 151 5.4 1 53 98 142 152 5.4 1 54 98 143 130 5.4 1 40 90 144 110 5.4 1 29 81 145 131 5.4 1 40 91 146 110 5.4 1 34 76 147 136 5.4 1 50 86 148 288 5.4 1 120 168 149 844 5.4 1 443 401 150 3002 5.4 1 1628 1374 151 842 5.4 1 425 417 RUN STATISTICS FOR INPUT FILE: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001.fastq.gz ============================================= 360317079 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz file_1: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz, file_2: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_trimmed.fq.gz and Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R1_001_val_1.fq.gz Writing validated paired-end read 2 reads to Geoduck-NMP-gDNA-2to4kb-4_S14_L004_R2_001_val_2.fq.gz