{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Running in Docker container on Swoose\n",
"\n",
"Started Docker container with the following command:\n",
"\n",
"```docker run -p 8888:8888 -v /home/sam/data/pacbio_oly/:/home/data -it bioinformatics/bioinformatics:v0 /bin/bash```\n",
"\n",
"The command allows ```/home/sam/data/pacbio_oly/``` to be accessible to the Docker container.\n",
"\n",
"Once access to Jupyter Notebook over port 8888 and makes my Jupyter Notebook GitHub repo and my data files the container was started, started Jupyter Notebook with the following command inside the Docker container:\n",
"\n",
"```jupyter notebook --allow-root```\n",
"\n",
"This is configured in the Docker container to launch a Jupyter Notebook without a browser on port 8888.\n",
"The Docker container is running on an image created from this [Dockerfile (Git commit 7ee99a4](https://github.com/RobertsLab/code/commit/7ee99a4722180ce89cff4e1e73468764ee440455)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mon Sep 18 16:08:15 UTC 2017\n"
]
}
],
"source": [
"%%bash\n",
"date"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"11cac628f321\n"
]
}
],
"source": [
"%%bash\n",
"hostname"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Architecture: x86_64\n",
"CPU op-mode(s): 32-bit, 64-bit\n",
"Byte Order: Little Endian\n",
"CPU(s): 24\n",
"On-line CPU(s) list: 0-23\n",
"Thread(s) per core: 2\n",
"Core(s) per socket: 6\n",
"Socket(s): 2\n",
"NUMA node(s): 1\n",
"Vendor ID: GenuineIntel\n",
"CPU family: 6\n",
"Model: 44\n",
"Model name: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz\n",
"Stepping: 2\n",
"CPU MHz: 2926.129\n",
"BogoMIPS: 5851.98\n",
"Virtualization: VT-x\n",
"L1d cache: 32K\n",
"L1i cache: 32K\n",
"L2 cache: 256K\n",
"L3 cache: 12288K\n",
"NUMA node0 CPU(s): 0-23\n"
]
}
],
"source": [
"%%bash\n",
"lscpu"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" total used free shared buffers cached\n",
"Mem: 70G 55G 15G 252M 544M 47G\n",
"-/+ buffers/cache: 6.9G 63G\n",
"Swap: 4.7G 5.4M 4.7G\n"
]
}
],
"source": [
"%%bash\n",
"free -mh"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/data\n"
]
}
],
"source": [
"%%bash\n",
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 96720640\n",
"-rw-rw-r-- 1 1000 1000 1463327832 Sep 11 18:51 170210_PCB-CC_MS_EEE_20kb_P6v2_D01_1_filtered_subreads.fasta\n",
"-rwxrwxr-x 1 1000 1000 2852947472 Sep 7 21:26 170210_PCB-CC_MS_EEE_20kb_P6v2_D01_1_filtered_subreads.fastq\n",
"-rw-rw-r-- 1 1000 1000 1601947458 Sep 11 18:49 170228_PCB-CC_AL_20kb_P6v2_C01_1_filtered_subreads.fasta\n",
"-rwxrwxr-x 1 1000 1000 3126996263 Sep 7 21:27 170228_PCB-CC_AL_20kb_P6v2_C01_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 2843320527 Sep 7 21:27 170228_PCB-CC_AL_20kb_P6v2_D01_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 3114876304 Sep 7 21:28 170228_PCB-CC_AL_20kb_P6v2_E01_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 2960438946 Sep 7 21:28 170307_PCB-CC_AL_20kb_P6v2_C01_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 2995066419 Sep 7 21:28 170307_PCB-CC_AL_20kb_P6v2_C02_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 2092190052 Sep 7 21:29 170314_PCB-CC_20kb_P6v2_A01_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 1842836662 Sep 7 21:29 170314_PCB-CC_20kb_P6v2_A02_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 1672061431 Sep 7 21:29 170314_PCB-CC_20kb_P6v2_A03_1_filtered_subreads.fastq\n",
"-rwxrwxr-x 1 1000 1000 1831019208 Sep 7 21:30 170314_PCB-CC_20kb_P6v2_A04_1_filtered_subreads.fastq\n",
"-rw-r--r-- 1 root root 0 Sep 7 22:40 20170905_minimap2_pacibio_oly.paf\n",
"-rw-r--r-- 1 root root 44085 Sep 18 16:07 20170907_docker_pacbio_oly_minimap2.ipynb\n",
"-rw-r--r-- 1 root root 42611690934 Sep 11 22:02 20170911_minimap2_pacbio_oly.paf\n",
"-rw-r--r-- 1 root root 1350653569 Sep 11 19:11 20170911_minimap2_pacbio_oly_170210_fq.paf\n",
"-rw-r--r-- 1 root root 1350653569 Sep 11 19:04 20170911_minimap2_pacibio_oly_170210.paf\n",
"-rw-r--r-- 1 root root 0 Sep 11 18:54 20170911_minimap2_pacibio_oly_170210_vs_170228C01.paf\n",
"-rw-r--r-- 1 root root 25331753284 Sep 11 19:31 20170911_oly_pacbio_cat.fastq\n",
"-rw-r--r-- 1 root root 1469 Sep 18 16:07 20170918_docker_pacbio_oly_miniasm0.2.ipynb\n",
"-rw-rw-r-- 1 1000 1000 902 Sep 7 21:30 md5sums.txt\n"
]
}
],
"source": [
"%%bash\n",
"ls -l"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run [miniasm v0.2](https://github.com/lh3/miniasm)\n",
"\n",
"#### Mimiasm is a fast de-novo assembler that can be used with PaBio data. Typically accepts the output from [minimap2](https://github.com/lh3/minimap2) (which is what will be used in this notebook).\n",
"\n",
"#### Using as part of pipeline: minimap/miniasm/racon"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/usr/local/bioinformatics/miniasm-0.2/miniasm\n"
]
}
],
"source": [
"%%bash\n",
"which miniasm"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Usage: miniasm [options] \n",
"Options:\n",
" Pre-selection:\n",
" -m INT min match length [100]\n",
" -i FLOAT min identity [0.05]\n",
" -s INT min span [2000]\n",
" -c INT min coverage [3]\n",
" Overlap:\n",
" -o INT min overlap [same as -s]\n",
" -h INT max over hang length [1000]\n",
" -I FLOAT min end-to-end match ratio [0.8]\n",
" Layout:\n",
" -g INT max gap differences between reads for trans-reduction [1000]\n",
" -d INT max distance for bubble popping [50000]\n",
" -e INT small unitig threshold [4]\n",
" -f FILE read sequences []\n",
" -n INT rounds of short overlap removal [3]\n",
" -r FLOAT[,FLOAT]\n",
" max and min overlap drop ratio [0.7,0.5]\n",
" -F FLOAT aggressive overlap drop ratio in the end [0.8]\n",
" Miscellaneous:\n",
" -p STR output information: bed, paf, sg or ug [ug]\n",
" -b both directions of an arc are present in input\n",
" -1 skip 1-pass read selection\n",
" -2 skip 2-pass read selection\n",
" -V print version number\n",
"\n",
"See miniasm.1 for detailed description of the command-line options.\n"
]
}
],
"source": [
"%%bash\n",
"/usr/local/bioinformatics/miniasm-0.2/miniasm"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Errno 20] Not a directory: '/usr/local/bioinformatics/miniasm-0.2/miniasm'\n",
"/home/data\n"
]
}
],
"source": [
"cd /usr/local/bioinformatics/miniasm-0.2/miniasm"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cd /usr/local/bioinformatics/miniasm-0.2/"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/data\n"
]
}
],
"source": [
"%%bash\n",
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/usr/local/bioinformatics/miniasm-0.2\n"
]
}
],
"source": [
"cd /usr/local/bioinformatics/miniasm-0.2/"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/usr/local/bioinformatics/miniasm-0.2\n"
]
}
],
"source": [
"%%bash\n",
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"ename": "SyntaxError",
"evalue": "invalid syntax (, line 3)",
"output_type": "error",
"traceback": [
"\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m3\u001b[0m\n\u001b[0;31m %%bash\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
]
}
],
"source": [
"# Run miniasm following example used in README file.\n",
"# NOTE: There is a difference; the README file uses a gzipped PAF file as one of the inputs. Let's just see how this goes...\n",
"%%bash\n",
"time miniasm \\\n",
"-f \\\n",
"/home/data/20170911_oly_pacbio_cat.fastq /home/data/20170911_minimap2_pacbio_oly.paf > /home/data/20170918_oly_pacbio_miniasm_reads.gfa"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[M::main] ===> Step 1: reading read mappings <===\n",
"[M::ma_hit_read::340.853*0.58] read 190030638 hits; stored 55874767 hits and 1131723 sequences (9110809463 bp)\n",
"[M::main] ===> Step 2: 1-pass (crude) read selection <===\n",
"[M::ma_hit_sub::350.201*0.59] 926745 query sequences remain after sub\n",
"[M::ma_hit_cut::351.575*0.59] 51099870 hits remain after cut\n",
"[M::ma_hit_flt::352.852*0.60] 36162346 hits remain after filtering; crude coverage after filtering: 25.64\n",
"[M::main] ===> Step 3: 2-pass (fine) read selection <===\n",
"[M::ma_hit_sub::355.159*0.60] 839478 query sequences remain after sub\n",
"[M::ma_hit_cut::356.058*0.60] 33656383 hits remain after cut\n",
"[M::ma_hit_contained::357.343*0.60] 130281 sequences and 463618 hits remain after containment removal\n",
"[M::main] ===> Step 4: graph cleaning <===\n",
"[M::ma_sg_gen] read 295698 arcs\n",
"[M::main] ===> Step 4.1: transitive reduction <===\n",
"[M::asg_arc_del_trans] transitively reduced 100135 arcs\n",
"[M::asg_arc_del_multi] removed 9442 multi-arcs\n",
"[M::asg_arc_del_asymm] removed 4437 asymmetric arcs\n",
"[M::main] ===> Step 4.2: initial tip cutting and bubble popping <===\n",
"[M::asg_cut_tip] cut 69266 tips\n",
"[M::asg_pop_bubble] popped 209 bubbles and trimmed 123 tips\n",
"[M::main] ===> Step 4.3: cutting short overlaps (3 rounds in total) <===\n",
"[M::asg_arc_del_multi] removed 0 multi-arcs\n",
"[M::asg_arc_del_asymm] removed 1029 asymmetric arcs\n",
"[M::asg_arc_del_short] removed 1617 short overlaps\n",
"[M::asg_cut_tip] cut 1992 tips\n",
"[M::asg_pop_bubble] popped 79 bubbles and trimmed 53 tips\n",
"[M::asg_arc_del_multi] removed 0 multi-arcs\n",
"[M::asg_arc_del_asymm] removed 209 asymmetric arcs\n",
"[M::asg_arc_del_short] removed 235 short overlaps\n",
"[M::asg_cut_tip] cut 210 tips\n",
"[M::asg_pop_bubble] popped 35 bubbles and trimmed 8 tips\n",
"[M::asg_arc_del_multi] removed 0 multi-arcs\n",
"[M::asg_arc_del_asymm] removed 136 asymmetric arcs\n",
"[M::asg_arc_del_short] removed 174 short overlaps\n",
"[M::asg_cut_tip] cut 75 tips\n",
"[M::asg_pop_bubble] popped 38 bubbles and trimmed 7 tips\n",
"[M::main] ===> Step 4.4: removing short internal sequences and bi-loops <===\n",
"[M::asg_cut_internal] cut 84 internal sequences\n",
"[M::asg_cut_biloop] cut 101 small bi-loops\n",
"[M::asg_cut_tip] cut 17 tips\n",
"[M::asg_pop_bubble] popped 5 bubbles and trimmed 2 tips\n",
"[M::main] ===> Step 4.5: aggressively cutting short overlaps <===\n",
"[M::asg_arc_del_multi] removed 0 multi-arcs\n",
"[M::asg_arc_del_asymm] removed 54 asymmetric arcs\n",
"[M::asg_arc_del_short] removed 94 short overlaps\n",
"[M::asg_cut_tip] cut 25 tips\n",
"[M::asg_pop_bubble] popped 9 bubbles and trimmed 2 tips\n",
"[M::main] ===> Step 5: generating unitigs <===\n",
"[M::main] Version: 0.2-r128\n",
"[M::main] CMD: miniasm -f /home/data/20170911_oly_pacbio_cat.fastq /home/data/20170911_minimap2_pacbio_oly.paf\n",
"[M::main] Real time: 383.844 sec; CPU: 238.580 sec\n",
"\n",
"real\t6m23.853s\n",
"user\t3m30.504s\n",
"sys\t0m28.084s\n"
]
}
],
"source": [
"%%bash\n",
"# Run miniasm following example used in README file.\n",
"# NOTE: There is a difference; the README file uses a gzipped PAF file as one of the inputs. Let's just see how this goes...\n",
"time miniasm \\\n",
"-f \\\n",
"/home/data/20170911_oly_pacbio_cat.fastq /home/data/20170911_minimap2_pacbio_oly.paf > /home/data/20170918_oly_pacbio_miniasm_reads.gfa"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-rw-r--r-- 1 root root 36M Sep 18 16:34 /home/data/20170918_oly_pacbio_miniasm_reads.gfa\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh /home/data/20170918_oly_pacbio_miniasm_reads.gfa"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"S\tutg000001l\tGTGTGTGTGGTGTTGGATATTTGGATGGGCATCTATTTTTCATTTTAAAATGCAAAGATTTTCTCCATTGATTTGAAAACAAATATATACACAGCTTCAGCAAATATATCACTAGAAGTGTTCCATTCGGTTTTGTCTGTGTAACTTCGCAGCACAGTTAAGAGCTGAGTAGGGGCGCTCTCCTAATAATCTATTTTTAGAATGTGTATATACATCAAAACACCAAAACGAACAAAATAATCGAACACTAGCAAATATCTGTAATTTAGAATAATTAATGTTTATTTAATGCACGGAAGTATATACACACAAAAAACTTACATTAAACGGCTCTTGTTGGTCGTGAGTGTAGTCTGACGCCCTCAGATATCTAAACAACAACATCGTATCGCGATGCACTTTCATTATCATGACGTCATCAACGATCTAGTGTGCCGCATTTCGCGAAGATCTAGAGTGGCGGATCAATTGTTTTTTTGTCAACGAAAAAATTACTTTAGATGTAATGCAAATCATGATTGGAAATATTAAAAGACAATTAATTTGAGCAACAATGCCACCCGTTATAAAATTCATAACATTATCTATCTTATCCCTAGGAAAGCATTGATGCCATATTTATTTAAAAAAATAAGGGGGGGGATATGTTGAGGCATGGGTGGGTCATGCATATCGTTCATTGCACAAGGCTGCCCTAAATTGCTGTTTAAACGCGAAACCAAATAATATATTAAGCTATTTGAAAAACGTTTGTTACTCCTTGATGAATTTCTTAGAGAAATAATTGTCTATTTGAAATCGAGGTGCAGATGACTGGGTAAGATAGTTTCAACATACGATACAATATATTATTTTAACAATTATAAGTCAAAATCTGTAAAGGAATGCAAACGTAATCATAAACTACAGAGGTTTATAAAAACATTCATGAATCAAAAGTTAGATTTCAATTCCCAGTATGCCCTTAATATTGAAATCTATTGATAATCACAATACCGCTTGGTCAGTTTGGCAAAAACCACGTTGATGCCGAGGTCACGTGATCAATTAGATATTGAAAATTAATCGTCCTGAATATGTATTTTCAGTTTCCTTATTAATAATTTTTATCAATTTTCAATAAATTCCAAAAGATTGATTATATCCATTTCAAATAAATGATTGTTTTTCATCGCCATATTTTCAGCGAATTTTTCCAATTGGCCACCTTATCTGTGTTGCGTTTGCATATCGTTGTGTTACTAAAATATTTCAAATTCAAATCATATAACACGCGCATTTCATAAATTAGGCGGCGACAAGTAAGTCAGACAGGGGATGCTTCCCCTGGCAACAAGTTGAGATCTTACCTCGGATTGGGGTTCATCTGGGGTCCATCAATTGTCTCAGTGTTTAACTCCATGTGAATGCTATTCTCTGTTTTCAGGATATCTTAACTCATTTCCATCCATACATGAATTTTAGACATGCAATAAAAATCTACTTTTTATCGTTTTCACATTATCATATAAAACTAATTAAACACCTTCTCAACCTTTATATAGATTACTCTGTCTAATTTTTGTCTATCTGCAATGCCGCATTTGCGGTCTGTTATGAAAAAAAAAAAACAATTCGTTTATGTAGTATTACTCGATGCTGTTGTTAAAAAGTCTATAGGCCTAATTGTGCCAAAGGCGTAACATGTTATATATGTCTCTGTTTGCGCTTCAGTTTTGTGTGGGACACAACTTTTTTACGCAAAAACATATATGGGTCACACAAATATTTTCTAAAATGCTTGCATGGGGTCACGTTACTTCTTTTAAAACATTTAAAAAAAAAAAAAAAAAAAAAAATTTATTCAAATGCACCCCTGGTCCCATTCCGGTGTGAGAAAATCTTTGCATAGTCCCTAACTGCCAATATATTTTAGACGGCTATTTCTGTTGTCGAGAATTGAAAAATCGTGAAGCTCAAAGATGTTTTTCCTTTGGATCGCAGTATTCTATTTTTGTTAAACACTGGTTTTAGCTTTAATTTTGCCCCTTGAAATTCGGAAAGTTATTGCATTTTCCCGCGACATCAGGACAAATGAAATGCAATAAAATACTAGTATTTGTTCTTTGTCATTCATTCCCGAGGCTGTGAGATAAATTTCCAAATAGTTGCACAACTTCCCCTAGTTCTCTTTTTAATTTTCCCTTCTTAACGACGAGAATAGGCGGCGGCTTGAGTTTTCCATTCTGACACCATATAATGCTCTGATCGTTTACTAATGCTTTATTTCAAAACCAGTAATTCTATTACCACATTACAAAAATACAAAGCCATGCATATCGTTAGTATACCTTGCAATGTGCAAAATATTTATTGCTAAGAAAGATTATGTCAGACCCTCTGTTGGAATAACTTGAATTAAAACAAACACAATTATCAGAGCAAGGTGTCAAAGTTAATAAAATTTTAGGTCAATATGATTAACAATCGGAGGGCTCATATATATTTATTGCCTGTTCGACAATACTGCTTTACCTCTCTATATTTTAACATTCCTTAATGTTTTTTGCGTTCAATATCTTTACTACTACTATTTTCTCGTGAATTATTTGCAGGAATATAAACATGGACATATAATTTTTTTGATCATTATAGTCGTTTGTTTCAGCGCCTAGTCTCGAAAGGAAATGAAAATAGAAAGAAACAGGAAAAAATAAAAGTACATGAGGAGTATAACCTTGCATCTTTCACAGCATAAATAAAGTTTATTATAGGTTGTTTAACGCTAAATTTTTTCTAAGATAGTTCCTTGTGTGCTTTATATTTTATTTAAAAACAAAGAACTATGTTTTTTTTAATGTATAGAAATTTGAAGAGTTTAATCTCGAAATCGGTTTTGCTTATTAATTCAATATTTGCAAAAAAAGAAAATAAACTCATTCATTTCTGGTAATTTGTGTTCCATGTCGTATTCCCAACTCCGATGTTTCCCTTCATATCTTGCTGTAACTTAATGGATCCACGTCTTGCTATTTACTTGATATCTATTTCATTTATAGAGCATTTCACAATATCATATGTTACTCACATTATGACTTGTTTATAGCCAATAATTTTTATGTCCTTATATTAAGAGGGAGCATTCGTTTTAGACTCAAATTTTTGGTTCAATTTTTTAGATTCAAGTGTTGTTTGCTTTAAGCTTTAAAGAACCTCTTGCTTGTGATAGTTAGTACATCTTAACCATATTACAATACAATGTTGTCCCTAGAAAGACGACAACCTATATTATCAATATTGAAGTCAACCCGTGAAGGTAAAACCACTACAATATAGAAATATACTGGGCGGGGGGGGTTCAATGTCTTTGAGGGGAGTTTACTGATATATAAACTAAGACAGTTGGTTAAGTAGTTATTTTGATCCTTATTGATTTTCAGGTTACAAGATCGTGGATCTAGTTCTAATTTTCTTAATAACAAACAAAACGCTTTCCACCTATCAGTTTACTCTCAAGTCTTTGATTTTTCAGATGACTATTAATACTAAAACAAACAATTATTATTATTATTTTTTATTTATTTTTTTTTCTTTGACGAAAATAGTAGTGAATTTGAAAAGTTCTTTTAATTTCAAAATTATTGTTTAACATATCCTTTACTTCTCTGGTGTATCAGCAATTTTCCCGTTGTCTGTTAAATAGGAGACACAAAGCGATTTTATTCATTGTGTTCGAGTACGTATTAAAAGAATTTGTTACAGGAATGGACACATTTCCCGTGTATCAGAAGCCCATGTATGTGGGTGTATGTCGTGAAGAGTGAGGGGACACCGACAGAAGTGACATTCAGGAGGGATCTTAAGAGGACACGGCGGAGATGATACAAAAAACCTTCAGAGTTATAAATAATGGAGTGAGAAGTATGCTGAGTGTAGTCGTTAGGGAGGGATTCAGGTTTGAATCATCTGACTACTGATTACCCATGTTTGGTACACTGATCATAAGGTGATATGTAAGATTGTCTCAGATCCACATTTTATCATCCACCCAGACATACACTTATCTATATGGAGACTCTTCTGATTACTCCCCAATTCGCCAGATTACAACTAACTCACACGTCAAATAACCCAATCTGTCATGTCCTCGCATCGGAGAGAAACAATACTGTATATATTTCAAGTTCACATACTACGAATTGGAAATCAATCAATCATTAAACGGATTGCATCCAGCCTTTGAAGGTGAGAACCCTGCACCGGTCCATGTCACAACTTCACTTTTGTTTGTATAGGAAAGTGATTGCGTTGTGTTTCCATTCACGCCCATTGCCTTACCAGTCCCTTCCCTGGCGTTGATAGATGTTCGACTCGCCACATGGCCTTCGCTAGTGTAACTGCAAAAAATTATTGATGAAGGAATTCATCTCGTTCCTTATCGGCAAGACCAGGACAGACAAATGAATTAAACTCGGACGTTGGAGTGAGGAGGAATTTCTTTTTCACAAGCGGAGCAAAAGCTAGTGTACTCTAGAACCATTAATCAGTTTCTCTGTTACGGGATCATGAAAATAATTCTAGAAGGGACGATATTTTGTGTGATAAGTACTTTGTTCATATTTTATGAAAACCAATATTATTTTTGGATAATTAAAGAGAGTTCTTCTACTCCCTGGTACCATCAACATTGCTTCAACATGTGTGGCTTTGTTTCAAATTCCTTGATTGCAGGTGTACATTACCGGTTACCTTCCAAAAATTTCTTTATTCCAAGGGAACAACATGTTTGTTGGAAAAGTATGTGTAGGGGCGCAAACATAAGGCGTTACATTGTATCACAAAGCTATCAAAAAGCTAGAAACATTGGCCACAAAAATGGTGCGTTGCTTTCCCTGTTACAGCAGTCCCATCTCAGAGAGATTCTGACTCCGGCCCTCTCCAATCCGAGCTTTGTAAATGCAGACAGGGGAAAGTTACGCTCAAAGTTCAGAAATAGAATATTGATATTGCTTTTTATAGTGAAAATCTCAGTGTTAGAAAATACGATAACTGCATCAGATTTCCTTTTTAAACATGCATGCAATTTAAATGTCATAGATCAGTTGAGTCTCAACTCGTTCATAAATGTTAATACTCCATGATGCTCGTTGATAGGTGTGTTATTATCAAACGCCAAGTAATTTAAAGCTCAGAGTTTGTTTGTGTCAGAGAAAAAGAGATCACCAATTGATAGGTCAATAAATTCCACATGTTAGGTTTATTTTAAAGTTAGGTTTTTGTAATCTGAATGGTTCATATTAGCGTTACACTTGTACGATACCGGAAGATATCAGACAGCGCTAAACTGTCACATATTAGAGTGAAGGCAGAGAACTCTCAAGCCCCATATCATGTATGAGGATAACGTAGACAGAGAAAGATAACAGTGAGGCGTGGGTGGTCAGTCTATGTTACCAAGTTAAGGAAGGCTTGCAAAGAAAATTGTGTTTCTATAAAAGACTTTAACATATATTGAAGATTAAGTTGGAACAAAAGTCAGTAAATGAACGGAATGCCAAATTTTATGCTATCTCCCTTTGTGGCCGTGCACATGCTATCTGTCCATGTCCACTTACGCTAGGTTAACAGGCACAATATCTACAAGGCAACTGACCAGACCTACACACCCTACTGCTCTAATGAATGATGGGAGTAACGTCCTTATTTCTATAGGGATCCTGTCCTGGCAGATACTGGGGATCTGTTCACATGTTGTGGGGGACTTAACGGGGCGTTACAGTCATCAAAAGAGTCACTTAGACAGGTACAGGTACCACAAAATACAGATAGGCCACGGTAACTAGAATTGATTTAGTTAACTCACAATTGTAGGCAAGTAAGTAATGGCGACGTAAGCATTATAAAAGGTGTGAAAAAATTAAGATTATCTTTTGTATGTACATGTATTTCCATGGAACGTTTGGACTTTTGATTCTTGTGGGATGTTATGTACAGTAGGTTGTGGTATAGATAAACAGTTTAATATGCGAAAAGATCTCTATTGATGTAATAATCATGTCTAATTTCGATTTCCATTGTACATTGAATGCCGAACACATACGCTTCTCTCAAATTGAGTAAATTTTTATTGTAACATGGATATGCAAATGTATTTGACAAATAAATTTATTTTGAATAGCTATCAATTCTAGTTTGAACAATATTTAATCCCATTTTTTTGTTTCTTCAGCCCTGGAATTGTCAGCTTGGGGCATCCATTTAGTGATGCATTCAGCTAAATCAGAGAGCTCATAGTAACTAATCATCTCCACTGTTTCACTTGTATGTAAAGGCAAGATAACGAACAGTGATCAATCTCGTAAAACCCTAAATAGAATACAAAAAACTAAGAGTTGGAACAATCACGGACCATGGACACCACCAGAGCGGGATCAAGGTGTTAGGAACGAGTAAGCATCCCTGTTCACAAGTCACACCCGCCGTGAACCACAGTCTGATCAGGTTAAACTCCTCCCTTAAAACAAGGAAACCAATAAACTTCAGTTTACATATTCAAAGTTTTAACTCTAAAGCAAAGCAAAGACAATGTTATTGCAGATCAAGTAGAGTAATTAACTTGCACGAGTAAAACTAAATGGTGGTTTGGAGTATTTATTGTGTTTTCATAGATCATTTCAATTCAGCTATTGGGAATTTTTAACTTTTTTATTACCATTTATTAAATACAGGATAGATTTCCTGCGTCATACCCTCTGAATTAGTAACAATACATCGATATACTTTTAAAGAATATGGGAGAGTGAGGGGTTGCTCCGCTATGCACTTATAGGAAAAAGTAAACCGCTACCGGCAATGCTGAAACTCGGGGGCGGATTGATTCTTGGGAAATATTAGAAAAGCCAAGATAATGAACAGTGATCAAATCTCGTAAGTCCCTAAAACTGCTACACGAGAATATTTAGCCGGCACCTAATTATTTTAGCCGCAAACGTACGATTTGCGCTAAATTAAGTTGTCGCTAAAAATTGAACTAGATGGTACGCTGACGCCTAAAATAAGTTTGCGCCAAAAGGTTTTCATTCCTGGATGCGCTAAGTATGTTTTGGTTTATCAAAGCTAAATATCAATAAACATCGACTAAAACCTATTCAGACTGTAATTTACGTGTCTTTTGTCACGTCACAGCCCACCGTTGCAAGAGCCCCCAAGCAGTTCGCCACATGGGAAATATTCACCCATCCCTACCAACCCATAATTTATGTGTTTAATATGCATCTTATGCCGTGTTCTTTGCTGTTTGTTTTGTAATTTTTCAGGTAAAAATCCTTAGCAGGGAGCGTCATCCCTCCGTTCACGGCTTGTCAGTTTCCGAGTGGGCGGATTTCAGCTTCCAGTAGAGCTCACCTGGATTTTTCATCAATTAATTGACTCCAAATATAATATGAACAATTCTTTACTCAATTTTACTGGGTTGAACATCTGCGTCCCAGGTTCATTCAGGTTCTTGTTTTTCTCTATTTATCTACACCAATTACGCAGCAAGGCCGTGATCATTATCGCCAATAACCTAAAAACTTCAACCTGTCGGTTTTGCGTTGCTTGCCCCTGGTGAACACTACTAGCTATTTTGTCTTTGATAATTCATAGCAGGACAAAAGTCCGTTCATAGCTCTAAATTCGTAACATACGTTTATACATTTTAAAGAAATATAGGACAGTTGAGGGGTTGTGAGTAATACAAAAAGCGCGAAGTTCAACATTGGGTTCAAAAGTAATTTTAATTACATTTACAGGATCACCATTCTGTTGATTCATATAAATCCATTTCTTCGTTTGGCTCTGCTTCAAATTTATCTCCAGGCTCGCTTCTTTTTTTTTTATATTTTTGAAAATCCTAGGGCTATACACATTCATCATTTCGCCAGCAGTCTTTCGACATTGGAAGGGCAGAATTAGCATAGAGGAATTCCACGGTATCTCGTCTGCGACCGGACCTGATAGAATTTCTCCGACTTCGAGGAAGATCGGGATGTCCCCAATTGAAATTTGTCTGTGAAAGATAGTTTTCCAGCTGACAGCTCGATCCCGATCGAGTTTTTGCTGGAGCGACGAGGAATATTTTCATCTGAAAAGGTGTCCAGAGTCTGGTGTCAGAAGGAAGACCGAGAACCGGTTTGGTGTTCGCCAGACTTCCACGTCCGTGTCTGAATAATCTTTCCATTCTATTGTATTGATCCGTTAATGTTTTGATTAAATAGGATCCATGGCCTCTTCCTCAAAACGAATTATATTTCCTCATTGCAAAACTCAATTGGTGACAAGTCTTTGGACTACTTCTTTTGTGGTTGGTAATGGTGGTAATAAAACATTGAAGGCGGCGGATTGAATGCTTCCTTTTGCGTGCTCCGCGCTTATGTTTCAGAAGTAGAAGGCAACACATTGTACCCAGGTGGACGCACTGACTTCGTTCTTAGAGGGCATTTGTGAGTATAAAATACATGCTATCGTTGAATCGGTTCATCAACAATGCTGTTGAAAGGCTTCGGAGAAGACTTGACGTAGAAGATGACCGCCCCTGCCGCCTTGTCCGTCTTGCTTCCATTTGTATTCCTTCTGATAGAAAGATCCACCAGGATCCGAAGCACGGGTTTGGCGCTGCTTCTTGGGTCGGCTTTTTATTATCCCTGGGGGCTCGTCGTAGCGAACATCGCGAATCCTCATCGATTTTCATTATCCAACAAGTAAACACGTGTTCTCTACGGGAGATTTTGGAGGAACTTTTTGTGATTCATTGGGGACTTTCCAATATAGTGCGGGTTAGAAAATTATTTGACATACGCAGGGGACAGAAGATGCGCAAGGGATGAGCCGTCATTAGAATCGATTTCTTTGGAGGGAAAAAGGGAAACCATACGTAACTTTCATCACACTTTTCTCTTTAAAACCGATCACTTTTTGACAGAAGCATGTCCACATAGTAGAGGTTCGGTCTGCGCGCGCATCTGAGCTGACTATTGTCGTCTGAGACGATGCAGTTCCAGCGGATATTTGCTTAAAAAATTTGAACGTATTGGCACTGTCTCAATTGCATCTTCCATCGGTAAATCAGATGGTTGCTCGTATGGAGAATCCACTTGTAATCTTGTAGCAGACTTTGATGTTCTCATATAAATTCATCCTTTCTCGTTACAATAACTCCAAAGCTTGCACTGATACGATGGCAATGCATAAAAGGTCATCATTGGACACAACCTACTTAACCTGCAGTTCACAAGAAGTGGGATTTCCAGAAGAATATTTCTAGGTTGAGTAACGTTGCTCCGTACTACCGATTGCCCAGATCTGCTCGTTCCAAACGGAAGCGTTCCGGTTAAATGACAACTGGCGTGTCGGAATACGGCCATAAACAAGGGCGCACAATACAAGGTGAAACATTCTTGTTGATGACTTTCACTCATTTGAAACCGGACGATTAGCGCCCCCGGAAGTAGTAGTTAAGGTGGTAGATAATCCTTTTTGATCGAACGGAAGCAAAAATTTCTCGTCGCGTCACTGTGCACAAACATGTTACTTTTCCGTCACGACTTTCGTATTTTCGAAACACTTACTATTTTCTCCTTTTTTCACCGACCGTAAATCTTCTCTTTTCGCCAAACATCGCCTGGAAATATAGCGTGCGTTGTTGATGAATAATGCTTCCATTTTCCCGACGCACAATTTTGAAGAGCATGCCACCCGACGTTCTCCACCGATAAGAAATTCTTCCGTATCTGTTCCCGCCCAACTCTGAGATTTCTTCATAGTGGGAACAAAAATCGTCAACGGTGACCATTGTTATACATGTAGCTCTTCTTCGAACAATGGAAAAGTATCATGGACAAAGCAAAGCGTTTTATATTGGTGCATATTTTCTGCCCACCCTTGTCAGTAATTATGCGACTTGTCAGATAATTATGTCGACTTGTCAGCATCTTTATGTTCGACTTTGTCAGATCTTTTTGTCGACTTGTTCAGATATTCACTGTGTTCAAGAATTCATGATGAAAAAATCGTTCATCCCGTTTGTGCACATCCAGTTGACAACATAATTTTCTGACAAGTTCAACATAATCATTACAAAGTCGACATCATTATCTGACAAGTTCGACATCATTATCTGACAAGTTCAACATTCATTATCTGACAAGTCACATCATAATAATTATGTCTAACTTGTCAGATAAACATGCCTACGGAAACGGTTTACTCGGAACACCTCTTGTTCTTCTATGTCGATATAAAAAGTCGACATTGTCCCAATTTGAACAATTTAACGACTTTTTAAACTAGATAAGTCTTCATTTGTTGCACTACACGACATATTTTTCTTCGCCTTGTTGTTACTGAAATCAACATGACGATCTAGCAAATCGCCAAAAAATAAGTCAATGAAATGAATGTCGCTAAGGCTTTTTACCCACGACTTTTAAAAGAATGATTACGTTTGTACGGAAACGTTTTTACGGAGCACCTCTTTTCTCCATGTCGATAAAAAAAAAGTCGACATAGTCGCCAATTAGCGCAATTTAACGACTTTTTAAAACTCATAAGTCTTTCTTTTGATGCACTACACGACTTATTTGGTCGTCTTGGTTTGTACTTAATCCAAACAAGACGGACTTCTAGCCTTTTAAATTCGCCAAAAATAAAGTCAATGAAATGAAATGTCGCCCTGGTTTTGGCACACGACTTTTTAAAAAGATGATATTTGTGTAAGTCGACGAAAAGATCTCAACTAGATGTCTATTTTCATTGCTAAAAGGAACCATTTTTGACGACATATTTTCATTAAACCAAATAAGCGACTTGATAAAGTCGTTTTGTATTGTCTTTTGACAACAAATTCGCCATGTGTAGGAACACTTTTATTACGGGGCGTTAGGCGCCATACCAATGCACCCCTAAAAGTCCCCAACAAATATTGCAATATGTATATTTATTATACATATTAAACAATTCTCCACCCCCATATTGTTATATTTGCAAAGTTACCGGGGCTATTGAAATATTTCAGGCCTATACCCTAAGAAAATTCATCTCTTTTTTAAACAACCTTTTATAAATAATACTTGTTGTTCCAAATATACGGTCATTTAGATGGAAAAGAACGCATTATATTGTAATTTATGCATTTTCAATAGGTAAAATTACTTATTTATCCATTCATAAAATTCATATTTTTACTATTGAAAATACTAAGTGAAAAAACAACTTTGTTGATATTATGTTTGCATAGCTTTATACTTTTGAGACCATCATGTTCTATACACAAGAATAGTAAATAAATAAAAAGTCAATGCTTTTTAGTTGGATATCGAATTTATTTTCACTTATAAATAGGATTTTCTAAAAATTTTCATTCGCTTCGACTCGTGAAAATATGGAAAAAAAAAACCTGTCTCACTCGTGAATAAATTCAAAATCCAACTACGTGTACAAAGCATGAATGTCCTCTATTTATATATAGAATTTTTTTTATAAACTATCATGAATACCAGGAAGTTTGACAGACTATACTAAAGCTAAACGTTTAAAATTAACGACAATTTTAAAGTACCAAATCCAGTATAGATCTTAAATTACATGGCGGGTTAGAACTTCATAGCTACATTACTAATCAATACAAAAATATTTTCTAACATACTTTATTTACTACACATGCAATATGGACATAAGGTAACAAAACGATAAATGTCAAAAATGAAACAATGTTCACCAAAAAGCAACAATTAATTAATTAGCCAAATTACATTATATCATGAATCATGATATTTTGACATTTGTATACATCATTTACACATGATAATGATAAAACAATCACTCAAATTGGTTGACTTTGATGAGACATATCGTCAAAATATGTTAACCTTTTTGCAAGAAAATTGGATTTTCGAACACTTTATTCGTCGAACTCTTACAACAAATATCAATCTTTTTTAGAAAGTCGCTGCAAAAACCAGGGCGACATTCATTTCATTAACTTCTTTTTGGCGATTAGCTAGAAGTCGTTCATGTTGATTTAAGTACAACCAGGCGAAGAAGATATGTTCGTTTATAGTGCATCAAAATGAAGACTTACGAGTTTAAAAAGTCGTTAATTTTCAAATTAGGCGACAATTTCGACTTTTTTATATCGGACATACAGAACAAGAGATGCTCCGTAAACGTTCCGTACGTTTGTGACACAATTGTATAACTTCCCTTTTGTATTTGTTGTAATCGACAAAAAATGTCGACAAAATGGCCCAAATTTCTTGCAAAAAGATGAAACACTATTTTGACGATATGTTTCATCAAGTCAACCATCGATGATTGTTTATCATTATACATGTGTACATGAGGAGTAGTCAATGTTCAAAATATAATGATATCAATATAATGGCTTAATTAATTAATTGTTGCTTTTGTGAAAAT\tLN:i:12432\n",
"a\tutg000001l\t0\tm170211_224036_42134_c101073082550000001823236402101737_s1_X0/119802/23643_32121:297-8472\t+\t699\n",
"a\tutg000001l\t699\tm170315_063041_42134_c101169382550000001823273008151700_s1_p0/145210/0_9963:167-9905\t-\t3585\n",
"a\tutg000001l\t4284\tm170308_230815_42134_c101174252550000001823269408211743_s1_p0/161776/0_7674:79-6933\t-\t539\n",
"a\tutg000001l\t4823\tm170301_162825_42134_c101174162550000001823269408211762_s1_p0/150940/15520_22005:3-6466\t+\t1946\n",
"a\tutg000001l\t6769\tm170301_162825_42134_c101174162550000001823269408211762_s1_p0/144765/0_10257:4593-10255\t-\t5663\n",
"S\tutg000002l\tGCTTGGATGCTTTGGATGTGTATCTTGTGCAAAAATGGACAGAAAATCAGTTAGGAAATCTGGCAGAGACACATTTTTGAAAAATCACAAAGTTTTCCAATAAAACGTGCTAGGGTAGGGGTGAAAAATAGGGTTGTCGGATGACAGTAAACTGACAGTAAACCACATATTTTTTTTGGCCAAATATTTACACCGGGACAACTTATACCTGGGTAAAGTATAAGTAAATTTTTATTTTACAATAACCCATGTTAACATTTCCAAGAGATTTGTAACTTTTTCTGGGTCACAAGTTTTACAGAAGTCAAATACCAACATTTGAATGTTGGTAAATTTACCAACAGCTAGAATCTGGTAATTGTGACTTATGTAGGATATGGCTCCAAAGACAAATTAATTCTGCAGGTAGAAGGTATCGGCATGTGTTGAAACATGGCCATTTTTGGTAAAATTTTTCTGATAATTTGATTTAGCTTAGTGCAATAGGTAGAGTCTCTACTTTGGACTTGAACAGTGGTCCACCCATAATGCCCTTTCCAGATCAAGAAAAGTAAGGAAAAAAAGCAGAGAACTTGGACAGAACTGCTTATTCATAACAAGTTCATACTCTCCTCTTGTTAGATAGAATCATAGAATTCTCTTTTATTTATTCACTGAATATACAGGAAAGTAGGGCATAATTTTACTCAGAGTACATCGGATTTTTTACTGAATTAACTCTTACTCATTGATTGTAGACCGTCAACTACGTACAAATAATTTATTGAATTTTTTTGTGACAACTATTTGGTGATTGAATTTTCCTAGATTGATGATGTAAGAAGTATAGGGTATAAATCAAAGGATTAATTTATAACTAGCCTACCTAGTAAATTTTACATGTATTACCCGATCCCACTGGAACTTTCGTCGGCTAACTTCAGCCATATTTCCCTGATTTTTTTTTAACAGTCCGGTGTCGACTTTAACTATTCATGTAACGGATGCACAACTACAATTAGACGTTTCAATACCGGTCATAGCGGATTCCGATCAGCCGATGAAAATGGCAAGTGTTGCGACAATTGTAATTAGACGAAGATTCAGCAGCTATAACACACCAGTTCTGCTCATATAAGATACGCCTCGCGTTAATTAAACCAATTCAATAACCTACAGTTAATCTGGTTACAATCTGAGATTTGAAGAAGTGTCTTTAAGTGCATTCGAAACAAACGGAAAACCATCATTTCTTTTTGCAATTTCCAACTTTCGGTCCGAATTGCTCACCAATCATCGTTGTTACGACGCTATCTGAAATCCGGATCCCGGAATGATAATTCGGATACCATTTTTTACATATGGTACGTATATCTCAAGAAAATACAAGCGGGCCAAAAATTCCATTTAACATAAAATCACTTTAGTATCCTTGTTTAGAATAATATAAATATTGAAAAAAGAAGCCGTTGGAAGGGATTTACATAATGTGTTTGTTAAAAGGATAGCCATTCACTTATAACGCAGTTGGAAATGCTTTGTTGTAAACTTGATTCAAAACAAAATAATTTTTCAATACTTCAGTTTCAAACAAAATATATTCTGTTCAAATAAAAATAGGTAGAACTATACTCACAATACAAAAAAGAAGAAAAAAAAATATTTTAAAGCCTAACAAATCCAAATTGCATCCATTTAAATATAGCATGCTGTATAATACCATATAAATAGTATTTTTTTACCTTTGGGGTTAGATAAAGTAAAAATAGTGGCTTTCACATCATGAAAGTACATATTTCCAAAATAACATTGCTTGTGGTTATTTTGTGTTATGTTGAATATCACCATCCATAGCAAAAAATTGGATTTCACATGGTTTCTGCGTAGACTGCCAACTTAATACTTATGTTCTATAATAACCTCACTATATCTCCAGAGTATTAGAGAGTGGTGCAACTTGAACCTGACCGTTTTCGGGTTGCAACTCCACCTACTGTTCAATCAAACTCAATTCCTTCTTTTTGCTCGTTTCGAACGACCGAAAGCAACGACTTTTGGACAAACAAGTCAACTAACTCTCGAAACAAAAATTAACTATCACACAATATCTTCATTTTAATTAAAAAATATTTGTCAGATTTAAGTTTTAGGCCTGTTGTATCGCTTTTCACTTCACTGGATGTAATTCAAATAATTGAACATATTCTATTATAAATTACTTTGATGTTTACAACTGTGGCCAATAAATTTTAATTGTAAACGTTCCAACAATTTGAATAAGTTCAAACGACCTCGAACATTAAACTAATTTTTAAAGTTTGAATTTACTAATTTTCAACGTCACAACTATGCAGGTATCAGGCTTACACCAACGGGAAGTCTGAAGTTGGTTTTTTTTTTTCAAATAATGAAGCTCAAACTACATCCCTAACAAAAAGCCCACCAGGATGGCACGCGGTTTTTTCGAGGTGCCAACCTGGAAAATTAACATAAAATATTTTTGTACTCGACCGGGGAGAAATTCTGTAAATTCAGAATGTTTGGGCTGTTTTTACATACTTTATGCGCCCGATTATCGCCCTTTTCCGGCGCGCCGGGTGTTTTTTTTTTTTGGTTTTTTTTTTGTGTGTGTGTTTTTTTTTTTGGTTTTTTTTTTGGTGTTTTTTGGGTTTTTTTTTAGCTTCGATGCTCATCCCTGGTACTATGATGTGAACCGGAAGTGAGTTACGCAGTGTAGTACACATTAAATCACTCGGTTATATGATACAAAAATAAAACTTCAAAAATCAACGAATAATAAAATCAAAAACTATATAAAAGAAAATATGTAAGACTTTTTTTTAAAAAATTCAAAATCCAAAATCAAATCTTATTCATCAGTTGACTTGAAGTTTTATTTTCCATTTAATCGAGTGATTTATCATCTACTAACCGCTGAGGTACATGCACTTCGCGGTTTTTTCCCACCAATGGAATTAGGGATGATCTATTAATCGAAAAAGGGGGGGGGGGGGGGGGCCAAAAAAGGGTCGTAATCGCCGCGGTTGCATGGGTAAGCAGCCCAGAAATTCCGATTACGAATTTCTCAACGAAATTTTCCGCGGCCGAATCCAACACTACAGTGTAATTTTCCAGGTGGGACCTTGAAAGACCCGTTTGCCATTATGTGTTTAAGGTGTTTTTTTTAAAGTAGGGGGTAATATTATTTTTGGCTCTATCATTGAAAAAACATAAAGACTCCTGTGAACTATAATTAACCGAACATTTAACCAATAGACATACGGGAATAAAACAAGAACATTACCAAGCTATATGCAGAATGAAACACGTTGGTGGCTTGGGTAAAATACGTCAGCTATCATCTCCCTCCATCGAACTGCATAGTCTCTCTGGTAGGTCAAGCAGTGAGGTTCCGTTAAAGAAGAATAGAGCCAGGTTGGTGAAAAGTGGCAATATTTGCCCACATTTTTCTGCTTTCTCCACTTCGTTAGACGCCATGGGGCCTGCTGTGACGTCGATGGTGTATTCACTTGTCTTACGTAATGTTGTTATGCTTATAATGTTTGTTGTATAACTGAAAAAATATTGAATCATGATATTCAAATCAAAATAGTGCTACTCTTTCGTATAATATTAAATTTTCACGTGATCAGATCACCGACACTTCCTAGAAACAAATTTATTTTTCGAAAATTGAAATGAAAAACTAATTAAATGTTAATGTAAGTAACATGGATGCAAATTTACCTATCACCGATGGGATATAATGTATCCTTATATGATTATCTGAGGAGGGCTAACGTATTTTAACTTTGAATTATAGAAGAGGAGAGGGAAGGCAGTAAGTTGATAGAACTTGACTCCAATCCGTTCAATAAAATATGTTTCAAAACAAATATAAAAGAATTTAACAAAATTCTCAGTTCCTATGTTTGTAGAAATACACGTAACTCTAATGATATATCTATACATCCTTTTTGTATCAAATTGTCCTAAAAATCATGATTTTAATAAATTAAATGTCCTTTGAGCAATATATTATTAGAAAAAAGTGTAAATTGAATACTGCGTATTCAGGGACCAGGAACTTGAACCAGCACGTAAACCTCAATTGGCCTCATCTTAGTCTGTGATAAATGCTCATTGTTGGCGAGGACTATCCCATAAAATTACTATTAAAATCCGTGTATCCATCGGAAATTCATCAGGGTACAATCCCTAATAGACTCTACAATCACTGGCATGGTTCTTAACCGCTGATTTCTGATTAACCATTCAATGAAATTTTTTTTTGTATTCGTAATTGGTGTTCATTTGAATCAAACAACGTACGGAACACGACGCCATTCAACATTTTATAATGAAGCGACCAAACAGCTTTGGTATCGAAAGTAGATTAGTGGCAGAACTAATTATGGTCTGACAATAGGCGAGAAGACGGGTTGATTTTGATTTAATCGGAAACCGAGCGCACTTACATGTCTGCAGAAGTCGGGTGGTTTTTTTCTCCATTAAAGCGGTTTTTGCGTGATATTGCATCTGCGAGGCGGCCTATTCACTTTTCAATTCAATATGTCGAACTAGATATATTCCAGTTGTAACACCAGCAAACGTTTTGATGTCTGTATGTACATGTATCACCCATGTTGTGGTTTCAATTTCCTATTTCATTTCTTTGCTGCCGTACATTTATGTGGGGTTCGCTCTAGTTGTAGCGCCTCTCGCCCCCCCTTTTTTTTTTTTTTTAAATCGGAGAGGGCCTACATATTGCAGCTATCTAGATCATGTTGCAGTTTGATCTTTCGACTATCTTGCTGGAATTTTTCCCCTGTTAATTTGACGATTAGTATGACATGAAGGGTTAACATAATTGTCCGGGTTATTTTCAATCAAACCTAGTAAACGGTCCTACTAAACGGAATATCTCTGGGGTTTTTTGTTTTTGTTTTGGGTTGTTTTTTTTGGGAGGGGGGGTTGTTTGTTGTTGGGTTTTTTTTTTTTTTTCCTTGGTTTTTTTTTAAGGGGGGGGGGGGTAATATGTTATTACAAGGTGTGGTGTGTATATATAAATTAATAACAATGTATATATCAAATCAAATGTTGAATATGTTCACGTCATTGCATCAAATTAAAAACTCTAGCCGTTCCTTTTTAGTAGTACCTTTGTATGAGTGAAAACATTCCAAATTACATAGTGTTTGGATGTGGCATGTGGCAAGGGTTTTTATTTTTCTTCAATTCCTGGGAACAAATTACAGCGTTGTTGAAATTTCATTAGTTTTTTAAGAAATAAATATAAAGTGAGTGCATGGTTTGTTCTTTTTTAAAACTCCTCTCATAGTATTAAAGCCAGGCTAAGGGACTTTGCAAACAACTTGATTGTATTGTTTATTCAAGAACTATAACAGTTCATAGGTTTACGTTTTTCAATAAGTATAAATAAACAGACATATAAAACATGAGTTGAAATTGCGCTTCATTGTTTCAACATATGAGTATCTTGTCGAAGGAGGTTTTAAGGAGGTATGGGAATCCAAGAAATTCATATAAGGAAAGTGGTATATGAAATAGATTTTTTTTTGGCATATAGCTATTTGCAATGTTAAAAAAATTCCAATTCTGGCGAGATTAAACCATGATTGTACAGTTACCAATTACATGTATAGTCGAGTAAGCTTGGCCCCGAACAGATACAAAGAATGGTATGACCAATTTCTAATTTCACTCGGTTTCATCCAGGATTCCAGGTTTATAAAGATGTGTTGTGTGTATTTTTTTTTTTTGTTTTTGTTCAATTAAATATTAGAAAGTGGAGCTTTGTTGCAAACTAATTCAAATGTGTATAAATGTGATGTGTTGATGTGAATGTGGTGTGTGTGTGTGTGTGTGCCGTGTGCTGTTTTGCAGCCAGTTTAATACCAAAACGCTGTTGTACTGTCAATTGATGATGTTATAACTCACCGTTCCAGAAACGTCTACAGCAAAATCATCTTATAAGATTGTACTGTATGTGCAAGTTATATTGTTGATATTAGTTATTCGTGGGCCACAGAATTAACTCAAGTACAGTTAGTGAATTGTCTCTTAATCTATTATAAAGTCTGGATCACGCGTATATATTCAATAGAGAAGAGTCGGAGTAAACAGTCTTCATCTGTGAATGGTTGACACTATCGTGTAAGTCACACTTCAGGAGCCGTTTGTTGCGACGAGGATATCGGATATACCGACGCTCTTTGTTTTGTATGATTTTATGAGCTTTCAATATTTTATATAATCTACGGACAAAGAAGTGTAAATTATTGAATAAACATATCTCAACGCCTGTTTGACCAATGCGAGAGCATTAACTGTGAAATAATATGAAACAATCATTACAAAACCCAGGATCGTCAGTCTAACGTTACATCTAGCTTTTGTTGTGATTTCTACGTTTCCATATGAATATTCAACTTAAATGATAATGGCTCAAAATGGCAACCTCAAGGGATATAATTTACTATTAATCGAAATACATATAGTTGTACCAAAATATTGCTATTCAGGTGTTATGATGAAATGTAACAAATGGGAGGTTTAACTTAGCACAATCAGCTGTGCTTATCTGGGAAAAGGCCAAAAAACATTTTTGAAATGAAAATAATTATTGGTAATTTGGTAATCATGTAAGAATAGTGCTCACGATTCTTTATTATGTAAACACACGAGGTTGTTTTGTTTACGTAAATATTGCTTCACAAATAATACCATGTTTAAGACACAATGAAATAATTCGAAACATTACACACTGTTTAAATTGTGTTCAAAATGCATTTGTATCTTTGAAAAAAATCTGCTTAAACGGAATAGTATGATACATGTAGAATATTCGAATATAGGTAAAAGCCGGAAACTACCGGACCACGATCTAGCACGATCCTTGATTTTGCCTAAATGTGATCAGGTCGGGTAACACCGCCACGTTTTTGAAAATCCCCACTGACATTCTACTACGATTCATTTAAATCTTGTAAGGTCGTGAGAGCGTAGTGAAATCGTGGCAGTGTGAGGGGGCTTTAAACTAAAACAATGCCGAGCATGTACAGTTCCGAAGAATTGTAAGTGGTGCAACATATCCCACTTGAAACTCAACAAATGTTTCAAATTTTTACTGACCGTTGAAAAATACTAAGTGGAGCATTTCTAAACATAAATCAAAGTACTTCCAAGTTCTTAAACGTGGGTTCGACACACAAATAAAGATAAGTGTCTAATCTTTTAGAATCCGTCTACGATAAACAAACATTAACATGCTAGGTATCTGCTCATTTTAACTACTTTGTGTTTAAATTTGGAAAATCGTTGGAACTCGGCAGTGAGCCCATTTTGTTTGCTGGACGCTTGACATAACGAGGTGTTCACCTAGTAAAAATGGAAATCTAAAACTGTTCATGCCCCATTAAATTTATAAAGATATCCTGAGAGTGCACTGCAAAACTGTCTAGCACGTAACTTAATAGATCATAACTGGTCCAAATTTCAATTCTCAAGTACAGTTGAAATCTCGTTTTATCTCGAACTCGATGAAGACCAAGAAAAAACTTTCGAGTTACGTAGTCGAGATATCGAGGGTAAAATACTTATGATTTTCTTTCTAGATCAATCTGAAGGATTTTCAAACATGTACAATAATAAAGAGTTTATTCACACACTTTTTATCCAAACATTAAAACAAAAGAATAATTGACCATGCACTCAAGAGCATACAAAATCTTTGGTAAAAAAATTGACATTTATAAAAAATAAATCATACACAAGAAAAAAAGAAGTCCATATTGTTGATTTACTCAGAAAAAGACTGATAGATTCTTCTCTAAGCAGAATGATTAATTTGAATGCATGATACTTGCAAATAGAAAGTGATACCACTGTAAATCAATTAATGAAAGGTGGTTGATTGTGTGATTGAGGGTGAAATTTTGTTTTGCTTTTTTCTAATGTACAATTAGAATTTTATTCTAAGTTTAAAGAAACACTTATAACATAGCACCAAAAAAAAATATTACTCTTGACAAGAAAGAACATGTTATAAATATATTTTAACAGTCAATATATAAACAAAATATTAAAAATTACTGTGATGTTCCATGATTGTAATTGTTTGTGCAAGAATATTTGTATGAAATGTTTTCTGCTTCTTCCCCAAATAGTACATGTAACCCGATCTCCTATTTGATTTGCGGAAGTGCGGTCCGTGGTCATGTTTACTGAAAACGAAAATTGATATTCTTTTATCAATGTAGCAGTCAAGATTCTTCATGAATCATAATATAGAATGGTAAATATTATTTCCGTACTAGTCCTATCAGATTTTAATGCGAAATTTAAGTAAGCAAAACTATTGGAAATTGCCATATAATCATTTTCTGAATAATGAAAGTTGAAAGTATATGATTATGAAGCATCCACGTGTGATGACATTTTTATTTGTTCTACGAAAAGCGAAGGCACATTTTAGTTATGTTCTGTTTTCTGTCTACCCACATAGAAAACTATTTATGTCCATATTTAAATATGAACTTGCCAGGTAAGCCTTTGCTTGTTACTTGATGAAAGGCAGTGAAACACAGGAACAGCATGTGAATTGAAAAAATGTCATTTATTATGGTAATGACTGAGTTATATAACAGAAAACATGCGGGGATCAGATGGCTCGCAATATAACCATAAATATAAAGAACATCTCTACATGTATGCACAAGTGCTGATTAATGCTCTGAATATCTCAGAACACATTATGCAATATAAAAAATAGACTGCTGGAATGGATGTGGCCTGCATTGATTTGTATAACCACTTATACTTGATGTCCAGGCGCCTATTTTTATATTGCATAATGGTGTTCTGTATCAGATGGCATAATCACACCTATGTGATACTGTAGATTATGTTCTTATATTTTGGTTTTTCGAGCCCATCGTGATCCCCGCAGGTTTTTGTTATATAACTTCGTACATTACATATAAATGACAATTTTTCAATTCACCATTGCTGTCCATGTTTTCACTGCCTTTATCAAGTAACAAGCAAAGGCTACCGGCCGTCATATTACTATGGAACATATAATAATGTAATTATTGCAGTAATGACTTTATGTACCATGTTAATTTGTATTTAATTTTGGTGTATATAAAATTGTGTGAAGAATTTTAAACATATTTAGGAATAAGAAATTGTGTTTTTCATATCAGTCATACGCGGGGAATGGAAACTTGATTTCTGTGGTCCACTCTTCTTTTTGCAGGGCCTCTGGTGGATGCTGTTCCTTCAAACTGTTTGGAAAATTTTTGATCACGCGTTTATACAGGATTGCATAGTTTTATTACGCTTGAACAATCACAAGCGCCGCTTTTAATCGGGACGTGCTCATTGCATAATCACACGAACGCCGCAGTTTTTTGCTGATCGATTTGACCTGCGGTCATTCTTGTTTAATTGCTCAGCTTTGATACTGATCTGAAATTTGTTTAAAGTATATTAATCTGATTACCCACCATTCACCCCTATCCTCTGTGGTATTAGTTGTTGTAATATTTTTCTATGTCCTGCATTATTTGTGATAACCACCTTACTTGATCCAGCAGCTCTATTTTTATATGCATATGTGTTCTTGTGGATCAATGGCATAATCACTATCCTATGTGATAAGTCGTAGGATAGTATCTTATATTTTGGTTTTTCGAGCCCTATGATCCCCGCATGTTTTCTGTTAATATAACTTCGTCATTTAACATAATAAATGACAATGTTTCTCAATTGGAGCGACGGTACACTTGTACTAGGTTATTAAAGATCAATAACTGACGGTTACGGGTAACAAGTTCAACGCTAAATTCTTGAGAGATCGTAGCGGCAGTAAGTGCTAAGAAGCGCTCAGATGCTGGCTAGCCCGCTATACGATCTGAGCGCGACCCTCTCAAGACCTCTTTCCTGCTTGGGGCAATAGTTTGATGCGGCGTAGGAGCGCTGAGTGTTCGTGTTGTGTTAGAAAAGTGGCTTACCCGGGAAAAACAGAAATCATTTTGGAGGTGTTCCATAGGGGAACTCCTGATAACAATATTATAAATGTTCTGAATAGAAAGATAGATTATGAAAAAAATACTAGTATCGACAAAGGTTTATGAAATAATCTTATCATTATATCGATTTACAATACAGCCCAAAGTGGCAGTGTTAAAATATAGAAACAATTCATTATGGTTTCCAGATATCAATTTATGCTAGTTTGACAATATGCATTTTAGTGTTGTTGGTTATTGTTTTTTAATAATCCTCGCACATGCAAATCGGAGTGTTCAGACGGTATAGCCCAATCTTGCAGTAATATGCATCGAATCACGGGGCATAGATCCGGCGTATGCTGCATTCTGGGTTACATTACATGTAACACATGCTTTTCTGGAATTATGTTTGTTCTGCACTCTTTTGAAACTCAACATCTTATAAATCATATTTCCTTATTGGCATCAGTTAAAATAAATAGTTCCCTCCAGATATGCTTCGGAAAAAGAAACTTACTTCAATCTTATATCTTGTTGGAAATGATTGTATCTACATTTTACAACTGAAAGCTGATTTTTTTTTTTGTTATCCGCCAAAATTTAGCATTGAATACACCAGACAACCTCTTCAGGCGATATGCTGCGAATTCTCAAATATCTCAATCCATAGCGACACTAGCTTTAGAAAAATTTGTGACGGTTTATGATCTAAGGTTGAGAGTATAATCATTTAAATTGTTGCCAAAAATGCCCGTCGATAGTCTCCTATAATGTTGCCTGCCAGTAATCAGACTGTTTGATTCTTAATAGACAACATATACGGTCGGTTTCCCCTACTTCCCGCTCCTCAATAAAATAGTAAGCTATTAATCATTAGGTGGTAGGGTTTGTGAAACATGCCTTTCCACTACATAACTAAGTATTTGCGCTAATGCTAAAACATGTTTTTATTTCATAAAAATGCAGAAAATAATTGAACTACTGTATCATCAACGACTCTAGTAATGAGCCGGTATATTTCTCGGGGGATTAAAAACAAGCGATACATCAATGCCACTATTATGAAACTTGTAGGAAAATAAAGGATATCCTTTTTTTCAATCTCACAACCCATTCTATCGATCTTAAAAAACGCTCAGTGGTAAGCACGGCACGTGGTCAGAGGATCAGACGACGAACATGCTGATCTGAAAGAATTCAATTTGAAATGCAAGCCGGGTACCTGGAGAAACCTCAAATTGAGATATAAGTTTTCGGCCAAGCATTTACACTCCTACCAGCATTGTTTGGTAATTAACAAAATGCATGCGTTCAGACAAATATCGACTAGTTTCTGAAGCAAAAACCAAACATTCATCCTAGATGCGATCGTAAATGTTTTGTTTGTCTTAGAGTATTGCTTTCAATTCGCCCCCCATTAGGGGTCATTTTTTAATGTTTAGTGTTAATGCATTATTCAACGAACAATAGTAACTTAAATAAAATCTAATACTTGACTCATTGTGAATGTTTGCCCGTAATGAATACACAAAGGAATTATTTTCGTGGAATTCCATCGCCAATTAAGATAAGAGTTATGTATGACTATGCCGGGATAGAGACTTGCGCCATCAAGTTCATGAAACACACGGTGGAACAAAAGGGGAGCATACATTTTATAAAATTTGGAACTCTTCCACCTCTTGTGCTGAATGGGTGGGGCAAAAAACTGGTCAAAATTCAAATATCGTCTTCTTCCCTTCCCCACATTTAGGTAGAAAAACCAAATGCATGCTTATAATGTCCATGAATGCTTCTACTTCTAATTGTAAAATATTGATTCATGAGTGCTTCAGCCTCCAGGGTGGGTCATATATGACTACTCAGTGAACATGTATCAAATCTTACAAAATATTCTTCTGCACTTTCATATAATTTTTTAGAATAAACTTAAATGCATGGTTATCTCAGGTAGATGGATATACTGTTAAGTGCTTAGTGAAGCTGTGGATAATGATTTGTATTCATTAATAAAGATTGTGATATTAACATCTGGTATGACTATATATATCTGCTGTTAATTGTTGAATGGTCTTAGTGCTATCTTATTGTTGTACAAAAAATGATGCTCTGGCTGAATAAAGCTACACTAAACTAAACTATGATGTACCTTGACCAATCCGATTCTTGTAAATTCGGTCTTGATTTTCTGCTGTCCTCCCAGGGCCTCAGGTATTGTTTGTTTCATGTTTGAATCTCGAACTTTGACTTGTACATCTTCTGGTCATAGGCCATGCCTTAATGATATAATGTATTGGTGGCCCTGCCATTTTGTGAGGCATGCCACAATTGTGGAATTTTGATTTGTGATGTCCTTTGGTATTTTTATCAGCCGTGCTTTATAGCCAATATAGATATAGAAACCAAATATCATTCTTTCTGCTTGCAATACTGAGATAACAAACAGTGATCCATCTCATTGTATCAAGAAAATACAGAGTAAAGAGTTAGGACAAACATCGGACCCCTGGGCATACTAGAGTTGGAAACTGTGTGCTAGAAAGCCTTACTTGTTATATATATTATATGTTCATGCTAATATCTGGTGTTTCTAAGGGTCCGTGTTTCCCAATCAATTGTCTATAACGTTCATGGGAACCTGATCATTCGGACTTGTAATCCAACATTAACTGTTCCTACCAACCCACATTCACTAATCTAATGGGAATTCTGTAGAAAAATCCCCAGTACGTAGGGACTTTGGCGTGGGGCAGTTTCAAAATTATATTCATATACGACCAATACATCCTCATAAAATAATTATAGAAAATGGCTAGATATTCTAAAAATGATAATTGGTCAGTAAAAAGAACCTTAATAACAAATAAGATTAGAGAGTCAAAAAAGGTGGACCATATGAACTGTAAAATCTCTTAAATCACCATTGTGAACTTTTGAATCTAACCTTACAACTGAATTGCCTATGATATGAGAGGGAAAACCGTCTGCCAAATCTTGAAATCTTCTGCACTGGTCAATGTCACTCGAACAGGGCTTATACAATAAAATGCATCATATCTTTAAAATCTTAATTTCCTGGTCGTACAACAAATATAGTGAGTGTGTAGTATTTAAAATTAGAGTCTCCACCAGATGATTGTCCAATTCAAACCCTTTTGGCCACGAATTTTGGTCTTACAGCCAACCTGTTTTTAAGTTTACTGTTCATAATTGAATGAAACTTTTTGAATCCTGAAAAATCACACAACTAATGCAAAACTATCATGTGCGTCGATGGTTTGCGTACGTCATAAACTTGGATCAGGTTAATGCTGAGATCAGTCTCAATATAATATACTAGTAATTCGTAAATTAAAATGTCTTTATAGGTATTCCTTTTTTAATTTTCGTGTTGGTTATCATATTGGCAATGAACTAAAAATACTAGAATACCGACAGGAAGACCTTTTGAACGTAAAAGTCTTCACTTAAGGGTTTGGCTGTAGTAAGTAAGTCTTTTGAACTCTTCCTCCTGCAGTATCTGGAAAATAACACTGGCATTGCTATAAAACGAAATTTGTCTAACGATAACCCCATACCGCCGATGAATCCAAATCAAAAGTTCGTCAGTCCAATATAAATATTGTGTACTCCCAATATGTCAAAAACAATAATGCACGTATGTGCGTGCATACAAAGCAGTAACCAGGGTTTGGGAGTAGTGGGTGGGTGTGTGCTACTGGGTGCAAGTGCATCCCCATAAAAACTGCTAAAAACCTCTCATTGGGGGGGGGTCACTCCTCACACACCCCTAGCTTAGACCAGGCATGGTGGAGTGTAGTGGGGGTCGCTAAAGTCCCCCACCCAAGTGAAAATGTTAAACTAGTAGCTTCCGGGACCCCATACTCACAACCGGTCTGGACCAGCTAGGGGAATCAGACCCCTCGCCTTTGTCTCGTGCACTCCGTGCATCTGGAATTCTAGTTACGTCCTGATACATACACGTGAAGAACTGATCATTAGTTACATAAGATTGTTGATATATCTGTATTGTAGAGTTTAAGCCTGCACTCGAAGCTTTTATGTAGCAAATGTTTAGAAAGGCGGCATGTCGCTAAGATATTTGATACCAAGACCGATTGGTTTTGTTTCCTAATATTTGACGTGTGGGATAAGTCACTATCGTGTCGTATATCCTTACCTATATAACGTCAGAATCGTCATGCCGGTCACACTCAGTGCTGATATAGATAACACTTTTACCTCAATAATGTTCTTCTCTGGAAGTCTTCGGCATTTTCTTGGGGTAGTTTTAATTACAGGATCATCTACAAAGTAACCTGGTAATGTACATCTTGTTTGGATATTACACTGATTTTGTGGGTTTTTTCCCTTTGTACATACATTTGTTTTTAATATTTGATAGAATTATTTTACATTAACCTAAATAGTCATATACATTATCGTTTATGATACCTGAAATTAAATGTACAGTCATTTATTTTCGAGGAGGTAATATGCAGATAGTCTATACAATGTGATTCTATCGCCCCTTATTTACAGGACGCCACAAGCTTCGCTGTAGCGTTATAATTCGAGTTTCAAACTGATATCTTTTTCCAAAATTATTTACAGGATGCATCAAGAGCGGGTCTGCTCTGATATGGGTTTTCAAAATCTGCAGAATTAAAGTTTTTATTTCTGGCGGAGAATATTTAAAGTACATATATTTAATAAAAATCAAAACTAGAATGTAGATGTATAATGGGTCAGCTTGCCATTATGATGCCAAATTGAATCTTCCCCAATGTTTACTAGTCTCCGAACCTACCAATATTTACCGTTTTGTATATGATTTGTTCAGCTCCTGCGGGCTGACATATTATATGGTGCGTTTATCTGAACTGCGCGTTGCGTTAACTGGGGATTGTGGGAATGAACTACTAAAAAAAATTAGAAATCTGAATTAAAGAGGTTTCTTAGAAGAAGTTCATTGATATCAAGTCACTTAAACTATGATATAAGAAAACCCAATAGAGCAAAATGATAAACAATAAGAGAAAATGATATACGATTTCTTGGAAATGAGGTACTGTTATCAATAAACATTTCTCCAAAATAGACTCGTCATAACAGAGGTATCATTCTATATAAGAGGTGCGTCAACTTCCAAATTTGTTCGCAACGCTAAGGCCTTCGACGGGGAGTAGCACAGGCCCTTTCCGGACAAGTCACACTACTAAATTATCTCAATTAACTGCAGACTAGTAGCTTTCAAACGTAGAAATCTGAGCTAAATTAGTAGTACTAACCCTCAAAATATGATCTTTTAAAAAAATCCATAAGAACAATCGATAGCGATTCTAGGAAATCAAATAAACGCAGTGGTACACTCTATTATTAGATGTGCCAACGGCTTGTGCGTACAGATACTGCATTTTTTTTGGGTATTAACACTTTATTATATAATATATGTCTGTTCCACTGATTAAAAAACTTCTTTTTTTATAATCAGTGTACTCTTTTTAGCATTAAATTCATTCGGCAACCTCGGTTTTGTCCAGGAATTTAAAGTTATCTTTAATGGTGTCCTGCAAAACGACGTTATAATATGATATTTCGTAAAAGACTTCTTAAAATGATACCTTTCATCAAAGAAAGACAATGTATCATGTATTACACGAGGAAAACTTACCTCTTTTTCACATATATTTAATCGATTGATATTTTGTTCATTGGCTTTCAAACTTGGTTTATTATGAAAAACATCTTTTCAAAATTACACACGTGAACTGCTTGTGCAACTTCCTCAAT\tLN:i:15812\n",
"a\tutg000002l\t0\tm170211_224036_42134_c101073082550000001823236402101737_s1_X0/111/0_9545:243-7169\t-\t9\n",
"a\tutg000002l\t9\tm170211_224036_42134_c101073082550000001823236402101737_s1_X0/111/9588_19287:2360-9622\t+\t143\n",
"a\tutg000002l\t152\tm170315_190851_42134_c101169382550000001823273008151702_s1_p0/115246/13012_25767:156-12738\t+\t5682\n"
]
}
],
"source": [
"%%bash\n",
"head /home/data/20170918_oly_pacbio_miniasm_reads.gfa"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Well, it looks like it worked. We now have an assembly in the [graphical fragment assembly format (GFA)](https://github.com/GFA-spec/GFA-spec)\n",
"\n",
"#### Will proceed to the next part of the pipeline (consensus sequence generation): [Racon](https://github.com/isovic/racon)\n",
"\n",
"#### My notebook for this step is here: [https://github.com/sr320/LabDocs/blob/master/jupyter_nbs/sam/20170918_docker_pacbio_oly_racon0.5.0.ipynb](https://github.com/sr320/LabDocs/blob/master/jupyter_nbs/sam/20170918_docker_pacbio_oly_racon0.5.0.ipynb)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}