{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### In this notebook I will run PECAN on the DNR geoduck peptide data files. \n", "\n", "#### Prior to executing, I did the following; all programs installed on lab computer Emu:\n", " * Obtained a [background proteome file](https://raw.githubusercontent.com/sr320/paper-pano-go/52c6b18b5b09e5c3a49250cf47ad4ddc8e9dc004/data-results/Geoduck-transcriptome-v2.transdecoder.pep) from the geoduck gonad transcriptome, a protein fasta file (provided by Steven).\n", " * Digested the proteome file in silico using Protein Digestion Simulator. See notebook for more details. \n", " * Obtained the PRTC protein sequence fasta file, converted it to .tabular and merged with the digested proteome file. Combined file is: 2017-02-19_Geoduck-database4pecan.tabular \n", " * Obtained the isolation scheme file (from Emma): DNR_Geoduck_IsolationScheme.txt\n", " * Converted the .raw files that are produced by Lumos to .mzML using MSConvert. Steven did this for me (need to record process). \n", " * Created a .txt file with list of paths to all mzML files: DNR_Geoduck_mzMLpath.txt \n", " * Created a .txt file with path to the background proteome database: DNR_Geoduck_DatabasePath.txt " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /Users/yst/PycharmProjects/pecan_repo/test/DB/human_20150911_uniprot_sp_digested_Mass600to4000.txt\r\n", "\r\n", "Please confirm the file path in: /usr/local/lib/python2.7/dist-packages/PECAN-0.9.9.3-py2.7.egg/PECAN/PecanUtil/config\r\n", "\r\n" ] } ], "source": [ "! pecanpie -o ~/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-b ~/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.tabular \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType TARGET -w isolationWindowWidth \\\n", "--pecanMemRequest 16 \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### That didn't work; did some trouble shooting with help from Sean & Emma, [see github issue](https://github.com/sr320/LabDocs/issues/472#issuecomment-281236200): changed the database format to .txt, and also modified the Isolation Scheme format from 1 column to 2 column \"windows\", then tried again:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /Users/yst/PycharmProjects/pecan_repo/test/DB/human_20150911_uniprot_sp_digested_Mass600to4000.txt\r\n", "\r\n", "Please confirm the file path in: /usr/local/lib/python2.7/dist-packages/PECAN-0.9.9.3-py2.7.egg/PECAN/PecanUtil/config\r\n", "\r\n" ] } ], "source": [ "! pecanpie -o ~/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-b ~/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType TARGET -w isolationWindowWidth \\\n", "--pecanMemRequest 16 \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "~/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Didn't work again. Let's try converting all the tabs to character spaces in the database file:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "expand: /home/srlab/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt: No such file or directory\r\n" ] } ], "source": [ "! expand ~/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "head: cannot open '/home/srlab/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt' for reading: No such file or directory\r\n" ] } ], "source": [ "! head ~/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "#### File paths could be the issue:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /Users/yst/PycharmProjects/pecan_repo/test/DB/human_20150911_uniprot_sp_digested_Mass600to4000.txt\r\n", "\r\n", "Please confirm the file path in: /usr/local/lib/python2.7/dist-packages/PECAN-0.9.9.3-py2.7.egg/PECAN/PecanUtil/config\r\n", "\r\n" ] } ], "source": [ "! pecanpie -o /home/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-b /home/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType BOARDER \\\n", "--pecanMemRequest 16 \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Nope. Ok, actually converting from tab to character spaces this time: " ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "! expand -t 8 /home/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan.txt > \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan_char.txt" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /Users/yst/PycharmProjects/pecan_repo/test/DB/human_20150911_uniprot_sp_digested_Mass600to4000.txt\r\n", "\r\n", "Please confirm the file path in: /usr/local/lib/python2.7/dist-packages/PECAN-0.9.9.3-py2.7.egg/PECAN/PecanUtil/config\r\n", "\r\n" ] } ], "source": [ "! pecanpie -o /home/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-b /home/srlab/Documents/Laura/DNR_geoduck/2017-02-19_Geoduck-database4pecan_char.txt \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType BOARDER \\\n", "--pecanMemRequest 16 \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Still no. Emma emailed Sonia and this was her reply: \n", "\n", "The path to the file is incorrect as stated in the last line\n", "\"Please confirm the file path in ....\"\n", "\n", "If you wish to have pecanpie to provide some species db by default, put them under /your/dir/path and change the corresponding address in the config file to /your/dir/path\n", "\n", "Sean added my 2017-02-19_Geoduck-database4pecan.tabular file to the PECAN util folder; this is now saved as a new -s [species] called \"LAURAGEO\". " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /Users/yst/PycharmProjects/pecan_repo/test/DB/human_20150911_uniprot_sp_digested_Mass600to4000.txt\r\n", "\r\n", "Please confirm the file path in: /usr/local/lib/python2.7/dist-packages/PECAN-0.9.9.3-py2.7.egg/PECAN/PecanUtil/config\r\n", "\r\n" ] } ], "source": [ "! pecanpie -o /home/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-s LAURAGEO \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType BOARDER \\\n", "--pecanMemRequest 16 \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Still no dice. Something is wrong with the pecanpie config path" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r\n", "File does not exist: /home/srlab/Documents/Laura/Jupyter\r\n", "\r\n", "Please confirm the file path in: /home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt\r\n", "\r\n" ] } ], "source": [ "pecanpie -o /home/srlab/Documents/Laura/DNR_geoduck/Pecan_2017-02-21_geoduck \\\n", "-s LAURAGEO \\\n", "-n DNR_geoduck_SpLibrary --isolationSchemeType BOARDER \\\n", "--pecanMemRequest 16 \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_mzMLpath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_DatabasePath.txt \\\n", "/home/srlab/Documents/Laura/DNR_geoduck/DNR_Geoduck_IsolationScheme.txt \\\n", "--fido --jointPercolator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### SEan figured out the issue: by providing the .txt file that provides all the mzML file paths, it assumes that the mzML files are located in the same parent directories, so you cannot include all the parent directories in the file paths. So, all file paths were changed from /home/srlab/Documents/Laura/ .... \n", "\n", "Changes made:\n", " * Saved the isolation scheme file to .csv (was .txt)\n", " * " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }