{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PKtegndeTK4w"
   },
   "source": [
    "# Table of Content <a id='toc'></a>\n",
    "\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;[Module 3 - Biopython](#0)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Broad examples of what we can do with Biopython](#1)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Main concept of Biopython](#2)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Objects](#3)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Help - important!](#4)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`Seq` Object and (Alphabets)](#5)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Indexing and slicing of `Seq`](#6)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Methods of `Seq` objects](#7)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Mutability of `Seq` objects](#8)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[deprecated - Alphabets](#9)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;['Biological' methods of `Seq` objects](#10)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Bio.SeqIO Module and SeqRecord Object](#11)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Reading sequence records from files](#12)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`SeqRecord` objects](#13)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Slicing a SeqRecord](#14)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`SeqFeature` object](#15)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Writing to sequence files](#16)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Accessing Online Databases](#17)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Now, you can try to solve the exercises.](#18)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;[Additional Theory](#19)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[On `SeqRecord` iterators and processing large sequence files with `SeqIO`](#20)\n",
    "\n",
    "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[More on `SeqFeature` objects](#21)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TYePQ3v0TK43"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "\n",
    "# Module 3 - Biopython <a id='0'></a>\n",
    "----------------------------------\n",
    "\n",
    "Let's consider a typical task in bioinformatics: we are interested in finding the GC% for some sequences. For this we would need i) open the file, ii) parse and extract the sequence information, and iii) calculate and report their GC%. In a very simple example, we could write something like following:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 4165,
     "status": "ok",
     "timestamp": 1687847940310,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "4GyD-rRxVqbp",
    "outputId": "0d2d07e8-2652-4134-99f0-f3b4d22cdb22"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: biopython in c:\\users\\asus\\anaconda3\\lib\\site-packages (1.81)\n",
      "Requirement already satisfied: numpy in c:\\users\\asus\\anaconda3\\lib\\site-packages (from biopython) (1.24.3)\n"
     ]
    }
   ],
   "source": [
    "!pip install biopython"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 912,
     "status": "ok",
     "timestamp": 1687847984066,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "VDj9wP5iXAvR",
    "outputId": "be9c30b3-e6a4-4590-b76f-69ac769902f1"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "'wget' is not recognized as an internal or external command,\n",
      "operable program or batch file.\n",
      "'wget' is not recognized as an internal or external command,\n",
      "operable program or batch file.\n",
      "'wget' is not recognized as an internal or external command,\n",
      "operable program or batch file.\n"
     ]
    }
   ],
   "source": [
    "!wget 'https://raw.githubusercontent.com/sib-swiss/first-steps-with-python-training/master/notebooks/data/my_sequences.fa'\n",
    "!wget 'https://raw.githubusercontent.com/sib-swiss/first-steps-with-python-training/master/notebooks/data/example.fa'\n",
    "!wget 'https://raw.githubusercontent.com/sib-swiss/first-steps-with-python-training/master/notebooks/data/example.gp'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Zd7iV7PwTK45"
   },
   "source": [
    "Now let's see how it would look like if we used Biopython:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 334,
     "status": "ok",
     "timestamp": 1687827584469,
     "user": {
      "displayName": "Faris Izzatur Rahman",
      "userId": "03185179294135315816"
     },
     "user_tz": -420
    },
    "id": "XqBqHM4DTK45",
    "outputId": "b40c60a1-2918-4d42-d97f-67274b061043"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GC% for my_seq_0 is 49.26\n",
      "GC% for my_seq_1 is 51.32\n",
      "GC% for my_seq_2 is 52.37\n",
      "GC% for my_seq_3 is 48.54\n",
      "GC% for my_seq_4 is 50.00\n",
      "GC% for my_seq_5 is 52.32\n",
      "GC% for my_seq_6 is 56.25\n",
      "GC% for my_seq_7 is 50.25\n",
      "GC% for my_seq_8 is 47.21\n",
      "GC% for my_seq_9 is 53.06\n"
     ]
    }
   ],
   "source": [
    "from Bio import SeqIO\n",
    "from Bio.SeqUtils import GC\n",
    "for rec in SeqIO.parse('my_sequences.fa', \"fasta\"):\n",
    "    print(\"GC% for {} is {:.2f}\".format(rec.id, GC(rec.seq)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "m7S1bJHCTK46"
   },
   "source": [
    "[back to the toc](#toc)\n",
    "\n",
    "## Broad examples of what we can do with Biopython <a id='1'></a>\n",
    "\n",
    "- Sequence analysis\n",
    "  - Motif\n",
    "  - Search: HMMs, Alignments\n",
    "  - Restriction\n",
    "- Structures\n",
    "  - SCOP\n",
    "  - PDB\n",
    "- Database query\n",
    "- Phylogeny\n",
    "- Pathway\n",
    "- And more ...\n",
    "\n",
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## Main concept of Biopython <a id='2'></a>\n",
    "\n",
    "- I/O interface and parsing abilities for bioinformatics files/DBs\n",
    "  - Blast\n",
    "  - FASTA\n",
    "  - PubMed\n",
    "  - SwissProt\n",
    "- Efficient and practical data-structures for bioinformatics data\n",
    " - Sequences\n",
    " - Alignments\n",
    " - Structures\n",
    "- Methods implementing bioinformatics analysis\n",
    " - Translation\n",
    " - Classification\n",
    " - Phylogeny trees\n",
    "- Interface to common bioinformatics programs\n",
    "  - Standalone Blast\n",
    "  - ClustalW\n",
    "  - EMBOSS\n",
    "\n",
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## Objects <a id='3'></a>\n",
    "\n",
    "In Biopython there are many modules and each module contains several major new data types. Objects created with these types serve specific purposes in the above mentioned functionalities. We will focus on sequence analysis; some new objects we will discover are:\n",
    "\n",
    "- `Seq`\n",
    "- `Alphabet`\n",
    "- `SeqRecord`\n",
    "- `SeqFeature`\n",
    "\n",
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## Help - important! <a id='4'></a>\n",
    "\n",
    "- A relatively detailed tutorial: http://biopython.org/DIST/docs/tutorial/Tutorial.html\n",
    "- Help on certain concepts and modules: https://biopython.org/wiki/Category%3AWiki_Documentation\n",
    "\n",
    "In this notebook we will cover basics of the Biopython package:\n",
    "    - Seq object and alphabets\n",
    "    - Bio.SeqIO module and SeqRecord object\n",
    "    - Interacting with external DBs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nRD5OiMpTK46"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## `Seq` Object and (Alphabets) <a id='5'></a>\n",
    "\n",
    "Seq object is a flexible encapsulator for _sequence-like_ strings. It has two main components:\n",
    "- A Python string representing a _biological sequence_\n",
    "- An Alphabet object describing the _letters_ used in this sequence.\n",
    "\n",
    "The related module is __Bio.Seq__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "id": "1_SvKvzoTK47"
   },
   "outputs": [],
   "source": [
    "from Bio.Seq import Seq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "cnaX2E4dTK47"
   },
   "source": [
    "Let's create our first `Seq` object. The only required positional argument is _'data'_, which is used for the sequence string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "UkTbbKdwTK47"
   },
   "outputs": [],
   "source": [
    "my_seq = Seq('AGCGCGATTTATATATAGCGAGCGATTCGGAGCGATCGACGGATTCGAC')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "gsxBzZlkTK47",
    "outputId": "76937128-14b9-4cee-9aa8-4a60a5d0d807"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('AGCGCGATTTATATATAGCGAGCGATTCGGAGCGATCGACGGATTCGAC')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nQ0o5DnSTK48"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### Indexing and slicing of `Seq` <a id='6'></a>\n",
    "\n",
    "A `Seq` object is based on Python's `str` object, so most methods we can use with a `str` object, we can also use with a `Seq` object. Let's try to use indexing and slicing with our sequence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "id": "qRim3jJPTK48",
    "outputId": "93eba63e-4d8f-4eff-c35b-1c1947ae6e9e"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5th nucleotide is A\n"
     ]
    }
   ],
   "source": [
    "print(\"5th nucleotide is {}\".format(my_seq[6]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "id": "645fk-k_TK48",
    "outputId": "a1eefb3c-3b1b-4306-c6b4-b6877e16cfb4"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Seq('GATTTA'), Seq('AGCGAG'))"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq[5:11], my_seq[16:22]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "foAdy9VLTK48"
   },
   "source": [
    "> Note that the result of slicing is a **new** `Seq` object!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Xil5N83ITK48"
   },
   "source": [
    "It is important to make a difference between indexing and single-element slicing. Let's have a *closer* look at the results of these two operations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "id": "htVvf3jnTK49",
    "outputId": "3394e168-0db6-4f2a-a3c9-25b13ce6356f"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'A'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Using index\n",
    "my_seq[6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "id": "qNu4eO5xTK49",
    "outputId": "0bb910c6-d587-4e0b-e283-ee574c12b997"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('A')"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# And using slice\n",
    "my_seq[6:7]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "B5tNRYoWTK49"
   },
   "source": [
    "<div class=\"alert alert-block alert-success\">\n",
    "\n",
    "### Question 1.\n",
    "\n",
    "Can you make a new `Seq` object combining the nucleotides 6-11 and 17-22. *Hint:* try to use slices\n",
    "\n",
    "<div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wuZh9nHVTK49"
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZW7sdJ4PTK49"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### Methods of `Seq` objects <a id='7'></a>\n",
    "\n",
    "As `Seq` objects inherits many properties from `str`, many of its methods are available to us to use on our sequences."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "id": "XkuWO9kjTK49",
    "outputId": "d778b6a3-e07b-4e89-f41d-75780b12a01c"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('AGCGCGATTTATATATAGCGAGCGATTCGGAGCGATCGACGGATTCGAC')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "id": "Y-wXk9viTK4-",
    "outputId": "0015c359-2da2-433e-813b-2af074b6305a"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq.count('TAT') # counts the occurences of the input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "id": "gHl6iafmTK4-",
    "outputId": "9fbcb007-6bc8-43c5-8319-426327f099e0"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('agcgcgatttatatatagcgagcgattcggagcgatcgacggattcgac')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq.lower()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "id": "lGfxpWpRTK4-",
    "outputId": "6533b177-8194-44d1-f015-7c1cb5cca9d2"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq.index('TAT') # gives the index (0-based) of the first occurence of the input"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "74_LnMA4TK4_"
   },
   "source": [
    "Notice that the main difference from calling these methods on a `str` object is that they return new `Seq` objects as a result:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "id": "BHOrr_fOTK4_",
    "outputId": "5b52fd87-ed56-47b5-b602-3a2420a78571"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Seq('AGCGC'),\n",
       " Seq('TTATATATAGCGAGC'),\n",
       " Seq('TCGGAGC'),\n",
       " Seq('CGACG'),\n",
       " Seq('TCGAC')]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_seq.split('GAT')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AkmQCJQETK4_"
   },
   "source": [
    "Of course, being a specialized object, it **does not** support all `str` methods. And more importantly, it implements many **new** methods that relate to biological sequences. Let's explore some examples for both cases:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "id": "R9ZU8eyxTK5B",
    "outputId": "b58b8fee-e47d-4b41-896e-953fa6fcb4db",
    "scrolled": true
   },
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'Seq' object has no attribute 'isdigit'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "Cell \u001b[1;32mIn[16], line 3\u001b[0m\n\u001b[0;32m      1\u001b[0m \u001b[38;5;66;03m# .isdigit() is one of the string methods that is not supported by Seq objects\u001b[39;00m\n\u001b[1;32m----> 3\u001b[0m my_seq\u001b[38;5;241m.\u001b[39misdigit()\n",
      "\u001b[1;31mAttributeError\u001b[0m: 'Seq' object has no attribute 'isdigit'"
     ]
    }
   ],
   "source": [
    "# .isdigit() is one of the string methods that is not supported by Seq objects\n",
    "\n",
    "my_seq.isdigit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "id": "Weu6OIYHTK5B",
    "outputId": "9ca2063e-39c9-47ec-dc67-c930724cd83b"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# on the other hand, .count_overlap() is one the new methods that Seq implements\n",
    "\n",
    "my_seq.count_overlap('TAT')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8VR0YUANTK5C"
   },
   "source": [
    "> To see all methods and their descriptions you can always visit the documentation pages or simply use the help function whenever needed: `help(Seq)`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jwnVpmqpTK5E"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### 'Biological' methods of `Seq` objects <a id='10'></a>\n",
    "Let's now focus on `Seq` specific 'biological' methods:\n",
    "\n",
    "- complement()\n",
    "- reverse_complement()\n",
    "- transcribe()\n",
    "- back_transcribe()\n",
    "- translate()\n",
    "\n",
    "These methods do not *require* any other extra arguments and they return a new `Seq` object."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ebbSMt3kTK5E"
   },
   "source": [
    "#### Complementing and reverse complementing sequences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 405,
     "status": "ok",
     "timestamp": 1687848117692,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "Bva8TtRcTK5E",
    "outputId": "259fb1e5-d96f-4a8e-e17b-991f6534b069"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "intronless_dna: CACCTCTGGAGCGGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAATGCCTATAGGATCCAAAGAGAGGCCAACATTTTTTGAAATTTTTAAGACACGCTGCAACAAAGCA\n",
      "\n",
      "compl_seq: GTGGAGACCTCGCCTGAATAAATGGTTCGTAACCTCCTTATAGCATCCATTTTTACGGATATCCTAGGTTTCTCTCCGGTTGTAAAAAACTTTAAAAATTCTGTGCGACGTTGTTTCGT\n",
      "\n",
      "rev_strand: TGCTTTGTTGCAGCGTGTCTTAAAAATTTCAAAAAATGTTGGCCTCTCTTTGGATCCTATAGGCATTTTTACCTACGATATTCCTCCAATGCTTGGTAAATAAGTCCGCTCCAGAGGTG\n",
      "alternatively: TGCTTTGTTGCAGCGTGTCTTAAAAATTTCAAAAAATGTTGGCCTCTCTTTGGATCCTATAGGCATTTTTACCTACGATATTCCTCCAATGCTTGGTAAATAAGTCCGCTCCAGAGGTG\n"
     ]
    }
   ],
   "source": [
    "from Bio.Seq import Seq\n",
    "\n",
    "intronless_dna = Seq(\n",
    "    'CACCTCTGGAGCGGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAATGCCTA'\n",
    "    'TAGGATCCAAAGAGAGGCCAACATTTTTTGAAATTTTTAAGACACGCTGCAACAAAGCA')\n",
    "print(\"intronless_dna:\", intronless_dna)\n",
    "\n",
    "\n",
    "# If we need to find its complement sequence:\n",
    "\n",
    "compl_seq = intronless_dna.complement()\n",
    "print()\n",
    "print(\"compl_seq:\", compl_seq)\n",
    "\n",
    "# If we would need to find the reverse complement of our sequence, we can simply\n",
    "\n",
    "rev_strand = intronless_dna.reverse_complement()\n",
    "print()\n",
    "print(\"rev_strand:\", rev_strand)\n",
    "print(\"alternatively:\", compl_seq[::-1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9sWRqkaqTK5F"
   },
   "source": [
    "#### Transcription and reverse transcription of sequences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 287,
     "status": "ok",
     "timestamp": 1687848123463,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "Z_9MF8eQTK5F",
    "outputId": "c299f91a-638b-494a-add0-31d9a6c095b6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RNA: CACCUCUGGAGCGGACUUAUUUACCAAGCAUUGGAGGAAUAUCGUAGGUAAAAAUGCCUAUAGGAUCCAAAGAGAGGCCAACAUUUUUUGAAAUUUUUAAGACACGCUGCAACAAAGCA\n"
     ]
    }
   ],
   "source": [
    "# Let's now transcribe this piece of DNA into RNA. This is an intronless DNA sequence, so\n",
    "\n",
    "rna_seq = intronless_dna.transcribe()\n",
    "print(\"RNA:\", rna_seq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 496,
     "status": "ok",
     "timestamp": 1687848140283,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "jDMXm2UKTK5F",
    "outputId": "5376c8a8-4a9d-4ef1-e1c4-e3d9b74c498c"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('CACCTCTGGAGCGGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAA...GCA')"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# We can also reverse transcribe an RNA sequence to cDNA...\n",
    "cdna_seq = rna_seq.back_transcribe()\n",
    "cdna_seq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Mr7eKIVJTK5G"
   },
   "source": [
    "#### Translation of sequences"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PXXASWaMTK5G"
   },
   "source": [
    "Up to here, we have described an intronless DNA sequence, and 'transcribed' it into an RNA sequence, `rna_seq`. If we know where the CDS starts, we can also translate the CDS into a protein sequence. In this example, the start codon can be found at 53. position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "id": "Tb_eB4lLTK5G",
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('AUGCCUAUAGGAUCCAAAGAGAGGCCAACAUUUUUUGAAAUUUUUAAGACACGC...GCA')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cds_seq = rna_seq[53:]\n",
    "cds_seq"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "id": "LUaHCZCzTK5H"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('MPIGSKERPTFFEIFKTRCNKA')"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "protein_seq = cds_seq.translate()\n",
    "protein_seq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5j6-kXdbTK5J"
   },
   "source": [
    "#### Where alphabets were useful"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Nus0KcRiTK5K"
   },
   "source": [
    "A potential source of error may come from the fact that two `Seq` objects may represent different types of biological sequences (protein and cDNA for instance).\n",
    "\n",
    "Indeed, nothing prevents us from erroneously applying `Seq` methods to types of sequences that would 'biologically' not support these actions. For example, if we try to reverse-complement a protein sequence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "id": "W1CE4zbKTK5K"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('TMNGYAMFIEFFAPYEMSCIPK')"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "protein_seq.reverse_complement() # returns a meaningless sequence by trying to interpret AA as IUPAC nts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kZqI1W_0TK5M"
   },
   "source": [
    "Or, we can concatenate two incompatible `Seq` objects together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "id": "1MhdimX-TK5M"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AUGCCUAUAGGAUCCAAAGAGAGGCCAACAUUUUUUGAAAUUUUUAAGACACGCUGCAACAAAGCAMPIGSKERPTFFEIFKTRCNKA\n"
     ]
    }
   ],
   "source": [
    "print( cds_seq + protein_seq )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PhcGC_89TK5N"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## Bio.SeqIO Module and SeqRecord Object <a id='11'></a>\n",
    "\n",
    "The `Bio.SeqIO` module provides a simple and uniform interface to read and parse from and write to various sequence file formats (including multiple sequence alignments). The `Bio.SeqIO` module supports a large number of sequence file formats, including *fasta*, *fastq*, and *genbank* and many more, which can be found [here](https://biopython.org/wiki/SeqIO). All sequences in this method will be accessed as `SeqRecord` objects.\n",
    "\n",
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### Reading sequence records from files <a id='12'></a>\n",
    "\n",
    "The main function of the module is `.parse()`, which takes a file handle (or filename) and format name, and returns a `SeqRecord` iterator. An iterator is an object that can be iterated upon, meaning that you can traverse through all the values, one by one. In this case `Bio.SeqIO.parse()` method returns an iterator of `SeqRecord` objects extracted and parsed from the input file. We can then iterate over the records and process them in a very efficient manner."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 371
    },
    "executionInfo": {
     "elapsed": 295,
     "status": "error",
     "timestamp": 1687848292520,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "FfHw1FHvTK5N",
    "outputId": "4c743126-210b-43a6-91fb-72118ae9a9f7"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CAX36647.1\n",
      "XP_009312554.1\n",
      "ADY59640.1\n",
      "AEE14926.1\n",
      "ADY55933.1\n",
      "AEV99900.1\n",
      "NP_001315822.1\n"
     ]
    }
   ],
   "source": [
    "# Let's parse an example fasta file and iterate over its records\n",
    "\n",
    "from Bio import SeqIO\n",
    "for record in SeqIO.parse(\"example.fa\", \"fasta\"):   # the for loop iterates over the SeqRecord objects\n",
    "    print(record.id)                                # each SeqRecord object contains an ID"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "E9aBHgzfmTif"
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "K8TmjFXfTK5O"
   },
   "source": [
    "Sometimes, it could be handier to supply a filehandle instead of the filename to `.parse()` method. This also permits the use of different input resources such as streams. We can re-examine the previous example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "id": "-taJ7Cw7TK5O"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CAX36647.1\n",
      "XP_009312554.1\n",
      "ADY59640.1\n",
      "AEE14926.1\n",
      "ADY55933.1\n",
      "AEV99900.1\n",
      "NP_001315822.1\n"
     ]
    }
   ],
   "source": [
    "with open(\"example.fa\", \"r\") as filehandle:\n",
    "    for rec in SeqIO.parse(filehandle, \"fasta\"):\n",
    "        print(rec.id)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-mn7yEuFTK5O"
   },
   "source": [
    "The second argument of the `Bio.SeqIO.parse()` method defines the type of the input file and it is mandatory. A detailed list of possible values for this argument can be found on [the official documentation](https://biopython.org/wiki/SeqIO). Let's open another file, **example.gp**, that contains the same sequences as before but this time in GenBank format (full GenPept format to be more precise)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "id": "F8PDEr-dTK5O"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CAX36647.1\n",
      "XP_009312554.1\n",
      "ADY59640.1\n",
      "AEE14926.1\n",
      "ADY55933.1\n",
      "AEV99900.1\n",
      "NP_001315822.1\n"
     ]
    }
   ],
   "source": [
    "for rec in SeqIO.parse(\"example.gp\", \"genbank\"):\n",
    "    print(rec.id)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "HMs2daNMTK5P"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### `SeqRecord` objects <a id='13'></a>\n",
    "We have mentioned that the results of 'Bio.SeqIO' parsing methods are `SeqRecord` objects. We have already seen that these objects contain an `.id` attribute. A `SeqRecord` object holds, beside the sequence (as a `Seq` object), identifiers (ID and name), description and optionally annotation and sub-features. The completeness of a record will depend on its source."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "HsgPHJrcTK5P"
   },
   "source": [
    "`SeqRecord` contains many attributes, some of which can be rather complex. We will further examine most commonly used ones:\n",
    "\n",
    "- seq\n",
    "- id\n",
    "- name\n",
    "- description\n",
    "- annotations\n",
    "- features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "id": "gldex4PtTK5P"
   },
   "outputs": [],
   "source": [
    "# Let's grab the last record in our example file to use in our following cells\n",
    "records = list(SeqIO.parse(\"example.gp\", \"genbank\"))\n",
    "rec = records[-1]\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "id": "_toFW7tfTK5Q"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Seq('MSNTAGQVIRCRAAVAWEAGKPLVIEEVEVAPPQANEVRIKILFTSLCHTDVYF...MEE')"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rec.seq is a `Seq` object containing the sequence and the alphabet\n",
    "rec.seq"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "id": "rUEUo5JtTK5Q"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'NP_001315822.1'"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rec.id is a string containing the sequence identifier\n",
    "rec.id"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "id": "f8-X9FiRTK5R"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'NP_001315822'"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rec.name is a string containing the display name of the sequence\n",
    "# it may be identical to or quite different than rec.id !!\n",
    "rec.name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "id": "ZgIDWacpTK5R"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'alcohol dehydrogenase [Malus domestica]'"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rec.description is also a simple string\n",
    "rec.description"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "id": "oHcecAjyTK5R"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "record is from Malus domestica (apple)\n",
      "and its full taxonomy is ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliopsida', 'eudicotyledons', 'Gunneridae', 'Pentapetalae', 'rosids', 'fabids', 'Rosales', 'Rosaceae', 'Amygdaloideae', 'Maleae', 'Malus']\n"
     ]
    }
   ],
   "source": [
    "# rec.annotations is dictionary that contains different annotations about the sequence record\n",
    "# When created from a GenBank source, certain keys can be expected to be present such as source, taxonomy\n",
    "print(\"record is from\", rec.annotations[\"source\"])\n",
    "print(\"and its full taxonomy is\", rec.annotations[\"taxonomy\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "id": "h9r-XYuuTK5R"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SeqFeature(SimpleLocation(ExactPosition(0), ExactPosition(380)), type='CDS', qualifiers=...)"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rec.features is a list\n",
    "# It contains SeqFeature objects which we will discuss briefly later.\n",
    "# But in a nutshell, they hold information about various positional features identified on the sequence\n",
    "\n",
    "rec.features[-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "E-m0r4VsTK5R"
   },
   "source": [
    "Let's examine the records contained in 'example.gb' file more closely."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "id": "ka1NLUJ2TK5S"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 [CAX36647.1]\talcohol dehydrogenase\tlength 170\twith 4 features and 12 annotations\n",
      "1 [XP_009312554.1]\talcohol dehydrogenase\tlength 392\twith 7 features and 13 annotations\n",
      "2 [ADY59640.1]\tAlcohol dehydrogenase\tlength 436\twith 6 features and 13 annotations\n",
      "3 [AEE14926.1]\tAlcohol dehydrogenase\tlength 389\twith 7 features and 13 annotations\n",
      "4 [ADY55933.1]\tAlcohol dehydrogenase\tlength 392\twith 6 features and 13 annotations\n",
      "5 [AEV99900.1]\tAlcohol dehydrogenase\tlength 382\twith 7 features and 13 annotations\n",
      "6 [NP_001315822.1]\talcohol dehydrogenase\tlength 380\twith 8 features and 13 annotations\n"
     ]
    }
   ],
   "source": [
    "for index, rec in enumerate(SeqIO.parse(\"example.gp\", \"genbank\")):\n",
    "    print(\"{} [{}]\\t{}\\tlength {}\\twith {} features and {} annotations\".format(\n",
    "        index, rec.id, rec.description[:21], len(rec.seq), len(rec.features), len(rec.annotations)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "phIgDJv9TK5S"
   },
   "source": [
    "#### Question 4.\n",
    "\n",
    "Similarly parse the **example.fa** file and compare the `SeqRecord` objects to the above cell from **example.gp**. Which attributes are identical? What are the differences? Why?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ybN67DUjTK5S"
   },
   "outputs": [],
   "source": [
    "# You can use this cell for Question 4.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iIOvOiV1TK5S"
   },
   "source": [
    "We can create a `SeqRecord` directly. We need to import the class from its module, `Bio.SeqRecord`, first. In order to create its sequence we will also need the `Bio.Seq.Seq` or a similar `Seq` class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "id": "Z-GLgR7DTK5S"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ID: my_seq_ID\n",
      "Name: made-up sequence\n",
      "Description: Just some randomly typed ATGCs\n",
      "Number of features: 0\n",
      "Seq('ACGGCTATCTGAGGACTACGAGCATCATCGA')\n"
     ]
    }
   ],
   "source": [
    "from Bio.Seq import Seq\n",
    "from Bio.SeqRecord import SeqRecord\n",
    "\n",
    "record = SeqRecord(Seq(\"ACGGCTATCTGAGGACTACGAGCATCATCGA\"),\n",
    "                   id=\"my_seq_ID\", name=\"made-up sequence\",\n",
    "                   description=\"Just some randomly typed ATGCs\")\n",
    "print(record)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EiINUVx9TK5X"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "### Writing to sequence files <a id='16'></a>\n",
    "So far we have seen how to parse sequence files and work with sequence records. In many applications, we will also need to write our processed sequence records back into a standard sequence file. For this purpose, the `Bio.SeqIO` module has a `.write()` function. It can be thought as the reverse of the `.parse()` method we have learnt above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "id": "ytkJSmk7TK5X"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let's create 10 random sequences each around 400 bases long. For this we can use the 'random' module\n",
    "\n",
    "import random\n",
    "\n",
    "records = []\n",
    "for i in range(10):\n",
    "    random_atgc = random.choices(\"ATGC\", k=random.randint(380, 420))\n",
    "    record = SeqRecord(Seq(\"\".join(random_atgc)),\n",
    "                       id=\"my_seq_{}\".format(i),\n",
    "                       name=\"random sequnce {}\".format(i),\n",
    "                       description=\"A randomly generated sequence\")\n",
    "    records.append(record)\n",
    "SeqIO.write(records, \"random_sequences.fa\", \"fasta\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "j2OjQvbwTK5X"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "## Accessing Online Databases <a id='17'></a>\n",
    "\n",
    "Biophyton introduces multiple means to interact with commonly used bioinformatics databases over the internet. We will cover the basics of accessing online databases with two databases:\n",
    "\n",
    "1. NCBI's Entrez\n",
    "2. UniProt/SwissProt's ExPASy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pGVtk9EpTK5X"
   },
   "source": [
    "The Entrez can be queried using the `Bio.Entrez` module. There many different functionalities, but today we will only cover the `.efetch()` method, which we can use to fetch records from Entrez over the internet.\n",
    "\n",
    "Let's discover this functionality with an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "id": "ccvDodOkTK5X"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NP_001315822.1 with 7 features\n"
     ]
    }
   ],
   "source": [
    "from Bio import Entrez\n",
    "from Bio import SeqIO\n",
    "\n",
    "Entrez.email = \"my.address@email.ch\"  # This is obligatory.You should use yours!\n",
    "\n",
    "with Entrez.efetch( db=\"protein\",                     # which Entrez database to use, nucleotide, protein,..\n",
    "                    rettype=\"gp\",                     # return-type: fasta, gb (genbank), gp (genpept)\n",
    "                    retmode=\"text\",                   # return-mode: text or xml (perhaps other options?)\n",
    "                    id=\"NP_001315822.1\" ) as handle:  # the accession ID of the record we would like to fetch\n",
    "    # Once it is a handle, we know the rest from SeqIO...\n",
    "    seq_record = SeqIO.read(handle, \"gb\")\n",
    "\n",
    "print(\"{} with {} features\".format(seq_record.id, len(seq_record.features)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "id": "OetsVireTK5Y"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NP_001315822.1 with 0 features\n"
     ]
    }
   ],
   "source": [
    "with Entrez.efetch( db=\"protein\",\n",
    "                    rettype=\"fasta\",                 # if the return-type is set to 'fasta'\n",
    "                    retmode=\"text\",\n",
    "                    id=\"NP_001315822.1\" ) as handle:\n",
    "    seq_record = SeqIO.read(handle, \"fasta\") # we can only parse it as fasta\n",
    "\n",
    "print(\"{} with {} features\".format(seq_record.id, len(seq_record.features)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IzHh_knpTK5Y"
   },
   "source": [
    "The `Bio.ExPASy` module provides functions to interact with the UniProt/SwissProt database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "id": "3-tM3TY_TK5Y"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Retrived P48977: RecName: Full=Alcohol dehydrogenase; EC=1.1.1.1 {ECO:0000250|UniProtKB:P06525};\n"
     ]
    }
   ],
   "source": [
    "from Bio import ExPASy\n",
    "from Bio import SeqIO\n",
    "\n",
    "with ExPASy.get_sprot_raw(\"P48977\") as handle:   # a simple .get_sprot_raw() function fetches records\n",
    "    record = SeqIO.read(handle, \"swiss\")         # which we can then parse with SeqIO\n",
    "print(\"Retrived {}: {}\".format(record.id, record.description))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7QpU7m5iTK5Y"
   },
   "source": [
    "[back to the toc](#toc)\n",
    "\n",
    "<br>\n",
    "\n",
    "# Additional Theory <a id='19'></a>\n",
    "-----------------------------\n",
    "\n",
    "The Biopython module is 'huge'! Under this section, we have included some more information on basic functionality which is not needed for the exercises."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OOyw-w_FTK5Y"
   },
   "source": [
    "\n",
    "[back to the toc](#toc)\n",
    "\n",
    "\n",
    "### On `SeqRecord` iterators and processing large sequence files with `SeqIO` <a id='20'></a>\n",
    "\n",
    "We have learnt how we can iterate over the sequence records contained in an input file using the `.parse()` method of `SeqIO`. There are some important limitations that come with iterators. Below, we will explore some of there limitations, how we can overcome them but at a cost in memory usage. Finally, some alternatives if we need to process large files without iterators."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "id": "DCOd6YgBTK5Z"
   },
   "outputs": [],
   "source": [
    "from Bio import SeqIO"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "id": "o6j2l001TK5Z"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "First record is CAX36647.1\n",
      "first use MLRAT\n",
      "first use MAVAL\n",
      "first use MSKDK\n",
      "first use MKFNY\n",
      "first use MQDLL\n",
      "first use MSNTA\n",
      "We have reached the end of the iterator\n"
     ]
    }
   ],
   "source": [
    "# A word about iterators. Once we iterate over all values an iterator gets empty.\n",
    "# That means, we can not access the contents any longer.\n",
    "\n",
    "# Let's create an iterator of sequence records:\n",
    "records = SeqIO.parse(\"example.fa\", \"fasta\")\n",
    "\n",
    "# We can access the next item (in this case, the first one) via core next() method:\n",
    "print(\"First record is\", next(records).id)\n",
    "\n",
    "# Now, let's iterate over all values with a for loop:\n",
    "for record in records:\n",
    "    print(\"first use\", record.seq[:5])\n",
    "print(\"We have reached the end of the iterator\")\n",
    "\n",
    "# At this point, we have reached the end of the iterator. We can not use it again:\n",
    "for record in records:\n",
    "    print(\"second use\", record.seq[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XNJCnF39TK5Z"
   },
   "source": [
    "Therefore, sometimes it is useful to convert the iterator into a list or a dictionary, given that it is not too big to fit into the memory. Iterators can be easily converted into lists, simply casting a `list()` function. For conversion into a dictionary, the `Bio.SeqIO` provides a specialized method called `.to_dict()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "id": "wStnhAvMTK5Z"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "first use MNISA\n",
      "first use MLRAT\n",
      "first use MAVAL\n",
      "first use MSKDK\n",
      "first use MKFNY\n",
      "first use MQDLL\n",
      "first use MSNTA\n",
      "second use MNISA\n",
      "second use MLRAT\n",
      "second use MAVAL\n",
      "second use MSKDK\n",
      "second use MKFNY\n",
      "second use MQDLL\n",
      "second use MSNTA\n"
     ]
    }
   ],
   "source": [
    "# Converting the sequence record iterator into a list\n",
    "\n",
    "records = list(SeqIO.parse(\"example.fa\", \"fasta\"))\n",
    "# Now, let's loop over all values with for:\n",
    "for record in records:\n",
    "    print(\"first use\", record.seq[:5])\n",
    "# And again:\n",
    "for record in records:\n",
    "    print(\"second use\", record.seq[:5])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "id": "o0u8lFCJTK5Z"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Listing all items:\n",
      "CAX36647.1 MNISA\n",
      "XP_009312554.1 MLRAT\n",
      "ADY59640.1 MAVAL\n",
      "AEE14926.1 MSKDK\n",
      "ADY55933.1 MKFNY\n",
      "AEV99900.1 MQDLL\n",
      "NP_001315822.1 MSNTA\n",
      "\n",
      "Accessing to a specific record: ADY55933.1\n",
      "ADY55933.1 MKFNY\n"
     ]
    }
   ],
   "source": [
    "# Converting the sequence record iterator into a dictionary\n",
    "\n",
    "records = SeqIO.parse(\"example.fa\", \"fasta\")\n",
    "rec_dict = SeqIO.to_dict(records)\n",
    "print(\"Listing all items:\")\n",
    "for rec_id, rec in rec_dict.items():\n",
    "    print(rec_id, rec.seq[:5])\n",
    "print()\n",
    "\n",
    "specific_rec_id = \"ADY55933.1\"\n",
    "print(\"Accessing to a specific record:\", specific_rec_id)\n",
    "print(specific_rec_id, rec_dict[specific_rec_id].seq[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "q50R2NpLTK5a"
   },
   "source": [
    "For larger files, it isn’t possible to hold everything in memory, so `Bio.SeqIO.to_dict()` is not suitable. In these situations we can use the `Bio.SeqIO.index()` function. This will not populate a dictionary, rather index the input file such that we can access records in a an arbitrary order. This should be used with very large files; for small files it will be slower than the `.to_dict()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "id": "PYyyejAyTK5a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ADY55933.1 MKFNY\n"
     ]
    }
   ],
   "source": [
    "rec_dict = SeqIO.index(\"example.fa\", \"fasta\")\n",
    "print(specific_rec_id, rec_dict[specific_rec_id].seq[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VLcVYs_YTK5b"
   },
   "source": [
    "When we slice a `SeqRecord` that contains `SeqFeature`s, we have learnt that only those features that completely fall within the slice will be kept. Then the positions will be offsetted relative to the new sliced sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 389
    },
    "executionInfo": {
     "elapsed": 419,
     "status": "error",
     "timestamp": 1687848748238,
     "user": {
      "displayName": "Subair Beta",
      "userId": "08001855949161377870"
     },
     "user_tz": -480
    },
    "id": "03A9seduTK5b",
    "outputId": "7369631e-7236-44c6-cc03-18df4a9ec8d0"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ID: NP_001315822.1\n",
      "Name: NP_001315822\n",
      "Description: alcohol dehydrogenase [Malus domestica]\n",
      "Number of features: 1\n",
      "/molecule_type=protein\n",
      "Seq('CHTDVYFWEAKGQNPLFPRIYGHEAGGIVESVGEGVTDLKAGDHVLPVFTGECK...LSC')\n",
      "\n",
      "The feature is of 'Site' type\n",
      "It is from 0 to 131\n",
      "Its location is order{[0:1], [2:3], [22:23], [130:131]}\n",
      "Its qualifiers are: {'site_type': ['other'], 'note': ['catalytic Zn binding site [ion binding]'], 'db_xref': ['CDD:176261']}\n",
      "\n",
      "ID: NP_001315822.1\n",
      "Name: NP_001315822\n",
      "Description: alcohol dehydrogenase [Malus domestica]\n",
      "Number of features: 0\n",
      "/molecule_type=protein\n",
      "Seq('APPQANEVRIKILFTSLCHTDVYFWEAKGQNPLFPRIYGHEAGGIVESVGEGVT...GEC')\n"
     ]
    }
   ],
   "source": [
    "# Let's grab the last record in our example file\n",
    "records = list(SeqIO.parse(\"example.gp\", \"genbank\"))\n",
    "last_record = records[-1]\n",
    "\n",
    "# We have seen above that the zinc binding site is from 47 to 178 (0-based)\n",
    "zinc_binding = last_record[47:178]\n",
    "incomplete_overlap = last_record[30:100]\n",
    "print(zinc_binding)\n",
    "print()\n",
    "\n",
    "# Let's have a closer look at the single feature included within the sliced portion\n",
    "feat = zinc_binding.features[0]\n",
    "print(\"The feature is of '{}' type\".format(feat.type))\n",
    "print(\"It is from {} to {}\".format(feat.location.start, feat.location.end))\n",
    "print(\"Its location is {}\".format(feat.location)) # We have the positions of 4 residues that make contact with Zn\n",
    "print(\"Its qualifiers are: {}\".format(feat.qualifiers))\n",
    "print()\n",
    "\n",
    "# On the other hand, the incompletely overlapping part (look at the features)\n",
    "print(incomplete_overlap)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phylogenetic Analysis "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAGwCAYAAABcnuQpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAABf90lEQVR4nO3de1hVVf4/8PcBuR7kIgkckIBU5CqctJCMIU0LL02OjprgiA5SOThi6NcydDQhGQu7OKQREhhem5JmdBzNrwKOJt6CkcRvKgJ5QalGwXNUQFi/P/xxxhOIHDyHw4b363nW87j3Xnuvzzq4OR/W2heZEEKAiIiISKJMjB0AERER0cNgMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSehk7AENramrC5cuX0bt3b8hkMmOHQ0RERO0ghMCNGzfg6uoKE5O2x166fTJz+fJluLu7GzsMIiIi6oALFy6gX79+bdbp9slM7969Adz9MGxtbY0cDREREbVHbW0t3N3dNd/jben2yUzz1JKtrS2TGSIiIolpzyUivACYiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiDogPT0dQUFBkMvlsLe3h1KpxKpVq/TaRnZ2NmQyWaulurpar21JWS9jB0BERCQ1mZmZSEhIwJo1axAeHo66ujqcPHkSpaWlem1n6tSpiIiI0Fo3c+ZM3L59G05OTnptS8o4MkNERKSjHTt2YMqUKYiJicGAAQPg7++PadOmISkpSateVlYWfH19YWlpCR8fH6xdu1Zr+9GjR6FUKmFpaYmhQ4ciNzcXMpkMxcXFAAArKyu4uLhoiqmpKfbv34+YmJjO6qokcGSGOo1arTZ2CEREOpPL5S3Wubi4oKCgAJWVlfDw8Gh1v4yMDCxbtgxpaWlQKpUoKipCbGws5HI5oqOjoVarMX78eIwcORIbN25EeXk54uPj24zls88+g7W1NX7729/qpW/dhujmampqBABRU1Nj7FB6PAAsLCwskiutuXz5shg2bJgAILy9vUV0dLTYtm2baGxs1NRxd3cXmzdv1tovKSlJhIaGCiGESE9PF3369BFqtVqzfd26dQKAKCoqarVdPz8/MWfOnIf8bSwNunx/c5qJDE6tVkMmkxk7DCIivVEoFDh8+DBKSkowb948NDQ0IDo6GhEREWhqasKPP/6ICxcuICYmBjY2NpqSnJyMsrIyAMDp06cRFBQEa2trzXFDQ0Pv2+bhw4dRWlrKKaZWcJqJOtXVq1dbHbIlIpKigIAABAQEIC4uDgcPHkRYWBgKCgrg5+cH4O5UU0hIiNY+pqamAAAhhE5trV+/HsHBwRgyZIh+gu9GmMxQp5LL5UxmiKhbak5g1Go1nJ2d4ebmhvPnzyMqKuq+9XNycnDr1i1YWVkBAAoLC1utq1Kp8PnnnyMlJcUwwUsckxkiIiIdzZkzB66urhg5ciT69euHqqoqJCcno2/fvpqpouXLl2PevHmwtbXFmDFjUFdXh+PHj+PatWtISEhAZGQkEhMTERMTgyVLlqCiogKpqamttrdt2zbcuXPnvolRT8drZoiIiHQ0atQoFBYWYvLkyfD29sakSZNgaWmJffv2wdHREQAwe/ZsrF+/HtnZ2QgMDER4eDiys7Ph5eUFALCxscGOHTtQWloKpVKJxMTE+z50LzMzExMnToSDg0On9VFKZELXSTuJqa2thZ2dHWpqamBra2vscHoktVoNGxsbAHeHSjnNRETUuoqKCnh5eaGoqAjBwcHGDseodPn+5sgMERERSRqTGSIiIpI0XgBMRETURXh6eup8yzZxZIaIiIgkjskMERERSRqTGSIiIpI0JjNkVPn5+ZDJZLh+/bqxQwEAbN++HUOHDoW9vT3kcjmCg4ORk5Nj7LCIiKgNvACYCEBjYyNkMhn69OmDxMRE+Pj4wNzcHDt37sSsWbPg5OSE559/3thhEhFRKzgyQ11KdnY27O3tsWfPHvj6+sLGxgYRERGoqqrS1Jk5cyYmTJiA1NRUKBQKODo6Ii4uDg0NDZo69fX1WLRoEdzc3CCXyxESEoL8/PwW7ezcuRN+fn6wsLBAZWUlnnnmGfzmN7+Br68v+vfvj/j4eAwePBgHDx7szI+BiIh0wJEZui+1Wm2U49y8eROpqanIycmBiYkJpk+fjoULF2LTpk2aOnl5eVAoFMjLy8O5c+cwdepUBAcHIzY2FgAwa9YsVFRUYOvWrXB1dUVubi4iIiJQUlKCgQMHatpJSUnB+vXr4ejoCCcnJ604hBDYv38/vv/++/s+YpyIiIyPyQzdV/MrCDpbQ0MDPv74Y/Tv3x8AMHfuXKxYsUKrjoODA9LS0mBqagofHx+MGzcO+/btQ2xsLMrKyrBlyxZcvHgRrq6uAICFCxdi9+7dyMrKwsqVKzXtrF27FkFBQVrHrqmpgZubG+rq6mBqaoq1a9di9OjRndBzIiLqCCYz1MK971IyBmtra00iAwAKhQLV1dVadfz9/WFqaqpVp6SkBADw7bffQggBb29vrX3q6uo0L4ADAHNzcwwePLhF+71790ZxcTFUKhX27duHhIQEPPbYY3jmmWf00T0iItIzJjPUpqtXrz70iyHVajWcnZ3bXd/MzExrWSaTtXgiZmt1mpqaAABNTU0wNTXFiRMntBIeQHu0ycrKCjKZrEX7JiYmGDBgAAAgODgYp0+fRkpKCpMZIqIuiskMtUkul0vuLddKpRKNjY2orq5GWFjYQx9PCIG6ujo9REZERIbAZIa6HW9vb0RFRWHGjBlYvXo1lEolfvrpJ+zfvx+BgYEYO3bsffdNSUnB0KFD0b9/f9TX12PXrl347LPPsG7duk7sARER6YLJDHVLWVlZSE5OxoIFC3Dp0iU4OjoiNDS0zUQGuDsl9oc//AEXL16ElZUVfHx8sHHjRkydOrWTIiciIl3JRDd/PWdtbS3s7OxQU1MDW1tbY4cjCfdeAKxSqfRyzYw+j0dERN2fLt/ffGgeERERSRqTGSIiIpI0JjNEREQkaUxmiIiISNKYzBAREZGkMZkhIiIiSWMyQ0RERJLGZIaIiKgD0tPTERQUBLlcDnt7eyiVSqxatcogbWVnZ2Pw4MGwtLSEi4sL5s6da5B2pKpLPwE4JSUF27dvx//93//BysoKTz31FFatWoVBgwYZOzQiIurBMjMzkZCQgDVr1iA8PBx1dXU4efIkSktL9d7We++9h9WrV+Pdd99FSEgIbt++jfPnz+u9HSnr0iMzBQUFiIuLQ2FhIfbu3Ys7d+7gueeeg1qtNnZoRETUg+3YsQNTpkxBTEwMBgwYAH9/f0ybNg1JSUla9bKysuDr6wtLS0v4+Phg7dq1WtuPHj0KpVIJS0tLDB06FLm5uZDJZCguLgYAXLt2DUuWLMFnn32GyMhI9O/fH/7+/njhhRc6q6uS0KVHZnbv3q21nJWVBScnJ5w4cQK/+tWvjBQVPQwmokQkNa29gsXFxQUFBQWorKyEh4dHq/tlZGRg2bJlSEtLg1KpRFFREWJjYyGXyxEdHQ21Wo3x48dj5MiR2LhxI8rLyxEfH691jL1796KpqQmXLl2Cr68vbty4gaeeegqrV6+Gu7u7QforSUJCzp49KwCIkpKS+9a5ffu2qKmp0ZQLFy4IAKKmpqYTI5U2lUolAAgAQqVS6fV4LCwsLFIrrbl8+bIYNmyYACC8vb1FdHS02LZtm2hsbNTUcXd3F5s3b9baLykpSYSGhgohhEhPTxd9+vQRarVas33dunUCgCgqKhJCCJGSkiLMzMzEoEGDxO7du8Xhw4fFs88+KwYNGiTq6uoe+vdzV1ZTUyOA9n1/SyaZaWpqEi+88IJ4+umn26y3bNmyVv8zMplpP30nM82M/QuJhYWFpSOlLSUlJSItLU1ERkYKS0tLMXr0aNHY2Ciqq6sFAGFlZSXkcrmmWFhYCCcnJyGEEPPnzxcjRozQOl5xcbEA/pvMvP322wKA2LNnj6ZOdXW1MDExEbt379bb7+euSJdkpktPM91r7ty5OHnyJA4ePNhmvcWLFyMhIUGzXFtby6G4LkKlUhk7BCIivQoICEBAQADi4uJw8OBBhIWFoaCgAH5+fgDuTjWFhIRo7WNqagoAEEI88PgKhQIANMcDgL59++KRRx7BDz/8oK9uSJ4kkpk//vGP+Pvf/44DBw6gX79+bda1sLCAhYVFJ0VGumht3pmIqLtoTjjUajWcnZ3h5uaG8+fPIyoq6r71c3JycOvWLVhZWQEACgsLteoMHz4cAPD9999rvv/+85//4KeffrrvtTo9UZdOZoQQ+OMf/4jc3Fzk5+fDy8vL2CERERFhzpw5cHV1xciRI9GvXz9UVVUhOTkZffv2RWhoKABg+fLlmDdvHmxtbTFmzBjU1dXh+PHjuHbtGhISEhAZGYnExETExMRgyZIlqKioQGpqqlY73t7eePHFFxEfH49PPvkEtra2WLx4MXx8fDBixAhjdL1L6tK3ZsfFxWHjxo3YvHkzevfujStXruDKlSu4deuWsUMjIqIebNSoUSgsLMTkyZPh7e2NSZMmwdLSEvv27YOjoyMAYPbs2Vi/fj2ys7MRGBiI8PBwZGdna/4wt7GxwY4dO1BaWgqlUonExMRWH7r32WefISQkBOPGjUN4eDjMzMywe/dumJmZdWqfuzKZaM+knZHIZLJW12dlZWHmzJntOkZtbS3s7OxQU1MDW1tbPUbXfanVatjY2AC4e50Lp4eIiDpHRUUFvLy8UFRUhODgYGOHY1S6fH93+WkmIiIiorZ06WkmIiIiogfp0iMzREREPYmnpydnJTqAIzNEREQkaUxmiIiISNKYzBAREZGkMZkhneTn50Mmk+H69evGDkXj+vXriIuLg0KhgKWlJXx9fbFr1y5jh0VERJ2EFwCTJDU2NkImk+HOnTsYPXo0nJyc8MUXX6Bfv364cOECevfubewQiYiok3Bkhh5KdnY27O3tsWfPHvj6+sLGxgYRERGoqqrS1Jk5cyYmTJiA1NRUKBQKODo6Ii4uDg0NDZo69fX1WLRoEdzc3CCXyxESEoL8/PwW7ezcuRN+fn6wsLBAZWUlPv30U/znP//BV199heHDh8PDwwNPP/00goKCOvNjICIiI+LITBekVqsl1f7NmzeRmpqKnJwcmJiYYPr06Vi4cCE2bdqkqZOXlweFQoG8vDycO3cOU6dORXBwMGJjYwEAs2bNQkVFBbZu3QpXV1fk5uYiIiICJSUlGDhwoKadlJQUrF+/Ho6OjnBycsLf//53hIaGIi4uDn/729/Qt29fREZG4vXXX9e8mZaIiLo3JjNdUPOrBKSioaEBH3/8Mfr37w8AmDt3LlasWKFVx8HBAWlpaTA1NYWPjw/GjRuHffv2ITY2FmVlZdiyZQsuXrwIV1dXAMDChQuxe/duZGVlYeXKlZp21q5dqzXqcv78eezfvx9RUVHYtWsXzp49i7i4ONy5cwd/+tOfOukTICIiY2Iy04Xc+04kKbG2ttYkMgCgUChQXV2tVcff319rpEShUKCkpAQA8O2330IIAW9vb6196urqNC9sAwBzc3MMHjxYq05TUxOcnJzwySefwNTUFEOGDMHly5fx7rvvMpkhIuohmMx0UVevXjXaCx7VajWcnZ3bXf+Xb26VyWQtnmDZWp2mpiYAdxMSU1NTnDhxosXU0L3JnZWVVYuXjyoUCpiZmWnt5+vriytXrqC+vh7m5ubt7gcREUkTk5kuSi6X95i3VSuVSjQ2NqK6uhphYWE67Tt8+HBs3rwZTU1NMDG5ez37mTNnoFAomMgQEfUQvJuJjM7b2xtRUVGYMWMGtm/fjvLychw7dgyrVq164PNi5syZg59//hnx8fE4c+YM/vGPf2DlypWIi4vrpOiJiMjYODJDXUJWVhaSk5OxYMECXLp0CY6OjggNDcXYsWPb3M/d3R1ff/01XnvtNQwePBhubm6Ij4/H66+/3kmRExGRsclEN389Z21tLezs7FBTUwNbW1tjh9Omey8AVqlURr1mpivEQUREPZcu39+cZiIiIiJJYzJDREREksZkhoiIiCSNyQwRERFJGpMZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERdUB6ejqCgoIgl8thb28PpVKJVatW6b2d+Ph4DBkyBBYWFggODm61TklJCcLDw2FlZQU3NzesWLGixTvyujM+AZiIiEhHmZmZSEhIwJo1axAeHo66ujqcPHkSpaWlem9LCIHf//73OHLkCE6ePNlie21tLUaPHo0RI0bg2LFjOHPmDGbOnAm5XI4FCxboPZ6uiCMzREREOtqxYwemTJmCmJgYDBgwAP7+/pg2bRqSkpK06mVlZcHX1xeWlpbw8fHB2rVrtbYfPXoUSqUSlpaWGDp0KHJzcyGTyVBcXKyps2bNGsTFxeGxxx5rNZZNmzbh9u3byM7ORkBAACZOnIg333wT7733Xo8ZneHIDHUatVpt7BCIiHTW2itdXFxcUFBQgMrKSnh4eLS6X0ZGBpYtW4a0tDQolUoUFRUhNjYWcrkc0dHRUKvVGD9+PEaOHImNGzeivLwc8fHxOsd3+PBhhIeHw8LCQrPu+eefx+LFi1FRUQEvLy+djyk1TGao0zS/74mISEpaG91YtmwZJk6cCE9PT3h7e2tejPvb3/4WJiZ3Jz2SkpKwevVqTJw4EQDg5eWF0tJSpKenIzo6Gps2bUJjYyM+/fRTWFtbw9/fHxcvXsScOXN0iu/KlSvw9PTUWufs7KzZ1hOSGU4zkcGp1WrIZDJjh0FEpDcKhQKHDx9GSUkJ5s2bh4aGBkRHRyMiIgJNTU348ccfceHCBcTExMDGxkZTkpOTUVZWBgA4ffo0goKCYG1trTluaGhoh+L55e/Y5gSsp/zu5cgMdaqrV6/yLdxE1G0EBAQgICAAcXFxOHjwIMLCwlBQUAA/Pz8Ad6eaQkJCtPYxNTUF0PqIT0e4uLjgypUrWuuqq6sB/HeEprtjMkOdSi6XM5khom6pOYFRq9VwdnaGm5sbzp8/j6ioqPvWz8nJwa1bt2BlZQUAKCws1Lnd0NBQvPnmm6ivr4e5uTkA4Ouvv4arq2uL6afuitNMREREOpozZw6SkpJw6NAhVFZWorCwEDNmzEDfvn01U0XLly9HSkoKPvzwQ5w5cwYlJSXIysrCe++9BwCIjIyEiYkJYmJiUFpail27diE1NbVFW+fOnUNxcTGuXLmCW7duobi4GMXFxaivr9ccx8LCAjNnzsR3332H3NxcrFy5EgkJCT1mmgmim6upqREARE1NjbFDeSCVSiUACABCpVJ1mzi6Sr+IiPTliy++EGPHjhUKhUKYm5sLV1dXMWnSJHHy5Emteps2bRLBwcHC3NxcODg4iF/96ldi+/btmu2HDx8WQUFBwtzcXAQHB4svv/xSABBFRUWaOuHh4ZrfofeW8vJyTZ2TJ0+KsLAwYWFhIVxcXMTy5ctFU1OToT8Gg9Ll+1smRPe+Cb22thZ2dnaoqamBra2tscNpk1qt1tzxo1KpjDYdo+84ukq/iIi6uuZbqYuKiu77tN+eQpfvb04zERERkaQxmSEiIiJJ491MREREXYSnp2ePeQWBPnFkhoiIiCSNyQwRERFJGpMZIiIikjQmM2RU+fn5kMlkuH79urFDAXD30eNhYWFwcHCAg4MDRo0ahaNHjxo7LCIiagOTGSIAjY2NaGpqQn5+PqZNm4a8vDwcPnwYjz76KJ577jlcunTJ2CESEdF9MJmhLiU7Oxv29vbYs2cPfH19YWNjg4iICFRVVWnqzJw5ExMmTEBqaioUCgUcHR0RFxeHhoYGTZ36+nosWrQIbm5ukMvlCAkJQX5+fot2du7cCT8/P1hYWKCyshKbNm3CH/7wBwQHB8PHxwcZGRloamrCvn37OvNjICIiHfDWbGqTWq3u9GPcvHkTqampyMnJgYmJCaZPn46FCxdi06ZNmjp5eXlQKBTIy8vDuXPnMHXqVAQHByM2NhYAMGvWLFRUVGDr1q1wdXVFbm4uIiIiUFJSgoEDB2raSUlJwfr16+Ho6AgnJ6dWY2loaECfPn0e4hMgIiJDYjJDbTLG6+MbGhrw8ccfo3///gCAuXPnYsWKFVp1HBwckJaWBlNTU/j4+GDcuHHYt28fYmNjUVZWhi1btuDixYtwdXUFACxcuBC7d+9GVlYWVq5cqWln7dq1CAoKum8sb7zxBtzc3DBq1CgD9ZaIiB4WkxlqQS6XQwhhtLetWltbaxIZAFAoFKiurtaq4+/vD1NTU606JSUlAIBvv/0WQgh4e3tr7VNXVwdHR0fNsrm5OQYPHnzfON555x1s2bIF+fn5sLS0fKg+ERGR4TCZoftSqVR6OY5ardZphMfMzExrWSaTtXgiZmt1mpqaAABNTU0wNTXFiRMntBIeAJoXXgKAlZXVfRO21NRUrFy5Ev/7v//bZsJDRETGx2SG7kuqb7dWKpVobGxEdXU1wsLCdN7/3XffRXJyMvbs2YOhQ4caIEIiItInJjPU7Xh7eyMqKgozZszA6tWroVQq8dNPP2H//v0IDAzE2LFj77vvO++8g6VLl2Lz5s3w9PTElStXANwd0bl3VIeIiLoO3ppN3VJWVhZmzJiBBQsWYNCgQfj1r3+NI0eOwN3dvc391q5di/r6evz2t7+FQqHQlNTU1E6KnIiIdCUT3fz1nLW1tbCzs0NNTQ1sbW2NHU6b1Gq15q9/lUol2WmeX+qu/SIiIsPR5fubIzNEREQkaUxmiIiISNKYzBAREZGkMZkhIiIiSWMyQ0RERJLGZIaIiIgkjckMERFRB6SnpyMoKAhyuRz29vZQKpVYtWqVXtv4+eefERERAVdXV1hYWMDd3R1z585FbW2tXtuROj4BmIiISEeZmZlISEjAmjVrEB4ejrq6Opw8eRKlpaV6bcfExAQvvvgikpOT0bdvX5w7dw5xcXH4z3/+g82bN+u1LSmT1MhMSkoKZDIZ5s+fb+xQiIioB9uxYwemTJmCmJgYDBgwAP7+/pg2bRqSkpK06mVlZcHX1xeWlpbw8fHB2rVrtbYfPXoUSqUSlpaWGDp0KHJzcyGTyVBcXAwAcHBwwJw5czB06FB4eHjg2WefxR/+8Af861//6qyuSoJkRmaOHTuGTz75hG8wJiIio3NxcUFBQQEqKyvh4eHRap2MjAwsW7YMaWlpUCqVKCoqQmxsLORyOaKjo6FWqzF+/HiMHDkSGzduRHl5OeLj49ts9/Lly9i+fTvCw8MN0S3JkkQyo1KpEBUVhYyMDCQnJxs7HHoIarXa2CEQEemktVewLFu2DBMnToSnpye8vb0RGhqKsWPH4re//S1MTO5OeiQlJWH16tWYOHEiAMDLywulpaVIT09HdHQ0Nm3ahMbGRnz66aewtraGv78/Ll68iDlz5rRob9q0afjb3/6GW7du4YUXXsD69esN22mpERIwY8YMMX/+fCGEEOHh4SI+Pv6+dW/fvi1qamo05cKFCwKAqKmp6aRoO06lUgkAAoBQqVRa2/Ly8gQAce3aNeME14r3339feHt7C0tLS9GvXz8xf/58cevWrRb17u0XCwsLi9RKW0pKSkRaWpqIjIwUlpaWYvTo0aKxsVFUV1cLAMLKykrI5XJNsbCwEE5OTkIIIebPny9GjBihdbzi4mIBQBQVFWmtr6qqEqdPnxZfffWV8PPzE3PmzHm4X+ASUFNTI4D2fX93+ZGZrVu34ttvv8WxY8faVT8lJQVvvfWWgaPquRobGyGTybBlyxa88cYb+PTTT/HUU0/hzJkzmDlzJgDg/fff19pHLpdDCAGZTGaEiImIDCcgIAABAQGIi4vDwYMHERYWhoKCAvj5+QG4O9UUEhKitY+pqSkAQOjwnmcXFxe4uLjAx8cHjo6OCAsLw9KlS6FQKPTXGQnr0snMhQsXEB8fj6+//hqWlpbt2mfx4sVISEjQLNfW1sLd3d1QIRpNdnY25s+fj23btmH+/Pm4cOECnn76aWRlZWn+c8+cORPXr1/H008/jdWrV6O+vh4vvfQSPvjgA5iZmQEA6uvrsWTJEmzatAnXr19HQEAAVq1ahWeeeUarnY0bN2LRokU4c+YMzp49i8OHD2P48OGIjIwEAHh6emLatGk4evTofWNWqVSG/VCIiIyoOYFRq9VwdnaGm5sbzp8/j6ioqPvWz8nJwa1bt2BlZQUAKCwsfGA7zUlQXV2dniKXvi6dzJw4cQLV1dUYMmSIZl1jYyMOHDiAtLQ01NXVaTLcZhYWFrCwsDBYTIa85kPXY9+8eROpqanIycmBiYkJpk+fjoULF2LTpk2aOnl5eVAoFMjLy8O5c+cwdepUBAcHIzY2FgAwa9YsVFRUYOvWrXB1dUVubi4iIiJQUlKCgQMHatpJSUnB+vXr4ejoCCcnJzz99NPYuHEjjh49iieffBLnz5/Hrl27EB0dfd94W5t3JiKSojlz5sDV1RUjR45Ev379UFVVpbl9OjQ0FACwfPlyzJs3D7a2thgzZgzq6upw/PhxXLt2DQkJCYiMjERiYiJiYmKwZMkSVFRUIDU1VaudXbt24erVq3jiiSdgY2OD0tJSLFq0CMOHD4enp6cRet5FGXrO62HU1taKkpISrTJ06FAxffp0UVJS0q5j6DLn1h7opDnaB10zk5WVJQCIc+fOaep89NFHwtnZWbMcHR0tPDw8xJ07dzTrJk+eLKZOnSqEEOLcuXNCJpOJS5cuabX17LPPisWLF2u1U1xc3OKzWLNmjTAzMxO9evUSAHrEHC4RkRBCfPHFF2Ls2LFCoVAIc3Nz4erqKiZNmiROnjypVW/Tpk0iODhYmJubCwcHB/GrX/1KbN++XbP98OHDIigoSJibm4vg4GDx5Zdfal0zs3//fhEaGirs7OyEpaWlGDhwoHj99de71PWThtJtrpnp3bs3AgICtNbJ5XI4Ojq2WG9oarUaNjY2ndrmg1hbW6N///6aZYVCgerqaq06/v7+WqNXCoUCJSUlAIBvv/0WQgh4e3tr7VNXVwdHR0fNsrm5eYtb4vPz8/H2229j7dq1CAkJwblz5xAfHw+FQoGlS5fqrY9ERF3RpEmTMGnSpAfWi4yM1EzHt2bYsGGaZ8oAQEVFhdb2ESNG4JtvvulomD1Gl05muqqrV68aZMqkeZ61vZqve2kmk8laXFDWWp2mpiYAQFNTE0xNTXHixIkW03X3Jm5WVlYtLt5dunQpfve732H27NkAgMDAQKjVarz88stITEzU3JpIRERkaJJLZvLz840dAuRyebe4/kOpVKKxsRHV1dUICwvTad+bN2+2SFhMTU0hhNDpCn0iIqKHJblkhvTH29sbUVFRmDFjBlavXg2lUomffvoJ+/fvR2BgIMaOHXvffV944QW89957UCqVmmmmpUuX4te//nWLUR4iImofT09P/kHYAUxmerisrCwkJydjwYIFuHTpEhwdHTVPsmzLkiVLIJPJsGTJEly6dAl9+/bFCy+8gLfffruTIiciIrpLJrp5ClhbWws7OzvU1NTA1ta2w8e59wJglUplsGtmDN0GERGRFOjy/c2rNImIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpDGZISIiIkljMkNERD1Ceno6goKCIJfLYW9vD6VSiVWrVhmsvZ9//hn9+vWDTCbD9evXDdYOAb2MHQAREZGhZWZmIiEhAWvWrEF4eDjq6upw8uRJlJaWGqzNmJgYDB48GJcuXTJYG3QXR2aIiKjb27FjB6ZMmYKYmBgMGDAA/v7+mDZtGpKSkrTqZWVlwdfXF5aWlvDx8cHatWu1th89ehRKpRKWlpYYOnQocnNzIZPJUFxcrFVv3bp1uH79OhYuXGjorhE4MtMjqNVqY4dARNRp5HJ5i3UuLi4oKChAZWUlPDw8Wt0vIyMDy5YtQ1paGpRKJYqKihAbGwu5XI7o6Gio1WqMHz8eI0eOxMaNG1FeXo74+PgWxyktLcWKFStw5MgRnD9/Xu/9o1aIbq6mpkYAEDU1NQ91HJVKJQAIAEKlUukpus5po/mYLCwsLD2htOby5cti2LBhAoDw9vYW0dHRYtu2baKxsVFTx93dXWzevFlrv6SkJBEaGiqEECI9PV306dNHqNVqzfZ169YJAKKoqEgIIcTt27fF4MGDRU5OjhBCiLy8PAFAXLt2TW+/03sKXb6/Oc3UjanVashkMmOHQURkdAqFAocPH0ZJSQnmzZuHhoYGREdHIyIiAk1NTfjxxx9x4cIFxMTEwMbGRlOSk5NRVlYGADh9+jSCgoJgbW2tOW5oaKhWO4sXL4avry+mT5/eqf3r6TjN1ENcvXq11aFXIqKeJCAgAAEBAYiLi8PBgwcRFhaGgoIC+Pn5Abg71RQSEqK1j6mpKQBACPHA4+/fvx8lJSX44osvtPZ55JFHkJiYiLfeekuf3aH/j8lMDyGXy5nMEBHdozmBUavVcHZ2hpubG86fP4+oqKj71s/JycGtW7dgZWUFACgsLNSq8+WXX+LWrVua5WPHjuH3v/89/vWvf6F///4G6gkxmSEiom5vzpw5cHV1xciRI9GvXz9UVVUhOTkZffv21UwVLV++HPPmzYOtrS3GjBmDuro6HD9+HNeuXUNCQgIiIyORmJiImJgYLFmyBBUVFUhNTdVq55cJy08//QQA8PX1hb29faf0tSfiNTNERNTtjRo1CoWFhZg8eTK8vb0xadIkWFpaYt++fXB0dAQAzJ49G+vXr0d2djYCAwMRHh6O7OxseHl5AQBsbGywY8cOlJaWQqlUIjEx0aAP3aP2k4n2TAJKWG1tLezs7FBTUwNbW9sOH0etVsPGxgYAoFKpDDJlo+82OiNmIqKerKKiAl5eXigqKkJwcLCxw+lWdPn+7tA007Fjx/DXv/4VP/zwA+rr67W2bd++vSOHJCIiIuoQnaeZtm7diuHDh6O0tBS5ubloaGhAaWkp9u/fDzs7O0PESERERHRfOo/MrFy5Eu+//z7i4uLQu3dvfPjhh/Dy8sIrr7wChUJhiBiJiIi6JE9Pz3bdsk2GpfPITFlZGcaNGwcAsLCw0DyY7bXXXsMnn3yi9wCJiIiI2qJzMtOnTx/cuHEDAODm5obvvvsOAHD9+nXcvHlTv9ERERERPYDO00xhYWHYu3cvAgMDMWXKFMTHx2P//v3Yu3cvnn32WUPESERERHRfOiczaWlpuH37NoC776AwMzPDwYMHMXHiRCxdulTvARIRERG1hc+ZaSc+Z4aIiKjz6PL9rfM1M5mZma2uv3PnDhYvXqzr4YiIiIgeis7JzIIFCzBp0iT85z//0az7v//7Pzz55JP4/PPP9RocERER0YPonMwUFRXh6tWrCAwMxN69e/HRRx/h8ccfR0BAAIqLiw0QIhEREdH96XwBsJeXFw4cOIDXXnsNERERMDU1xWeffYaXXnrJEPERERERtalDb83euXMntmzZgqeeegr29vbIyMjA5cuX9R0bERER0QPpnMy88sormDJlChYtWoQDBw7g5MmTsLCwQGBgIK+ZISIiok6n8zTToUOHcOTIEQQFBQEAXFxcsGvXLnz00Uf4/e9/jylTpug9SCIiIqL70TmZOXHiBCwsLFqsj4uLw6hRo/QSFBEREVF76TzN1Foi02zQoEEPFQx1nvz8fMhkMly/ft3YoQAAsrOzIZPJWpTmp00TERHdj84jMwDwxRdf4PPPP8cPP/yA+vp6rW3ffvutXgKjnqGxsREymQwAYGtri++//15ru6WlpTHCIiIiCdF5ZGbNmjWYNWsWnJycUFRUhCeffBKOjo44f/48xowZY4gYqRNkZ2fD3t4ee/bsga+vL2xsbBAREYGqqipNnZkzZ2LChAlITU2FQqGAo6Mj4uLi0NDQoKlTX1+PRYsWwc3NDXK5HCEhIcjPz2/Rzs6dO+Hn5wcLCwtUVlYCAGQyGVxcXLQKERHRg+g8MrN27Vp88sknmDZtGjZs2IBFixbhsccew5/+9CetpwLTw1Gr1Z1+jJs3byI1NRU5OTkwMTHB9OnTsXDhQmzatElTJy8vDwqFAnl5eTh37hymTp2K4OBgxMbGAgBmzZqFiooKbN26Fa6ursjNzUVERARKSkowcOBATTspKSlYv349HB0d4eTkBODu+6M8PDzQ2NiI4OBgJCUlQalUPvTnQERE3ZvOycwPP/yAp556CgBgZWWFGzduAAB+97vfYdiwYUhLS9NvhD2Us7Nzp7fZ0NCAjz/+GP379wcAzJ07FytWrNCq4+DggLS0NJiamsLHxwfjxo3Dvn37EBsbi7KyMmzZsgUXL16Eq6srAGDhwoXYvXs3srKysHLlSk07a9eu1dwRBwA+Pj7Izs5GYGAgamtr8eGHH2L48OH497//rUmCiIiIWqNzMuPi4oKff/4ZHh4e8PDwQGFhIYKCglBeXo5u/gJug5PL5RBCaK4h6WzW1taaRAYAFAoFqqurter4+/vD1NRUq05JSQmAu9dLCSHg7e2ttU9dXR0cHR01y+bm5hg8eLBWnWHDhmHYsGGa5eHDh+Pxxx/HX/7yF6xZs+bhO0dERN2WzsnMyJEjsWPHDjz++OOIiYnBa6+9hi+++ALHjx/HxIkTDRFjj6NSqfRyHLVardMIj5mZmdayTCZrkaC2VqepqQkA0NTUBFNTU5w4cUIr4QEAGxsbzb+trKwemLCZmJjgiSeewNmzZ9sdPxER9Uw6JzOJiYlwc3MDALz66qvo06cPDh48iBdeeIEXAOuJXC43dggdolQq0djYiOrqaoSFhT3UsYQQKC4uRmBgoJ6iIyKi7krnu5kGDBig9WySKVOmYM2aNYiKioKPj48+YyOJ8fb2RlRUFGbMmIHt27ejvLwcx44dw6pVq7Br1642933rrbewZ88enD9/HsXFxYiJiUFxcTFeffXVToqeiLq79PR0BAUFQS6Xw97eHkqlEqtWrdJ7O/v27cNTTz2F3r17Q6FQ4PXXX8edO3f03g79l84jM/e7LkalUvGZIISsrCwkJydjwYIFuHTpEhwdHREaGoqxY8e2ud/169fx8ssv48qVK7Czs4NSqcSBAwfw5JNPdlLkRNSdZWZmIiEhAWvWrEF4eDjq6upw8uRJlJaW6rWdkydPYuzYsUhMTMRnn32GS5cu4dVXX0VjYyNSU1P12hb9l0y086rdhIQEAMCHH36I2NhYWFtba7Y1NjbiyJEjMDU1xaFDh/Qa4KVLl/D666/jn//8J27dugVvb29kZmZiyJAh7dq/trYWdnZ2qKmpga2tbYfjUKvVmus+VCqVJKaCpBgzEZEhTJgwAQ4ODsjKymqzXlZWFt555x2Ul5fD09MT8+bNwx/+8AfN9qNHj+KVV17B6dOnERAQgMTEREycOBFFRUUIDg7Gm2++ib179+LYsWOafb766itMmzYN1dXV6N27t8H62N3o8v3d7pGZoqIiAHdHZkpKSmBubq7ZZm5ujqCgICxcuLCDIbfu2rVrGD58OEaMGIF//vOfcHJyQllZGezt7fXaDulOH8/BISIyhNb+cHNxcUFBQQEqKyvh4eHR6n4ZGRlYtmwZ0tLSoFQqUVRUhNjYWMjlckRHR0OtVmP8+PEYOXIkNm7ciPLycsTHx2sdo66ursUshZWVFW7fvo0TJ07gmWee0Vs/6R5CRzNnzhQ1NTW67tYhr7/+unj66ad12uf27duipqZGUy5cuCAAPHTMKpVKABAAhEqleqhjdRZDxtx8XBYWFpauVlpz+fJlMWzYMAFAeHt7i+joaLFt2zbR2NioqePu7i42b96stV9SUpIIDQ0VQgiRnp4u+vTpI9RqtWb7unXrBABRVFQkhBBiz549wsTERGzevFncuXNHXLx4UTz99NMCQItjU9tqamoE0L7vb50vAM7Kynqo6Rpd/P3vf8fQoUMxefJkODk5QalUIiMjo819UlJSYGdnpynu7u6dEmtPoVarjfYcHCKijlIoFDh8+DBKSkowb948NDQ0IDo6GhEREWhqasKPP/6ICxcuICYmBjY2NpqSnJyMsrIyAMDp06cRFBSkdZlFaGioVjvPPfcc3n33Xbz66quwsLCAt7c3xo0bBwAtHllB+tPua2aMoXmoLiEhAZMnT8bRo0cxf/58pKenY8aMGa3uU1dXh7q6Os1ybW0t3N3dec2MnmK+95hXr16VxOdARD1Le38vHTx4EGFhYdi/fz/8/Pzg4uKCjRs3IiQkRKueqakpvLy8MH/+fJw8eRL79+/XbPv3v/+N4OBgzTUzzYQQqKqqgoODAyoqKuDn54ejR4/iiSee0EsfewKDXDNjDE1NTRg6dKjmMfhKpRKnTp3CunXr7pvMWFhYwMLCojPD7LHkcjmTGSKSLD8/PwD/fcCom5sbzp8/j6ioqPvWz8nJwa1bt2BlZQUAKCwsbLWuTCbTvNZly5YtcHd3x+OPP26AXhDQxZMZhUKh+c/WzNfXF19++aWRIiIiIimaM2cOXF1dMXLkSPTr1w9VVVVITk5G3759NVNFy5cvx7x582Bra4sxY8agrq4Ox48fx7Vr15CQkIDIyEgkJiYiJiYGS5YsQUVFRau3W7/77ruIiIiAiYkJtm/fjj//+c/4/PPPOc1kQDpfM9OZhg8fju+//15r3ZkzZ+57JToREVFrRo0ahcLCQkyePBne3t6YNGkSLC0tsW/fPs2742bPno3169drXnobHh6O7OxseHl5Abj7WpYdO3agtLQUSqUSiYmJrT5075///CfCwsIwdOhQ/OMf/8Df/vY3TJgwoTO72+N06Wtmjh07hqeeegpvvfUWpkyZgqNHjyI2NhaffPLJfYcBf4nPmTHcNTNS+RyIiAyloqICXl5eLa6ZoYeny/d3lx6ZeeKJJ5Cbm4stW7YgICAASUlJ+OCDD9qdyBAREVH316WvmQGA8ePHY/z48cYOg4iIiLqoLp/MEBERdVWenp73fWchdZ4uPc1ERERE9CBMZoiIiEjSmMwQERGRpDGZISIiIkljMkN6k5+fD5lMhuvXrxs7FADAqVOnMGnSJHh6ekImk+GDDz4wdkhERGQATGao22lsbERTUxNu3ryJxx57DH/+85/h4uJi7LCIiMhAmMyQwWRnZ8Pe3h579uyBr68vbGxsEBERgaqqKk2dmTNnYsKECUhNTYVCoYCjoyPi4uLQ0NCgqVNfX49FixbBzc0NcrkcISEhyM/Pb9HOzp074efnBwsLC1RWVuKJJ57Au+++i5deeokvHyUi6sb4nJkeQq1WG+U4N2/eRGpqKnJycmBiYoLp06dj4cKF2LRpk6ZOXl4eFAoF8vLycO7cOUydOhXBwcGIjY0FAMyaNQsVFRXYunUrXF1dkZubi4iICJSUlGDgwIGadlJSUrB+/Xo4OjrCyclJL/0lIqKuj8lMD+Hs7GyUdhsaGvDxxx+jf//+AIC5c+dixYoVWnUcHByQlpYGU1NT+Pj4YNy4cdi3bx9iY2NRVlaGLVu24OLFi3B1dQUALFy4ELt370ZWVhZWrlypaWft2rUICgrq3A4SEZHRMZnpxuRyOYQQkMlkRovB2tpak8gAgEKhQHV1tVYdf39/mJqaatUpKSkBAHz77bcQQsDb21trn7q6Os2bbgHA3NwcgwcPNkQXiIioi2Myowf5+fkYMWIErl27Bnt7e2OHg4aGBqSkpGDDhg24dOmS5iWdo0ePfuhjq9VqnUZ5zMzMtJZlMlmLR3+3VqepqQkA0NTUBFNTU5w4cUIr4QGgeXs3AFhZWRk1aSMiIuNhMtONNDY2QiaTYcmSJdi4cSMyMjLg4+ODPXv2YNq0afjmm2+gVCqNHaZOlEolGhsbUV1djbCwMGOHQ0REXRDvZjIAY9/Fk5OTgzfffBNjx47FY489hjlz5uD555/H6tWrO/Nj0Atvb29ERUVhxowZ2L59O8rLy3Hs2DGsWrUKu3btanPf+vp6FBcXo7i4GPX19bh06RKKi4tx7ty5ToqeiIg6A5MZA7n3Lp4DBw7ghx9+wMKFC7Xq5OXloaysDHl5ediwYQOys7ORnZ2t2T5r1iwcOnQIW7duxcmTJzF58mRERETg7NmzWu0038Vz6tQpODk5oa6uDpaWllptWVlZ4eDBgwbts6FkZWVhxowZWLBgAQYNGoRf//rXOHLkCNzd3dvc7/Lly1AqlVAqlaiqqkJqaiqUSiVmz57dSZETEVGnEN1cTU2NACBqamoe6jgqlUoAEACESqXS2paXlycAiGvXrgkhhMjKyhIAxLlz5zR1PvroI+Hs7KxZjo6OFh4eHuLOnTuadZMnTxZTp04VQghx7tw5IZPJxKVLl7TaevbZZ8XixYu12ikuLtaqM23aNOHn5yfOnDkjGhsbxddffy2srKyEubn5Q30GD/ociIiI9EWX729eM2MgxryL58MPP0RsbCx8fHwgk8nQv39/zJo1C1lZWXrrHxERUVfBZMZAjHkXT9++ffHVV1/h9u3b+Pnnn+Hq6oo33ngDXl5eD90vIiKirobJTBelj7t4LC0t4ebmhoaGBnz55ZeYMmWKnqMkIiIyPiYzXdS9d/GsXr0aSqUSP/30E/bv34/AwECMHTv2vvseOXIEly5dQnBwMC5duoTly5ejqakJixYt6sQeEBERdQ7ezdSFdfQuntu3b2PJkiXw8/PDb37zG7i5ueHgwYNd4oF+RERE+iYTv7yQo5upra2FnZ0dampqYGtr2+HjqNVqzbUqKpUKcrlcXyFKCj8HIiLqDLp8f3NkhoiIeoT09HQEBQVBLpfD3t4eSqUSq1at0msb//73vzFt2jS4u7vDysoKvr6++PDDD/XaBrXEa2aIiKjby8zMREJCAtasWYPw8HDU1dXh5MmTKC0t1Ws7J06cQN++fbFx40a4u7vjm2++wcsvvwxTU1PMnTtXr23Rf3GaqZ04vXIXPwcikqIJEybAwcHhgc/bysrKwjvvvIPy8nJ4enpi3rx5+MMf/qDZfvToUbzyyis4ffo0AgICkJiYiIkTJ6KoqAjBwcGtHjMuLg6nT5/G/v379dmlbk+X72+OzFCHqdVqY4dARNRCa39kubi4oKCgAJWVlfDw8Gh1v4yMDCxbtgxpaWlQKpUoKipCbGws5HI5oqOjoVarMX78eIwcORIbN25EeXk54uPjHxhPTU0N+vTp89D9ojYY9FnEXUBnvM6gJ7n3c2BhYWHpiqU1ly9fFsOGDRMAhLe3t4iOjhbbtm0TjY2Nmjru7u5i8+bNWvslJSWJ0NBQIYQQ6enpok+fPkKtVmu2r1u3TgAQRUVFrbb7zTffCDMzM/H1118/5G/fnkeX729eAEw6kcvlLZ5kTETU1SkUChw+fBglJSWYN28eGhoaEB0djYiICDQ1NeHHH3/EhQsXEBMTAxsbG01JTk5GWVkZAOD06dMICgqCtbW15rihoaH3bfPUqVN48cUX8ac//QmjR482eB97Mk4zUYeoVCpjh0BEpLOAgAAEBAQgLi4OBw8eRFhYGAoKCuDn5wfg7lRTSEiI1j7Nr5TR5Q+50tJSjBw5ErGxsViyZIn+OkCtYjJDHcILf4lI6poTGLVaDWdnZ7i5ueH8+fOIioq6b/2cnBzcunULVlZWAIDCwsIW9U6dOoWRI0ciOjoab7/9tuE6QBpMZoiIqNubM2cOXF1dMXLkSPTr1w9VVVVITk5G3759NVNFy5cvx7x582Bra4sxY8agrq4Ox48fx7Vr15CQkIDIyEgkJiYiJiYGS5YsQUVFBVJTU7XaOXXqFEaMGIHnnnsOCQkJuHLlCoC7ozt9+/bt9H73FLxmhoiIur1Ro0ahsLAQkydPhre3NyZNmgRLS0vs27cPjo6OAIDZs2dj/fr1yM7ORmBgIMLDw5GdnQ0vLy8AgI2NDXbs2IHS0lIolUokJia2eOjeX//6V/z444/YtGkTFAqFpjzxxBOd3ueehM+ZaSc+X4WIiH6poqICXl5ebT5nhjqGrzMgIiKiHoPJDBEREUkaLwAmIiLqIE9PTz57qwvgyAwRERFJGpMZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERkaQxmSEiIiJJYzJDREREksZkhoiIiCSNyQwRERFJGpMZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERkaQxmSEiIiJJYzJDREREksZkhoiIiCSNyQwRERFJGpMZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERkaQxmSEioi4pPT0dQUFBkMvlsLe3h1KpxKpVq/TezrFjx/Dss8/C3t4eDg4OeO6551BcXKz3dshwunQyc+fOHSxZsgReXl6wsrLCY489hhUrVqCpqcnYoRERkQFlZmYiISEB8+bNw7///W8cOnQIixYtgkql0ms7N27cwPPPP49HH30UR44cwcGDB2Fra4vnn38eDQ0Nem2LDEh0YcnJycLR0VHs3LlTlJeXi7/+9a/CxsZGfPDBB+0+Rk1NjQAgampqHioWlUolAAgAQqVSPdSxiIiobS+++KKYOXPmA+t9+umnwsfHR1hYWIhBgwaJjz76SGv7kSNHRHBwsLCwsBBDhgwR27dvFwBEUVGREEKIY8eOCQDihx9+0Oxz8uRJAUCcO3dOr30i3ejy/d3LiHnUAx0+fBgvvvgixo0bBwDw9PTEli1bcPz4cSNHZlhqtdrYIRARdRq5XN5inYuLCwoKClBZWQkPD49W98vIyMCyZcuQlpYGpVKJoqIixMbGQi6XIzo6Gmq1GuPHj8fIkSOxceNGlJeXIz4+XusYgwYNwiOPPILMzEy8+eabaGxsRGZmJvz9/e/bLnVBnZBcdVhKSorw8PAQ33//vRBCiOLiYuHk5CQ2b958331u374tampqNOXChQuSG5lpboeFhYWlJ5TWXL58WQwbNkwAEN7e3iI6Olps27ZNNDY2auq4u7u3+D5ISkoSoaGhQggh0tPTRZ8+fYRardZsX7dunQD+OzIjhBDfffed6N+/vzAxMREmJibCx8dHVFZW6vG3OnWELiMzXTqZaWpqEm+88YaQyWSiV69eQiaTiZUrV7a5z7Jly1o9WaSQzNzbBgsLC0tPKW0pKSkRaWlpIjIyUlhaWorRo0eLxsZGUV1dLQAIKysrIZfLNcXCwkI4OTkJIYSYP3++GDFihNbxiouLBfDfZObmzZviySefFDNmzBBHjx4Vhw8fFpMmTRL+/v7i5s2bBvldT+3TbaaZtm3bho0bN2Lz5s3w9/dHcXEx5s+fD1dXV0RHR7e6z+LFi5GQkKBZrq2thbu7e2eFrDdXr15tdeiViKgnCQgIQEBAAOLi4nDw4EGEhYWhoKAAfn5+AO5ONYWEhGjtY2pqCgAQQjzw+Js3b0ZFRQUOHz4MExMTzToHBwf87W9/w0svvaTnHpEhdOlk5n/+53/wxhtvaP4zBQYGorKyEikpKfdNZiwsLGBhYdGZYRqEXC5nMkNEdI/mBEatVsPZ2Rlubm44f/48oqKi7ls/JycHt27dgpWVFQCgsLBQq87NmzdhYmICmUymWde8zDtnpaNL35rd/J/sXqampvwPRkTUzc2ZMwdJSUk4dOgQKisrUVhYiBkzZqBv374IDQ0FACxfvhwpKSn48MMPcebMGZSUlCArKwvvvfceACAyMhImJiaIiYlBaWkpdu3ahdTUVK12Ro8ejWvXriEuLg6nT5/GqVOnMGvWLPTq1QsjRozo9H5Tx3TpZOaFF17A22+/jX/84x+oqKhAbm4u3nvvPfzmN78xdmhERGRAo0aNQmFhISZPngxvb29MmjQJlpaW2LdvHxwdHQEAs2fPxvr165GdnY3AwECEh4cjOzsbXl5eAAAbGxvs2LEDpaWlUCqVSExMbPHQPR8fH+zYsQMnT55EaGgowsLCcPnyZezevRsKhaLT+00dIxPtmVQ0khs3bmDp0qXIzc1FdXU1XF1dMW3aNPzpT3+Cubl5u45RW1sLOzs71NTUwNbWtsOxqNVq2NjYAABUKpVBpoA6ow0iop6soqICXl5eKCoqQnBwsLHDoTbo8v3dpa+Z6d27Nz744AN88MEHxg6FiIiIuqguPc1ERERE9CBdemSGiIhInzw9Pdt1yzZJC0dmiIiISNKYzBAREZGkMZkhIiIiSWMyIxH5+fmQyWS4fv26sUNpYevWrZDJZJgwYYKxQyEioh6IyQzppLGxUesJzJWVlVi4cCHCwsKMGBUREfVkTGYkKjs7G/b29tizZw98fX1hY2ODiIgIVFVVaerMnDkTEyZMQGpqKhQKBRwdHREXF4eGhgZNnfr6eixatAhubm6Qy+UICQlBfn5+i3Z27twJPz8/WFhYoLKyEsDdxCYqKgpvvfUWHnvssU7rOxER0b2YzEjYzZs3kZqaipycHBw4cAA//PADFi5cqFUnLy8PZWVlyMvLw4YNG5CdnY3s7GzN9lmzZuHQoUPYunUrTp48icmTJyMiIgJnz57VaiclJQXr16/HqVOn4OTkBABYsWIF+vbti5iYmE7pLxERUWv4nBk9UqvVnbp/Q0MDPv74Y/Tv3x8AMHfuXKxYsUKrjoODA9LS0mBqagofHx+MGzcO+/btQ2xsLMrKyrBlyxZcvHgRrq6uAICFCxdi9+7dyMrKwsqVKzXtrF27FkFBQZrjHjp0CJmZmSguLn6IHhMRET08JjN61Pxepc5ibW2tSWQAQKFQoLq6WquOv78/TE1NteqUlJQAAL799lsIIeDt7a21T11dneZFbgBgbm6OwYMHa5Zv3LiB6dOnIyMjA4888ohe+0RERKQrJjN6cO8LIjuTmZmZ1rJMJmvxZMvW6jRfwNvU1ARTU1OcOHFCK+EBtBMzKysryGQyzXJZWRkqKirwwgsvaNY1H7NXr174/vvvtZIsIiIiQ2Iyo2dXr17t8Nuu1Wo1nJ2d9RzR/SmVSjQ2NqK6ulqnu5F8fHw0ozvNlixZghs3buDDDz+Eu7u7vkMlIiK6LyYzeiaXyzuczHQ2b29vREVFYcaMGVi9ejWUSiV++ukn7N+/H4GBgRg7dmyr+1laWiIgIEBrnb29PQC0WE9ERGRovJuph8vKysKMGTOwYMECDBo0CL/+9a9x5MgRjq4QEZFkyEQ3f31obW0t7OzsUFNTA1tb2w4f597rYlQqldboS1vb9NUGERFRT6LL9zdHZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzRETUJaWnpyMoKAhyuRz29vZQKpVYtWqV3tuRyWQtyscff6z3dshw+DoDIiLqcjIzM5GQkIA1a9YgPDwcdXV1OHnyJEpLSw3SXlZWFiIiIjTLdnZ2BmmHDIMjM0RE1OXs2LEDU6ZMQUxMDAYMGAB/f39MmzYNSUlJWvWysrLg6+sLS0tL+Pj4YO3atVrbjx49CqVSCUtLSwwdOhS5ubmQyWQoLi7Wqmdvbw8XFxdNsbKyMnQXSY84MtNFqdVqY4dARNQpWnt1i4uLCwoKClBZWQkPD49W98vIyMCyZcuQlpYGpVKJoqIixMbGQi6XIzo6Gmq1GuPHj8fIkSOxceNGlJeXIz4+vtVjzZ07F7Nnz4aXlxdiYmLw8ssvw8SEf+9LhujmampqBABRU1PzUMdRqVQCgAAgVCpVu7d1tA0WFhaWnlJac/nyZTFs2DABQHh7e4vo6Gixbds20djYqKnj7u4uNm/erLVfUlKSCA0NFUIIkZ6eLvr06SPUarVm+7p16wQAUVRUpLXPN998I4qKikRqaqqwtrYWSUlJHf5dTvqhy/c3k5l26oxkppmxf7GwsLCwdGZpS0lJiUhLSxORkZHC0tJSjB49WjQ2Norq6moBQFhZWQm5XK4pFhYWwsnJSQghxPz588WIESO0jldcXCwA7WTml1JTU4Wtre1D/y6nh6PL9zenmboglUpl7BCIiLqEgIAABAQEIC4uDgcPHkRYWBgKCgrg5+cH4O5UU0hIiNY+pqamAAAhRIfaHDZsGGpra3H16lU4Ozs/XAeoUzCZ6YJamz8mIurpmhMYtVoNZ2dnuLm54fz584iKirpv/ZycHNy6dUtzQW9hYeED2ykqKoKlpSXs7e31FjsZFpMZIiLqcubMmQNXV1eMHDkS/fr1Q1VVFZKTk9G3b1+EhoYCAJYvX4558+bB1tYWY8aMQV1dHY4fP45r164hISEBkZGRSExMRExMDJYsWYKKigqkpqZqtbNjxw5cuXIFoaGhsLKyQl5eHhITE/Hyyy/DwsLCGF2nDuCl2gaWn58PmUyG69evGzsUAMAzzzzT6gOixo0bZ+zQiIg0Ro0ahcLCQkyePBne3t6YNGkSLC0tsW/fPjg6OgIAZs+ejfXr1yM7OxuBgYEIDw9HdnY2vLy8AAA2NjbYsWMHSktLoVQqkZiY2OKhe2ZmZli7di1CQ0MxePBgfPjhh1ixYgVWr17d6X2mjpOJjk4qSkRtbS3s7OxQU1MDW1vbDh9HrVbDxsYGwN1rWu6dCmprW35+PkaMGIFr164ZdciysbFRk1TV19dr1v/8888ICgrC+vXrMXPmTKPFR0TUGSoqKuDl5YWioiIEBwcbOxxqgy7f3xyZ6WTZ2dmwt7fHnj174OvrCxsbG0RERKCqqkpTZ+bMmZgwYQJSU1OhUCjg6OiIuLg4NDQ0aOrU19dj0aJFcHNzg1wuR0hICPLz81u0s3PnTvj5+cHCwgKVlZXo06eP1oOh9u7dC2tra0yePLkzPwYiIiK9YTLTAWq1ukXRxc2bN5GamoqcnBwcOHAAP/zwAxYuXKhVJy8vD2VlZcjLy8OGDRuQnZ2N7OxszfZZs2bh0KFD2Lp1K06ePInJkycjIiICZ8+e1WonJSUF69evx6lTp+Dk5NQilszMTLz00ku86JiIiCSLFwB3wMPeqtfQ0ICPP/4Y/fv3B3D3yZMrVqzQquPg4IC0tDSYmprCx8cH48aNw759+xAbG4uysjJs2bIFFy9ehKurKwBg4cKF2L17N7KysrBy5UpNO2vXrkVQUFCrcRw9ehTfffcdMjMzH6o/RERS4enp2eFbtqnrYjLTTnK5HEIIyGSyhz6WtbW1JpEBAIVCgerqaq06/v7+mmclNNcpKSkBAHz77bcQQsDb21trn7q6Os2FcQBgbm6OwYMH3zeOzMxMBAQE4Mknn3yo/hARERkTkxkdtfZAu+ZnHrSXmZmZ1rJMJmvxl0JrdZqamgAATU1NMDU1xYkTJ7QSHgCaC5EBwMrK6r7J182bN7F169YWI0JERERSw2RGR13h2hKlUonGxkZUV1cjLCysQ8f4/PPPUVdXh+nTp+s5OiIios7FC4AlyNvbG1FRUZgxYwa2b9+O8vJyHDt2DKtWrcKuXbvadYzMzExMmDBBa1qKiIhIijgyI1FZWVlITk7GggULcOnSJTg6OiI0NBRjx4594L5nzpzBwYMH8fXXX3dCpERERIbFh+bpQVsPzSMiIiLd8aF5RERE1GMwmSEiIiJJYzJDREREksZkhoiIiCSNyQwRERFJGpMZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERkaQxmSEiIiJJYzJDREREksZkhoiIiCSNyQwRERFJGpMZIiIikjQmM0RERCRpTGaIiKhLSk9PR1BQEORyOezt7aFUKrFq1Sq9txMfH48hQ4bAwsICwcHBej8+GZ5Rk5kDBw7ghRdegKurK2QyGb766iut7UIILF++HK6urrCyssIzzzyDU6dOGSdYIiLqNJmZmUhISMC8efPw73//G4cOHcKiRYugUqn03pYQAr///e8xdepUvR+bOodRkxm1Wo2goCCkpaW1uv2dd97Be++9h7S0NBw7dgwuLi4YPXo0bty40cmREhFRZ9qxYwemTJmCmJgYDBgwAP7+/pg2bRqSkpK06mVlZcHX1xeWlpbw8fHB2rVrtbYfPXoUSqUSlpaWGDp0KHJzcyGTyVBcXKyps2bNGsTFxeGxxx7rjK6RAfQyZuNjxozBmDFjWt0mhMAHH3yAxMRETJw4EQCwYcMGODs7Y/PmzXjllVc6M9R2U6vVxg6BiEhS5HJ5i3UuLi4oKChAZWUlPDw8Wt0vIyMDy5YtQ1paGpRKJYqKihAbGwu5XI7o6Gio1WqMHz8eI0eOxMaNG1FeXo74+HhDd4eMwKjJTFvKy8tx5coVPPfcc5p1FhYWCA8PxzfffHPfZKaurg51dXWa5draWoPHei9nZ+dObY+ISOqEEC3WLVu2DBMnToSnpye8vb0RGhqKsWPH4re//S1MTO5OKiQlJWH16tWaP3i9vLxQWlqK9PR0REdHY9OmTWhsbMSnn34Ka2tr+Pv74+LFi5gzZ06n9o8Mr8teAHzlyhUALZMDZ2dnzbbWpKSkwM7OTlPc3d0NGidw96+K1k5GIiLqGIVCgcOHD6OkpATz5s1DQ0MDoqOjERERgaamJvz444+4cOECYmJiYGNjoynJyckoKysDAJw+fRpBQUGwtrbWHDc0NNRYXSID6rIjM81kMpnWshCixbp7LV68GAkJCZrl2traTkloABjkwjQiop4sICAAAQEBiIuLw8GDBxEWFoaCggL4+fkBuDvVFBISorWPqakpgNZHfKh76rLJjIuLC4C7IzQKhUKzvrq6us2pHAsLC1hYWBg8vta0Nu9LRET60ZzAqNVqODs7w83NDefPn0dUVNR96+fk5ODWrVuwsrICABQWFnZavNR5umwy4+XlBRcXF+zduxdKpRIAUF9fj4KCAoM8Z4CIiLqOOXPmwNXVFSNHjkS/fv1QVVWF5ORk9O3bVzNVtHz5csybNw+2trYYM2YM6urqcPz4cVy7dg0JCQmIjIxEYmIiYmJisGTJElRUVCA1NbVFW+fOnYNKpcKVK1dw69YtzZ1Ofn5+MDc378xuUwcZNZlRqVQ4d+6cZrm8vBzFxcXo06cPHn30UcyfPx8rV67EwIEDMXDgQKxcuRLW1taIjIw0YtRERGRoo0aNwqeffop169bh559/xiOPPILQ0FDs27cPjo6OAIDZs2fD2toa7777LhYtWgS5XI7AwEDMnz8fAGBjY4MdO3bg1VdfhVKphJ+fH1atWoVJkyZptTV79mwUFBRolpv/gC4vL4enp2en9JcejkwYcVIxPz8fI0aMaLE+Ojoa2dnZEELgrbfeQnp6Oq5du4aQkBB89NFHCAgIaHcbtbW1sLOzQ01NDWxtbfUZPhERSUxFRQW8vLxQVFTEp/12cbp8fxs1mekMTGaIiKgZkxnp0OX7u8vemk1ERETUHl32AmAiIiJ98/T05C3b3RBHZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGlMZoiIiEjSmMwQERGRpPUydgCGJoQAANTW1ho5EiIiImqv5u/t5u/xtnT7ZObGjRsAAHd3dyNHQkRERLq6ceMG7Ozs2qwjE+1JeSSsqakJly9fRu/evSGTyR7qWLW1tXB3d8eFCxdga2urpwi7DvZP2tg/6erOfQPYP6kzVv+EELhx4wZcXV1hYtL2VTHdfmTGxMQE/fr10+sxbW1tu+V/2Gbsn7Sxf9LVnfsGsH9SZ4z+PWhEphkvACYiIiJJYzJDREREksZkRgcWFhZYtmwZLCwsjB2KQbB/0sb+SVd37hvA/kmdFPrX7S8AJiIiou6NIzNEREQkaUxmiIiISNKYzBAREZGkMZkhIiIiSetRyczatWvh5eUFS0tLDBkyBP/617/arF9QUIAhQ4bA0tISjz32GD7++OMWdb788kv4+fnBwsICfn5+yM3Nfeh2O0rf/cvIyEBYWBgcHBzg4OCAUaNG4ejRo1p1li9fDplMplVcXFz03jdA//3Lzs5uEbtMJsPt27cfqt2O0nf/nnnmmVb7N27cOE2dzvr56dK3qqoqREZGYtCgQTAxMcH8+fNbrSfVc689/ZPyudee/kn53GtP/7rSuQfo1r/t27dj9OjR6Nu3L2xtbREaGoo9e/a0qNeVzj8AgOghtm7dKszMzERGRoYoLS0V8fHxQi6Xi8rKylbrnz9/XlhbW4v4+HhRWloqMjIyhJmZmfjiiy80db755hthamoqVq5cKU6fPi1WrlwpevXqJQoLCzvcblfqX2RkpPjoo49EUVGROH36tJg1a5aws7MTFy9e1NRZtmyZ8Pf3F1VVVZpSXV2t174Zqn9ZWVnC1tZWK/aqqqqHarcr9e/nn3/W6td3330nTE1NRVZWlqZOZ/z8dO1beXm5mDdvntiwYYMIDg4W8fHxLepI+dxrT/+kfO61p39SPvfa07+ucu51pH/x8fFi1apV4ujRo+LMmTNi8eLFwszMTHz77beaOl3p/GvWY5KZJ598Urz66qta63x8fMQbb7zRav1FixYJHx8frXWvvPKKGDZsmGZ5ypQpIiIiQqvO888/L1566aUOt9tRhujfL925c0f07t1bbNiwQbNu2bJlIigoqOOBt5Mh+peVlSXs7Oz02m5HdcbP7/333xe9e/cWKpVKs64zfn4P8xmGh4e3+mUh5XPvXvfr3y9J6dy71/36J+Vz717t/fkZ69wTQj+fo5+fn3jrrbc0y13p/GvWI6aZ6uvrceLECTz33HNa65977jl88803re5z+PDhFvWff/55HD9+HA0NDW3WaT5mR9rtCEP175du3ryJhoYG9OnTR2v92bNn4erqCi8vL7z00ks4f/78Q/SmJUP2T6VSwcPDA/369cP48eNRVFT0UO12RGf9/DIzM/HSSy9BLpdrrTfkz89Qn6GUz72OkNK5115SPfc6whjnHqCf/jU1NeHGjRta//e6yvl3rx6RzPz0009obGyEs7Oz1npnZ2dcuXKl1X2uXLnSav07d+7gp59+arNO8zE70m5HGKp/v/TGG2/Azc0No0aN0qwLCQnBZ599hj179iAjIwNXrlzBU089hZ9//vkhe/Vfhuqfj48PsrOz8fe//x1btmyBpaUlhg8fjrNnz3a43Y7ojJ/f0aNH8d1332H27Nla6w398zPUZyjlc68jpHTutYeUzz1dGevcA/TTv9WrV0OtVmPKlCmadV3l/LtXt39r9r1kMpnWshCixboH1f/l+vYcU9d2O8oQ/Wv2zjvvYMuWLcjPz4elpaVm/ZgxYzT/DgwMRGhoKPr3748NGzYgISGhQ/3QJd6H6d+wYcMwbNgwzfbhw4fj8ccfx1/+8hesWbOmw+12lCF/fpmZmQgICMCTTz6ptb6zfn6G+AylfO7pQorn3oNI/dzThbHPPaDj/duyZQuWL1+Ov/3tb3ByctL5mJ318wN6yMjMI488AlNT0xYZYXV1dYvMsZmLi0ur9Xv16gVHR8c26zQfsyPtdoSh+tcsNTUVK1euxNdff43Bgwe3GYtcLkdgYKDmLyx9MHT/mpmYmOCJJ57QxN5dfn43b97E1q1bW/xl2Bp9//wM9RlK+dzThRTPvY6Q0rmnC2Oee8DD9W/btm2IiYnB559/rjUiCHSd8+9ePSKZMTc3x5AhQ7B3716t9Xv37sVTTz3V6j6hoaEt6n/99dcYOnQozMzM2qzTfMyOtNsRhuofALz77rtISkrC7t27MXTo0AfGUldXh9OnT0OhUHSgJ60zZP/uJYRAcXGxJvbu8PMDgM8//xx1dXWYPn36A2PR98/PUJ+hlM+99pLqudcRUjr3dGHMcw/oeP+2bNmCmTNnYvPmzVq3kzfrKuefFoNcVtwFNd8mlpmZKUpLS8X8+fOFXC4XFRUVQggh3njjDfG73/1OU7/51tfXXntNlJaWiszMzBa3vh46dEiYmpqKP//5z+L06dPiz3/+831vT7tfu125f6tWrRLm5ubiiy++0Lp98MaNG5o6CxYsEPn5+eL8+fOisLBQjB8/XvTu3VsS/Vu+fLnYvXu3KCsrE0VFRWLWrFmiV69e4siRI+1utyv3r9nTTz8tpk6d2mq7nfHz07VvQghRVFQkioqKxJAhQ0RkZKQoKioSp06d0myX8rnXnv5J+dxrT/+kfO61p3/NjH3udaR/mzdvFr169RIfffSR1v+969eva+p0pfOvWY9JZoQQ4qOPPhIeHh7C3NxcPP7446KgoECzLTo6WoSHh2vVz8/PF0qlUpibmwtPT0+xbt26Fsf861//KgYNGiTMzMyEj4+P+PLLL3VqV5/03T8PDw8BoEVZtmyZps7UqVOFQqEQZmZmwtXVVUycOLHVk7or9m/+/Pni0UcfFebm5qJv377iueeeE998841O7Xbl/gkhxPfffy8AiK+//rrVNjvr56dr31r7f+fh4aFVR8rn3oP6J/Vz70H9k/q5157/n13l3BNCt/6Fh4e32r/o6GitY3al808IIWRC/P+rBomIiIgkqEdcM0NERETdF5MZIiIikjQmM0RERCRpTGaIiIhI0pjMEBERkaQxmSEiIiJJYzJDREREksZkhoiIiCSNyQwR4ZlnnsH8+fONHUarKioqIJPJUFxcrNN+MpkMX331lUFi0tXy5csRHBxs7DCIui0mM0REetSVkiiinoLJDBF1SH19vbFDICICwGSGiP6/O3fuYO7cubC3t4ejoyOWLFmCe1/d5unpieTkZMycORN2dnaIjY0FALz++uvw9vaGtbU1HnvsMSxduhQNDQ2a/ZqnWHJycuDp6Qk7Ozu89NJLuHHjhqZOU1MTVq1ahQEDBsDCwgKPPvoo3n77ba34zp8/jxEjRsDa2hpBQUE4fPiwTv27dOkSpk6dCgcHBzg6OuLFF19ERUWFZvvMmTMxYcIEpKamQqFQwNHREXFxcVp9qaqqwrhx42BlZQUvLy9s3rwZnp6e+OCDDzSfEQD85je/gUwm0yw3a+szIKKOYzJDRACADRs2oFevXjhy5AjWrFmD999/H+vXr9eq8+677yIgIAAnTpzA0qVLAQC9e/dGdnY2SktL8eGHHyIjIwPvv/++1n5lZWX46quvsHPnTuzcuRMFBQX485//rNm+ePFirFq1CkuXLkVpaSk2b94MZ2dnrWMkJiZi4cKFKC4uhre3N6ZNm4Y7d+60q283b97EiBEjYGNjgwMHDuDgwYOwsbFBRESE1ghTXl4eysrKkJeXhw0bNiA7OxvZ2dma7TNmzMDly5eRn5+PL7/8Ep988gmqq6s1248dOwYAyMrKQlVVlWa5PZ8BET0Eg72Pm4gkIzw8XPj6+oqmpibNutdff134+vpqlj08PMSECRMeeKx33nlHDBkyRLO8bNkyYW1tLWprazXr/ud//keEhIQIIYSora0VFhYWIiMjo9XjlZeXCwBi/fr1mnWnTp0SAMTp06fvGwcAkZubK4QQIjMzUwwaNEirf3V1dcLKykrs2bNHCCFEdHS08PDwEHfu3NHUmTx5spg6daoQQojTp08LAOLYsWOa7WfPnhUAxPvvv99qu+39DIjo4XBkhogAAMOGDYNMJtMsh4aG4uzZs2hsbNSsGzp0aIv9vvjiCzz99NNwcXGBjY0Nli5dih9++EGrjqenJ3r37q1ZVigUmhGN06dPo66uDs8++2yb8Q0ePFhrfwBaoyJtOXHiBM6dO4fevXvDxsYGNjY26NOnD27fvo2ysjJNPX9/f5iamrYa5/fff49evXrh8ccf12wfMGAAHBwc2hVDW58BET2cXsYOgIikQy6Xay0XFhbipZdewltvvYXnn38ednZ22Lp1K1avXq1Vz8zMTGtZJpOhqakJAGBlZdWutu89RnPS1XyMB2lqasKQIUOwadOmFtv69u3brjjFPdcP3et+63+prWMT0cNhMkNEAO4mJr9cHjhwoNZIxS8dOnQIHh4eSExM1KyrrKzUqd2BAwfCysoK+/btw+zZs3ULup0ef/xxbNu2DU5OTrC1te3QMXx8fHDnzh0UFRVhyJAhAIBz587h+vXrWvXMzMy0RrOIyPA4zUREAIALFy4gISEB33//PbZs2YK//OUviI+Pb3OfAQMG4IcffsDWrVtRVlaGNWvWIDc3V6d2LS0t8frrr2PRokX47LPPUFZWhsLCQmRmZj5Md7RERUXhkUcewYsvvoh//etfKC8vR0FBAeLj43Hx4sV2HcPHxwejRo3Cyy+/jKNHj6KoqAgvv/wyrKystKbnPD09sW/fPly5cgXXrl3TWx+I6P6YzBARgLt36ty6dQtPPvkk4uLi8Mc//hEvv/xym/u8+OKLeO211zB37lwEBwfjm2++0dzlpIulS5diwYIF+NOf/gRfX19MnTpVr9eTWFtb48CBA3j00UcxceJE+Pr64ve//z1u3bql00jNZ599BmdnZ/zqV7/Cb37zG8TGxqJ3796wtLTU1Fm9ejX27t0Ld3d3KJVKvfWBiO5PJto74UtERFouXrwId3d3/O///u8DL2AmIsNhMkNE1E779++HSqVCYGAgqqqqsGjRIly6dAlnzpxpcYEvEXUeXgBMRNRODQ0NePPNN3H+/Hn07t0bTz31FDZt2sREhsjIODJDREREksYLgImIiEjSmMwQERGRpDGZISIiIkljMkNERESSxmSGiIiIJI3JDBEREUkakxkiIiKSNCYzREREJGn/D8Bqb13z/Y0mAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from Bio import Phylo\n",
    "from Bio.Phylo.TreeConstruction import DistanceCalculator, DistanceTreeConstructor\n",
    "from Bio import AlignIO\n",
    "\n",
    "\n",
    "sequences = SeqIO.parse(\"my_sequences.fa\", \"fasta\")\n",
    "\n",
    "# Write the sequences to a temporary file (in FASTA format) and read it back\n",
    "with open(\"temp_sequences.fasta\", \"w\") as temp_file:\n",
    "    for i, seq in enumerate(sequences):\n",
    "        temp_file.write(f\">Seq{i+1}\\n{seq}\\n\")\n",
    "\n",
    "alignment = AlignIO.read(\"temp_sequences.fasta\", \"fasta\")\n",
    "\n",
    "# Calculate the distance matrix\n",
    "calculator = DistanceCalculator('identity')\n",
    "distance_matrix = calculator.get_distance(alignment)\n",
    "\n",
    "# Build a tree\n",
    "constructor = DistanceTreeConstructor()\n",
    "tree = constructor.upgma(distance_matrix)  # You can also use other methods like nj (Neighbor-Joining)\n",
    "Phylo.draw(tree)  # This will display a basic tree\n",
    "\n",
    "# Save the tree to a file (e.g., Newick format)\n",
    "Phylo.write(tree, \"output_tree.nwk\", \"newick\")\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
