NMPFamsDB

NMPFamsDB

NMPFamsDB

A database of Novel Metagenome Protein Families

A database of Novel Metagenome Protein Clusters

A database of Novel Metagenome Protein Clusters
x
This website uses cookies to improve user experience. By using NMPFamDB you consent to all cookies in accordance with our privacy policy. OK
Sample 3300004209

3300004209: Groundwater microbial communities from aquifer - Crystal Geyser CG22_combo_CG10-13_8/21/14_all



Overview

Basic Information
IMG/M Taxon OID3300004209 Open in IMG/M
GOLD Reference
(Study | Sequencing Project | Analysis Project)
Gs0111384 | Gp0110945 | Ga0066649
Sample NameGroundwater microbial communities from aquifer - Crystal Geyser CG22_combo_CG10-13_8/21/14_all
Sequencing StatusPermanent Draft
Sequencing CenterDOE Joint Genome Institute (JGI)
Published?Y
Use PolicyOpen

Dataset Contents
Total Genome Size1311645270
Sequencing Scaffolds30
Novel Protein Genes39
Associated Families34

Dataset Phylogeny
Taxonomy GroupsNumber of Scaffolds
All Organisms → cellular organisms → Bacteria → Proteobacteria → delta/epsilon subdivisions → Deltaproteobacteria → unclassified Deltaproteobacteria → Deltaproteobacteria bacterium1
All Organisms → cellular organisms → Archaea → TACK group → Thaumarchaeota → Nitrosopumilales → Nitrosopumilaceae → Nitrosopumilus1
All Organisms → cellular organisms → Archaea2
Not Available16
All Organisms → cellular organisms → Bacteria → FCB group → Candidatus Marinimicrobia → Candidatus Marinimicrobia bacterium1
All Organisms → Viruses → Duplodnaviria → Heunggongvirae → Uroviricota → Caudoviricetes → environmental samples → uncultured Caudovirales phage1
All Organisms → cellular organisms → Bacteria → Proteobacteria1
All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla → Candidatus Riflebacteria → unclassified Candidatus Riflebacteria → Candidatus Riflebacteria bacterium1
All Organisms → cellular organisms → Archaea → DPANN group → Candidatus Huberarchaea → Candidatus Huberarchaeum → Candidatus Huberarchaeum crystalense2
All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla1
All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla → Candidatus Desantisbacteria → Candidatus Desantisbacteria bacterium CG23_combo_of_CG06-09_8_20_14_all_40_231
All Organisms → cellular organisms → Archaea → DPANN group → Candidatus Altiarchaeota → Candidatus Altiarchaeales → Candidatus Altiarchaeum → unclassified Candidatus Altiarchaeum → Candidatus Altiarchaeum sp. CG2_30_32_30531
All Organisms → cellular organisms → Eukaryota → Sar → Alveolata → Ciliophora → Intramacronucleata → Spirotrichea1

Ecosystem and Geography

Ecosystem Assignment (GOLD)
NameDevelopment Of A Pipeline For High-Throughput Recovery Of Near-Complete And Complete Microbial Genomes From Complex Metagenomic Datasets
TypeEnvironmental
TaxonomyEnvironmental → Aquatic → Freshwater → Groundwater → Unclassified → Groundwater → Development Of A Pipeline For High-Throughput Recovery Of Near-Complete And Complete Microbial Genomes From Complex Metagenomic Datasets

Alternative Ecosystem Assignments
Environment Ontology (ENVO)freshwater biomeaquifergroundwater
Earth Microbiome Project Ontology (EMPO)Free-living → Non-saline → Subsurface (non-saline)

Location Information
LocationUSA: Utah: Grand County
CoordinatesLat. (o)38.9383Long. (o)-110.1342Alt. (m)N/ADepth (m)N/A
Location on Map
Zoom:    Powered by OpenStreetMap ©


Associated Families

FamilyCategoryNumber of Sequences3D Structure?
F000212Metagenome / Metatranscriptome1580Y
F000449Metagenome / Metatranscriptome1126Y
F002857Metagenome / Metatranscriptome525Y
F003081Metagenome / Metatranscriptome508Y
F006305Metagenome / Metatranscriptome376Y
F006747Metagenome / Metatranscriptome365Y
F008119Metagenome / Metatranscriptome338Y
F008120Metagenome / Metatranscriptome338Y
F009876Metagenome311Y
F016244Metagenome248Y
F016358Metagenome247Y
F017598Metagenome239N
F017599Metagenome / Metatranscriptome239Y
F019952Metagenome / Metatranscriptome226Y
F028419Metagenome / Metatranscriptome191Y
F029019Metagenome189Y
F030345Metagenome185Y
F042051Metagenome / Metatranscriptome159Y
F043249Metagenome / Metatranscriptome156Y
F063344Metagenome129Y
F065251Metagenome / Metatranscriptome128Y
F067432Metagenome / Metatranscriptome125Y
F071960Metagenome121N
F074391Metagenome119Y
F080881Metagenome114N
F084884Metagenome / Metatranscriptome112Y
F085143Metagenome111Y
F091390Metagenome107Y
F094905Metagenome105N
F096625Metagenome104N
F096627Metagenome104N
F099485Metagenome / Metatranscriptome103Y
F102531Metagenome101N
F104465Metagenome100Y

Associated Scaffolds

ScaffoldTaxonomyLengthIMG/M Link
Ga0066649_10034703All Organisms → cellular organisms → Bacteria → Proteobacteria → delta/epsilon subdivisions → Deltaproteobacteria → unclassified Deltaproteobacteria → Deltaproteobacteria bacterium2756Open in IMG/M
Ga0066649_10110403All Organisms → cellular organisms → Archaea → TACK group → Thaumarchaeota → Nitrosopumilales → Nitrosopumilaceae → Nitrosopumilus1498Open in IMG/M
Ga0066649_10141044All Organisms → cellular organisms → Archaea1309Open in IMG/M
Ga0066649_10208870Not Available1049Open in IMG/M
Ga0066649_10232837All Organisms → cellular organisms → Bacteria → FCB group → Candidatus Marinimicrobia → Candidatus Marinimicrobia bacterium985Open in IMG/M
Ga0066649_10316424Not Available821Open in IMG/M
Ga0066649_10353744All Organisms → Viruses → Duplodnaviria → Heunggongvirae → Uroviricota → Caudoviricetes → environmental samples → uncultured Caudovirales phage766Open in IMG/M
Ga0066649_10362516All Organisms → cellular organisms → Bacteria → Proteobacteria754Open in IMG/M
Ga0066649_10384781All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla → Candidatus Riflebacteria → unclassified Candidatus Riflebacteria → Candidatus Riflebacteria bacterium727Open in IMG/M
Ga0066649_10401086Not Available708Open in IMG/M
Ga0066649_10405289All Organisms → cellular organisms → Archaea703Open in IMG/M
Ga0066649_10409079Not Available699Open in IMG/M
Ga0066649_10413702All Organisms → cellular organisms → Archaea → DPANN group → Candidatus Huberarchaea → Candidatus Huberarchaeum → Candidatus Huberarchaeum crystalense694Open in IMG/M
Ga0066649_10436388Not Available671Open in IMG/M
Ga0066649_10438541All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla669Open in IMG/M
Ga0066649_10465951Not Available643Open in IMG/M
Ga0066649_10479656All Organisms → cellular organisms → Bacteria → Bacteria incertae sedis → Bacteria candidate phyla → Candidatus Desantisbacteria → Candidatus Desantisbacteria bacterium CG23_combo_of_CG06-09_8_20_14_all_40_23631Open in IMG/M
Ga0066649_10498701Not Available615Open in IMG/M
Ga0066649_10505192Not Available610Open in IMG/M
Ga0066649_10509258Not Available607Open in IMG/M
Ga0066649_10522340Not Available597Open in IMG/M
Ga0066649_10523310Not Available596Open in IMG/M
Ga0066649_10523557Not Available596Open in IMG/M
Ga0066649_10541169All Organisms → cellular organisms → Archaea → DPANN group → Candidatus Altiarchaeota → Candidatus Altiarchaeales → Candidatus Altiarchaeum → unclassified Candidatus Altiarchaeum → Candidatus Altiarchaeum sp. CG2_30_32_3053583Open in IMG/M
Ga0066649_10564711Not Available567Open in IMG/M
Ga0066649_10596279Not Available546Open in IMG/M
Ga0066649_10601455All Organisms → cellular organisms → Eukaryota → Sar → Alveolata → Ciliophora → Intramacronucleata → Spirotrichea543Open in IMG/M
Ga0066649_10605675Not Available541Open in IMG/M
Ga0066649_10645986Not Available518Open in IMG/M
Ga0066649_10679958All Organisms → cellular organisms → Archaea → DPANN group → Candidatus Huberarchaea → Candidatus Huberarchaeum → Candidatus Huberarchaeum crystalense500Open in IMG/M

Sequences

Scaffold IDProtein IDFamilySequence
Ga0066649_10034703Ga0066649_100347032F043249MPKIVGTRERVHQPFYDSLIRVDGSGDLRQGNVGVFGAVQSRSQLFVRQGADVAVSNLTTGGFFPSDQTFVTLAVRVWTYFRFNVESQRTDAQNTTGPVASLAGVTADRIQRVHKLYHQAENQLFWQFIAGDKPQLTTFTAYTPAAGGLDGFFSDTRLPRANNGVPTSAALMRLARPILVPPRQGFQVVAIASPIGQAQGASIIEQLNGAVPNNDPWGASSTGGTVTGTGGTNGLTTTGRDDIEKDIKYLIDGIHSRDVL*
Ga0066649_10110403Ga0066649_101104032F099485MNQKIIHNWQHSPSETQVASNPVKAVSKSLQQTKSLVFTNEDLMGLSWILFEERH*
Ga0066649_10141044Ga0066649_101410443F065251MYLRHFTKGKKKYYYIAKAVRKGTSVIQKSILYIGTADTLYEKLISLKKK*
Ga0066649_10208870Ga0066649_102088702F063344MKEPTYSEYQNIKKHKISIEYSVMFFTIGRCAFFRYFEDDGIEVSPIDRSCVLPMLSDEEIEKLSKFLDETKKEYNKYIF*
Ga0066649_10232837Ga0066649_102328371F084884PTLGILARFQAFFYASAFSQSDGDPPPAPARVTQTVIGA*
Ga0066649_10311488Ga0066649_103114881F096627MKQDAGVFQFDLSLAELYWLAGAFGIASLPLPSDAPSGLSPSQMEARQKNGHASLLTRGLIRPSPGFGWQVERLPAALLQWISSAPSLLRLERIPKDGAAQRIHLFTAGDQGLSLEMDDDTARFVIFQTRRLLQESAVRWLALPAGAKKSTTAHDLPQPITFLPTTWKDSQLASRILKEQGMNTKTAKSTLAWAVSLEWVTALSKVKLEGQRNVLANQFVLCGNAKSIWG
Ga0066649_10316424Ga0066649_103164241F006305LLIFLLISSIKSDSVCTESTLVTELIQDIKDNGKLDCLRRPLNPPSDKIESEDDKKLRLEGAWDTDCAFEADYDWLKSLKETYGLKTGLVDVNGKPVENDFDDQADMCEIVRAVIAGGLLEGAKLDELSPSVVDGIDCPGEGELGQSEICAVSGGAASQRYAWYIFLNGISITATNKPKWELSPDSQKLLDARKDTSS*
Ga0066649_10336744Ga0066649_103367443F016358MIIMQMTVIVNIAQASKSFAKQSPAIKYATLKMTTSKSDIFVLPEWFG*
Ga0066649_10337409Ga0066649_103374091F096625FQYFGFPVHKTGFVTKTFQQSLRFLPLLPAQLSQVIPPLDFPPIGDSYGTNRSLDLCQQTESQLASALSERLSDIQFTIETVHSEQFFADLEKLSHSLHSDHCPEDLGEAALAHHRNLSLWHALIKLEIHTTPDEIKVQFTEWYQRCAAHSTAALASVFATTYLKNNQDKEPIFEEDGAYTFKLILPCTCLDEHLLDKMADHLKDSWIYNPSSETDEVSNPELIPLYLLWFYQIVHPGPHLEFSEYLHALSSPAFTPNHLRLS
Ga0066649_10353744Ga0066649_103537441F042051FGEVSWNDDVFSGSEKKNSKDLFLRLDEGSNEMRLITQPFQYLVHKYKKEGDPGFGQKVNCSAVHGSCPLCAAGDKAKPRWLLGVISRKTGTYKILDVSFAVFSQVRKYARNTARWGDPTKYDIDIVVDKNGGATGYYAVQPIPKEPLSAADQQIKDSVDFDDLKRRVTPPTPDMVQKRIDKINGVTGEAAEAAPTPSGKAAKAATKAAPAPVNMSEEEDESFPAYDGDQAK*
Ga0066649_10362516Ga0066649_103625163F006747TEKTFMSNATLTSAKPPGNRTQLNKQLAQAKTLLRQMRETVEDIEDARTIERAKRANGNKPRIPWAQVKKELQLN*
Ga0066649_10384781Ga0066649_103847812F009876MSAKFLGSASNSGRGILPRRSGWKPLPLFRWIPTEHFKAGIET*
Ga0066649_10401086Ga0066649_104010861F102531IPIQPQKTIAQINQEIDNLVARGVDELEAIRQVGSISIPNYATTPEQIAALRLADQARTADEILQCPYTWCRHNSAISDAILESRDYQPYRAAMTMQTTEGLSAGGHQTSAIIINGEPVFIDLTNNLIITGQQALEQVLINSEKQLTALEMIRLTTNNVWDVINLIPK*
Ga0066649_10405289Ga0066649_104052891F091390MNTKSTKSTTIRINEPTKEKLETLDFVRKHTFDDILTELIDFYEKNKGKKTK*
Ga0066649_10407105Ga0066649_104071051F096625RSLDLCQQTESQLASDLSELTSDLQFTIETVHSEQFFADLKRLSHSLHSDHCPKDLGESALAHHRNLSLWHALNKLEIHTTSDEIQAKFTEWYRSYSAHSTAKLAALFATTYLRTHQDKEPIFEEDGAYTFKLILPCTFLDEHLLEEMADHLKDSWIYNPSSETDEVSNPELLPLYLLWYYQTVHPGPHLEFSEYLHALSSPSFTPNHLRQSQHWFLHALPYPVLPCFESGEG
Ga0066649_10409079Ga0066649_104090791F071960QNQGEAEFPEGFQERTGKTVLELLAMPLGEAIPLLKQGDGSPLSRLLPHRAFEGMLLMPRGADTICATCHEHGACRDWVIGAAEDVCLELDLVEPGEILSYIAAKVIEGDYDHNTSITQIAQEFRADVLITTEVCK*
Ga0066649_10413702Ga0066649_104137021F019952MVQTVQSTRVYPGQSVILLEKPRVLRRILFSIRALTDLTMGHKSHISFDDPSFLTYYILDGPVQQLEAKGEGICQGNIWAHNASSVELLFVMTEILV*
Ga0066649_10432250Ga0066649_104322501F000212DPVCEPPKCHTSCAEPKNAICDVKCEKPECEIKCPDKGCEMFDCPKCVTVCKAPHCVTHCQAPKPECEAVCEEPKCDWKCHKPQCPKPKCELVCENPNCAPKVECCPCAAGGARVAQPFPFFKEAEKNKDCCSCK*
Ga0066649_10436388Ga0066649_104363881F017598VTKLLEPHSKIVGYFLGKDIIHTNRDDITNRLAAHIIKHSQTHWTPALDVLNTTVLGKGHNTRMVTLVVGNTDYQGVLDILTQQPMETLSFLDHRTKRQNINQFDKMLKYHDYIVSHSTAVRLENVHYLDTQALWAHLQPITNTSFCDIFAGRTQGTTYIQCFKEKEPEVATAIQSYLLANFSDSADRPIIGERGGGTVNSSSTGTRTYRGNHSGKSTGKETT
Ga0066649_10438541Ga0066649_104385412F074391MVQTVRSVLIQPGHKLSILEKPLVLSRIFFGICAFAPQTSWYESQISLGGPIFSSYYVMGGPVNYFGARGEGIFQGDIWVRNVSAIDLLYSVTEILH*
Ga0066649_10465951Ga0066649_104659511F002857MKNTKENTGVFKDDETTEMKASMKFLTDRNFVERISLRDGMKRKFTMVKDERTEIIGFDGNTKQGITYTVTENGNLRNFFTASMKCISMLSKFSKDDMFEIELRTKKVGNNIVSFYVVKKL*
Ga0066649_10479656Ga0066649_104796562F029019MKNTEICSSEYIKSEVLYRKMLHKINLLEHQIFEVKKMTNIFWDIRLKKPELACWILEGDSVWWKTAADDISERWKEIDETKDEIDWAIRNMFVSVEREEEYKKKIKNNKKRTQ*
Ga0066649_10498701Ga0066649_104987011F080881MSKNSKSAVALSELAVMPQFAAPVVEITPEVLSMIVMDIDPALIEAAELKEARLDEIRAKKDEAKRLIALYHEIAKTLNPLCDEYDSVEEERKLLKGVLEDALLASRITLANHPKVLENKAQIVKAAPNLMAEFNEKFNAAVIKQTQIAQADNIARMKALDASFAELDKKTAQLS
Ga0066649_10505192Ga0066649_105051921F008120MKNTETSTSECIKAEALHQKILWQINLLGCQISECKKRFTFFYTQVQKKSDVSAGEKNENDISWVEAFNDFVENWEEMENVKDEVDWMIRDIFFAIEREEEEYNKKIKNNKKRTQ*
Ga0066649_10505192Ga0066649_105051922F104465MVREKPDENEKTIKEYNARGERMYKNAINRENAKKTRKIK*
Ga0066649_10509258Ga0066649_105092581F102531YGRQTNVIQPQKTVAGINQEIDKLVANGVDKLEAIRQVGSVSVPDYAATPEQIAQLQLADKARTADEILQCPYTWCRHNSTVSDAILESRGYQPYRAAMMMQTTEGLSAGGHQVSAIIVNGEPVFIDLTNNLIIPGQQALEQILLNSGKQLTALEMIRLTTNNVWDVINLIPK*
Ga0066649_10522340Ga0066649_105223401F030345MELKNEKICIQTIEKRERENDSCWYAITDNQGKRFSCFEDEVAKKLQTNIVNLCKVKYSGKYANVMSVEGYEDNPKVADVNIERKREKNLESLRILKCVALKSAALCSGGQSVNSAEVLTKA
Ga0066649_10523310Ga0066649_105233101F085143MARIDDGFATLIEFAEDSDVQMWEKEVTPPGVSGGGENDTSTMRNTTWRTKSPKGLMSLSEASLVVAYDPAVYNEIITMLNVNQQITITFADGSTLVFWGWIDEFTPGAAAEGSQPTATVKIIPSNQNGSGVETAPQYSAAP*
Ga0066649_10523557Ga0066649_105235571F008119ENVYNMIFVCTEKTNKSITHLVRSVITIGVKRSSTDGTVYFDKINMNIGYINNAGNFTSVSDLDALHAFSTASETYVDLCLQEFVRIMSDYPLVNKRFAVKVNIFAHVDATTTSGQLGMYHTRGSADSYVEVELK*
Ga0066649_10541169Ga0066649_105411692F000449MENKKAGEEDIPSPDFDQEFELRTSAPQEEMERYYARKKASLQRKIRQINKKIKKNIYNPYAGDDEKHKNVLE*
Ga0066649_10564711Ga0066649_105647111F000449MKNKKAGEGISPSPDLDQEFELRTSAPQEEMERYYTRKKASLLRKIRKINKKIKKNIYNPYAGDDEKHKNRK*
Ga0066649_10564711Ga0066649_105647112F017599MQATMKNTKIENEIKKRTPPGSKNIEKYRISTEYTSISYTTPHFLVFRYYCNSETYLCETRKSIVLWWISQAEKEKLLAFEKQKKEKHKGLKVFDGTTMEGL*
Ga0066649_10586203Ga0066649_105862032F094905MNSLYGYIAVLIVVVFLGAGWAVEHDKRITYQAKVEQAGADALAQTEKINAKHREEMQNAEQNTIIATNSIAD
Ga0066649_10596279Ga0066649_105962791F028419FASSIMSVHDINFNIMNDNMVTTRHYLSLNYWDLRNLGSPSNKFLLYEPIITKLSYLYQNNYMSDKFSLSSDPTGKVIITGGYNNMFHIIDADQKLNTQIIIDENNEKIMNTNVIRKINSKGSCFYKKDDPSLTNINFDKRILHQTYSPVENFCHLILLNCIYSYTGALAKKSK*
Ga0066649_10601455Ga0066649_106014551F003081MREEMFEETRFGTEVFAMHVRGVDVIFVFSYLHILKKIHLKNYITS
Ga0066649_10605675Ga0066649_106056752F016244MCIFPYLCFCNYNYKIFKNIIKINKMVLPILPGIIVLAGARILVSYGTHLLRFIVANPKILLGTATVATVADALKEHEKNEQIRNSILQDIYTQNPELAQKIVSAGGFSFHPVENIFQMAVSSAITGLIIYAIIQKI*
Ga0066649_10621776Ga0066649_106217761F067432PHCVTHCQAPKPECEAVCEEPKCDWKCHKPACPKPKCELVCENPNCAPKVECCPCAAGAARVAQPFPMFKESATNKDCCSCK*
Ga0066649_10645986Ga0066649_106459861F002857MENKKRIWGTTQMDEAYDEMIRMEKEKQNGSVFKNDEKNDKKEMEASMKFLSDKNFIERISLRDGMKRKFKLVKDEATEIIGFDGNKKQGITYTVTENGNLRSFFTTSLKCISMLSKLQKDDVFEIELRTKKVGDTLISYHVVKKL*
Ga0066649_10679958Ga0066649_106799581F008119KEVVRKIYVFQQVTNPYVFEGNNTHLLMENPSPSNLINGSDFSTLNSFSNTDSLIVYDMVFVCTEKTDKSITHSVRGVITIGVKRNSGNGNVNFDKINMNIGYIESTGAFTPISNADATHAYSTTSQTYVGLCLQDFVGIVNDYSLAGKRFAVKVNIFAHVDSADT

 ⦗Top⦘



© Pavlopoulos Lab, Bioinformatics & Integrative Biology | B.S.R.C. "Alexander Fleming" | Privacy Notice
Make sure JavaScript is enabled in your browser settings to achieve functionality.