NMPFamsDB

NMPFamsDB

NMPFamsDB

A database of Novel Metagenome Protein Families

A database of Novel Metagenome Protein Clusters

A database of Novel Metagenome Protein Clusters
x
This website uses cookies to improve user experience. By using NMPFamDB you consent to all cookies in accordance with our privacy policy. OK
Metagenome / Metatranscriptome Family F104140

Metagenome / Metatranscriptome Family F104140

Go to section:
Overview Alignments Structure & Topology Gene Neighborhood Phylogeny Ecosystems Sequences
Select file to download:
   Download


Overview

Basic Information
Family ID F104140
Family Type Metagenome / Metatranscriptome
Number of Sequences 100
Average Sequence Length 49 residues
Representative Sequence MTYSMKSENKLEGASNFRAWKTRIDLILAKNKVLDIVKGKIVEPEFEAGK
Number of Associated Samples 26
Number of Associated Scaffolds 100

Quality Assessment
Transcriptomic Evidence Yes
Most common taxonomic group Unclassified
% of genes with valid RBS motifs 0.00 %
% of genes near scaffold ends (potentially truncated) 0.00 %
% of genes from short scaffolds (< 2000 bps) 0.00 %
Associated GOLD sequencing projects 26
AlphaFold2 3D model prediction Yes
3D model pTM-score0.43

Note: High quality evidence is represented by blue. Low quality evidence is represented by red.
Hidden Markov Model
Powered by Skylign

Most Common Taxonomy
Group Unclassified (100.000 % of family members)
NCBI Taxonomy ID N/A
Taxonomy N/A

Most Common Ecosystem
GOLD Ecosystem Environmental → Terrestrial → Soil → Unclassified → Unclassified → Soil
(68.000 % of family members)
Environment Ontology (ENVO) Unclassified
(96.000 % of family members)
Earth Microbiome Project Ontology (EMPO) Free-living → Non-saline → Soil (non-saline)
(78.000 % of family members)



 ⦗Top⦘

Multiple Sequence Alignments

Select alignment to view:      


 ⦗Top⦘

Structure & Topology

Predicted Secondary Structure and Topology

Predicted Topology & Secondary Structure
Classification: Globular Signal Peptide: No Secondary Structure distribution: α-helix: 58.97%    β-sheet: 0.00%    Coil/Unstructured: 41.03%
Feature Viewer
Powered by Feature Viewer

Predicted 3D Structure

Structure Viewer
Per-residue confidence (pLDDT):
  0-50   51-70   71-90   91-100  
pTM-score: 0.43
Powered by PDBe Molstar

Low Quality Model:

This family has a low confidence model (pTM < 0.7) and has not been screened against SCOPe or PDB.


 ⦗Top⦘

Gene Neighborhood

Neighboring Pfam domains

Pfam IDName % Frequency in 100 Family Scaffolds
PF14223Retrotran_gag_2 17.00



 ⦗Top⦘

Phylogeny

NCBI Taxonomy

NameRankTaxonomyDistribution
UnclassifiedrootN/A100.00 %

Visualization
Powered by ApexCharts

Associated Scaffolds


ScaffoldTaxonomyLengthIMG/M Link



 ⦗Top⦘

Environmental Properties

Associated Habitat Types

Select Environment Taxonomy Level:
HabitatTaxonomyDistribution
SoilEnvironmental → Terrestrial → Soil → Unclassified → Unclassified → Soil68.00%
RootsHost-Associated → Plants → Roots → Unclassified → Unclassified → Roots18.00%
LeafHost-Associated → Plants → Phylloplane → Unclassified → Unclassified → Leaf7.00%
Plant LitterEnvironmental → Terrestrial → Plant Litter → Unclassified → Unclassified → Plant Litter6.00%
Switchgrass PhyllosphereHost-Associated → Plants → Phyllosphere → Unclassified → Unclassified → Switchgrass Phyllosphere1.00%

Visualization
Powered by ApexCharts



Associated Samples

Taxon OIDSample NameHabitat TypeIMG/M Link
3300020588Leaf-associated microbial communities from Pinus contorta in Yosemite National Park, California, United States - Lodgepole_Yose_1Host-AssociatedOpen in IMG/M
3300020590Leaf-associated microbial communities from Pinus contorta in Yosemite National Park, California, United States - Lodgepole_Yose_2Host-AssociatedOpen in IMG/M
3300020600Leaf-associated microbial communities from Pinus contorta in Yosemite National Park, California, United States - Lodgepole_Yose_4Host-AssociatedOpen in IMG/M
3300023017Spruce roots microbial communities from Bohemian Forest, Czech Republic ? CRU3Host-AssociatedOpen in IMG/M
3300023224Spruce roots microbial communities from Bohemian Forest, Czech Republic ? CRU4Host-AssociatedOpen in IMG/M
3300023533Metatranscriptome of spruce litter microbial communities from Bohemian Forest, Czech Republic - CLE5 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300023561Metatranscriptome of spruce roots microbial communities from Bohemian Forest, Czech Republic - CRU2 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300023563Metatranscriptome of spruce roots microbial communities from Bohemian Forest, Czech Republic - CRA4 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300023688Metatranscriptome of spruce roots microbial communities from Bohemian Forest, Czech Republic - CRE1 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300024123Spruce roots microbial communities from Bohemian Forest, Czech Republic - CRU5Host-AssociatedOpen in IMG/M
3300028019Spruce roots microbial communities from Bohemian Forest, Czech Republic ? CRU1Host-AssociatedOpen in IMG/M
3300030763Metatranscriptome of rhizosphere microbial communities from Maridalen valley, Oslo, Norway - NZI5 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300030764Metatranscriptome of plant litter microbial communities from Maridalen valley, Oslo, Norway - NLI1 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300030879Metatranscriptome of rhizosphere microbial communities from Maridalen valley, Oslo, Norway - NZU1 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300031592Metatranscriptome of spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRA4 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300031664Metatranscriptome of spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRA5 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300031690Metatranscriptome of spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRI1 (Metagenome Metatranscriptome)EnvironmentalOpen in IMG/M
3300031708FICUS49499 Metagenome Czech Republic combined assemblyEnvironmentalOpen in IMG/M
3300031816Metatranscriptome of spruce roots microbial communities from Bohemian Forest, Czech Republic - CRI2 (Metagenome Metatranscriptome) (v2)EnvironmentalOpen in IMG/M
3300032515FICUS49499 Metatranscriptome Czech Republic combined assembly (additional data)EnvironmentalOpen in IMG/M
3300032821Metatranscriptome of switchgrass phyllosphere microbial communities from Michigan, USA - G5R1_NF_15MAY2017_LR1 (Metagenome Metatranscriptome)Host-AssociatedOpen in IMG/M
3300033544Spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRE5Host-AssociatedOpen in IMG/M
3300033545Spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRE4Host-AssociatedOpen in IMG/M
3300033546Spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRE2Host-AssociatedOpen in IMG/M
3300033547Spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRE1Host-AssociatedOpen in IMG/M
3300033548Spruce roots microbial communities from Maridalen valley, Oslo, Norway - NRE6Host-AssociatedOpen in IMG/M

Geographical Distribution
Zoom:     Powered by OpenStreetMap



 ⦗Top⦘

Family Sequences

Protein ID Sample Taxon ID Habitat Sequence
Ga0213495_1003941313300020588LeafMKAENKLGATNFRAWKTRIDLLLAKEDLLRIVKGMVIEPKKMKRN
Ga0213495_1004291613300020588LeafMKSGNMLEGASNFRAWKLRIDIILTKNKVLDIVTRKVV
Ga0213495_1007265013300020588LeafMASMKSENKLEGATNFRDWKTRIDLILTKEDILEIVQGKVTEPTDEAGK
Ga0213495_1009962423300020588LeafMASSMKFENKLERASNLRAXKIRIGIILAKNKVLDIVTGKVVEPTNVIGKEKIKED
Ga0213496_1005179423300020590LeafMKFMETSMKKENKLEGASNYRGWKKGIELIFAKNKLLDFVQGKFKELSSG
Ga0213498_1009362013300020600LeafMNIFMNSEYKLKGASNFRAWKTRVDLILARNNGLGT
Ga0213498_1012759313300020600LeafMKSENKLQGATNFRAWKTGIDLILAKNDVLDIVKGNVTEPSDNEGKAKYKKDDIIS
Ga0224574_10013523300023017RootsMKSENKIDGASNFRAWKTRIDLILSKNKVPDIVRGKIVKPEFEEGEEKEPQ
Ga0224574_10706513300023017RootsMTYSMKSENKLEGASNFRAQKTRIDLILAKNKVLDIVKGKIVEPRF
Ga0224575_10217023300023224RootsMTYSMKSKNKLEGDSNFRAWKTRIDLILAKNKVLDIVKGKIMEPQFEEGKEKEP
Ga0224575_10874713300023224RootsMKSENKLEGPSNFRAWKKRIELILAKNKVLDIVKGKIMEPQFEEGKEKEPHNVAFMEKFKDNDINGMSII
Ga0247537_10328223300023533SoilKTKNKLKKASNFRAWKTRIDLILAKNKVLDIVKEKIVVPEFEEGKEKEP
Ga0247518_11476213300023561SoilMKLENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVEPVFEEGKEKEPQNVAVI
Ga0247530_11091013300023563SoilMKSENKLDGASNFRAWKTRIDLILANNKVLDIVKGKIMEPQFEAGKE
Ga0247522_10954013300023688SoilMTNSMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVKPEFKEGEEKEP
Ga0247522_11246613300023688SoilMKSENKLDGAFNFRAWKTRIDLILAKNKVLDIAKGKIVKPEFEEGKEKEPQNVAAMEKFKDVDIN
Ga0228600_100137113300024123RootsMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVKPEFKEGEEKEPQNVAA
Ga0228600_101823213300024123RootsMKSENKLEGDSNFRAXKKRIELIISKNKVLDIMKGKIMEPRFEEGKEKEP
Ga0228600_102311413300024123RootsMKSENKLDGPSNFRTLKTRIDLILSKNKVLDIVKPEFEEGKE
Ga0224573_100534713300028019RootsMTYSMKSKNKLEGDSNFRAWKTRIDLILAKNKVLDIMKG
Ga0224573_101154513300028019RootsSMKSKNKLEGDSNFRAWKTRIDLILAKNKVLDIVKGKIMEPQFEEGKEKEP
Ga0224573_101597613300028019RootsMKSENKLDGASNFKAWKTRIDLILAKNKVLDIMKGKIVKPKF
Ga0265763_101677613300030763SoilMKSENKLDGASNFKAWKIRIDLILAKNKALDIVKGKIVKPEFEEGEEKEPQNGAAMEKFK
Ga0265720_101475723300030764SoilMMNSMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVKPEFEEGEEKEL
Ga0265765_103557513300030879SoilMKSENKLDGASNFRAWKTRIDLILAKNKVLNIVKGKIVKPEFEEGKEKEPQKIATMEKF
Ga0310117_11569813300031592SoilMKSKNKLDGASNFRAWKTRIDLILAKNKMLDIVKGKIVKS
Ga0310118_11370513300031664SoilMKSKNKLDGASNFRAWKTRIDLILAKKKVLDIVKPEFE
Ga0310101_12067113300031690SoilMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVKPEFKE
Ga0310686_10004175723300031708SoilMTYSMKSENKLEGASNFKAWKTRIDLILAKKNVLDIVKGKIVEPQ
Ga0310686_10007871813300031708SoilMTYSIKSENKLDGASNFRILKTRIDLILAKNKVLDIVKGKIVEPEFEEGKE
Ga0310686_10024110913300031708SoilMKSENKLEGNPNFRAWKTRIDLILSKIKVLDIVKGKIVKPEFEEGKEKEPRNV
Ga0310686_10069746113300031708SoilMTYSMKSENKLEGASNVRAWKTRIDLILSKDKVLDIMKGK
Ga0310686_10079848713300031708SoilRIDLILAKNKLLDIVKGKIVEPQFEEGKEKGPQNIAVMEKFKENDINSISIIVDFVKDHLIP
Ga0310686_10131942313300031708SoilMTYSMNSEKNLEGASNFRAWKTMIEIILAKKKVLDIVKGKIVEPQFETRKEKEPQNVPT
Ga0310686_10154160123300031708SoilKLEGYSNFIAWKKRIELIIAKNKVLDIVKGKIMEP
Ga0310686_10244207213300031708SoilMTYSMDSENKLEGASNFIAWKTRIDLILAKNKVLDIVKGKIMEP
Ga0310686_10250520613300031708SoilMKSENKLDGASKFRAWKKRIDLILIKNKLMDIVKGKIAKPEFEKGKEKELQNVVAMEKFKED
Ga0310686_10280917313300031708SoilSYDLLYEIKNKLEGDSNFRAWKTRIYLILAKNKVLDIVKRKIVELEFEEGK
Ga0310686_10398675923300031708SoilMRSENKLEGVSNFGAWKTRIDLILAKNKVLDMVKGKIMEPQFEAGK
Ga0310686_10400453823300031708SoilMTYSMKLENKLEGAYSFRAWKTSIDLILAKNKVLNIMKGKIMELEFEEGKEKEP
Ga0310686_10437145513300031708SoilMKSEKKLEGASNFRAWKRRIDLILAKNKVLDIVKEKNMEPQFEEGKEKEPQNVAIT
Ga0310686_10465296413300031708SoilMKSENKLDGASKFRAWKTRIDLILAKNKVLDIVKGNIVEPEFEEGKEKEQQNVAVI
Ga0310686_10554884023300031708SoilMKSEKKLDGASNFRAWKKRIDLILAKNKVLDIVKGNIVKPEFK
Ga0310686_10634289813300031708SoilMKLENKLQGTSNFKAWKTRIDLILAKNKVLYIVKGKIVEP
Ga0310686_10672181933300031708SoilNYSMKSENKLDGASNFRAWKTRIELILAKNKVLDIVKGNIVEP
Ga0310686_10720423523300031708SoilMNYSMQSENKLEGASNLRAWKTRIDLILSKNKFLDIMKKKIVEPQF
Ga0310686_10740446413300031708SoilMKSKNKLDGASNFRAWKTRIDLILDKNKVLDIVKGNIVEPEFEEGKEKEPHNVAVMEK
Ga0310686_10798501613300031708SoilMKSENKLDGASNFRSWKTRIDLILAKNKVLKGKNMEPMFEEGKEKEPQNV
Ga0310686_10802785913300031708SoilMKLENKLEGDSNFRAWKTRIELILAKNKVLDIIKGKIMEPQYET
Ga0310686_10841907613300031708SoilMTYSMKSENKLEGASNFRAWKTRIDLILAKNKVLDIVKGKIVEPEFEAGK
Ga0310686_10869165613300031708SoilMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKQAFEEGKEKEQ
Ga0310686_10919598013300031708SoilMSYSMKSENKLEGDSKFRAWKTRIDLILSKNKVLDIVKGKIMEPQFEARKEKEPQNM
Ga0310686_10933018623300031708SoilMKSENKLEGASNFRAWKTRIDLILSKNKVLDIMKGNIMEPQFEE
Ga0310686_10997647213300031708SoilMKSKNKLEGASNFRAWKTRINLILAKNKVLDIVKGKIAEPHFEV
Ga0310686_11048601413300031708SoilMTYSMKPENKLEGDSNFRAWKTRIDLILSKNKVLDIVKGKIVEPEVEEGKEKEPQNIG
Ga0310686_11140129113300031708SoilMTYSMKSKNKLEGDYNFGAWKTRIDLILAKNKALYIMKGKIMEPQVEAGKEKEPQNIV
Ga0310686_11197192513300031708SoilMTYSIKSKNNLEVASDIRAWKTRIDLILAKNKLLDIMKGKIMETKFEAGKEKEPHNVTVMENFKDNDI
Ga0310686_11256996223300031708SoilMTYCMKLENKLDGASNFIAWKTRIDLILAKNKVLDIVKGKIVKPEFEEGK
Ga0310686_11260673113300031708SoilMAYSMKLENKLEGDSNFIACKTRIDLILAKNKVLDIMKGNIVEP
Ga0310686_11261883813300031708SoilMKSENKLDGASNFRAWKTRIDLILSKNKVLDIVKGKIVEPE
Ga0310686_11298340613300031708SoilMTYSMKSESKLEGASNFRAWKTRIDLILAKNKVLNIVKGKIMEP
Ga0310686_11324189023300031708SoilMNSENKLDGASNLRAWKTRIDLILANNKVLDIVKGNIVKLEFKEG
Ga0310686_11325669213300031708SoilMKSENKLDXASKFKVCKTRIDLILAKNKVLDIVKGKIMKLEFEEGEEKEPQNIATMEKFKDVDINAMSI
Ga0310686_11402627113300031708SoilMTYSMKSENKLDGASDFRAWKTRIDLILAKNKLLDIVKGKIVKPEFEEG
Ga0310686_11409555313300031708SoilMNYSMKSKNKLEGASNFIACKKMIDLILSKNKVLDIVKGKIVEQQFEEGKENEP
Ga0310686_11442030813300031708SoilMKSENKLEWASNFREWKTRIDLILAKNKVLDIVKGKIVKPKFKEGEEKEP
Ga0310686_11454737213300031708SoilMNYSMKSKKKLEGASKFKAWKTRIDLIPAKKKVLDIVNGKIVEPEFEEGKEKE
Ga0310686_11506434223300031708SoilMASSMKSKNKIEEASNFRAWKTIIDLILAKNKVLDMVKGKIKELKDDAGK
Ga0310686_11511866313300031708SoilMKSENELDGASNFRAWKRNIDLILAKNKVLDIVKGKIVEREFEEGK
Ga0310686_11550222423300031708SoilMNYSMKSEKKLEGASNFKAWRTRIDLILAKNKVLDIVKGKIMEP
Ga0310686_11618130513300031708SoilSNFRAWKTRIYLILSKNKVLGIMKGKIMELQFEARKEKEPQNVVVMKTKDLA
Ga0310686_11622268413300031708SoilMKSEKKLEGDSNFIAWNTSIDLILAKNKVLDTVKGGIMEPQFEVGKEKELQTIATM
Ga0310686_11631274113300031708SoilMNYSMKSENKLEGASNFIAWKTRIDLILAKNKVLDIVKGKIMEPRFDAGKEKEHQNMAIMKTKDL
Ga0310686_11743847923300031708SoilMTYSMKSENKLEGASNFRAWKTRIDLILAKNKVLDIVKGNIMEPQFEEG
Ga0310686_11791077013300031708SoilQVKEVMTYSMKSENKLAGVSNFRSWKTMIDIILAKNKVLDIVKGNIVEP
Ga0310686_11808581913300031708SoilMTYSMKLENKLDGVSNFRAWKKRIDLILAKNKVLDIM
Ga0310686_11833090223300031708SoilMMKSMKSENKIDGASNLIAWNTRIDLMLAKNKVLDIVKGNIVEP
Ga0310686_11888314223300031708SoilMTHSMKSKNNLDGASNFGAWKTRIDLILAKNKVLDIVKGKIVKPKFEEGIEKEPQNVAAMENFKDGD
Ga0310686_11900052323300031708SoilMTYSMKSENKLEGAYNFRAWKSRIGLILAKNKLLDIV
Ga0310686_11904117913300031708SoilVDQVKEVMTYSMKSKNKLEGASNFRAWKTRIDIILTKNKVLDIVKGKIMEP
Ga0310686_11954211513300031708SoilMTYSIKSENDLYGASNFKAWKTRIDLILYKNKVLDIVKGNIVEPEFEEGK
Ga0310686_11977806423300031708SoilMTYSMKSENKIEGASNFKEQKTRIDLTLSKNKVLDIVKGKIMEPWFEAGK
Ga0310686_11989049723300031708SoilMKSEKKLEGASNFRAWTTKIDLILAKNKVLDIMKGKIVEPKFEERKEKEP
Ga0316042_12684523300031816SoilMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKG
Ga0316042_13054423300031816SoilMTYSMKSENKLDGASNFRAWKTRIDLLLAKNKVLDIVKGKIMEP
Ga0348332_1019700113300032515Plant LitterMKSKNKLEGASNFKAWKTMIDLILAKNKVLDIVKGKIMERQFEEGKEKEPQNVAVME
Ga0348332_1051953013300032515Plant LitterKSKNKLDGASNFKAWKTRIDLILAKNKVLDIVKGKIVKPEFEEEKEKEP
Ga0348332_1207709213300032515Plant LitterMKSENKLDGASNFRAWKTRIDLILDKNKVLDIVKGKIVKPEFKEGEEKEPQNVAA
Ga0348332_1345864723300032515Plant LitterVSNFKTWKTRIDLILAKNKVLDLMKGKIMEPEFEAGKDKEPQNVAVMEKFTDNDINSMSI
Ga0348332_1424364913300032515Plant LitterMKSENKLYGASNFIAWKTRIDLILSKNKVLDIVKGNIVKPEFEEGKE
Ga0348332_1450372613300032515Plant LitterMTNSMKSENRLDGASDFRAWKTRIDLILSKNKVLDIVKGKIVKPKFEEGKEKEPQ
Ga0314719_101471913300032821Switchgrass PhyllosphereQVKAAMTYSMKLENKLDGAFNFRAWKTRIDLILAKNKVLDIVKGKIVEA
Ga0316215_101785713300033544RootsMTYSMKSKNKLEGDSNFIAWKTRIDLILAKNKVLDIVKGKIMEPRFEEG
Ga0316215_102780413300033544RootsMTNSMKSENKLDGASNFRAWKTRIDLILAKNKVLDIVKGKIVKP
Ga0316215_102800913300033544RootsSMKSKNKLEGDSNFRAWKTRIDLILAKNKVLDIVKGKIMEPRFEEGKEKEP
Ga0316214_104376213300033545RootsMTYSMKSENKLEGASNFRAWKTRIDLILAKNKVLDIVKGKIMEPQFE
Ga0316213_101479613300033546RootsMKSENKLEGASNFRAWKTRIDLILAKNKVLDIVKGKIVKPEFKE
Ga0316213_101783613300033546RootsMTYSIKSKNKLEGDSNFRSWKTRIDLILSKNKVLDIMKGKIVEPRFE
Ga0316212_105540413300033547RootsMTYSMKSKNKLEGDSNFRAWKTRIDLILAKNKVLDIVKGKIMEPRFEEGK
Ga0316216_102074313300033548RootsMKLENKLDGASNFRVWKTRIDLILAKNEVLDIVKVKTMKPEFE


 ⦗Top⦘


© Pavlopoulos Lab, Bioinformatics & Integrative Biology | B.S.R.C. "Alexander Fleming" | Privacy Notice
Make sure JavaScript is enabled in your browser settings to achieve functionality.