NMPFamsDB

NMPFamsDB

NMPFamsDB

A database of Novel Metagenome Protein Families

A database of Novel Metagenome Protein Clusters

A database of Novel Metagenome Protein Clusters
x
This website uses cookies to improve user experience. By using NMPFamDB you consent to all cookies in accordance with our privacy policy. OK
Metagenome / Metatranscriptome Family F102831

Metagenome / Metatranscriptome Family F102831

Go to section:
Overview Alignments Structure & Topology Gene Neighborhood Phylogeny Ecosystems Sequences
Select file to download:
   Download


Overview

Basic Information
Family ID F102831
Family Type Metagenome / Metatranscriptome
Number of Sequences 101
Average Sequence Length 51 residues
Representative Sequence CDGWWRSPHTIYMAQGQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV
Number of Associated Samples 80
Number of Associated Scaffolds 101

Quality Assessment
Transcriptomic Evidence Yes
Most common taxonomic group Bacteria
% of genes with valid RBS motifs 2.97 %
% of genes near scaffold ends (potentially truncated) 91.09 %
% of genes from short scaffolds (< 2000 bps) 88.12 %
Associated GOLD sequencing projects 72
AlphaFold2 3D model prediction No

Note: High quality evidence is represented by blue. Low quality evidence is represented by red.
Hidden Markov Model
Powered by Skylign

Most Common Taxonomy
Group Bacteria (100.000 % of family members)
NCBI Taxonomy ID 2
Taxonomy All Organisms → cellular organisms → Bacteria

Most Common Ecosystem
GOLD Ecosystem Environmental → Terrestrial → Soil → Unclassified → Unclassified → Vadose Zone Soil
(31.683 % of family members)
Environment Ontology (ENVO) Unclassified
(46.535 % of family members)
Earth Microbiome Project Ontology (EMPO) Free-living → Non-saline → Soil (non-saline)
(56.436 % of family members)



 ⦗Top⦘

Multiple Sequence Alignments

Select alignment to view:      
Full Alignment
Alignment of all the sequences in the family.
Sorting
Filter
Selection
Vis.elements
Color scheme
Extras
Export
Help

IDLabel
Powered by MSAViewer


 ⦗Top⦘

Structure & Topology

Predicted Secondary Structure and Topology

Predicted Topology & Secondary Structure
Classification: Globular Signal Peptide: No Secondary Structure distribution: α-helix: 51.06%    β-sheet: 0.00%    Coil/Unstructured: 48.94%
Feature Viewer
Position :
0
Zoom :
x 1
+ Add Multiple Variants

Enter the variants

Position

Original

Variant

Get Predictions
Get Predictions

Enter the variants

Position

Original

Variant

51015202530354045CDGWWRSPHTIYMAQGQLYDRQLAEHEMLHDLLQRGDHPPVFQACGVSequenceα-helicesβ-strandsCoilSS Conf. score
Powered by Feature Viewer


 ⦗Top⦘

Gene Neighborhood

Neighboring Pfam domains


Neighboring Clusters of Orthologous Genes (COGs)



 ⦗Top⦘

Phylogeny

NCBI Taxonomy

Select NCBI taxonomy Level:

Visualization
All Organisms
Unclassified
100.0%
Download SVG
Download PNG
Download CSV
Powered by ApexCharts

Associated Scaffolds





 ⦗Top⦘

Environmental Properties

Associated Habitat Types

Select Environment Taxonomy Level:

Visualization
Vadose Zone Soil
Grasslands Soil
Agricultural Soil
Soil
Grasslands Soil
Hardwood Forest Soil
Soil
Forest Soil
Corn, Switchgrass And Miscanthus Rhizosphere
Agricultural Soil
Soil
Populus Rhizosphere
31.7%13.9%6.9%19.8%12.9%5.9%
Download SVG
Download PNG
Download CSV
Powered by ApexCharts



Associated Samples


Geographical Distribution
Zoom:     Powered by OpenStreetMap



 ⦗Top⦘

Family Sequences

Protein ID Sample Taxon ID Habitat Sequence
C687J29039_1015621323300002243SoilAWYEVPGSRYACPGYSRGCAGWWRPPHTIYMAQASVKERRPVQHEMLHDLLQRSDHPPVFRTCGV*
Ga0066677_1043382613300005171SoilCPAYEGRCDGWWQPRHTIYLAHRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0066683_1060244813300005172SoilQCDGWWRAPHTIYMAQGRLYDRQVAEHEMLHDLLQRGDHPPVFRACGV*
Ga0066680_1082396013300005174SoilPGSTYACPAHEGRCDGWWRSPHTIYMAQSRLYDRQLVEHEMLHDLLQRGDHPPVFQTCGV
Ga0066678_1075545613300005181SoilEGRCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066676_1055102813300005186SoilQCDGWWRSPHTIYMAQGRLYDRRVAEHEMLHDLLQRGDHPPAFQACGV*
Ga0066676_1071579423300005186SoilEGQCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066675_1081432813300005187SoilDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV*
Ga0070714_10096731813300005435Agricultural SoilGWWQPPHTIYMAQDQTSNRVLAEHEMLHDLLQRGDHPPVFAACNVSTQREW*
Ga0070711_10074601123300005439Corn, Switchgrass And Miscanthus RhizosphereGSSYSCPAYEGRCDGWWRSPHTIYMAQGELYDRQVAEHEMLHDLLQRGDHLPVFQACGV*
Ga0070708_10151633513300005445Corn, Switchgrass And Miscanthus RhizosphereGSSYSCPAYEGRCDGWWRSPHTIYMAQGELYDRQVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066689_1073408613300005447SoilPHTIYMAQGLVYNRRLAEHEMLHDLLQRGGHPPVFQACGV*
Ga0066681_1032478013300005451SoilWYEVPGVDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV*
Ga0070707_10029233913300005468Corn, Switchgrass And Miscanthus RhizospherePPHTIYLAHRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGVL*
Ga0070695_10016300933300005545Corn, Switchgrass And Miscanthus RhizosphereYEVPGSSYSCPAYEGRCDGWWRSPHTIYMAQGELYDRHVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066704_1068361813300005557SoilAYEGRCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066670_1028802223300005560SoilAYEGRCDGWWHPPHTIYLADRWRADRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0066703_1057106223300005568SoilDYPCPAYEGRCDGWWHPPHTIFLAYRWRDDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0079221_1007063243300006804Agricultural SoilGRCEGWWQPPNTIYMASDEIGNRQLAEHEMLHDLLQTGAHPPAFAQCGVLTQKAW*
Ga0079221_1052510623300006804Agricultural SoilMRLRLGWLCWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0079220_1018607913300006806Agricultural SoilIYLAHRWRNDSRLVEHEMLHDLLQRGDHPPVFQACGVL*
Ga0075426_1014957313300006903Populus RhizosphereSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0075426_1039853423300006903Populus RhizosphereGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0099791_1010651923300007255Vadose Zone SoilELYEVPGSSYWCPAYEGRCDGWWRSPHTIYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0099793_1039116623300007258Vadose Zone SoilYEVPGSSYWCPAYEGRCDGWWRSPHTIYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066710_10285239523300009012Grasslands SoilPHTIYMAQSRLYDRQLAEHEMLHDLLQRGDHPLVFQACGV
Ga0066710_10366831123300009012Grasslands SoilYPCPAYEGRCEGWWQPPHTIYMAQDQTGNRQLAEHEMLHDLLQRGDHPLVFVACGVATQSAW
Ga0066710_10406918013300009012Grasslands SoilMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV
Ga0099829_1096850113300009038Vadose Zone SoilYEGRCDGWWRSPHTIYMAQGLLHNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0066709_10042850233300009137Grasslands SoilHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0066709_10046771213300009137Grasslands SoilGSSYPCPAYEGRCEGWWQPPHTIYMAQDQTGNRQLAEHEMLHDLLQRGDHPLVFVACGVATQSAW*
Ga0066709_10389070223300009137Grasslands SoilWWRSPHTIYMAQTRLYNRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134086_1016982613300010323Grasslands SoilPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134111_1031539023300010329Grasslands SoilPAYEGRCDGWWQPAHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0134071_1021180823300010336Grasslands SoilEWYEVPGSSYSCPAHEGRCDGWWRSPHTVYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134062_1000781963300010337Grasslands SoilWWRAPHTIYMAQGRLYDRQVAEHEMLHDLLQRGDHPPVFRACGV*
Ga0137391_1100261413300011270Vadose Zone SoilAYEGQCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137388_1026421313300012189Vadose Zone SoilEWYEVPGSSYSCPAYEGRCDGWWRSPHTIYMAQGQLYDRQVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137364_1126305213300012198Vadose Zone SoilVPGSSYSCPAYEGRCDGWWRSPHTIYMAQTRLNDRQLAEHEMLHDLLQQGDHPPVFQACGV*
Ga0137382_1056552813300012200Vadose Zone SoilEVPGSSYSCPAYEGRCDGWWRAPHTIYMAQNRLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137382_1113808113300012200Vadose Zone SoilAYEGRCDGWWRSPHTIYMAQTRLNDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137382_1116606923300012200Vadose Zone SoilCDGWWRSPHTIYMAQGQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137365_1013949123300012201Vadose Zone SoilVPGVDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0137374_1072858723300012204Vadose Zone SoilYTNGCAGWWQPPHTIYMAQTRLNDRRLVEHEMLHDLLQRSDHPPGFKTCGV*
Ga0137362_1137612913300012205Vadose Zone SoilCPAYEGRCDGWWRSPHTVYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137381_1076399523300012207Vadose Zone SoilCPAYEGQCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHLPVFQACGV*
Ga0137376_1161879013300012208Vadose Zone SoilSYSCPAYEGRCDGWWRSPHTIYMAQTRLNDRQLAEHEMLHDLLQQGDHPPVFQACGV*
Ga0137377_1029922113300012211Vadose Zone SoilGECAGWWQPPHTIYLAETRVHDRLLVEHEMLHDLLQRGDHPPVFQTCGVAVQSAR*
Ga0137370_1074200023300012285Vadose Zone SoilLCDGWWRSPHTIYMAHSRLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137387_1006056443300012349Vadose Zone SoilDGWWRAPHTIYMAQSRLYDRQLAEHEMLHDLLQRGDHPPLFQACGV*
Ga0137387_1109390313300012349Vadose Zone SoilRSPHTIYMAQGRLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137387_1132831813300012349Vadose Zone SoilCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137372_1011378843300012350Vadose Zone SoilVDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0137386_1002721033300012351Vadose Zone SoilVEWYEVPGVDYPCPAYEGRCDGWWQPPHTIYLAYRWRSDRQLVEHEMVHDLLQRGDHPPEFQACGV*
Ga0137386_1052780323300012351Vadose Zone SoilPAYEGRCEGWWQPPHTIYMAQDQTGNRQLAEHEMLHDLLQRGDHPLVFVACGVATQSAW*
Ga0137386_1123467023300012351Vadose Zone SoilPHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHLPVFQACGV*
Ga0137371_1093100923300012356Vadose Zone SoilGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0134036_126469313300012384Grasslands SoilHTIYLAQGRLYDRQVAEHEMLHDLLQRGDHPPVFRACGV*
Ga0134060_109984613300012410Grasslands SoilVEWYEVPGSSYSCPAHEGRCDGWWRSPHTVYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137358_1101727413300012582Vadose Zone SoilSPHTIYMARGQLYDRQVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137416_1041794223300012927Vadose Zone SoilYEVPGSSYSCPAYEGRCDGWWRSPHTIYMAQGQLYDRHLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137407_1031330513300012930Vadose Zone SoilPHTIYMAQGQLYDRQVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134081_1005770723300014150Grasslands SoilCDGWWRAPHTIYMAQSRLYDRQVAEHEMLHDLLQRGDHPPVFRACGV*
Ga0134079_1021533823300014166Grasslands SoilEWYEVPGVDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV*
Ga0134079_1040905513300014166Grasslands SoilWWHPPHTIFLAYRWRDDRQLVEHEMLHDLLQRGDHPPVFQACGV*
Ga0137420_101745313300015054Vadose Zone SoilRADGCDGWWRSPHTIYMSQDQLYDRQLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0137420_108121813300015054Vadose Zone SoilMAQGQRYDRQVAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134089_1001841633300015358Grasslands SoilPHTIYLAETRANDRLLVEHEMLHDLLQRGDHPPAFQACGVAVQSAR*
Ga0134085_1006773213300015359Grasslands SoilHEGRCDGWWRSPHTIYMAQGRLYDRRLAEHEMLHDLLQRGDHPPVFQACGV*
Ga0134112_1038252623300017656Grasslands SoilSCPAYDGECAGWWQPPHTIYLAETRVNDRLLVEHEMLHDLLQRGDHPPVFQTCGVAVQSA
Ga0134112_1039378913300017656Grasslands SoilHTIYMAQGLLYNRRLAEHEMLHDLLQRGDHPPVFQACGV
Ga0134083_1049763813300017659Grasslands SoilAPHLIYMAQQRLDDRRLAEHEMLHDLLGRGDHPSVFQACGL
Ga0066655_1076204933300018431Grasslands SoilWCRAPHLIYMAEQRLDDRRLAEHEMLHDLLGRGDHPPVFQACGV
Ga0066667_1001123053300018433Grasslands SoilMAQGRLYDRPLAEHEMLHDLLQRGDHPPVFQACGV
Ga0066667_1042827823300018433Grasslands SoilWWRSPHTIYMAQTRLYNRQLAEHEMLHDLLQRGDHPPVFQACGV
Ga0066667_1164251713300018433Grasslands SoilCDGWWHPPHTIYLADRWRADRQLVEHEMLHDLLQRGDHPPVFRACGV
Ga0066667_1193094813300018433Grasslands SoilLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV
Ga0066662_1013594333300018468Grasslands SoilYSCPAYDGECAGWWQPPHTIYLAETRVNDRLLVEHEMLHDLVQRGDHPPVFQACGVAVQSAR
Ga0207653_1000319533300025885Corn, Switchgrass And Miscanthus RhizosphereMAQGELYDRQVAEHEMLHDLLQRGDHPPVFQACGV
Ga0207646_1112817113300025922Corn, Switchgrass And Miscanthus RhizospherePAHEGRCDGWWRPPHTIYLAHRWRNDRQLVEHEMLHDLLQRGDHPPVFQACGVL
Ga0209761_109026433300026313Grasslands SoilYEGQCDGWWRSPHTIYMAQGLLYNRRLAEHEMLHDLLQGGDHPPVFQACGV
Ga0209686_104736913300026315SoilGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV
Ga0209471_119495213300026318SoilEGRCDGWWHPPHTIFLAYRWRDDRQLVEHEMLHDLLQRGDHPPVFQACGV
Ga0209470_116703923300026324SoilGSSYSCPAYEGQCDGWWRAPHTIYMAQSRLYDRQLAEHEMLHDLLQRGDHPPVFRACGV
Ga0209266_114640323300026327SoilGQCDGWWRAPHTIYMAQGRLYDRQVAEHEMLHDLLQRGDHPPVFRACGV
Ga0209266_119609823300026327SoilHLIYMAEQRLDDRRLAEHEMLHDLLGRGDHPPVFQACGV
Ga0209808_101482013300026523SoilRSPHTIYMAQGLVYNRRLAEHEMLHDLLQRGGHPPVFQACGV
Ga0209378_103736323300026528SoilMAQSRLYDRQLAEHEMLHDLLQRGDHPLVFQACGV
Ga0209378_106762413300026528SoilYPCPAYEGRCEGWWQPPHTIYMAQDQTGNRQLAEHEMLHDLLQRGDHPPVFVACGVATQSAW
Ga0209156_1000916493300026547SoilAYEGRCDGWWHPPHTIYLADRWRADRQLVEHEMLHDLLQRGDHPPVFQACGV
Ga0179593_108395033300026555Vadose Zone SoilMARGQLYDRQVAEHEMLHDLLQRGDHPPAFQACGV
Ga0209076_119433423300027643Vadose Zone SoilGSSYTCPAHEGLCDGWWRSPHTIYMAQSRLDDRQLAEHEMLHDLLQRGDHPPVFKACGV
Ga0209178_114674913300027725Agricultural SoilPAYDGRCEGWWQPPNTIYMASDEIGNRQLAEHEMLHDLLQTGAHPPAFAQCGVLTQKAW
Ga0208989_1027559413300027738Forest SoilHTIYMAQSRLHDRQLAEHEMLHDLLQRGDHPPVFKACGV
Ga0209689_129206013300027748SoilVDYPCPAYEGRCDGWWQPPHTIYLAYRWRNDRQLVEHEMLHDLLQRGDHPPVFHACGV
Ga0209073_1020258523300027765Agricultural SoilPHAIYLAHRWRNDSRLVEHEMLHDLLQRGDHPPVFQACGVL
Ga0209177_1015851013300027775Agricultural SoilPGPDYPCPAYEGRCDGWWQPPHVIYLASQWRYDRPLVEHEMLHDLLQRGDHPPVFQACGV
Ga0209074_1021797923300027787Agricultural SoilYACPAYDGRCEGWWQPPNTIYMASDEIGNRQLAEHEMLHDLLQTGAHPPAFAQCGVLTQKAW
Ga0209283_1009180913300027875Vadose Zone SoilWWRSPHTIFMSQDQLYDRQLAEHEMLHDLLQRGDHPPAFQACGV
Ga0307468_10138041423300031740Hardwood Forest SoilMAQDQTSNRVLAEHEMLHDLLQRGDHPPVFAACNVSTQREW
Ga0307471_10005603453300032180Hardwood Forest SoilSYPCPAYEGRCEGWWQPPHTIYMAQDQTGNRVLAEHEMLHDLLQTGNHPPVFAACGVLSQSEW


 ⦗Top⦘


© Pavlopoulos Lab, Bioinformatics & Integrative Biology | B.S.R.C. "Alexander Fleming" | Privacy Notice
Make sure JavaScript is enabled in your browser settings to achieve functionality.