| Basic Information | |
|---|---|
| Family ID | F074774 |
| Family Type | Metagenome / Metatranscriptome |
| Number of Sequences | 119 |
| Average Sequence Length | 48 residues |
| Representative Sequence | MRSKGLHLSEAFMRKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Number of Associated Samples | 86 |
| Number of Associated Scaffolds | 119 |
| Quality Assessment | |
|---|---|
| Transcriptomic Evidence | Yes |
| Most common taxonomic group | Unclassified |
| % of genes with valid RBS motifs | 81.36 % |
| % of genes near scaffold ends (potentially truncated) | 19.33 % |
| % of genes from short scaffolds (< 2000 bps) | 57.14 % |
| Associated GOLD sequencing projects | 73 |
| AlphaFold2 3D model prediction | Yes |
| 3D model pTM-score | 0.69 |
| Hidden Markov Model |
|---|
| Powered by Skylign |
| Most Common Taxonomy | |
|---|---|
| Group | Unclassified (82.353 % of family members) |
| NCBI Taxonomy ID | N/A |
| Taxonomy | N/A |
| Most Common Ecosystem | |
|---|---|
| GOLD Ecosystem | Environmental → Aquatic → Marine → Coastal → Unclassified → Aqueous (26.050 % of family members) |
| Environment Ontology (ENVO) | Unclassified (34.454 % of family members) |
| Earth Microbiome Project Ontology (EMPO) | Free-living → Saline → Water (saline) (46.218 % of family members) |
| ⦗Top⦘ |
| ⦗Top⦘ |
| Predicted Topology & Secondary Structure | |||||
|---|---|---|---|---|---|
| Classification: | Globular | Signal Peptide: | No | Secondary Structure distribution: | α-helix: 41.56% β-sheet: 0.00% Coil/Unstructured: 58.44% | Feature Viewer |
|
|
|||||
| Powered by Feature Viewer | |||||
| Structure Viewer | |
|---|---|
|
| |
| Per-residue confidence (pLDDT): 0-50 51-70 71-90 91-100 | pTM-score: 0.69 |
| Powered by PDBe Molstar | |
| ⦗Top⦘ |
| Pfam ID | Name | % Frequency in 119 Family Scaffolds |
|---|---|---|
| PF11753 | DUF3310 | 39.50 |
| PF12705 | PDDEXK_1 | 9.24 |
| PF00154 | RecA | 3.36 |
| PF04820 | Trp_halogenase | 2.52 |
| PF03819 | MazG | 2.52 |
| PF06737 | Transglycosylas | 2.52 |
| PF01464 | SLT | 2.52 |
| PF02467 | Whib | 1.68 |
| PF01930 | Cas_Cas4 | 1.68 |
| PF14579 | HHH_6 | 1.68 |
| PF02945 | Endonuclease_7 | 1.68 |
| PF00478 | IMPDH | 0.84 |
| PF00383 | dCMP_cyt_deam_1 | 0.84 |
| PF13704 | Glyco_tranf_2_4 | 0.84 |
| PF13640 | 2OG-FeII_Oxy_3 | 0.84 |
| PF09723 | Zn-ribbon_8 | 0.84 |
| PF09738 | LRRFIP | 0.84 |
| PF14890 | Intein_splicing | 0.84 |
| PF05065 | Phage_capsid | 0.84 |
| PF11397 | GlcNAc | 0.84 |
| COG ID | Name | Functional Category | % Frequency in 119 Family Scaffolds |
|---|---|---|---|
| COG0468 | RecA/RadA recombinase | Replication, recombination and repair [L] | 3.36 |
| COG1468 | CRISPR/Cas system-associated exonuclease Cas4, RecB family | Defense mechanisms [V] | 1.68 |
| COG4653 | Predicted phage phi-C31 gp36 major capsid-like protein | Mobilome: prophages, transposons [X] | 0.84 |
| ⦗Top⦘ |
| Name | Rank | Taxonomy | Distribution |
| Unclassified | root | N/A | 82.35 % |
| All Organisms | root | All Organisms | 17.65 % |
| Visualization |
|---|
| Powered by ApexCharts |
Note: Some of these datasets are restricted, as per the data usage policy of the Joint Genome Institute (JGI). Utilizing any of their features below requires obtaining a license from the datasets' corresponding author(s).
| ⦗Top⦘ |
| Habitat | Taxonomy | Distribution |
| Aqueous | Environmental → Aquatic → Marine → Coastal → Unclassified → Aqueous | 26.05% |
| Freshwater | Environmental → Aquatic → Freshwater → Ice → Glacial Lake → Freshwater | 10.08% |
| Freshwater | Environmental → Aquatic → Freshwater → Lake → Unclassified → Freshwater | 6.72% |
| Salt Marsh | Environmental → Aquatic → Marine → Intertidal Zone → Salt Marsh → Salt Marsh | 5.04% |
| Freshwater To Marine Saline Gradient | Environmental → Aquatic → Marine → Coastal → Unclassified → Freshwater To Marine Saline Gradient | 4.20% |
| Saline Lake | Environmental → Aquatic → Non-Marine Saline And Alkaline → Saline → Unclassified → Saline Lake | 4.20% |
| Freshwater Sediment | Environmental → Aquatic → Freshwater → Sediment → Unclassified → Freshwater Sediment | 2.52% |
| Freshwater Lake | Environmental → Aquatic → Freshwater → Lake → Unclassified → Freshwater Lake | 2.52% |
| Benthic Lake | Environmental → Aquatic → Freshwater → Wetlands → Unclassified → Benthic Lake | 2.52% |
| Seawater | Environmental → Aquatic → Marine → Coastal → Unclassified → Seawater | 2.52% |
| Fracking Water | Environmental → Terrestrial → Deep Subsurface → Unclassified → Unclassified → Fracking Water | 2.52% |
| Freshwater Lake | Environmental → Aquatic → Freshwater → Lentic → Unclassified → Freshwater Lake | 1.68% |
| Freshwater Lentic | Environmental → Aquatic → Freshwater → Lentic → Unclassified → Freshwater Lentic | 1.68% |
| Freshwater And Sediment | Environmental → Aquatic → Freshwater → Lentic → Unclassified → Freshwater And Sediment | 1.68% |
| Marine Plankton | Environmental → Aquatic → Freshwater → Lotic → Unclassified → Marine Plankton | 1.68% |
| Surface Water | Environmental → Aquatic → Freshwater → Groundwater → Unclassified → Surface Water | 1.68% |
| Marine | Environmental → Aquatic → Marine → Oceanic → Unclassified → Marine | 1.68% |
| Marine Sediment | Environmental → Aquatic → Marine → Oceanic → Sediment → Marine Sediment | 1.68% |
| Sediment | Environmental → Aquatic → Marine → Coastal → Sediment → Sediment | 1.68% |
| Estuarine | Environmental → Aquatic → Marine → Unclassified → Unclassified → Estuarine | 1.68% |
| Hydrocarbon Resource Environments | Engineered → Wastewater → Industrial Wastewater → Petrochemical → Unclassified → Hydrocarbon Resource Environments | 1.68% |
| Freshwater | Environmental → Aquatic → Freshwater → Lentic → Epilimnion → Freshwater | 0.84% |
| Lake | Environmental → Aquatic → Freshwater → Lentic → Epilimnion → Lake | 0.84% |
| Lake | Environmental → Aquatic → Freshwater → Lake → Unclassified → Lake | 0.84% |
| Freshwater Lake Sediment | Environmental → Aquatic → Freshwater → Lake → Sediment → Freshwater Lake Sediment | 0.84% |
| Freshwater | Environmental → Aquatic → Freshwater → River → Unclassified → Freshwater | 0.84% |
| Pond Fresh Water | Environmental → Aquatic → Freshwater → Pond → Unclassified → Pond Fresh Water | 0.84% |
| Freshwater | Environmental → Aquatic → Freshwater → Creek → Unclassified → Freshwater | 0.84% |
| Seawater | Environmental → Aquatic → Marine → Inlet → Unclassified → Seawater | 0.84% |
| Sackhole Brine | Environmental → Aquatic → Marine → Coastal → Unclassified → Sackhole Brine | 0.84% |
| Estuarine Water | Environmental → Aquatic → Marine → Unclassified → Unclassified → Estuarine Water | 0.84% |
| Pelagic Marine | Environmental → Aquatic → Marine → Neritic Zone → Unclassified → Pelagic Marine | 0.84% |
| Marine | Environmental → Aquatic → Marine → Neritic Zone → Unclassified → Marine | 0.84% |
| Saline Water | Environmental → Aquatic → Non-Marine Saline And Alkaline → Saline → Unclassified → Saline Water | 0.84% |
| Saline Lake | Environmental → Aquatic → Non-Marine Saline And Alkaline → Saline → Unclassified → Saline Lake | 0.84% |
| Sediment (Intertidal) | Environmental → Aquatic → Sediment → Unclassified → Unclassified → Sediment (Intertidal) | 0.84% |
| Sediment | Engineered → Wastewater → Industrial Wastewater → Mine Water → Unclassified → Sediment | 0.84% |
| Hydrocarbon Resource Environments | Engineered → Biotransformation → Microbial Solubilization Of Coal → Unclassified → Unclassified → Hydrocarbon Resource Environments | 0.84% |
| Visualization |
|---|
| Powered by ApexCharts |
Note: Some of these datasets are restricted, as per the data usage policy of the Joint Genome Institute (JGI). Utilizing any of their features below requires obtaining a license from the datasets' corresponding author(s).
| Taxon OID | Sample Name | Habitat Type | IMG/M Link |
|---|---|---|---|
| 3300000116 | Marine microbial communities from Delaware Coast, sample from Delaware MO Spring March 2010 | Environmental | Open in IMG/M |
| 3300001336 | ML7 | Environmental | Open in IMG/M |
| 3300001605 | Tailings pond microbial communities from Northern Alberta - Syncrude Mildred Lake Settling Basin | Engineered | Open in IMG/M |
| 3300001838 | Marine plankton microbial communities from the Amazon River plume, Atlantic Ocean - RCM33, ROCA_DNA217_0.2um_bLM_C_2a | Environmental | Open in IMG/M |
| 3300001851 | Marine plankton microbial communities from the Amazon River plume, Atlantic Ocean - RCM31, ROCA_DNA206_0.2um_MCP-S_C_3b | Environmental | Open in IMG/M |
| 3300002212 | Freshwater microbial communities from San Paulo Zoo lake, Brazil - JAN 2013 | Environmental | Open in IMG/M |
| 3300002835 | Freshwater microbial communities from Lake Mendota, WI - (Lake Mendota Combined Ray assembly, ASSEMBLY_DATE=20140605) | Environmental | Open in IMG/M |
| 3300002856 | Wastewater microbial communities from Syncrude, Ft. McMurray, Alberta - Tailing Pond Surface TP_surface | Engineered | Open in IMG/M |
| 3300004097 | Pelagic marine sediment microbial communities from the LTER site Helgoland, North Sea, for post-phytoplankton bloom and carbon turnover studies - OSD3 (Helgoland) metaG | Environmental | Open in IMG/M |
| 3300005584 | Freshwater lentic microbial communities from great Laurentian Lakes, MI, USA - Great Lakes metaG HU45MSRF | Environmental | Open in IMG/M |
| 3300005805 | Microbial and algae communities from Cheney Reservoir in Wichita, Kansas, USA | Environmental | Open in IMG/M |
| 3300005827 | Microbial communities from Cathlamet Bay sediment, Columbia River estuary, Oregon - S.188_CBA | Environmental | Open in IMG/M |
| 3300005909 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UKF | Environmental | Open in IMG/M |
| 3300005912 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UKD | Environmental | Open in IMG/M |
| 3300005933 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UKE | Environmental | Open in IMG/M |
| 3300006025 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - DEBay_Sum_22_D_<0.8_DNA | Environmental | Open in IMG/M |
| 3300006026 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - DEBay_Sum_29_D_<0.8_DNA | Environmental | Open in IMG/M |
| 3300006734 | Marine viral communities from the Gulf of Mexico - 31_GoM_OMZ_CsCl metaG | Environmental | Open in IMG/M |
| 3300006802 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - Viral MetaG DEL_Nov_18 | Environmental | Open in IMG/M |
| 3300006875 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - DEBay_Sum_0.19_N_>0.8_DNA | Environmental | Open in IMG/M |
| 3300007516 | Freshwater microbial communities from Lake Fryxell liftoff mats and glacier meltwater in Antarctica - FRY-01 | Environmental | Open in IMG/M |
| 3300007519 | Freshwater microbial communities from Lake Bonney liftoff mats and glacier meltwater in Antarctica - BON-03 | Environmental | Open in IMG/M |
| 3300007522 | Freshwater microbial communities from Lake Bonney liftoff mats and glacier meltwater in Antarctica - BON-01 | Environmental | Open in IMG/M |
| 3300007523 | Freshwater microbial communities from Lake Fryxell liftoff mats and glacier meltwater in Antarctica - FRY-03 | Environmental | Open in IMG/M |
| 3300007538 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_2 Viral MetaG | Environmental | Open in IMG/M |
| 3300007540 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1504_2 Viral MetaG | Environmental | Open in IMG/M |
| 3300007541 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_1S Viral MetaG | Environmental | Open in IMG/M |
| 3300007960 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_1D Viral MetaG | Environmental | Open in IMG/M |
| 3300009081 | Freshwater sediment microbial communities from Prairie Pothole Lake near Jamestown, North Dakota, USA - PPLs Lake P7 Core (1) Depth 19-21cm May2015 | Environmental | Open in IMG/M |
| 3300009082 | Freshwater sediment microbial communities from Prairie Pothole Lake near Jamestown, North Dakota, USA - PPLs Lake P8 Core (1) Depth 1-3cm May2015 | Environmental | Open in IMG/M |
| 3300009159 | Freshwater microbial communities from Lake Simoncouche, Canada to study carbon cycling - S_140212_EF_MetaG | Environmental | Open in IMG/M |
| 3300009161 | Freshwater microbial communities from Lake Montjoie, Canada to study carbon cycling - M_130207_XF_MetaG | Environmental | Open in IMG/M |
| 3300010316 | Freshwater to marine salinity gradient microbial communities from Chesapeake Bay, USA - CPBay_Spr_15_0.8_DNA | Environmental | Open in IMG/M |
| 3300010354 | Freshwater to marine salinity gradient microbial communities from Chesapeake Bay, USA - CPBay_Sum_0.6_0.8_DNA | Environmental | Open in IMG/M |
| 3300010388 | Freshwater microbial communities from the surface of the forest pond in Jussy, Geneva, Switzerland - JEBV, may 2015 | Environmental | Open in IMG/M |
| 3300012516 | Freshwater to marine salinity gradient microbial communities from Chesapeake Bay, USA - CPBay_Spr_15_0.2_RNA1 (Metagenome Metatranscriptome) | Environmental | Open in IMG/M |
| 3300012990 | Tailings pond microbial communities from Northern Alberta -TP6_2010 BML May 2015 | Engineered | Open in IMG/M |
| 3300013132 (restricted) | Freshwater microbial communities from Kabuno Bay, South-Kivu, Congo ? kab_092012_9.5m | Environmental | Open in IMG/M |
| 3300013232 | Sediment microbial communities from Acid Mine Drainage holding pond in Pittsburgh, PA, USA ? S1 | Engineered | Open in IMG/M |
| 3300014962 | Surface water microbial communities from Bangladesh - BaraHaldiaSW0309 | Environmental | Open in IMG/M |
| 3300017818 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 101401AT metaG (megahit assembly) | Environmental | Open in IMG/M |
| 3300017949 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 071406AT metaG (megahit assembly) | Environmental | Open in IMG/M |
| 3300017950 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 041413US metaG (megahit assembly) | Environmental | Open in IMG/M |
| 3300018036 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 041406US metaG (megahit assembly) | Environmental | Open in IMG/M |
| 3300018041 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 041407BS metaG (megahit assembly) | Environmental | Open in IMG/M |
| 3300018682 | Metatranscriptome of marine microbial communities from Baltic Sea - GS680_0p1 | Environmental | Open in IMG/M |
| 3300018775 | Metatranscriptome of marine microbial communities from Baltic Sea - GS679_0p8 | Environmental | Open in IMG/M |
| 3300019756 | Freshwater microbial communities from the Broadkill River, Lewes, Delaware, United States ? IW6Sep16_MG | Environmental | Open in IMG/M |
| 3300020048 | Microbial communities from Manganika and McQuade lakes, Minnesota, USA Combined Assembly of Gp0225457, Gp0225456, Gp0225455, Gp0225454, Gp0225453, Gp0224915 | Environmental | Open in IMG/M |
| 3300020194 | Coastal salt marsh microbial communities from the Groves Creek Marsh, Skidaway Island, Georgia - 041403US metaG (spades assembly) | Environmental | Open in IMG/M |
| 3300021356 | Coastal seawater microbial communities near Pivers Island, North Carolina, United States - PICO245 | Environmental | Open in IMG/M |
| 3300021375 | Coastal seawater microbial communities near Pivers Island, North Carolina, United States - PICO132 | Environmental | Open in IMG/M |
| 3300021378 | Coastal seawater microbial communities near Pivers Island, North Carolina, United States - PICO131 | Environmental | Open in IMG/M |
| 3300021579 | Freshwater microbial communities from McNutts Creek, Athens, Georgia, United States - 2-17 MG | Environmental | Open in IMG/M |
| 3300021960 | Estuarine water microbial communities from San Francisco Bay, California, United States - C33_9D | Environmental | Open in IMG/M |
| 3300022198 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_1S Viral MetaG (v3) | Environmental | Open in IMG/M |
| 3300022200 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1504_1 Viral MetaG (v3) | Environmental | Open in IMG/M |
| 3300022856 | Saline water microbial communities from Ace Lake, Antarctica - #797 | Environmental | Open in IMG/M |
| 3300023109 (restricted) | Seawater microbial communities from Saanich Inlet, British Columbia, Canada - SI_122_August2016_10_MG | Environmental | Open in IMG/M |
| 3300024346 | Whole water sample coassembly | Environmental | Open in IMG/M |
| 3300025057 | Marine viral communities from the Gulf of Mexico - 31_GoM_OMZ_CsCl metaG (SPAdes) | Environmental | Open in IMG/M |
| 3300025380 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UKF (SPAdes) | Environmental | Open in IMG/M |
| 3300025425 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UK8 (SPAdes) | Environmental | Open in IMG/M |
| 3300025502 | Saline lake microbial communities from Ace Lake, Antarctica - Antarctic Ace Lake Metagenome 02UKJ (SPAdes) | Environmental | Open in IMG/M |
| 3300025543 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1504_2 Viral MetaG (SPAdes) | Environmental | Open in IMG/M |
| 3300025646 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_1S Viral MetaG (SPAdes) | Environmental | Open in IMG/M |
| 3300025647 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1504_1 Viral MetaG (SPAdes) | Environmental | Open in IMG/M |
| 3300025655 | Freshwater to marine saline gradient viral communities from Chesapeake Bay - CB_1508_2 Viral MetaG (SPAdes) | Environmental | Open in IMG/M |
| 3300025732 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - DEBay_Sum_0.19_N_>0.8_DNA (SPAdes) | Environmental | Open in IMG/M |
| 3300025889 | Aqueous microbial communities from the Delaware River and Bay under freshwater to marine salinity gradient to study organic matter cycling in a time-series - Viral MetaG DEL_Nov_18 (SPAdes) | Environmental | Open in IMG/M |
| 3300027586 | Freshwater lentic microbial communities from great Laurentian Lakes, MI, USA - Great Lakes metaG HU45MSRF (SPAdes) | Environmental | Open in IMG/M |
| 3300027770 | Freshwater microbial communities from Lake Montjoie, Canada to study carbon cycling - M_130207_XF_MetaG (SPAdes) | Environmental | Open in IMG/M |
| 3300027805 | Freshwater and sediment microbial communities from dead zone in Sandusky Bay, Ohio, USA (SPAdes) | Environmental | Open in IMG/M |
| 3300027832 | Freshwater microbial communities from Lake Bonney liftoff mats and glacier meltwater in Antarctica - BON-02 (SPAdes) | Environmental | Open in IMG/M |
| 3300027917 | Marine sediment microbial communities from White Oak River estuary, North Carolina - WOR-2-8_12 (SPAdes) | Environmental | Open in IMG/M |
| 3300027976 | Freshwater microbial communities from Lake Fryxell liftoff mats and glacier meltwater in Antarctica - FRY-01 (SPAdes) | Environmental | Open in IMG/M |
| 3300027983 | Freshwater microbial communities from Lake Fryxell liftoff mats and glacier meltwater in Antarctica - FRY-03 (SPAdes) | Environmental | Open in IMG/M |
| 3300028571 (restricted) | Freshwater microbial communities from meromictic Lake La Cruz, Castile-La Mancha, Spain - LaCruzMarch201714.5m_1 | Environmental | Open in IMG/M |
| 3300031519 | Sea-ice brine microbial communities from Beaufort Sea near Barrow, Alaska, United States - SB 0.2 | Environmental | Open in IMG/M |
| 3300032263 | Coastal sediment microbial communities from Maine, United States - Phippsburg sediment 1 | Environmental | Open in IMG/M |
| 3300034050 | Freshwater microbial communities from Lake Mendota, Madison, Wisconsin, United States - TYMEFLIES-ME07May2013-rr0095 | Environmental | Open in IMG/M |
| 3300034073 | Fracking water microbial communities from deep shales in Oklahoma, United States - MC-6-XL | Environmental | Open in IMG/M |
| 3300034102 | Freshwater microbial communities from Lake Mendota, Madison, Wisconsin, United States - TYMEFLIES-ME17Jul2002-rr0112 | Environmental | Open in IMG/M |
| 3300034104 | Freshwater microbial communities from Lake Mendota, Madison, Wisconsin, United States - TYMEFLIES-ME02Aug2005-rr0120 | Environmental | Open in IMG/M |
| 3300034107 | Freshwater microbial communities from Lake Mendota, Madison, Wisconsin, United States - TYMEFLIES-ME21Apr2017-rr0133 | Environmental | Open in IMG/M |
| 3300034283 | Freshwater microbial communities from Lake Mendota, Madison, Wisconsin, United States - TYMEFLIES-ME07Aug2003-rr0061 | Environmental | Open in IMG/M |
| Geographical Distribution | |
|---|---|
| Zoom: | Powered by OpenStreetMap |
| ⦗Top⦘ |
Note: Some of these sequences are restricted, as per the data usage policy of the Joint Genome Institute (JGI). Utilizing any of their features below requires obtaining a license from the datasets' corresponding author(s).
| Protein ID | Sample Taxon ID | Habitat | Sequence |
| DelMOSpr2010_100876702 | 3300000116 | Marine | VAKNSGLHLSESFLKKRYVMDKRSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| ML7_1000004520 | 3300001336 | Benthic Lake | LPRGSSLHHSEAYLKKRLHLDKKTPQEIAKECNVSIQVIYRQMKKFGLK* |
| ML7_1000224526 | 3300001336 | Benthic Lake | MRSKGLHLSKAYMEKRYISDKKSPEAIAEECGVSVQLIYRQLKKFGLKK* |
| ML7_100796422 | 3300001336 | Benthic Lake | MRSKGLHLSKAYMEKRYIRDKKTPEAIAEECGVSIQLIYRQLKKFGLKK* |
| Draft_1000012860 | 3300001605 | Hydrocarbon Resource Environments | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKMGLKR* |
| RCM33_10033765 | 3300001838 | Marine Plankton | MPRGISLHHSEAYLRKRYVMDKKSPEDIAKECNVSVQIIYKQLKKFGLKR* |
| RCM31_100966974 | 3300001851 | Marine Plankton | MAKNAGLHHSEAYLKKRLYIDKKTPEEIAKECGVSVQIIYRQMKKMG |
| metazooDRAFT_13586391 | 3300002212 | Lake | MGIRKLGEDVLMPRGGSLHHSEAFLRKRLHMDKKTPEEVAKECNVSIQVIYRQMKKFGIK |
| B570J40625_1005679623 | 3300002835 | Freshwater | MPRGSSLHHSEAFLRKRLHIDKKTPEEIAKECNVSLQVIYRQMKRFG |
| draft_115200886 | 3300002856 | Hydrocarbon Resource Environments | MARNSGLHLSESYLRKRYVMDKKPIEEIAKECGVSIQIIYRQLAKFGLKK* |
| Ga0055584_1001035957 | 3300004097 | Pelagic Marine | MTSRLHHSAAYLKKRYVMEKKSVDDIAKECGVSIQIIYRQLKKFG |
| Ga0049082_101516612 | 3300005584 | Freshwater Lentic | MRTKGLHLSEAFMKKRYVMERKSPDDIAKECGVSVQLIYRQLKKFGLKR* |
| Ga0079957_100930313 | 3300005805 | Lake | MRSKGLHLSEAFMRKRYVLDKKSPEDIAKECGVSVQIIYRQLKKFGLKR* |
| Ga0074478_19082132 | 3300005827 | Sediment (Intertidal) | MARSNGLHLNEAYLKKRYTMDKKPIEEIAKECGVSIQIIYRQLKKFGLKK* |
| Ga0075112_10060807 | 3300005909 | Saline Lake | MRSIGLHLSEAFLRKRYVMDKKTPEEIAGECGVSVQLIYRQLKKFGLKR* |
| Ga0075109_10968742 | 3300005912 | Saline Lake | MPRGGSLHHSEAFLRKRIHLDKKTPEEVAKECNVSIQLIYRQMKKFGIK* |
| Ga0075118_100064584 | 3300005933 | Saline Lake | MRSIGLHLSEAFLRKRYVMDKKTPEDIAKECEVSVQLIYRQLKKFKLKR* |
| Ga0075474_1000006267 | 3300006025 | Aqueous | MEKRYIRDKKTPEAIAEECGVSVQLIYRQLKKFGLKK* |
| Ga0075478_100247804 | 3300006026 | Aqueous | MRSKGLHLSKAYMEKRYIRDKKTPEAIAEECGVSVQLIYRQLKKFGLKK* |
| Ga0098073_100035012 | 3300006734 | Marine | MKLHTNKEYLRKRYVMQKKSPDEIAKECNVTVQTIYTQLKKFGLRK* |
| Ga0070749_102158601 | 3300006802 | Aqueous | MRSKGLHLSEAFMKKRYVLDKKSPEDIAKECGVSVQLIYRQLKKFGLKR* |
| Ga0075473_100058646 | 3300006875 | Aqueous | MPRGTSLHHSEAYLRKRYIMDKKSPEDIAKECNVSVQIIYKQLKKFGLKK* |
| Ga0105050_100112457 | 3300007516 | Freshwater | MMSTNKLHMSEAFMKKRYVMEKKSPEDIAKECEVSVQLIYRQLKKMGLKR* |
| Ga0105050_100280302 | 3300007516 | Freshwater | MAKNTKLHHSEAYLKKRRYLDKKSPEDIAKECGVSIQIIYRQFKKFGIK* |
| Ga0105050_102997242 | 3300007516 | Freshwater | MRSIGLHLSEAFLRKRYVMDKKTPEDIAKECEVSVQLIYRQLKKFGLKR* |
| Ga0105050_103980684 | 3300007516 | Freshwater | SIGLHLSEAFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFGLKR* |
| Ga0105050_104756152 | 3300007516 | Freshwater | MRSMGLHLSEAFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFKLKR* |
| Ga0105055_101628104 | 3300007519 | Freshwater | MRSIGLHLSEAFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFGLKR* |
| Ga0105053_1002266013 | 3300007522 | Freshwater | MMENNKLHLSEAFMKKRYVMEKKSPDDIANECGVSVQLIYRQLKRFGLKR* |
| Ga0105052_102094382 | 3300007523 | Freshwater | MMSTNKLHMSEAFMKKRYVMDKKSPEDIAKECEVSVQLIYRQLKKMGLKR* |
| Ga0099851_10001904 | 3300007538 | Aqueous | MAKNAGLHHNEAYLRKRLYLDKRTPEQIASECGVSLQIIYRQMKKFGLK* |
| Ga0099851_10100684 | 3300007538 | Aqueous | MPRGSSLHHSEAYLRKRLHIDKKTPEQIAKECNVSVQVIYRQMKKFGIK* |
| Ga0099851_11287603 | 3300007538 | Aqueous | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0099851_12230662 | 3300007538 | Aqueous | MAKNAGLHHSEAYLKKRLHIDKKSPEDVAKECGVSLQIIYRQMKKFGLK* |
| Ga0099851_12364211 | 3300007538 | Aqueous | MAKNSGLHLSEAFMKKRYVLDKKTPEDIAKECGVSVQLIYRQLKKFGLRK* |
| Ga0099851_13138282 | 3300007538 | Aqueous | MAKNSKLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLRK* |
| Ga0099847_10270636 | 3300007540 | Aqueous | MRSKGLHLNEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0099848_10239765 | 3300007541 | Aqueous | MPRGSSLHHSEAYLRKRLHIDKKTPEQVAKECNVSVQVIYRQMKKFGIKQ* |
| Ga0099848_11387771 | 3300007541 | Aqueous | GSSLHHSEAYLRKRLHIDKKTPEQIAKECNVSVQVIYRQMKKFKIR* |
| Ga0099848_11872352 | 3300007541 | Aqueous | MRSKGLHLSEAFMRKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0099848_12807762 | 3300007541 | Aqueous | MPRGSSLHHSEAYLRKRLHIDKKTPEEVAKECNVSLQIIYRQMKK |
| Ga0099848_12949463 | 3300007541 | Aqueous | MRSLGLHLNESFLRKRYIVDKKSPEDIAKECGVSVQLIYRQLKKFALKK* |
| Ga0099850_10033762 | 3300007960 | Aqueous | MPRGSSLHHSEAYLRKRLHIDKKTPEQVAKECNVSVQVIYRQMKKFGIK* |
| Ga0099850_11747742 | 3300007960 | Aqueous | MRSLGLHLNESFLRKRYIVDKKSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0105098_100543683 | 3300009081 | Freshwater Sediment | MPRGSSLHHSEAYLRKRLHIDKKTPEQIAKECNVSLQVIYRQMKKFGLKY* |
| Ga0105098_104147682 | 3300009081 | Freshwater Sediment | MSKNLGLHLSETFLRKRYVMDKKSPDDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0105099_103410412 | 3300009082 | Freshwater Sediment | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKMGLRK* |
| Ga0114978_102112014 | 3300009159 | Freshwater Lake | MPRGSTLHHSEAFLRKRILVDKKTPEEIAKECNVSL |
| Ga0114966_103300071 | 3300009161 | Freshwater Lake | MPRGSTLHHSEAFLRKRILVDKKTPEEIAKECNVSLQIIYRQLKKFGLRK* |
| Ga0136655_10536374 | 3300010316 | Freshwater To Marine Saline Gradient | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLK |
| Ga0129333_1000003340 | 3300010354 | Freshwater To Marine Saline Gradient | MRNSGLHLSEAFLKKRYVQERKSPEDIAKECGVSVQLIYRQLKKFGLKK* |
| Ga0129333_100350988 | 3300010354 | Freshwater To Marine Saline Gradient | LNRSFMAKRYIIDKKSPEEIAKECGVSVQIIYRQLKKFGLKK* |
| Ga0129333_114039392 | 3300010354 | Freshwater To Marine Saline Gradient | MRSKGLHLSEAFMRKRYVMDKKSPEDIAKECEVSVQIIYRQLKKFGLKR* |
| Ga0129333_114747143 | 3300010354 | Freshwater To Marine Saline Gradient | MTKNTGFHLNRSFMAKRYIIDKKSPEEIAKECGVSVQIIYRQLKKFGL |
| Ga0136551_100060013 | 3300010388 | Pond Fresh Water | MAKNVGLHHNEAYLKKRLYLDKKTPEQIAVECGVSLQIIYRQMKKFNLK* |
| Ga0129325_11858845 | 3300012516 | Aqueous | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLRK* |
| Ga0159060_10083356 | 3300012990 | Hydrocarbon Resource Environments | MRSMGLHLSEAFMKKRYVMEKKSPEDIAKECGVSVQLIYRQLKKFGLRK* |
| (restricted) Ga0172372_102667993 | 3300013132 | Freshwater | MRSKGLHLSEAFMRKRYIVDKKSPEDIAKECEVSVQLIYRQLKKFGLKK* |
| Ga0170573_107806285 | 3300013232 | Sediment | MARSNGLHLNEAYLKKRFVMDKKPIEEIAKECGVSIQIIYRQLKKFGLKK* |
| Ga0134315_10720271 | 3300014962 | Surface Water | MPRGSSLHHSEAYLRKRLHIDKKTPEQIAKECNVSVQIIYRQMKKFNIKY* |
| Ga0134315_10740212 | 3300014962 | Surface Water | VPRGSSLHHSEAFLRKRLFIDKKTPEEIAKECNVSLQIIYRQLKKFGLKK* |
| Ga0181565_100730134 | 3300017818 | Salt Marsh | MTSKLHHSAAYLKKRYVMEKKSVDDIAKECGVSIQIIYRQLKKFGLRR |
| Ga0181584_100001345 | 3300017949 | Salt Marsh | MRSKGLHLSKAYMEKRYIRDKKTPEAIAEECGVSVQLIYRQLKKFGLKK |
| Ga0181607_1001407310 | 3300017950 | Salt Marsh | MPRGSSLHHSEAYLRKRLHIDKKTPEDVAKECNVSLQVIYRQMKKFGLKK |
| Ga0181600_105084611 | 3300018036 | Salt Marsh | NMAKNAGLHHNEAYLRKRLYLDKRTPEQIASECGVSLQIIYRQMKKFGLNK |
| Ga0181601_100246812 | 3300018041 | Salt Marsh | MAKNVGLHHNEAYLRKRLHLDKKTPEQIASECGVSLQIIYRQMKKFGLNK |
| Ga0188851_10054192 | 3300018682 | Freshwater Lake | MAKNMGLHHSEAFLKKRLHLDKKTPEEIAKECNVSLQIIYRQMKKFGLKK |
| Ga0188848_10011325 | 3300018775 | Freshwater Lake | MGLHHSEAFLKKRLHLDKKTPEEIAKECNVSLQIIYRQMKKFGLKK |
| Ga0194023_100000144 | 3300019756 | Freshwater | MEKRYIRDKKTPEAIAEECGVSVQLIYRQLKKFGLKK |
| Ga0207193_1000213142 | 3300020048 | Freshwater Lake Sediment | MAKNAGLHHSESYLRKRLYLDKKTPEQIAIECGVSLQIIYRQLKKFGLK |
| Ga0181597_101270101 | 3300020194 | Salt Marsh | SSLHHSEAYLRKRLHIDKKTPEDVAKECNVSLQVIYRQMKKFGLKK |
| Ga0213858_100019549 | 3300021356 | Seawater | MKLHTNKEYLRKRYVMQKKSPDEIAKECNVTVQTIYTQLKRFGLRK |
| Ga0213869_104681561 | 3300021375 | Seawater | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLRK |
| Ga0213861_1000194112 | 3300021378 | Seawater | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0213918_10169363 | 3300021579 | Freshwater | QRYVMQKRTPEEIAKECGVSVQIIYKQLNKFGLRR |
| Ga0222715_107008942 | 3300021960 | Estuarine Water | MPRGKGLHLSQPYLKQRYVYDKKSIEEIAKECDVSIQIIYKQLKKFGLKK |
| Ga0196905_10416181 | 3300022198 | Aqueous | PRGSSLHHSEAYLRKRLHIDKKTPEQVAKECNVSVQVIYRQMKKFGIK |
| Ga0196901_100017739 | 3300022200 | Aqueous | MAKNAGLHHNEAYLRKRLYLDKRTPEQIASECGVSLQIIYRQMKKFGLK |
| Ga0196901_10079063 | 3300022200 | Aqueous | MAKNAGLHHSEAYLKKRLHIDKKSPEDVAKECGVSLQIIYRQMKKFGLK |
| Ga0196901_10806813 | 3300022200 | Aqueous | MAKNSKLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLRK |
| Ga0222671_10009447 | 3300022856 | Saline Water | MRSIGLHLSEAFLRKRYVMDKKTPEEIAGECGVSVQLIYRQLKKFGLKR |
| (restricted) Ga0233432_101492732 | 3300023109 | Seawater | MKLHTNKEYLRKRYIMEKKSPDEIAKECDVTVQTIYTQLKKFGLRK |
| Ga0244775_100325143 | 3300024346 | Estuarine | MAKNVGLHHSEAYLKKRLYIDKKTPEEIAKECNVSLQIIYRQIKKFGLKK |
| Ga0244775_106394853 | 3300024346 | Estuarine | MAKNMGLHHSEAYLKKRLYLDKKTPEQIATECGVSLQIIYRQMKKFGLR |
| Ga0208018_1026073 | 3300025057 | Marine | MKLHTNKEYLRKRYVMQKKSPDEIAKECNVTVQTIYTQLKKFGLRK |
| Ga0208901_10091271 | 3300025380 | Saline Lake | LSEAFLRKRYVMDKKTPEEIAGECGVSVQLIYRQLKKFGLKR |
| Ga0208646_10065074 | 3300025425 | Saline Lake | MPRGGSLHHSEAFLRKRIHLDKKTPEEVAKECNVSIQLIYRQMKKFGIK |
| Ga0208903_100140427 | 3300025502 | Saline Lake | MRSIGLHLSEAFLRKRYVMDKKTPEDIAKECEVSVQLIYRQLKKFKLKR |
| Ga0208303_10833051 | 3300025543 | Aqueous | EAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0208161_10146812 | 3300025646 | Aqueous | MPRGSSLHHSEAYLRKRLHIDKKTPEQIAKECNVSVQVIYRQMKKFGIK |
| Ga0208160_11154493 | 3300025647 | Aqueous | SKMRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0208160_11341123 | 3300025647 | Aqueous | LEKGYSKMRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0208160_11476391 | 3300025647 | Aqueous | HSEAYLRKRLHIDKKTPEQVAKECNVSVQVIYRQMKKFGIK |
| Ga0208795_10664463 | 3300025655 | Aqueous | MAKNSGLHLSEAFMKKRYVLDKKTPEDIAKECGVSVQLIYRQLKKFGLRK |
| Ga0208784_10030226 | 3300025732 | Aqueous | MPRGTSLHHSEAYLRKRYIMDKKSPEDIAKECNVSVQIIYKQLKKFGLKK |
| Ga0208644_11585421 | 3300025889 | Aqueous | MRSKGLHLSEAFMKKRYVLDKKSPEDIAKECGVSVQLIYRQLKKFGLKR |
| Ga0208966_11597082 | 3300027586 | Freshwater Lentic | MRTKGLHLSEAFMKKRYVMERKSPDDIAKECGVSVQLIYRQLKKFGLKR |
| Ga0209086_102991811 | 3300027770 | Freshwater Lake | MPRGSTLHHSEAFLRKRILVDKKTPEEIAKECNVSLQIIYRQLKKFGLRK |
| Ga0209229_103629261 | 3300027805 | Freshwater And Sediment | VAKNVGLHHSEAYLRKRFHIDKKTPEDIATECGVSVQIIYKQLKKFGLKK |
| Ga0209229_104770862 | 3300027805 | Freshwater And Sediment | VAKNVGLHHSEAYLRKRFHVDKKTPEDIATECGVSVQIIYKQLKKFGLKK |
| Ga0209491_102147533 | 3300027832 | Freshwater | MRSIGLHLSEAFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFGLKR |
| Ga0209536_1000562565 | 3300027917 | Marine Sediment | MRSKGLHLSESFMRKRYVFDKRSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0209536_1027891891 | 3300027917 | Marine Sediment | MRSKGLHLSEAFMRKRYVLDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0209702_100619012 | 3300027976 | Freshwater | MRSIGLHLSEAFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFKLKR |
| Ga0209702_101344272 | 3300027976 | Freshwater | MRSIGLHLSEAFLRKRYVMDKKTPEDIAKECEVSVQLIYRQLKKFGLKR |
| Ga0209284_103135263 | 3300027983 | Freshwater | AFLRKRYVMDKKTPEEIAKECEVSVQLIYRQLKKFGLKR |
| (restricted) Ga0247844_13280702 | 3300028571 | Freshwater | VSRSLGLHLNEAFMKKRYVVDKKSPDDIAKECGVSVQLIYRQLKKFGLRK |
| Ga0307488_101721905 | 3300031519 | Sackhole Brine | RKILQRLTYMKKNNGLHLSQAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0316195_102801721 | 3300032263 | Sediment | MARGSTLHHSEAYLKQRYVMQKRTPEEIAKECNVSVQII |
| Ga0316195_106040341 | 3300032263 | Sediment | MARGSTLHHSEAYLKQRYVMQKRTPEEIAKECNVSVQI |
| Ga0335023_0000591_21544_21693 | 3300034050 | Freshwater | MRSKGLHLSEAFMKKRYVMDKKSPEDIAKECGVSVQLIYRQLKKMGLRK |
| Ga0310130_0012234_66_215 | 3300034073 | Fracking Water | MPRGSSLHHSEAYLRKRLHIDKKTPEQVAKECNVSVQVIYRQMKKFGIK |
| Ga0310130_0042356_849_998 | 3300034073 | Fracking Water | MRSKGLHLSEAFMRKRYITDKKSPEDIAKECGVSIQLIYRQLKKFGLKK |
| Ga0310130_0156362_427_576 | 3300034073 | Fracking Water | MRSKGLHLSEAFMRKRYVMDKKSPEDIAKECGVSVQLIYRQLKKFGLKK |
| Ga0335029_0544194_279_428 | 3300034102 | Freshwater | MRSKGLHLSEAFMRKRYVMDKKSPEDIAKECEVSVQIIYRQLKKFGLKR |
| Ga0335031_0000174_16773_16922 | 3300034104 | Freshwater | MAKNAGLHHSEAYLKKRLYLDKKTPEQIAVECGVSLQIIYRQMKKFGLK |
| Ga0335037_0652192_374_526 | 3300034107 | Freshwater | MAKNVGLHHSEAYLKKRLHIDKKSPEEIAKECNVSLQIIYRQMKKFGLKK |
| Ga0335007_0008717_3096_3245 | 3300034283 | Freshwater | MAKNVGLHHSEAYLKKRLYLDKKTPEQIAVECGVSLQIVYRQMKKFGLK |
| Ga0335007_0020991_41_190 | 3300034283 | Freshwater | MAKNNGLHHNEAYLKKRLHLDKKTPEQIALECGVSLQIIYRQMKKFGLK |
| ⦗Top⦘ |