zaro

What is a FUBAR test?

Published in Evolutionary Analysis 3 mins read

A FUBAR test, which stands for Fast, Unconstrained Bayesian AppRoximation, is a statistical method used in evolutionary biology to analyze coding sequences. It employs a Bayesian approach to infer the rates of nonsynonymous (dN) and synonymous (dS) substitutions at each site within a given alignment of coding sequences, along with its corresponding phylogeny.

Understanding FUBAR in Detail

FUBAR's primary goal is to identify sites in a protein that are under positive or negative selection. Here's a breakdown:

  • Bayesian Approach: FUBAR uses Bayesian statistics, which means it incorporates prior beliefs about the distribution of dN/dS ratios and updates these beliefs based on the observed data.

  • dN/dS Ratio: The ratio of nonsynonymous to synonymous substitution rates (dN/dS, also denoted as ω) is a key indicator of selection pressure.

    • dN > dS (ω > 1): Suggests positive selection, meaning that changes in the amino acid sequence are favored.
    • dN < dS (ω < 1): Suggests negative (purifying) selection, meaning that changes in the amino acid sequence are disfavored.
    • dN = dS (ω = 1): Suggests neutral evolution, meaning that changes are neither favored nor disfavored.
  • Per-Site Analysis: FUBAR estimates dN/dS on a site-by-site basis, allowing it to pinpoint specific amino acid positions within a protein that are evolving under different selection pressures.

Practical Applications of FUBAR

FUBAR is used to identify amino acid residues within a protein that are undergoing selection. This information can be used in different fields.

  • Identify sites under positive selection:
    • To identify potential drug-resistance mutations in viruses like HIV.
    • To understand the adaptive evolution of proteins in response to environmental changes.
  • Identify sites under negative selection:
    • To understand the critical functional regions of a protein that are intolerant to change.
    • To determine important residues for protein stability or folding.

Example Scenario

Imagine analyzing the gene encoding the hemagglutinin protein of the influenza virus using FUBAR. The analysis may reveal:

  1. Sites under positive selection: These sites may correspond to amino acid positions in the hemagglutinin protein that are evolving rapidly to evade the host's immune system.
  2. Sites under negative selection: These sites may correspond to amino acid positions that are crucial for the protein's structure or function, where mutations are detrimental to the virus's fitness.

Key Benefits of FUBAR

  • Fast Computation: As the name suggests, FUBAR is designed to be computationally efficient, making it suitable for analyzing large datasets.

  • Unconstrained: The "Unconstrained" aspect means the method doesn't impose strong assumptions about the distribution of dN/dS ratios.

  • Bayesian Inference: Provides a probabilistic framework for estimating dN/dS ratios and quantifying uncertainty in the estimates.