Skip to content

Baseline

rionid.baseline

NONPARAMS_EST

Bases: object

Wrapper class for non-parametric baseline estimation algorithms.

This class provides a unified interface to access various baseline subtraction methods, primarily Penalized Least Squares (PLS) variants.

Parameters:

Name Type Description Default
data array - like

The input 1D spectrum array (e.g., Power Spectral Density) from which the baseline should be estimated.

required
Source code in src/rionid/baseline.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class NONPARAMS_EST(object):
    """
    Wrapper class for non-parametric baseline estimation algorithms.

    This class provides a unified interface to access various baseline subtraction
    methods, primarily Penalized Least Squares (PLS) variants.

    Parameters
    ----------
    data : array-like
        The input 1D spectrum array (e.g., Power Spectral Density) from which
        the baseline should be estimated.
    """
    def __init__(self, data):
        self.data = data

    def pls(self, method, l, **kwargs):
        """
        Executes a Penalized Least Squares (PLS) baseline estimation.

        Parameters
        ----------
        method : str
            The specific PLS algorithm to use. Currently supports 'BrPLS'.
        l : float
            The smoothness parameter (lambda). Higher values result in a stiffer,
            smoother baseline. Typical values range from 10^5 to 10^8 for Schottky spectra.
        **kwargs : dict
            Additional arguments passed to the specific PLS method (e.g., 'ratio', 'nitermax').

        Returns
        -------
        numpy.ndarray
            The estimated baseline array, with the same shape as the input data.

        Raises
        ------
        ValueError
            If the specified `method` is not implemented.
        """
        pls_method = PLS(self.data)
        if method == 'BrPLS':
            return pls_method.BrPLS(l=l, **kwargs)
        else:
            raise ValueError(f"Method {method} not implemented.")

pls(method, l, **kwargs)

Executes a Penalized Least Squares (PLS) baseline estimation.

Parameters:

Name Type Description Default
method str

The specific PLS algorithm to use. Currently supports 'BrPLS'.

required
l float

The smoothness parameter (lambda). Higher values result in a stiffer, smoother baseline. Typical values range from 10^5 to 10^8 for Schottky spectra.

required
**kwargs dict

Additional arguments passed to the specific PLS method (e.g., 'ratio', 'nitermax').

{}

Returns:

Type Description
ndarray

The estimated baseline array, with the same shape as the input data.

Raises:

Type Description
ValueError

If the specified method is not implemented.

Source code in src/rionid/baseline.py
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def pls(self, method, l, **kwargs):
    """
    Executes a Penalized Least Squares (PLS) baseline estimation.

    Parameters
    ----------
    method : str
        The specific PLS algorithm to use. Currently supports 'BrPLS'.
    l : float
        The smoothness parameter (lambda). Higher values result in a stiffer,
        smoother baseline. Typical values range from 10^5 to 10^8 for Schottky spectra.
    **kwargs : dict
        Additional arguments passed to the specific PLS method (e.g., 'ratio', 'nitermax').

    Returns
    -------
    numpy.ndarray
        The estimated baseline array, with the same shape as the input data.

    Raises
    ------
    ValueError
        If the specified `method` is not implemented.
    """
    pls_method = PLS(self.data)
    if method == 'BrPLS':
        return pls_method.BrPLS(l=l, **kwargs)
    else:
        raise ValueError(f"Method {method} not implemented.")

PLS

Bases: object

Implementation of Penalized Least Squares (PLS) algorithms.

Parameters:

Name Type Description Default
data array - like

The input 1D spectrum array.

required
Source code in src/rionid/baseline.py
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
class PLS(object):
    """
    Implementation of Penalized Least Squares (PLS) algorithms.

    Parameters
    ----------
    data : array-like
        The input 1D spectrum array.
    """
    def __init__(self, data):
        self.data = data

    def BrPLS(self, l, ratio=1e-6, nitermax=50):
        """
        Bayesian reweighted Penalized Least Squares (BrPLS).

        This algorithm estimates the baseline by iteratively reweighting the data points.
        It assumes that the baseline is smooth and that peaks are positive deviations
        from this baseline. It uses a Bayesian approach to update weights based on
        the probability that a point belongs to the background noise versus a peak.

        Reference: Q. Wang et al., NUCL SCI TECH, 33: 148 (2022).

        Parameters
        ----------
        l : float
            Smoothness parameter (lambda). Controls the trade-off between fidelity to
            the data and smoothness of the baseline.
        ratio : float, optional
            Convergence threshold. The iteration stops when the relative change in
            the baseline vector is less than this value. Default is 1e-6.
        nitermax : int, optional
            Maximum number of iterations. Default is 50.

        Returns
        -------
        numpy.ndarray
            The calculated baseline array.
        """
        L, beta = len(self.data), 0.5

        # Construct the difference matrix (2nd order derivative approximation)
        D = sparse.diags([1,-2,1], [0,-1,-2], shape=(L, L-2))
        D = l * D.dot(D.transpose())

        w, z = np.ones(L), self.data.copy()

        for i in range(nitermax):
            # Solve the linear system (W + D)z = Wy
            W = sparse.spdiags(w, 0, L, L)
            Z = W + D
            zt = spsolve(Z, w*self.data)

            # Calculate residuals
            d = self.data - zt

            # Separate positive and negative residuals to estimate noise statistics
            d_pos = d[d > 0]
            d_neg = d[d < 0]

            # Robust mean and sigma estimation (avoid division by zero)
            d_m = np.mean(d_pos) if len(d_pos) > 0 else 1e-6
            d_sigma = np.sqrt(np.mean(d_neg**2)) if len(d_neg) > 0 else 1e-6

            # Bayesian weight update formula using Error Function (erf)
            # This calculates the probability that a point is part of the baseline
            w = 1 / (1 + beta / (1 - beta) * np.sqrt(np.pi / 2) * d_sigma / d_m * 
                     (1 + sp.erf((d / d_sigma - d_sigma / d_m) / np.sqrt(2))) * 
                     np.exp((d / d_sigma - d_sigma / d_m)**2 / 2))

            # Check convergence of the baseline vector z
            if np.sqrt(np.sum((z - zt)**2) / np.sum(z**2)) < ratio: break
            z = zt

            # Update prior probability beta
            if np.abs(beta + np.mean(w) - 1.) < ratio: break
            beta = 1 - np.mean(w)

        return z

BrPLS(l, ratio=1e-06, nitermax=50)

Bayesian reweighted Penalized Least Squares (BrPLS).

This algorithm estimates the baseline by iteratively reweighting the data points. It assumes that the baseline is smooth and that peaks are positive deviations from this baseline. It uses a Bayesian approach to update weights based on the probability that a point belongs to the background noise versus a peak.

Reference: Q. Wang et al., NUCL SCI TECH, 33: 148 (2022).

Parameters:

Name Type Description Default
l float

Smoothness parameter (lambda). Controls the trade-off between fidelity to the data and smoothness of the baseline.

required
ratio float

Convergence threshold. The iteration stops when the relative change in the baseline vector is less than this value. Default is 1e-6.

1e-06
nitermax int

Maximum number of iterations. Default is 50.

50

Returns:

Type Description
ndarray

The calculated baseline array.

Source code in src/rionid/baseline.py
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
def BrPLS(self, l, ratio=1e-6, nitermax=50):
    """
    Bayesian reweighted Penalized Least Squares (BrPLS).

    This algorithm estimates the baseline by iteratively reweighting the data points.
    It assumes that the baseline is smooth and that peaks are positive deviations
    from this baseline. It uses a Bayesian approach to update weights based on
    the probability that a point belongs to the background noise versus a peak.

    Reference: Q. Wang et al., NUCL SCI TECH, 33: 148 (2022).

    Parameters
    ----------
    l : float
        Smoothness parameter (lambda). Controls the trade-off between fidelity to
        the data and smoothness of the baseline.
    ratio : float, optional
        Convergence threshold. The iteration stops when the relative change in
        the baseline vector is less than this value. Default is 1e-6.
    nitermax : int, optional
        Maximum number of iterations. Default is 50.

    Returns
    -------
    numpy.ndarray
        The calculated baseline array.
    """
    L, beta = len(self.data), 0.5

    # Construct the difference matrix (2nd order derivative approximation)
    D = sparse.diags([1,-2,1], [0,-1,-2], shape=(L, L-2))
    D = l * D.dot(D.transpose())

    w, z = np.ones(L), self.data.copy()

    for i in range(nitermax):
        # Solve the linear system (W + D)z = Wy
        W = sparse.spdiags(w, 0, L, L)
        Z = W + D
        zt = spsolve(Z, w*self.data)

        # Calculate residuals
        d = self.data - zt

        # Separate positive and negative residuals to estimate noise statistics
        d_pos = d[d > 0]
        d_neg = d[d < 0]

        # Robust mean and sigma estimation (avoid division by zero)
        d_m = np.mean(d_pos) if len(d_pos) > 0 else 1e-6
        d_sigma = np.sqrt(np.mean(d_neg**2)) if len(d_neg) > 0 else 1e-6

        # Bayesian weight update formula using Error Function (erf)
        # This calculates the probability that a point is part of the baseline
        w = 1 / (1 + beta / (1 - beta) * np.sqrt(np.pi / 2) * d_sigma / d_m * 
                 (1 + sp.erf((d / d_sigma - d_sigma / d_m) / np.sqrt(2))) * 
                 np.exp((d / d_sigma - d_sigma / d_m)**2 / 2))

        # Check convergence of the baseline vector z
        if np.sqrt(np.sum((z - zt)**2) / np.sum(z**2)) < ratio: break
        z = zt

        # Update prior probability beta
        if np.abs(beta + np.mean(w) - 1.) < ratio: break
        beta = 1 - np.mean(w)

    return z