.. _uncertainty_Chapter: ============================ Uncertainties ============================ No calibration is complete without a thorough understanding of the uncertainty in your results and where that uncertainty comes from. This section is a review of the types of the uncertainty present in calibrations, how to access them, and common mistakes made when calculating uncertainties. Types of Uncertainty ==================== Channel Uncertainty (Centroid Uncertainty) ------------------------------------------ Channel uncertainty comes from the inherent statistical uncertainty in the peak fitting process. A peak with a center at channel 1000 looks almost identical to a peak with a center at channel 1000.1, especially if you only saw a few photons from that peak. A channel uncertainty can be converted to a wavelength uncertainty using the function .. math:: u_{x,nm} = f_{cal}'(x) * u_{x,ch} Where :math:`u_{x,nm}` is the uncertainty from peak fitting in nm, :math:`u_{x,ch}` is the uncertainty in channels, :math:`f_{cal}(x)'` is the derivative of the calibration function **evaluated at x**. As an example, consider you know the calibration function is: .. math:: f_{cal}(x) = 4 nm + 1 * 10^{-6} \frac{nm}{{ch}^2} x ^ 2 Which has the derivative .. math:: f_{cal}(x) = 2 * {10^{-6}} \frac{nm}{{ch}^2} x A line at channel 500 with a channel uncertainty of +/- 1 nm would have an effective wavelength uncertainty of: .. math:: u_{x,nm} = 2 * {10^ {-6}} \frac{nm}{{ch}^2} 500 ch * 1 ch u_{x,nm} = 0.001 nm. Literature Wavelength Uncertainty --------------------------------- The calibration lines are not known exactly, and come with uncertainty values attached as well. We do not have to calculate these, as they are given in other publications. Systematic Uncertainty ---------------------- The calibration drifts over time due to thermal expansion of the diffraction grating, small movements of the CCD camera, and other factors. Systematic uncertainty attempts to capture the errors in calibrations that do come from statistical sources. For more information of how systematic uncertainty is estimated, see the section on :ref:`uncertainty:estimation of systematic uncertainty` Calibration Uncertainty ----------------------- Once a calibration has been completed, a scientist needs to know not only what the wavelength is at a certain point, but also *how confident they can be in that wavelength*. The calibration uncertainty represents the +/- at a predicted wavelength, and changes as channel number changes. In areas where you have many calibration lines, the calibration uncertainty is generally smaller. If there are no close calibration lines, you can expect higher calibration uncertainties. To obtain the 1 sigma (close to 68% confidence band) calibration uncertainties, use the :func:`~euv_fitting.calibrate.calibrators.Calibrator.confidence_bands` method after fitting has been completed. .. code-block:: python cb = calibrator.confidence_bands(n_sigma = 1) #or cb = calibrator.confidence_bands(n_sigma = 1) cb = calibrator.cb cb.shape == (2048,) `cb` will be an array of the calibration uncertainty at each channel. Examples and Best Practices =========================== One common task for users of euv_fitting is creating a calibration, applying it to another spectra, and then extracting the wavelength and wavelength uncertainty of peaks in that spectra. This example is a template for completing that process and mentions some mistakes to avoid along the way. First, create a calibration from a Neon and Background file. For more information, see the :ref:`calibration:calibration process` section. .. jupyter-execute:: from euv_fitting.calibrate.utils import SpeReader, CosmicRayFilter from euv_fitting.calibrate.calibrators import Distance_Calibrator CRF = CosmicRayFilter() Ne_spe = SpeReader('./example_data/Example_Neon.SPE') Bkg_spe = SpeReader('./example_data/Example_Background.SPE') Ne_img = CRF.apply(Ne_spe.load_img()) Bkg_img = CRF.apply(Bkg_spe.load_img()) Ne_cal = Distance_Calibrator(Ne_img, 'Ne', num_peaks = 25) #specify element and #peaks Bkg_cal = Distance_Calibrator(Bkg_img, 'Ba', num_peaks = 25) #specify element and #peaks Ne_cal.calibrate() Bkg_cal.calibrate() Ne_cal.join([Bkg_cal]) print('Total Solution Array') print(Ne_cal.solution_arr) Ne_cal.fit() Ne_cal.plot() Ne_cal.residual_plot() Ne_cal.print_info() Immediately, we can identify the: * channel uncertainty (fourth column of the solution array) * literature wavelength uncertainty (second column of the solution array) * systemtatic uncertainty (given in the print out) .. jupyter-execute:: from euv_fitting.calibrate.utils import MultiBatchGaussian import numpy as np import matplotlib.pyplot as plt #apply the calibration function to our channels to get wavelengths channels = np.linspace(0, 2047, 2048) wavelengths = Ne_cal.f3(channels, *Ne_cal.popt) Ir_spe = SpeReader('./example_data/Example_Neon.SPE') Ir_img = CRF.apply(Ir_spe.load_img()) #create multibatchgaussian with wavelength on the x-axis Ir_multi = MultiBatchGaussian(Ir_img, xs = wavelengths, num_peaks = 15) Ir_multi.fit() f, ax = plt.subplots() Ir_multi.plot_fit(ax = ax) ax.set_xlim([8, 10]) centroids = Ir_multi.get_centroids_ascending() print(centroids) Note that because we gave the wavelengths as the x-values when we created the MultiBatchGaussian, centroids (and their centroid uncertainties) are already in nm. The next step is to calculate the **total uncertainty** for each centroid. The total uncertainty has three parts 1. :math:`u_{x,nm}`, the centroid uncertainty 2. :math:`u_{cal}`, the calibration uncertainty 3. :math:`u_{sys}`, the systematic uncertainty. To calculate the total uncertainty, combine all parts in quadrature: .. math:: u_{tot} = \sqrt{{u_x,nm} ^ 2 + {u_cal} ^ 2 + {u_sys} ^ 2} :math:`u_{sys}` is easily gotten through the `cal.systematic_uncertainty` .. jupyter-execute:: sys_unc = Ne_cal.systematic_uncertainty The calibration uncertainty can be found by linearly interpolating the confidence band array. For example, if the calibration uncertainty at channel 1000 is 0.001 nm, but is 0.0012 nm at channel 1001, then a peak with a center at channel 1000.5 would have calibration uncertainty 0.0011 nm. Note that the calibration uncertainty generally changes slowly. .. jupyter-execute:: #the calibration uncertainty changes with wavelength, so we need to calculate it #for each centroid calibration_unc = [np.interp(centroid, wavelengths, Ne_cal.confidence_bands(n_sigma = 1)) \ for centroid in centroids[:, 0]] Finally, we can calculate the total uncertainty for each centroid using the formula above: .. jupyter-execute:: for centroid, centroid_uncertainty, cal_unc in zip(centroids[:, 0], centroids[:, 1], calibration_unc): total_unc = np.sqrt(centroid_uncertainty ** 2 + cal_unc ** 2 + sys_unc ** 2) print(f'{centroid:4.4f} +/- {total_unc:5.4f}') These are the values that should be published in a paper. They take into account not only the fit uncertainty, but also the calibration and systematic uncertainties. Estimation of Systematic Uncertainty ===================================== One measure of goodness-of-fit is the reduced chi-squared (A.K.A. :math:`\chi^2_{\nu}`), defined as: .. math:: \chi^2_{\nu} = \frac{1}{n - m} \sum{\frac{(y_{observed} - y_{calculated})^2}{\sigma ^ 2}} If there are no systematic errors in your experiment, then :math:`\chi^2_{\nu}` should be close to 1. Values significantly less than 1 suggest you are *overestimating* your uncertainties, while values greater than 1 suggest you are *underestimating* your uncertainties. To reduce the chance of publishing over confident uncertainties, euv_fitting adds a systematic uncertainty in quadrature with the channel and literature uncertainties if the :math:`\chi^2_{\nu}` is greather than 1. The systematic uncertainty is slowly increased until :math:`\chi^2_{\nu}` equals 1, as shown below. .. jupyter-execute:: :hide-code: sys_test = np.linspace(0, 0.005, 25) res = [] for sys in sys_test: chi_square = Ne_cal.eval_chi_square(sys, Ne_cal.f3, target = 0) res.append(chi_square) plt.plot(sys_test, res, 'xb') plt.hlines(1, np.min(sys_test), np.max(sys_test), 'r', '--') plt.xlabel('Systematic Uncertainty') plt.ylabel('Reduced Chi Squared') plt.title('Search for Best Systematic Uncertainty') plt.show()