SIMULTANEOUS FITS IN ISIS ON THE EXAMPLE OF GRO J1008–57

. Parallel computing and steadily increasing computation speed have led to a new tool for analyzing multiple datasets and datatypes: ﬁtting several datasets simultaneously. With this technique, physically connected parameters of individual data can be treated as a single parameter by implementing this connection into the ﬁt directly. We discuss the terminology, implementation, and possible issues of simultaneous ﬁts based on the X-ray data analysis tool Interactive Spectral Interpretation System (ISIS). While all data modeling tools in X-ray astronomy allow in principle ﬁtting data from multiple data sets individually, the syntax used in these tools is not often well suited for this task


th unpreced
nted precision.

Motivation

Most data analysis in X-ray astronomy concentrate on describing single datasets or on characterizing samples with results of fits of individual datasets.Once a good description of an example dataset is found, the analysis of comparable datasets follows.Finally, the results of all those individual analyses are compared and interpreted.

For instance, a particular parameter is found to depend on other parameters.Instead of going back to the data analysis and fitting this dependency directly to enhance the parameter precision or break degeneracies (feasible through reduced degrees of freedom), the dependency is then analyzed on its own.In another way, the former analysis is indeed repeated but with this parameter fixed according to the discovered dependency.Furthermore, if parameters cannot be constrained well, it is common to keep those parameters fixed to a ertain standard value.

Thus, one cannot gain any physical information from this fixed parameter and, more importantly, systematical effects might arise.The reason for not following sophisticated ways is usually a lack of computation power.Implementing parameter correlations  or dependencies would require one to analyze all data at the same time.However, since computer power has increased and parallel computation using several computers is possible, this situation has changed today.In other words, fitting data simultaneously has become feasible even when large numbers of datasets (e.g., 50-100 pointings at a single source) are to be considered.

In Section 2 we introduce an implementation of simultaneous data analysis into the Interactive Spectral Interpretation System [ISIS, 1], which has been "designed to facilitate the interpretation and analysis of high resolution X-ray spectra" 1 .In Section 3, we present ideas for possible applications of simultaneous data analysis and further demonstrate the power of this method on the example of the transient X-ray binary GRO J1008-57 in Section 4. Finally, we discuss questions and issues, which arise by comparing advantages and disa

antages of simultaneous f
ts (Table 1).


Implementation into ISIS

ISIS [1] was developed to fit X-ray spectra, but it can also be used to analyze nearly all kinds of data due to its strong customization capability [2] compared to, e.g., XSPEC [3,4].For instance, user-defined fitfunctions, as well as complex correlations between data and models, can be implemented.However, functions to handle these correlations for a large number of parameters and datasets in an easy way are not yet available.

Before we describe the technical realization of simultaneous data analysis in ISIS, we introduce new nota

ons used by
he implemented functions.


Terminology

The parameters of a model which is fitted to data either act on all datasets loaded into ISIS, or on an individual dataset.By defining parameters for each dataset and tying them to each other, parameters can be linked to multiple datasets similar o the approach chosen, e.g., in XSPEC.

We call multiple datasets, which should be fitted with the same set of parameters, a datagroup.The corresponding parameters are called group parameters.Global parameters denote parameters which act on all datagroups.

Figure 1 illustrates these definitions.In this example, a datase ers.

1 http://space.mit.edu/CXC/ISIS/There are simultaneous data from n detectors available which can be described by the same parameters.These datasets define the datagroup A with p free parameters.Another group of data was recorded by m detectors.These datasets define an individual datagroup B with, again, p free parameters.During the analysis of both groups, however, it turns out that a specific parameter seems to be equal for both data groups within the uncertainties.As a result, the two individual values for this parameter are tied to each other, resulting in a global parameter.That reduces the number of free parameters by one and the remaining group parame

rs can be constrained better
Data-and analysis functions

Since simultaneous fits can have large numbers of fit parameters connected by a complicated logic, we provide a collection of all functions necessary to initialize and perform simultaneous fits in ISIS2 .The initialization of a simu simfit = simultaneous_fit();
where simultaneous_fit returns a structure (Struct_Type), which has to be assigned to a variable, here simfit.The structure contains several functions and fields to handle simultaneous fits.The documentation of each function is available using the help-qualifier.Some important functions are d

cribed in the following.


s
mfit.add_data(filenames);

This defines a datagroup and loads the spectra given by filenames, which must be an array of strings.The function also allows other data than spectra to be loaded or defined simfit.fit_fun(model);

The string model defines the fit-function to be used for all datasets.Here, the placeholder % can be used instead of a component instance.In this case individual group parameters are applied to each

tagroup automatically.


simfit.set_par_f
n(parameter, function);

This is probably one of the most useful functions.Like the ISIS intrinsic function, the value of the parameter is determined by the given function.The %-placeholder can be used within the string parameter to apply the function to the corresponding parameter of each data group.However, the function may contain other parameters or even a single parameter name as well.In the latter case, if the function is also applied to all datagroups (using the %), the single PREPRINT vol.

no. / Simultaneous fits in ISIS on the example of GRO J1008-57 parameter is treated as global parameter from now on.

Because a simultaneous fit results in a larg number of parameters, a single call to a fit-routine (fit_counts) will take a long time.In the example of the previo s section, the final model fitted to the data consists of (n+m)×p parameters, where only 2p − 1 are free.To reduce the runtime of a fit, three fit-routines are implemented within the simultaneousfit-structure.


simfit.fit_groups(groupID);

Instead of perfoming a χ 2 -minimization of all parameters and datasets, this function loops over all datagroups and fits only the ass ciated parameters (group par meters).If a group is specified by the optional groupID, then only the group parameters of this particular group are fitted.


simfit.fit_global();

Instead of fitting the group parameters, this function fits the global parameters only.Since all defined data groups have to be taken into ac

unt, the fit usually
akes longer than fitting the group parameters.


Uncertainty calculations

As already mentioned, the runtime of simultaneous fits is increased compared to fitting a single dataset only.Thus, uncertainty calculatio

of parameters, where a c
rtain parameter's range has to be found corresponding to a change in χ 2 , will be affected dramatically by the high runtime.Furthermore, it is necessary to distinguish between group-and global parameters.We recommend to compute the uncertainty intervals for each parameter on a different machine by, e.g., using [5] or mpi_fit_pars and the SLmpi module3 .We compared the runtime of a parallel uncertainty calculation in ISIS with a serial approach in XSPEC.Estimating the uncertainties of 10 parameters in parallel (i.e., on 10 cores) is faster by more than a factor of 3 (21 ks vs. 60 ks).Additionally, the calculations in ISIS resulted in a better χ 2 red because the parameter ranges being scanned are larger in the parallel approach.

Group parameters depend on a single datagroup only.As a consequence, all other datagroups and therefore all other group parameters can be ignored during the uncertainty cal ulation.Unfortunately, that is not the case for global parameters.During the analysis of GRO J1008-57 (see Section 4), the uncertainties of the global parameters have been calculated by revealing the χ 2 -landscape of each global parameter by individual fits.Afterwards, every landscape has been interpolated to find the ∆χ 2 -value of interest (e.g., ∆χ 2 = 2.71 corresponding to the 90%-confidence level).In this way the runtime of an uncertainty calculation of a single global parameter could be reduced significantly.Note that depending on the model and amount of data, such a computation can take up to several days.


Applications

There are various applications of simultaneous fits and data analysis.Besides determining specific parameters which seem to be constant over time by all available

ta, more phys
cal questions can be tackled.For instance, if a physical property of the object of interest results in multiple observables:

• the geometry of the accretion column in accreting neutron star X-ray binaries affects the line shape of cyclotron resonant scattering features (CRSF) [6] as well as the pulse pro ile shape (Falkner et al., in prep.).

Furthermore, instead of deriving physical properties from parameters after fits have been performed, these properties can be directly fitted to the data by implementing the depende cy on the model parameters:

• the components in radio maps of jets in active galactic nuclei move with certain velocities.Assuming a constant velocity of the jet components, the velocity itself could be a glo al fit parameter [7].

• in the sub-critical accretion regime of neutron stars, the spectrum is believed to harden with increasing luminosity [8].Any possible dependency between power-law shape and lumino ity could be fitted simultaneously with multiple spectra.


The Example GRO J1008-57

As an example of a successful simultaneous fit we briefly summarize the results of our analysis of GRO J1008-57 using almost all available X-ray spectra an

-lightcurves.This transie
t high-mass X-ray binary consists of a neutron star orbiting a Be-type optical companion.For further details of the system as well as for the results of the analysis see [9] and references therein.Since sources are only visible for a small fraction of their full orbit, it is challenging to obtain the orbital parameters of transient X-ray binaries by analyzing, e.g., pulse arrival times [10,11].Thus, an observed shift in the orbital phase with respect to initial orbital parameters can be fitted with either a different orbital period or time of periastron passage.This leads to a parameter degeneracy, which can be visualized by a contour map of the χ 2 -landscape of these parameters.The resulting contour map shows that both parameters are degenerated statistically (i.e., the ellipsoidal contour lines are tilted).

The outburst times of the source are, however, clearly connected to the periastron passage.Performing a simultaneous fit of the pulse arrival times and the outburst times reduces the param ter degeneracy and results in much better constrained parameters (about a factor of 2-3) as seen by recalculated contour map.

Initial fits of the spectra of three outbursts of GRO J1008-57 in 2005, 2007 and 2011 with an absorbed cutoff power-law and an additional black body component showed that the folding energy E fold , as well as the black body temperature kT , are independent of time within uncertainties.In particular, it seems that they do not change between different outbursts, i.e., these parameters are constant properties of the source.

Thus, those parameters are set as global parameters using simfit.set_par_funand their values are determined by all available data.In addition, further parameters can be treated as global one [see 9, for more details].Finally, each observation is described by 3 group parameters only (≈ degrees of freedom for each datagroup, the global parameters contribute marginally), which are the power-law flux F PL , the black body flux F BB , and the photon index Γ.The latter two strongly correlate with F PL , but show no dependency on the outburst time or -shape.This correlation can be fitted to describe the spectrum of GRO J1008-57 by only one parameter: the power-law flux F PL .The fit is shown in Fig. 2 and its values are given in Section 4.2 of [9].

As already mentioned in Section 2.3, the runtime of uncertainty calculations of the global parameters is increased dramatically.In the case of this analysis, the χ 2 -landscape produced by taking ll 68 spectra into account was interpolated to estimate the uncertainties.The calculation of a single global parameter took ∼ 7 days on 100 CPUs (16320 CPUh).


Outlook

Although the simultaneous fits have already been applied successfully to real data (see Section 4 and [9]), the routines and functions are still under develop-ment.We recommend to pull the

sisscrip
s-GITrepository 2 regularly to be up-to-date.

There are, however, some caveats according to Table 1 which one should be aware of (as with any routine, not just our ISIS implementation).In particular, the runtime still has to be reduced.One way t achieve this is by performing the fit on multiple CPUs, e.g., one CPU handles one dataset or datagroup.This has not been implemented yet because the dependencies of the datasets on each other require data exchange between the processes on the different machines.Additionally, the question about weighing the data is currently under discussion.The weight depends on the number of datapoints available in each dataset (or -group) as well as their uncertainties -but what does this mean for its importance, i.e., its effect on the model parameters?These remaining issues have to be clarified and the respective solutions will be published in the future.



advantages disadvantages • fixed parameters can be determined correctly • increased runtime of fits and uncertainty calculations • complicated parameter correlations can be implemented and tested • large