C _{j } = (20) and . When the parameters are constrained to some known values e _{j} , C _{j} is a unit matrix . Each constraint defines a pdf of model parameters (Jackson and Matsu’ura, 1985), , (21) where N _{j} is the number of independent constraint equations, namely the rank of C _{j} , and ||_{D} represents the product of all nonzero eigen values. When there are N _{c } constraint matrices, the joint pdf is given by a product of (21) for j = 1, ..., N _{c} as, , (22) where A is a normalization factor that keeps the integration of (22) for m to unity. Hereafter we consider only one constraint matrix C _{1} , whose rank is equal to N _{m} , for simplicity. Following a Bayes’ theorem, a posterior pdf is obtained by multiplying (2221) and the likelihood function (13), (23) . (24) where, . For given and , the problem is formally identical to the maximum likelihood problem in the equations (13)-(15), and the solution is similarly obtained by maximizing the posterior pdf (23), in other words minimizing the least square residual Res(m ). The best estimates and the variance matrix of parameters are, . (25) . (26) To find the best estimates of and , we integrate (23) for the model parameters, and obtain the marginal likelihood as (27) where || represents the determinant. At the maximum of the marginal likelihood, the variance of data error is calculated as (Yabuki and Matsu’ura 1992), , (28) Following the definition of well-known Akaike’s information criterion (AIC), Akaike (1980) introduced Akaike’s Bayesian information criterion (ABIC) as , . (29) For the current problem, the expression of ABIC is given by (27) and (28) as, . (30) ABIC is a function of the hyperparameter , and its minimum is found numerically. When the constraint is too strong, that is is very large, and the second and the third terms of (30) almost cancel, but the least square residual Res(m _{0} ) in the first term is also large and so ABIC is. When the constraints are too weak, is very small, the residual is small, but the second term becomes large to make a large ABIC. Between these two limits, we can find a minimum of ABIC. In (30) defines the condition number of the matrix in the parenthesis of the third term. The tradeoff between the residual and the condition number of the matrix is essential for ABIC. Cohee and Beroza (1994) compared the residual and the condition number to find an appropriate value of weight of constraint. (3130) gives a statistical background for such a comparison. In other words, ABIC solves the well-known problem of tradeoff between the resolution and accuracy of the solution. Practically, finding the minimum of ABIC is not easy for general inversion problems including nonlinearity and non-Gaussian error distribution. However, at least in principle, we need not worry about this tradeoff by using Bayesian modeling and ABIC. We can use a combination of different constraints using (22) instead of (21). ABIC with two constraints is explained in Fukahata et al . (2004). Figure 6.4 is a typical example of application of ABIC for slip inversion, with prior information that slip distribution is smooth in space and time. The synthetic waveforms from an input slip model (Fig. 6.4(a)) contaminated with artificial noises are inverted for slip models (Figs. 6.4(b)-(f)) using different values of two hyper parameters in space and time, and , respectively. The hyper parameters (weights of constraints) change the roughness of the slip distribution and source time functions. If the input model is consistent with the prior information and smooth to some extent, the slip model based on the minimum ABIC (Fig. 6.4(b)) is similar to the input model. 6.3.3 Comparison in the Frequency Domain The final goal of slip inversion is to obtain a model that can explain seismic waves over a broad frequency band. Accurate strong-motion prediction requires such a broadband seismic source. However, most previous slip models are determined by mainly low frequency (long period) seismic waves. Not many studies discussed validity of their model in the frequency domain. There have been a few studies that solved inversion problems in the frequency domain, such as Olson and Anderson (1988) and Cotton and Campillo (1995). The study of Ji et al (2002a) using wavelet transform instead of Fourier transform is a mixed one in the time and frequency domains (wavelet domain). In this subsection, we discuss the nature of seismic waves at different frequencies and review studies that focused on the frequency contents of seismic waves. A good reference model for seismic wave spectra is the omega square model (Aki 1967; Brune 1970), in which the displacement spectrum is proportional to , (31) where is a corner angular frequency. High-frequency waves attenuate as squared frequency, and lower-frequency waves relative to the corner frequency dominate in displacement records. Since waveform inversion in the time domain gives a solution that explains waves of dominant frequencies, the inversion of displacement data tends to explain these low frequency waves. Since, velocity data have large amplitude around the corner frequency, the model is determined by waves abound this frequency. Similarly, acceleration data have implicit weights above the corner frequency. Naturally, records of seismometers with different instrumental responses (e.g. , Mendoza and Hartzell 1988a), or records with different bandpass filters (e.g. , Ide 1999), result in the apparently different slip distributions. This is one reason of difference between models for the same earthquake, because different studies use different type (displacement, velocity, or acceleration) of data as shown in Table 6.1. To obtain a slip model that is valid over a broad frequency band, we must apply some weighting schemes that weaken this omega square frequency dependence of seismic waves, which is difficult in time domain. Olson and Anderson (1988) proposed the first inversion in the frequency domain. They transformed time domain waves to frequency domain spectra and also converted Green’s functions and slip distribution to the frequency-wavenumber domain. They compared data and predictions at each frequency, which greatly simplifies the problem especially at low frequencies and small wavenumbers, preserving its linearity. For high frequencies and large wavenumbers, the number of parameters increases and some regularization is required. They proposed the minimum length regularization, but did not apply it to the real data. Mendez and Anderson (1991) applied essentially the same method to the 1985 Michoacan earthquake (Mw 7.9) with a different regularization that minimizes an error function based on the discretized version of a wave equation, which is minimized by a propagating rupture at a constant velocity. One shortcoming of the frequency-wavenumber domain transform of slip distribution is that positivity constraint is difficult to apply because the slip at a point is represented by the summation of slip functions at various frequencies and wavelengths. Cotton and Campillo (1995) compared Fourier coefficients of data and model prediction in the frequency domain but they modeled slip distribution in the time domain, in similar form to the nonlinear expression in Fig. 2b. They applied this method to the real data of the Landers earthquake and obtained a slip model which is consistent with other studies. Actually, their method is essentially the same as an ordinary time domain inversion that compares the time domain data d ^{0} and corresponding model predictions d ^{e} and minimizes the least square residual, . This value is an invariant for any orthonormal transformation T . (32) Since Fourier transform is a kind of orthonormal transformation, the comparison of Fourier coefficients in the frequency domain gives an identical residual to the one in the time domain. This property is also known as Parseval’s theorem. Nevertheless, the first evaluation of validity of slip model in the frequency domain was a significant result. Although Fourier transform is popular and its properties are well known, it has an aspect that is not easy to deal with. Fourier transform of a boxcar function in the time domain extends over infinite frequency band, and vice versa. In contrast, orthonormal wavelet transform is a transform between a finite time domain signal and a finite frequency domain spectrum. Ji et al . (2002a) adopted Meyer-Yamada wavelet transform to manifest contribution of finite frequency signal in slip inversion. Their method enables easy re-weighting in each frequency band in the wavelet domain. They applied this method to the Hector Mine earthquake (Mw, 7.1, Ji et al . 2002b). While constructive wave interference occurs at low frequencies, high-frequency seismic waves are incoherent and the wave energies, squared amplitudes, are summed up at observation point. Special methods have been proposed to invert high-frequency incoherent waves and find the origin of these waves. A class of slip inversion developed for high-frequency waves is the envelope inversion. Synthetic envelopes from a source model are calculated using isochrone approximation of Spudich and Frazer (1984) (Zeng et al . 1993), using empirical Green’s function (Hartzell et al . 1996; Kakehi and Irikura 1996, 1997; Kakehi et al . 1996), and based on the scattering theory of Sato et al . (1997) (Nakahara et al . 1998, 1999). In many cases, the sources of high-frequency waves seem to be located near the periphery of large slip, which suggests high-frequency wave generation due to rapid accelerations and decelerations of the rupture front (Madariaga 1977). However, many exceptions exist for this observation and there is no general relation between low-frequency slip distribution and high-frequency energy radiation. Apparently this is a field that requires much future investigation.6.4 EXAMPLE OF SLIP MODEL: THE 1999 CHI-CHI EARTHQUAKE The Chi-Chi, Taiwan, earthquake is an interplate thrust earthquake between the Eurasian and the Philippine Sea plates which occurred on September 21, 1999 (local time). Surface offsets appeared along an 85 km segment of the Chelungpu fault. The maximum offset of about 8 m is located near the northern end where fault trace becomes complex and changes the strike direction to the east (Figure 6.5). Aftershocks are distributed to the east of this surface trace, reflecting the east dipping fault plane. Digital strong-motion records are available from 441 stations, 60 of which are within 20 km from the surface trace, . These stations are maintained by the Taiwan Strong-Motion Instrumentation Program (Lee et al . 2001). Furthermore, coseismic static displacements at about 100 sites are available from GPS survey (Yu et al . 2001). They comprised the best data set in quality and quantity among all the earthquakes before 2005. Several groups determined spatioa spatio-temporal slip model using this data set. Chi et al . (2001) assumed one fault plane and applied a multi-time-window method to invert strong-motion velocities and surface offsets. Wu et al . (2001) used spatio-temporal basis functions and Bayesian modeling to jointly invert strong-motion velocities and GPS displacements. They assumed from one to three fault planes to explain the northern fault bend. Ma et al . (2001) used a multi-time-window method for strong-motion displacements, far-field displacements, and GPS data. They compared two analyses with and without a northern small fault striking to the northeast. The northern small subfaults assumed by Wu et al . (2001) and Ma et al . (2001) were necessary to explain the records at some stations close to the surface offsets, but they are almost independent of the rest of data. All three studies used linear parameterization and solved using NNLS. The large number of parameters from 4000 to 8000 requires regularization to get reasonable solutions. On the other hand, Zeng and Chen (2001) inverted strong-motion and GPS data to determine the distribution of slip, rupture time, and rise time in a nonlinear expression using a genetic algorithm. Their fault plane has a 3D geometry, an extension of surface trace to the down eastward. Ji et al . (2003) applied their wavelet transform inversion method with simulated annealing to resolve slip distribution on a 3D complex fault plane. Fig. 6.5 shows the exact locations of the assumed fault planes, final slip distributions, and propagation behaviors of rupture area of Chi et al . (2001), Ma et al . (2001), and Wu et al . (2001) (models C, M, and W respectively). Models M and W are the single fault models in these papers and the slip data of models C and M are taken from the supplemental CD of the original papers. Although the values of seismic moment are written as 2.7-4.7 x 10^{20} Nm in the original paper, all the seismic potencies calculated from the model parameters are within 1.0-1.15 x 10^{10} m^{3} , which is reasonable because the colors of final slip in Fig. 6.5 are apparently similar. The common features visible in these models are: 1) total potency and rupture area, 2) small slip around the hypocenter, 3) large and deep slip in the northern part, 4) small and shallow slip in the southern part, and 5) the maximum slip of about 20 m. The good agreement between models M and W is probably due to addition of GPS data. In model C, the slip area extending over the northern end of the surface trace is hard to explain. The above features from 1) to 5) are also found in the models of Zeng and Chen (2001) and Ji et al . (2003). The most prominent similarity among the slip histories of these models is the large slip at 40 km north of the hypocenter and 20-24 s after the hypocentral time. The rupture propagation velocity is less than 2 km/s, which is slower than the average shear wave velocity in the source area of 2.5-3.0 km/s. This is also shown by Zeng and Chen (2001) and Ji et al . (2003). However, there are considerable discrepancies such as the timing of slip in the southward. Slip amount within each period does not match well. In Fig. 6.5, the origin time of model M is shifted 2 s to match other models better. This difference is not small but it is not unrealistic because the strong-motion records used in this study do not have accurate timing information and researcher independently corrected them based on other information. When large amounts of data are available as the Chi-Chi earthquake, we observe similarities among models, which are certainly constrained by data regardless of slightly different model assumptions. A dense GPS network is especially useful to constrain spatial distribution of slip. The amount and timing of shallow slip cannot be so very different when the measurements of surface offsets and very near-fault strong-motion records are available. While very near-fault records are only sensitive to the local slip, a good coverage of seismic data is essential to reveal spatio-temporal rupture propagation. All these data are available for the Chi-Chi earthquake. Similarly, we can find common timing and spatial extent of large slip in the models of well-studied events with the sufficient data, such as the Landers, Northridge, Kobe, and Tottori earthquakes (Table 6.1). Nevertheless, we should also acknowledge that considerable differences exist in the details of slip model that is affected by slight difference of the way of analysis even with such good data set. The behavior of small slips, often including those in the initial stage of rupture, is difficult to resolvebe resolved.