The main objective of the frequency analysis for annual series is to use probability distributions to derive the relationship between the magnitude of extreme events and probability of extreme events (Chow et al., 1988). Flood frequency analysis is the most important statistical technique in understanding the nature and the size of data observed over a long period of time in a river or precipitation system. Furthermore, the flood data is considered to be random, and it is assumed that this data is identically distributed and independent. It is also assumed that floods have not been affected by natural changes nor man-made changes.
2.2 Frequency Analysis
Due to the importance of flood frequency analysis, several studies and analyses have been conducted in various fields considering many discussions about the general aspects of frequency analysis. Previous studies have shown several research methods that related to the data structure, probability distribution, and parameters estimation. The history of frequency analysis began as early as 1958 based on the book by the writers (Rao and Hamed, 2000). To compute the flood discharge probability, Snyder (1958) developed a method based on rationality, time concentration, and interpretation of the hydrographic unit. In terms of importance to the drainage area, Benson (1959) found that the effect of the channel slope on flood frequencies was ranked second. In the late 1960’s, specifically speaking in 1969, Alexander et al. considered the problem of estimating the relationship between the magnitude and frequency of rare floods. In the same year, Kirby discussed a random occurrence of rare floods and the use of Poisson distribution in flood frequency analysis. A year later, in 1970, White and Reich considered the relationship between flood data and the characteristics of watersheds in small basins in Pennsylvania.
Subsequently, in the late 1970’s- specifically in 1979, Gerard and Karpuk developed a method of frequency analysis. This was done by incorporating the former water levels in site into probability analysis.
Frequency analysis was developed considerably during the 1980’s. Starting in 1981, Reich and Renard emphasized the importance of visual interpretation of the observed flood series. In 1982, Crippen developed envelope curves for extreme flood events in the United States. In the same year, Kuczera (1982) found that regional estimates were preferable to estimates based on short record lengths and estimates which combined both location and regional information for large record lengths. Kuczera (1982) also developed an empirical Bayes approach to combine site-specific and regional information. In 1983, Kuczera derived a Posterior distribution of the power normal distribution, and of the T-year flood. In addition to other developments in the field of flood frequency analysis, Tasker (1983) examined the effect of serial dependence on the reliability of the T-year event. Also, Smith and Karr (1986) used Cox regression model for flood frequency analysis. Procedures used to estimate recurrence intervals of the large flood were developed by Smith (1987). These models have allowed incorporation of time-varying exogenous information into flood frequency analysis.
In the early 1990’s, efforts have continued to develop on-site flood frequency by seeking new distributions and estimation methods for reliable flood estimates. At the same time, many researchers realized that the lack of a long data series has limited the degree of evolution in flood frequency analysis at-site. Therefore, it is necessary to look for other sources of information and comparing existing ones rather than developing new approaches Potter, 1987; Bobee et al, 1993a. Regionalization may be the preferable way to improve flood estimates, and fortunately, this seemed to be the direction research has taken in analyzing flood frequency in the 1990’s.
In 1990, Adamowski and Feluch came up with a non-parametric flood frequency analysis method that can also use historical information. Three years later, Vogel et al. (1993) analyzed flood data from 383 sites in the southwestern part of the United States, ought to explore the suitability of different distributions to model flood frequencies. Researchers used L-moment ratio diagrams for the selection of distributions.
The models of multivariate extreme value distributions with mixed Gumbel distributions are suitable options to be considered when performing flood frequency analysis (Escalante-Sandoval, 1998). The study of Faulkner and Robson (1999) showed that the floods in permeable drainage basins were proven to be able to estimate.
In 2000, GIS, L-moment, and geostatistical methods were used in regional frequency analysis (Daviau et al.,2000). After this, several methods were used in flood frequency analysis. Rulli and Rosso (2002) built an integrated simulation method and made frequency predictions for flash-flood risk assessment by combining Stochastic and deterministic methods. Meanwhile, Kjeldsen et al. (2002) conducted the regional flood frequency analysis by using the index-flood method. Based on the index-flood method (Dalrymple, 1960), Javelle et al. (2002) developed regional flood duration-frequency (QdF) curves for describing the flood regime for a basin. Furthermore, Chokmani and Ouarda (2004) proposed a physiographical space-based kriging method for regional flood frequency estimations at ungauged sites.
In subsequent years, the development of different methods has continued for flood frequency analysis. Ashkar and Mahdi (2003) researched generalized probability weighted moments (GPWM) and maximum likelihood (ML) fitting methods, in the two-parameter log-logistic (LL) model. The results obtained from the studies, claimed that although the log-logistic (LL) model is not one of the distributions frequently used in hydrology, it merits wider use in hydrological practice. Lee and Maeng (2003) compared and analyzed design floods on six Korean watersheds. This was done through the appropriate order of LH-moments with the test of homogeneity, independence, and outlier of data on annual maximum floods and application of appropriate distribution on the data. The results presented the appropriate order of LH-moments that derived appropriate design floods. At the same time, Cunderlik and Burn (2003) introduced a second-order, non-stationary method to pool flood frequency analysis. In addition, regional frequency analysis was done by Fowler and Kilsby (2003) for the United Kingdom extreme rainfall from 1961 to 2000.
In 2004, a new model was proposed for flood frequency analysis using extended three parameters. Burr XII distribution was demonstrated using data from China to conduct a regional flood frequency analysis (Shao et al., 2004). Can be established a regionalized relationship using regional flood frequency analysis which then can be used to estimate flood magnitudes for ungauged and poorly gauged catchments (Jingyi and Hall, 2004).
After, a study was conducted ought to update Regional rainfall frequency analysis for the state of Michigan, as it was observed that the events of rainfall were occurring more frequently than expected ( Trefry et al., 2005). Historical and pooled data was used as part of a multi-method approach to evaluate flood risk (Macdonald et al., 2006).
In the next year, Kysely and Picek (2007a) used regional analysis to improve estimates of probabilities of extreme events in the Czech Republic. In the same year, Kysely and Picek (2007b) sought to seek the probability estimates of heavy precipitation events in a flood-prone central European region with enhanced Mediterranean cyclones. Regional frequency analysis was used to characterize the severity of a flash flood, generating a storm that occurred in Italy, and to analyze short duration annual maximum precipitation (Norbiato et al., 2007). Meanwhile, the proposal of the regional Bayesian POT model, for flood frequency analysis for short record length sites, Region-of-influence (ROI) approach was used for frequency analysis for heavy precipitation data in Slovakia (Ribatet et al. (2007) and Gaal et al. (2007)).
A study was conducted by Gaal et al. (2008) to estimate the design precipitation on High Tatras region in Slovakia, using flood frequency analysis. A region in northern Mexico was selected to apply the multivariate extension of the logistic model with generalized extreme value (GEV) marginal, to estimate a regional at-site flood (Sandoval, 2008). A combination of self-organizing feature map and fuzzy clustering was used in regional flood frequency analysis (Srinivas et al. 2008).
Based on copulas and a multivariate quantile version with a focus on the bivariate case, regional floods were estimated (Chebana, F., ; Ouarda, T. B. (2009). Index flood-based multivariate regional frequency analysis. Water Resources Research, 45(10).).
Based on the Generalized Additive Models for Location, Scale, and Shape parameters (GAMLSS), and a tool for modeling time series under nonstationary conditions, a model for flood frequency analysis was created over the Little Sugar Creek watershed (Villarini, G., Smith, J. A., Serinaldi, F., Bales, J., Bates, P. D., ; Krajewski, W. F. (2009). Flood frequency analysis for nonstationary annual peak records in an urban drainage basin. Advances in Water Resources, 32(8), 1255-1266.).
On several Mediterranean catchments, Flood frequency analysis were applied based on a set of systematic data and a set of historical floods (Neppel, L., Renard, B., Lang, M., Ayral, P. A., Coeur, D., Gaume, E., … ; Vinet, F. (2010). Flood frequency analysis using historical data: accounting for random and systematic errors. Hydrological Sciences Journal-Journal des Sciences Hydrologiques, 55(2), 192-208.). Lang et al.(2010) Extrapolation of rating curves by using hydraulic modeling with a Bayesian framework, including a multiplicative error for flood frequency analysis (Lang, M., Pobanz, K., Renard, B., Renouf, E., ; Sauquet, E. (2010). Extrapolation of rating curves by hydraulic modelling, with application to flood frequency analysis. Hydrological Sciences Journal-Journal des Sciences Hydrologiques, 55(6), 883-898.). Built on the regional analysis, the effects of discordancy detection measured based on regional flood probability types and the accuracy of the estimates were assessed and analyzed the effect of outliers on the identification of regional probability distributions (Saf, B. (2010). Assessment of the effects of discordant sites on regional flood frequency analysis. Journal of hydrology, 380(3-4), 362-375.). Estimates the parameters of hydrological models with covariates in hydrology the by using Bayesian approach (Ouarda, T. B. M. J., ; El?Adlouni, S. (2011). Bayesian nonstationary frequency analysis of hydrological variables. JAWRA Journal of the American Water Resources Association, 47(3), 496-505.). A nonstationary flood frequency analysis method was developed and applied by Gilroy and McCuen in 2011 to adjust for future climate change and urbanization for the Little Patuxent River in Guilford, Maryland (Gilroy, K. L., ; McCuen, R. H. (2012). A nonstationary flood frequency analysis method to adjust for future climate change and urbanization. Journal of Hydrology, 414, 40-48.). Frequency analysis of flood characteristics including annual peak flow, flood volume, and flood duration were measured by applying the copula method in modeling the joint dependence structure of uncertain variables of Upper Godavari River flows in India (Reddy, M. J., ; Ganguli, P. (2012). Bivariate flood frequency analysis of Upper Godavari River flows using Archimedean copulas. Water Resources Management, 26(14), 3995-4018.). Two approaches of generalized additive models for location, scale and shape (GAMLSS) of flood frequency analysis, were applied to address the modeling of non-stationary time series for the annual maximum flood records of 20 continental of Spanish rivers (López, J., ; Francés, F. (2013). Non-stationary flood frequency analysis in continental Spanish rivers, using climate and reservoir indices as external covariates. In Hydrology and Earth System Sciences Discussions (Vol. 17, No. 8, pp. 3103-3142). European Geosciences Union (EGU).). Three estimation methods were used to estimate the parameters of fifteen different probability distributions based on a relatively long period of recorded stream flow data per a given site (Rahman, A. S., Rahman, A., Zaman, M. A., Haddad, K., Ahsan, A., ; Imteaz, M. (2013). A study on selection of probability distributions for at-site flood frequency analysis in Australia. Natural hazards, 69(3), 1803-1813.).
Aziz et al. 2014 developed and tested an ANN-based RFFA model, which can provide flood quantile estimates that are more accurate than the traditional QRT. This study used an extensive Australian database (Aziz, K., Rahman, A., Fang, G., ; Shrestha, S. (2014). Application of artificial neural networks in regional flood frequency analysis: a case study for Australia. Stochastic environmental research and risk assessment, 28(3), 541-554.). In the same year, Bezak et al. 2014 compared between the annual maximum (AM) and peaks-over-threshold (POT) series for Flood frequency analysis using three different parameter estimation techniques, and six different probability distributions (Bezak, N., Brilly, M., ; Šraj, M. (2014). Comparison between the peaks-over-threshold method and the annual maximum method for flood frequency analysis. Hydrological Sciences Journal, 59(5), 959-977.). Kochanek et al. 2014 presented the results of a national comparison which applied for the implementation of the main flood frequency analysis for daily flow data from more than 1000 gauging stations in France using the local, regional, and local-regional estimation of Gumbel and Generalized Extreme Value (GEV) distributions (Kochanek, K., Renard, B., Arnaud, P., Aubert, Y., Lang, M., Cipriani, T., ; Sauquet, E. (2014). A data-based comparison of flood frequency analysis methods used in France. Natural Hazards and Earth System Sciences, 14, p-295.).
The next year, a hybrid?clustering approach was used in conjunction with a flood?index methodology for flood frequency analysis to provide a regionalized discharge estimate using a global database (Smith, A., Sampson, C., ; Bates, P. (2015). Regional flood frequency analysis at the global scale. Water Resources Research, 51(1), 539-553.). Yan and Moradkhani in 2015 suggested a regional Bayesian hierarchical model which is considered an alternative to the traditional regional flood frequency analysis. They also used the L-moments layer using L-moments theory to select the best-fit probability distribution, based on the data collected from Willamette River Basin in the Pacific Northwest, U.S.(