free web stats
Kostia Zuev - Publications


ACM Lecture Notes
 

Statistical Inference
K.M. Zuev
[arXiv stat.AP 1603.04929]

Abstract:
What is Statistics? Opinions vary. In fact, there is a continuous spectrum of attitudes toward statistics ranging from pure theoreticians, proving asymptotic efficiency and searching for most powerful tests, to wild practitioners, blindly reporting p-values and claiming statistical significance for scientifically insignificant results. In these notes statistics is viewed as a branch of mathematical engineering, that studies ways of extracting reliable information from limited data for learning, prediction, and decision making in the presence of uncertainty. The main goals of these notes are: (1) provide a logical introduction to statistical inference, (2) develop statistical thinking and intuitive feel for the subject, and (3) introduce the most fundamental ideas, concepts, and methods of statistics, explain how and why they work, and when they don’t. These ACM lecture notes are based on the courses the author taught at the University of Southern California in 2012 and 2013, and at the California Institute of Technology in 2016.

Refereed Journal Publications
 

18. Transitional Annealed Adaptive Slice Sampling for Gaussian Process Hyper-Parameter Estimation
A. Garbuno-Iñigo, F.A. DiazDelaO, and K.M. Zuev
submitted
[arXiv stat.CO 1509.00349]
Abstract:
Surrogate models have become ubiquitous in science and engineering for their capability of emulating expensive computer codes, necessary to model and investigate complex phenomena. Bayesian emulators based on Gaussian processes adequately quantify the uncertainty that results from the cost of the original simulator, and thus the inability to evaluate it on the whole input space. However, it is common in the literature that only a partial Bayesian analysis is carried out, whereby the underlyig hyper-parameters are estimated via gradient-free optimisation or genetic algorithms, to name a few methods. On the other hand, maximum a posteriori (MAP) estimation could discard important regions of the hyper-parameter space. In this paper, we carry out a more complete Bayesian inference, through combining Slice Sampling with some recently developed Sequential Monte Carlo samplers. The resulting algorithm improves the mixing in the sampling through delayed-rejection, the inclusion of an annealing scheme akin to Asymptotically Independent Markov Sampling and parallelisation via Trainsitional Markov Chain Monte Carlo. Examples related to the estimation of hyper-parameters, as well as examples applied in other contexts of Bayesian inference, are presented. For the purpose of reproducibility, further development, and use in other applications, the code to generate the examples in this paper is freely available for download at this http URL.

17. Rare Event Simulation
J.L. Beck and K.M. Zuev
accepted for publication by the Springer Handbook on Uncertainty Quantification
[arXiv stat.CO 1508.05047]
Abstract:
Rare events are events that are expected to occur infrequently, or more technically, those that have low probabilities (say, order of $10^{-3}$ or less) of occurring according to a probability model. In the context of uncertainty quantification, the rare events often correspond to failure of systems designed for high reliability, meaning that the system performance fails to meet some design or operation specifications. As reviewed in this section, computation of such rare-event probabilities is challenging. Analytical solutions are usually not available for non-trivial problems and standard Monte Carlo simulation is computationally inefficient. Therefore, much research effort has focused on developing advanced stochastic simulation methods that are more efficient. In this section, we address the problem of estimating rare-event probabilities by Monte Carlo simulation, Importance Sampling and Subset Simulation for highly reliable dynamic systems.

16. Gaussian Process Hyper-Parameter Estimation using Parallel Asymptotically Independent Markov Sampling
A. Garbuno-Iñigo, F.A. DiazDelaO, and K.M. Zuev
Computational Statistics & Data Analysis, vol. 103, pp. 367-383, Jun. 2016.
[Web DOI | Paper pdf | arXiv stat.CO 1506.08010]
Abstract:
Gaussian process surrogates of computationally expensive computer codes provide fast statistical approximations to model physical processes. The training of these surrogates depends on the set of design points chosen to run the expensive simulator. However, such training set is bound to be limited and quantifying the resulting uncertainty in the hyper-parameters of the emulator by uni-modal distributions is likely to induce bias. This paper proposes a computationally efficient sampler based on an extension of the Asymptotically Independent Markov Sampling, a recently developed algorithm for Bayesian inference and extended to optimisation problems. Structural uncertainty of the emulator is obtained as a by-product of the Bayesian treatment of the hyper-parameters. Model uncertainty is also acknowledged through numerical stabilisation measures by including a nugget term in the formulation of the probability model. The efficiency of the proposed sampler is illustrated in examples where multi-modal distributions are encountered. For the purposes of reproducibility, further development, and use in other applications, we made the code used to generate the examples in this paper freely available for download at this https URL.

15. Hamiltonian Dynamics of Preferential Attachment
K.M. Zuev, F. Papadopoulos, and D. Krioukov
Journal of Physics A: Mathematical and Theoretical, vol. 49, art. num. 105001, Jan. 2016.
[Web DOI | Paper pdf | arXiv physics.soc-ph 1504.07981]
Abstract:
Prediction and control of network dynamics are grand-challenge problems in network science. The lack of understanding of fundamental laws driving the dynamics of networks is among the reasons why many practical problems of great significance remain unsolved for decades. Here we study the dynamics of networks evolving according to preferential attachment, known to approximate well the large-scale growth dynamics of a variety of real networks. We show that this dynamics is Hamiltonian, thus casting the study of complex networks dynamics to the powerful canonical formalism, in which the time evolution of a dynamical system is described by Hamilton's equations. We derive the explicit form of the Hamiltonian that governs network growth in preferential attachment. This Hamiltonian turns out to be nearly identical to graph energy in the configuration model, which shows that the ensemble of random graphs generated by preferential attachment is nearly identical to the ensemble of random graphs with scale-free degree distributions. In other words, preferential attachment generates nothing but random graphs with power-law degree distribution. The extension of the developed canonical formalism for network analysis to richer geometric network models with non-degenerate groups of symmetries may eventually lead to a system of equations describing network dynamics at small scales.

14. Subset Simulation Method for Rare Event Estimation: An Introduction
K.M. Zuev
In: Beer M. et. al. (Ed.) Encyclopedia of Earthquake Engineering, Springer Berlin Heidelberg, pp. 3671-3691, Oct. 2015.
[Web DOI | Paper pdf | arXiv stat.CO 1505.03506]
Abstract: Subset Simulation is an efficient and elegant method for simulating rare events and estimating the corresponding small tail probabilities. The method was originally developed by Siu-Kui Au and James Beck for estimation of structural reliability of complex civil engineering systems such as tall buildings and bridges at risk from earthquakes. The method turned out to be so powerful and general that over the last decade, Subset Simulation has been successfully applied to reliability problems in geotechnical, aerospace, fire, and nuclear engineering. This paper provides a detailed introductory description of Subset Simulation. A simple and intuitive derivation of the method is given along with the discussion on its implementation. The method is illustrated with several easy-to-understand examples. The reader is assumed to be familiar only with elementary probability theory and statistics.

13. Exponential Random Simplicial Complexes
K.M. Zuev, O. Eisenberg, and D. Krioukov
Journal of Physics A: Mathematical and Theoretical, vol. 48, art. num. 465002, Oct. 2015.
[Web DOI | Paper pdf | arXiv math.ST 1502.05032]
Abstract:
Exponential random graph models have attracted significant research attention over the past decades. These models are maximum-entropy ensembles subject to the constraints that the expected values of a set of graph observables are equal to given values. Here we extend these maximum-entropy ensembles to random simplicial complexes, which are more adequate and versatile constructions to model complex systems in many applications. We show that many random simplicial complex models considered in the literature can be casted as maximum-entropy ensembles under certain constraints. We introduce and analyze the most general random simplicial complex ensemble $\mathbf{\Delta}$ with statistically independent simplices. Our analysis is simplified by the observation that any distribution $\mathbb{P}(O)$ on any collection of objects $\mathcal{O}=\{O\}$, including graphs and simplicial complexes, is maximum-entropy subject to the constraint that the expected value of $-\ln \mathbb{P}(O)$ is equal to the entropy of the distribution. With the help of this observation, we prove that ensemble $\mathbf{\Delta}$ is maximum-entropy subject to the two types of constraints which fix the expected numbers of simplices and their boundaries.

12. Emergence of Soft Communities from Geometric Preferential Attachment
K.M. Zuev, D. Krioukov, M. Boguñá, G. Bianconi
Scientific Reports, vol. 5, art. num. 9421, Apr. 2015.
[Web DOI | Paper pdf |arXiv physics.soc-ph 1501.06835]
Abstract:
All real networks are different, but many have some structural properties in common. There seems to be no consensus on what the most common properties are, but scale-free degree distributions, strong clustering, and community structure are frequently mentioned without question. Surprisingly, there exists no simple generative mechanism explaining all the three properties at once in growing networks. Here we show how latent network geometry coupled with preferential attachment of nodes to this geometry fills this gap. We call this mechanism geometric preferential attachment (GPA), and validate it against the Internet. GPA gives rise to soft communities that provide a different perspective on the community structure in networks. The connections between GPA and cosmological models, including inflation, are also discussed.

11. General Network Reliability Problem and its Efficient Solution by Subset Simulation
K.M. Zuev, S. Wu, and J.L. Beck
Probabilistic Engineering Mechanics, vol. 40, pp. 25-35, Apr. 2015.
[Web DOI | Paper pdf]
Abstract: Complex technological networks designed for distribution of some resource or commodity are a pervasive feature of modern society. Moreover, the dependence of our society on modern technological networks constantly grows. As a result, there is an increasing demand for these networks to be highly reliable in delivering their service. As a consequence, there is a pressing need for efficient computational methods that can uantitatively assess the reliability of technological networks to enhance their design and operation in the presence of uncertainty in their future demand, supply and capacity. In this paper, we propose a stochastic framework for quantitative assessment of the reliability of network service, formulate a general network reliability problem within this framework, and then show how to calculate the service reliability using Subset Simulation, an efficient Markov chain Monte Carlo method, that was originally developed for estimating small failure probabilities of complex dynamic systems. The efficiency of the method is demonstrated with an illustrative example where two small-world network generation models are compared in terms of the reliability of the networks that they produce.

10. The Maximum Number of 3- and 4-Cliques within a Planar Maximally Filtered Graph
J. Birch, A.A. Pantelous, and K.M. Zuev
Physica A, vol. 417, pp 221-229, Jan. 2015.
[Web DOI | Paper pdf | arXiv math-ph 1507.02929]
Abstract: Planar Maximally Filtered Graphs (PMFG) are an important tool for filtering the most relevant information from correlation based networks such as stock market networks. One of the main characteristics of a PMFG is the number of its 3- and 4-cliques. Recently in a few high impact papers it was stated that, based on heuristic evidence, the maximum number of 3- and 4-cliques that can exist in a PMFG with n vertices is $3n-8$ and $n-4$ respectively. In this paper, we prove that this is indeed the case.

9. Global Optimization using the Asymptotically Independent Markov Sampling Method
K.M. Zuev and J.L. Beck
Computers & Structures, vol. 126, pp. 107-119, Sep. 2013.
[Web DOI | Paper pdf]
Abstract:
In this paper, we introduce a new efficient stochastic simulation method, AIMS-OPT, for approximating the set of globally optimal solutions when solving optimization problems such as optimal performance-based design problems. This method is based on Asymptotically Independent Markov Sampling (AIMS), a recently developed advanced simulation scheme originally proposed for Bayesian inference. This scheme combines importance sampling, Markov chain Monte Carlo simulation and annealing for efficient sampling from an arbitrary target distribution over a multi-dimensional space. Instead of a single approximation of the optimal solution, AIMS-OPT produces a set of nearly optimal solutions where the accuracy of the near-optimality is controlled by the user. Having a set of nearly optimal system designs, for example, can be advantageous in many practical cases such as when there exists a whole set of optimal designs or in multi-objective optimization where there is a Pareto optimal set. AIMS-OPT is also useful for efficient exploration of the global sensitivity of the objective function to the design parameters. The efficiency of AIMS-OPT is demonstrated with several examples which have different topologies of the optimal solution sets. Comparison is made with the results of applying Simulated Annealing, a well-known stochastic optimization algorithm, to the three two-dimensional problems.

8. Asymptotically Independent Markov Sampling: a new MCMC Scheme for Bayesian Inference
J.L. Beck and K.M. Zuev
International Journal for Uncertainty Quantification, vol. 3, num. 5, pp. 445-474, 2013.
[Web DOI | Paper pdf | arXiv stat.CO 1110.1880]
Abstract:
In Bayesian inference, many problems can be expressed as the evaluation of the expectation of an uncertain quantity of interest with respect to the posterior distribution based on relevant data. Standard Monte Carlo method is often not applicable because the encountered posterior distributions cannot be sampled directly. In this case, the most popular strategies are the importance sampling method, Markov chain Monte Carlo, and annealing. In this paper, we introduce a new scheme for Bayesian inference, called asymptotically independent Markov sampling (AIMS), which is based on the above methods. We derive important ergodic properties of AIMS. In particular, it is shown that, under certain conditions, the AIMS algorithm produces a uniformly ergodic Markov chain. The choice of the free parameters of the algorithm is discussed and recommendations are provided for this choice, both theoretically and heuristically based. The efficiency of AIMS is demonstrated with three numerical examples, which include both multimodal and higherdimensional target posterior distributions.

7. Bayesian Post-Processor and other Enhancements of Subset Simulation for Estimating Failure Probabilities
K.M. Zuev, J.L. Beck, S.K. Au, and L.S. Katafygiotis
Computers & Structures, vol. 92-93, pp. 283-296, Feb. 2012.
[Web DOI | Paper pdf | arXiv stat.CO 1110.3390]
Abstract:
Estimation of small failure probabilities is one of the most important and challenging computational problems in reliability engineering. The failure probability is usually given by an integral over a high-dimensional uncertain parameter space that is difficult to evaluate numerically. This paper focuses on enhancements to Subset Simulation (SS), proposed by Au and Beck, which provides an efficient algorithm based on MCMC (Markov chain Monte Carlo) simulation for computing small failure probabilities for general high-dimensional reliability problems. First, we analyze the Modified Metropolis algorithm (MMA), an MCMC technique, which is used in SS for sampling from high-dimensional conditional distributions. We present some observations on the optimal scaling of MMA, and develop an optimal scaling strategy for this algorithm when it is employed within SS. Next, we provide a theoretical basis for the optimal value of the conditional failure probability $p_0$, an important parameter one has to choose when using SS. Finally, a Bayesian post-processor SS+ for the original SS method is developed where the uncertain failure probability that one is estimating is modeled as a stochastic variable whose possible values belong to the unit interval. Simulated samples from SS are viewed as informative data relevant to the system's reliability. Instead of a single real number as an estimate, SS+ produces the posterior PDF of the failure probability, which takes into account both prior information and the information in the sampled data. This PDF quantifies the uncertainty in the value of the failure probability and it may be further used in risk analyses to incorporate this uncertainty. The relationship between the original SS and SS+ is also discussed.

6. Modified Metropolis-Hastings Algorithm with Delayed Rejection
K.M. Zuev and L.S. Katafygiotis
Probabilistic Engineering Mechanics, vol. 26, num. 3, pp. 405-412, Jul. 2011.
[Web DOI | Paper pdf]
Abstract:
The development of an efficient MCMC strategy for sampling from complex distributions is a difficult task that needs to be solved for calculating the small failure probabilities encountered in the high-dimensional reliability analysis of engineering systems. Usually different variations of the Metropolis–Hastings algorithm (MH) are used. However, the standard MH algorithm does not generally work in high dimensions, since it leads to very frequent repeated samples. In order to overcome this deficiency one can use the Modified Metropolis–Hastings algorithm (MMH) proposed in Au and Beck in 2001. Another variation of the MH algorithm, called the Metropolis–Hastings algorithm with delayed rejection (MHDR) has been proposed by Tierney and Mira in 1999. The key idea behind the MHDR algorithm is to reduce the correlation between states of the Markov chain. In this paper we combine the ideas of MMH and MHDR and propose a novel modification of the MH algorithm, called the Modified Metropolis–Hastings algorithm with delayed rejection (MMHDR). The efficiency of the new algorithm is demonstrated with a numerical example where MMHDR is used together with Subset simulation for computing small failure probabilities in high dimensions.

5. The Horseracing Simulation Algorithm for Evaluation of Small Failure Probabilities
K.M. Zuev and L.S. Katafygiotis
Probabilistic Engineering Mechanics, vol. 26, num. 2, pp. 157-164, Apr. 2011.
[Web DOI | Paper pdf]
Abstract:
Over the past decade, the civil engineering community has ever more realized the importance and perspective of reliability-based design optimization (RBDO). Since then several advanced stochastic simulation algorithms for computing small failure probabilities encountered in reliability analysis of engineering systems have been developed. In this paper we propose a novel advanced stochastic simulation algorithm for solving high-dimensional reliability problems, called Horseracing Simulation (HRS). The key idea behind HS is as follows. Although the reliability problem itself is high-dimensional, the limit-state function maps this high-dimensional parameter space into a one-dimensional real line. This mapping transforms a highdimensional random parameter vector, which may represent the stochastic input load as well as any uncertain structural parameters, into a random variable with unknown distribution, which represents the uncertain structural response. It turns out that the corresponding cumulative distribution function (CDF) of this random variable of interest can be accurately approximated by empirical CDFs constructed from specially designed samples. The generation of samples is governed by a process of ‘‘racing’’ towards the failure domain, hence the name of the algorithm. The accuracy and efficiency of the new method are demonstrated with a real-life wind engineering example.

4. A Formal Frobenius Theorem and Argument Shift
A.V. Bolsinov and K.M. Zuev
Mathematical Notes, vol. 86, num. 1-2, pp. 10-18, Aug. 2009.
[Web DOI | Paper pdf]
Matematicheskie Zametki,
vol. 86, num. 1, pp. 3-13, 2009.
[Web DOI | Paper pdf]
Abstract:
A formal Frobenius theorem, which is an analog of the classical integrability theorem for smooth distributions, is proved and applied to generalize the argument shift method of A. S. Mishchenko and A. T. Fomenko to finite-dimensional Lie algebras over any field of characteristic zero. A completeness criterion for a commutative set of polynomials constructed by the formal argument shift method is obtained.

3. Geometric Insight into the Challenges of Solving High-Dimensional Reliability Problems
L.S. Katafygiotis and K.M. Zuev
Probabilistic Engineering Mechanics, vol. 23, num. 2-3, pp. 208-218, Apr.-Jul. 2008.
[Web DOI | Paper pdf]
Abstract:
In this paper we adopt a geometric perspective to highlight the challenges associated with solving high-dimensional reliability problems. Adopting a geometric point of view we highlight and explain a range of results concerning the performance of several well-known reliability methods. We start by investigating geometric properties of the $N$-dimensional Gaussian space and the distribution of samples in such a space or in a subspace corresponding to a failure domain. Next, we discuss Importance Sampling (IS) in high dimensions. We provide a geometric understanding as to why IS generally does not work in high dimensions. We furthermore challenge the significance of “design point” when dealing with strongly nonlinear problems. We conclude by showing that for the general high-dimensional nonlinear reliability problems the selection of an appropriate fixed IS density is practically impossible. Next, we discuss the simulation of samples using Markov Chain Monte Carlo (MCMC) methods. Firstly, we provide a geometric explanation as to why the standard Metropolis–Hastings (MH) algorithm does “not work” in high-dimensions. We then explain why the modified Metropolis–Hastings (MMH) algorithm overcomes this problem. A study of the correlation of samples obtained using MMH as a function of different parameters follows. Such study leads to recommendations for fine-tuning the MMH algorithm. Finally, the MMH algorithm is compared with the MCMC algorithm proposed by Katafygiotis and Cheung in 2006 in terms of the correlation of samples they generate.

2. Spectrum of the Laplace-Beltrami Operator on Suspensions of Toric Automorphisms
K.M. Zuev
Sbornik: Mathematics, vol. 197, num. 9, pp. 1297-1308, 2006.
[Web DOI | Paper pdf]
Matematicheskii Sbornik,
vol. 197, num. 9, pp. 43-54, 2006.
[Web DOI | Paper pdf]
Abstract:
The spectrum and the eigenbasis of the Laplace–Beltrami operator on the suspensions of toric automorphisms are investigated. A description in terms of solutions of one-dimensional Schrödinger’s equation is presented.

1. On the case of Integrability of a Geodesic Flow on a Homogeneous Manifold
K.M. Zuev
Moscow University Mathematics Bulletin, num. 2, pp. 13-16, 2006.
[Paper pdf]

 

Refereed Conference Publications
 

12. Asymptotically Independent Markov Sampling: a new MCMC Scheme for Bayesian Inference
K.M. Zuev and J.L. Beck
2nd International Conference on Vulnerability and Risk Analysis and Management &
6th International Symposium on Uncertainty Modelling and Analysis ASCE-ICVRAM-ISUMA-2014
, Liverpool, UK, 2014.
[Paper pdf]

11. Rare Events in Complex Networks and their Efficient Estimation
K.M. Zuev, J.L. Beck, and S. Wu
SIAM Workshop on Network Science NS14, Chicago, IL, USA, 2014.
[Paper pdf]

10. Efficient Estimation of Complex Network Reliability
K.M. Zuev, S. Wu, and J.L. Beck
MIPT 56th annual scientific conference, Dolgoprudny, Russia, 2013.
[Paper pdf]

9. Network Reliability Problem and its Efficient Solution by Subset Simulation
K.M. Zuev, S. Wu, and J.L. Beck
11th International Conference on Structural Safety and Reliability ICOSSAR-2013, New York, NY, USA, 2013.
[Paper pdf]


8. Global Optimization for Performance-Based Design using the Asymptotically Independent Markov Sampling Method
K.M. Zuev and J.L. Beck
11th International Conference on Structural Safety and Reliability ICOSSAR-2013, New York, NY, USA, 2013.
[Paper pdf]


7. Bayesian Post-Processing for Subset Simulation for Decision Making under Risk
K.M. Zuev and L.J. Beck
Asian-Pacific Symposium on Structural Reliability and its Applications APSSRA-2012, Singapore, 2012.
[Paper pdf]


6. On the Optimal Scaling of the Modified Metropolis-Hastings algorithm
K.M. Zuev, J.L. Beck, and L.S. Katafygiotis
11th International Conference on Applications of Statistics and Probability ICASP-2011, Zurich, Switzerland, 2011.
[Paper pdf]


5. Modified Metropolis-Hastings Algorithm with Delayed Rejection for High-Dimensional Reliability Analysis
K.M. Zuev and L.S. Katafygiotis
Computational Methods in Structural Dynamics and Earthquake Engineering COMPDYN-2009, Rhodos, Greece, 2009.
[Paper pdf]


4. Horseracing Simulation Algorithm for Evaluation of Small Failure Probabilities
L.S. Katafygiotis and K.M. Zuev
Computational Methods in Structural Dynamics and Earthquake Engineering COMPDYN-2009, Rhodos, Greece, 2009.
[Paper pdf]


3. Modified Metropolis-Hastings Algorithm with Delayed Rejection
K.M. Zuev and L.S. Katafygiotis
Asian-Pacific Symposium on Structural Reliability and its Applications APSSRA-2008, Hong Kong, China, 2008.
[Paper pdf]


2. Estimation of Small Failure Probabilities in High Dimensions by Adaptive Linked Importance Sampling
K.M. Zuev and L.S. Katafygiotis
Computational Methods in Structural Dynamics and Earthquake Engineering COMPDYN-2007, Rethymno, Crete, Greece, 2007.
[Paper pdf]


1. Geometric Insight into the Challenges of Solving High-Dimensional Reliability Problems
L.S. Katafygiotis and K.M. Zuev
5th International Conference on Computational Stochastic Mechanics CSM-5, Rhodos, Greece, 2006.
[Paper pdf]



Miscellaneous

1. Discussion of paper by F. Miao and M. Ghosn
“Modified Subset Simulation method for reliability analysis of structural systems,” Structural Safety, 33:251–260, 2011

S.K. Au, J.L. Beck, K.M. Zuev, and L.S. Katafygiotis
Structural Safety, vol. 34, no. 1, pp. 379-380, Jan. 2012.
[Web DOI | Paper pdf]

Back to Top