Saturday, 30 November 2013

Weibull analysis

This first post in ComputSimu will explain the basic methodology to calculate Weibull distribution given a data set, in this case a data set provided by a three point bending test for specimens with circular cross section. The Weibull distribution is one of the most used distributions to model life data, due to its extreme flexibility to fit a wide range of data (Normal distribution or logarithmic distribution, for instance) and its applicability modelling different problems (weather forecasting, failure analysis, delivery times, etc...)

For creating the plot and calculate the Weibull distribution parameters we will use a python script (www.python.org). Python is widely extended as scripting language in computational science with an impressive support, which guarantees a constant number of new modules and improvements.
Schema:
  1. Three point bending test
  2. Weibull distribution
  3. Characteristic life
  4. Weibull distribution characteristics
  5. Extra: Gamma function
  6. Code

1) Three point bending test
This test consists in an applied force (F) acting at the center of a beam to assess the fracture resistance. Required formulas:

Beam deflection (double pinned contact):

$Moment(max) =\dfrac{F}{2}\dfrac{L}{2} = \dfrac{FL}{4}$ where \(L = span\) (mm) and \(F = load\) (N)
$Stress(max) = \dfrac{M_t}{\dfrac{I_p}{r}} = \dfrac{M_t}{W_p}$ where \(r = radius\) (mm); \(W_p = W_y = W_z = \dfrac{I_y}{r} = \dfrac{\dfrac{\pi r^4}{4}}{r} = \dfrac{\pi r^3}{4}\)
$Stress(max) = \dfrac{Moment(max)}{Resisting Bending Moment} = \dfrac{\dfrac{FL}{4}}{\dfrac{\pi r^3}{4}} = \dfrac{FL}{\pi r^3}$ (MPa)
We will use Numpy to import the data (there are many options, but this works pretty well). The completed code is on the bottom of the post, but I will explain quickly the most important parts. The first step is import the data into an array. These values are ordered ascendant, which is a must in order to calculate the failure probability, but you can also order your array once the values are imported. Example:

F(N)
333
440
440
...

 path = '../Data_Test.csv'
 aSamples = np.genfromtxt(path, delimiter=',', skip_header=1, dtype='f')
 
Now, we need two functions to create two new arrays with failure probabilities and the failure stresses. To calculate the probability we can choose one of the two methods below depending on the number of samples.

def FailureProb(n, N):
    if(N < 100):
        #Bernard's approximation
        return (n - 0.3) / (N + 0.4)
    else:
        #Mean Ranks
        return n / (N + 1)


def MaxStress(F, L, r):
    # Maximum Stress (cylinder 3 points)
    return (F * L) / (math.pi * r ** 3)

aSamples = np.genfromtxt(path, delimiter=',', skip_header=1, dtype='f')

aFailureProb = aSamples.copy()
aFailureStress = aSamples.copy()

for i in range(len(aFailureProb)):
    aFailureProb[i] = FailureProb((i + 1), float(len(aSamples) + 1))
    aFailureStress[i] = MaxStress(aSamples[i], L, r) 

2) Weibull distribution
The probability density function of a random variable is: \[f(x; \alpha, \beta) = \begin{cases} \frac{\alpha}{\beta} (\frac{x}{\alpha})^{(\alpha - 1)} e^{-(\frac{x}{\beta})^{\alpha}} & \text{if } x \geq 0 \\ 0 & \text{if } x < 0 \end{cases}\]
and the cumulative distribution function is: \[F(x; \alpha, \beta) = \begin{cases} 1-e^{-(\frac{x}{\beta})^{\alpha}} & \text{if } x \geq 0 \\ 0 & \text{if } x < 0 \end{cases}\]
Shape parameter ($\alpha$):
$\alpha$ shows how the failure rate develops in time. We can describe three different situations for the parameter $\alpha$:

  • $\alpha$ < 1: This first situation is characterized by a decreasing failure rate or hazard function.
  • $\alpha$ = 1: The second situation is characterized by a constant failure rate (independent of time), this is called lack of memory. In this situation Weibull distribution is identical to the exponential distribution.
  • $\alpha$ > 1: The third situation is characterized by an increasing failure rate. This represents that the more older is the "element" the more likely it is to fail.
Scale parameter or characteristic life ($\beta$):
$\beta$ is the time/cycle/sample at which 63,2% are expected to fail. This is a constant for all Weibull distribution, regardless the shape parameter.

Plot:

Script:
"""
Weibull:
    1) Probability density function
    2) Cumulative distribution function
"""

import numpy as np
import matplotlib.pyplot as plt
from weibull import weibProbDist, weibCumDist

x = np.linspace(0, 5, 100)
a_alpha = np.array([0.5, 1, 2, 5])
label = r'${\alpha} = %s, {\beta} = %s$'

fig = plt.figure('Weibull')
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
for i in a_alpha:
    ax1.plot(x, weibProbDist(x, i, 1), label=label % (i, 1))
    ax2.plot(x, weibCumDist(x, i, 1), label=label % (i, 1))
ax1.set_title('Probability density function', size=12)
ax2.set_title('Cumulative distribution function', size=12)
ax1.legend(loc=1, prop={'size': 10})
ax2.legend(loc=7, prop={'size': 10})

fig.subplots_adjust(hspace=0.5)
plt.show()

3) Characteristic life
To obtain the Weibull parameters the first step is linearise the cumulative distribution functions: $F(x) = 1-e^{-(\frac{x}{\beta})^{\alpha}} \Rightarrow P_f = 1-e^{-(\frac{\sigma}{\sigma_0})^{m}}$ where m = slope and \(\sigma\) = scale (characteristic stress)

$P_s = 1-P_f$
$ln(P_s^{-1}) = ln(e^{(\frac{\sigma}{\sigma_0})^{m}}) \Rightarrow ln[ln(P_s^{-1})] = m \enspace ln(\frac{\sigma}{\sigma_0}) \Rightarrow ln[ln(P_s^{-1})] = m \enspace ln(\sigma) - m \enspace ln(\sigma_0)$

Linear expression:
$\underbrace{ln[ln(P_s^{-1})]}_\text{f(x)} = \underbrace{ m \enspace ln(\sigma)}_\text{mx} - \underbrace{ m \enspace ln(\sigma_0)}_\text{b}$
Maximum likelihood estimation (MLE) - Linear regression:
$m (slope)= \dfrac{Cov(ln(\sigma), ln[ln(P_s^{-1})])}{Var(ln(\sigma))} = \dfrac{\dfrac{1}{n} \sum_{i=1}^{n}(ln(\sigma) - \overline{ln(\sigma)}) (ln[ln(P_s^{-1})] - \overline{ln[ln(P_s^{-1})])}}{\dfrac{1}{n} \sum_{i=1}^{n}(ln(\sigma) - \overline{ln(\sigma)})^2}$
$b (intercept)= \overline{ln[ln(P_s^{-1})])} - m \enspace \overline{ln(\sigma)}$
In order to solve this simple equation we use Scipy and its module called stats which apply the described methodology above.
 #Survival probability = 1 - Failure Probability
 aSurvivalProb = 1 - aFailureProb

 #Linear regression: parameters (m, b)
 aLnPs = np.log(- np.log(aSurvivalProb))
 aLnFS = np.log(aFailureStress / 1000)

 slope, intercept, r_square, s, std_error = stats.linregress(aLnFS, aLnPs)
 regressLine = slope * aLnPs + intercept

 #Characteristic life : Stress
 Stress = np.exp(- (intercept / slope)) * 1000
 Pr_Stress = weibCumDist(Stress, slope, Stress)
 
Finally, we can calculate the parameter $\beta$ with the next expression:
$\beta, \sigma = e^{-(\frac{intercept(b)}{slope(m)})}$

Plotting analysis results:
As described above, the final plot shows how a stress = 125,08 MPa is the characteristic stress whereby the failure probability is equal to 63,2%. The plots below try to be self-explanatory.




4) Weibull distribution characteristics
You will find all characteristic parameters in the Weibull module. These parameters are exported into a .csv file.
$Mean = \beta \enspace \Gamma(1+\dfrac{1}{\alpha})$
$Median = \beta \enspace \ln(2)^{(\frac{1}{\alpha})}$
$Mode = \begin{cases} \beta \enspace (\frac{\alpha - 1}{\alpha})^{\frac{1}{\alpha}} & \text{if } \alpha > 1 \\ 0 & \text{if } \alpha = 1 \end{cases}$
$Variance = \beta^2 \enspace \Gamma(1+\dfrac{2}{\alpha}) - \mu^2$
$Skewness = \beta^2[\Gamma(1+\dfrac{3}{\alpha})+2(\Gamma(1+\dfrac{1}{\alpha}))^3 - 3[\Gamma(1+\dfrac{2}{\beta})][\Gamma(1+\dfrac{1}{\beta})]]$
Skewness
  • $\text{if } \alpha < 3.6: Skewness > 0 $
  • $\text{if }\alpha = 3.6: Skewness = 0 $
  • $\text{if }\alpha > 3.6: Skewness < 0 $

5) Extra: Gamma function

The Weibull module also include one of the possible approximations for the Gamma function, Lanczos approximation. You can find more information about different implementations on the links below:
C++ library + pseudocode
"Calculators and the Gamma Function", Viktor T. Toth (Recommended)

6) Code
import numpy as np
import matplotlib.pyplot as plt
import math
from scipy import stats
from weibull import *
import csv

#inputs:
L, r = float(40), float(4)
path = '../Data_Test.csv'
pathReport = '../WeibullReport.csv'


def FailureProb(n, N):
    if(N < 100):
        #Bernard's approximation
        return (n - 0.3) / (N + 0.4)
    else:
        #Mean Ranks
        return n / (N + 1)


def MaxStress(F, L, r):
    # Maximum Stress (cylinder 3 points)
    return (F * L) / (math.pi * r ** 3)

aSamples = np.genfromtxt(path, delimiter=',', skip_header=1, dtype='f')

aFailureProb = aSamples.copy()
aFailureStress = aSamples.copy()

for i in range(len(aFailureProb)):
    aFailureProb[i] = FailureProb((i + 1), float(len(aSamples) + 1))
    aFailureStress[i] = MaxStress(aSamples[i], L, r)

MeanForces = np.mean(aSamples)
MeanStress = np.mean(aFailureStress)

#Survival probability = 1 - Failure Probability
aSurvivalProb = 1 - aFailureProb

#Linear regression: parameters (m, b)
aLnPs = np.log(- np.log(aSurvivalProb))
aLnFS = np.log(aFailureStress / 1000)

slope, intercept, r_square, s, std_error = stats.linregress(aLnFS, aLnPs)
regressLine = slope * aLnPs + intercept

m = slope
b = intercept

#Characteristic life : Stress
Stress = np.exp(- (intercept / slope)) * 1000
Pr_Stress = weibCumDist(Stress, slope, Stress)

# export results to .csv
Results = [
            ['Parameters', 'Value', 'Units'],
            ['shape (alpha)', slope],
            ['scale (beta)', Stress, 'MPa'],
            ['Mean', weibMean(slope, Stress), 'MPa'],
            ['Median', weibMean(slope, Stress), 'MPa'],
            ['Mode', weibMode(slope, Stress), 'MPa'],
            ['Variance', weibVar(slope, Stress), 'MPa^2'],
            ['Standard Deviation', np.sqrt(weibVar(slope, Stress)), 'MPa'],
            ['Skewness', weibSkew(slope, Stress)]
          ]

with open(pathReport, 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(Results)

#Plots:
fig = plt.figure('Weibull')
ax1 = fig.add_subplot(211)
scatt = ax1.scatter(aLnFS, aLnPs, label='samples', c='DarkTurquoise', s=15)
regres = ax1.plot(aLnPs, regressLine, 'b-', label='Linear regression (MLE)')
ax1.set_ylabel('$Ln[-Ln(Ps)]$')
ax1.set_xlabel('$Ln(\sigma) : Ln(GPa)$')
ax1.set_title('Weibull analysis')
ax1.set_ylim(min(aLnPs) * 1.1, max(aLnPs) * 1.25)
ax1.set_xlim(min(aLnFS) * 1.1, max(aLnFS) * 0.9)
ax1.legend(loc=2, prop={'size': 10})
ax1.text(min(aLnFS) * 1.05, -1, r'$y=%.4f m + %.4f$' % (m, b))
ax1.text(min(aLnFS) * 1.05, -1.5, r'$R^2=%.4f$' % r_square)

x = np.linspace(0, round(Stress * 2, 0), 100)
ax2 = fig.add_subplot(212)
ax2.set_title('Weibull distribution')
ax2.plot(x, weibProbDist(x, slope, Stress), label='Prob. Density function')
ax2.fill_between(x, weibProbDist(x, slope, Stress), 0, color='0.90')
ax2.set_ylabel(r'$f(x, {\alpha}, {\beta})$')
ax2.set_xlabel(r'${\sigma} (MPa)$')
ax3 = ax2.twinx()
ax3.plot(x, weibCumDist(x, slope, Stress), 'r', label='Cum. distr. function')
ax3.fill_between(x, weibCumDist(x, slope, Stress), alpha=0.15, color='r')
ax3.set_ylim(0, 1)
ax3.set_ylabel(r'$F(x) = 1-exp(-(x/{\beta})^{\alpha})$')
ax3.legend(loc=7, prop={'size': 10})
ax2.legend(loc=2, prop={'size': 10})

#Text:
ax3.text(5, 0.7, r'${\alpha} = %.4f, {\beta} = %.4f$' % (m, Stress))
ax3.text(5, 0.6, r'$F(%s) = %s$' % (round(Stress, 3), round(Pr_Stress, 3)),
color='m')

#Arrows:
ax3.arrow(Stress, 0, 0, Pr_Stress - 0.1, head_width=2, head_length=0.1,
ec='m', fc='m')
ax3.arrow(Stress, Pr_Stress, round(Stress * 2, 0) - Stress - 10, 0,
head_width=0.02, head_length=10, ec='m', fc='m')

fig.subplots_adjust(hspace=0.5)
plt.show()
Weibull module:
"""
Weibull distribution:
    1) Probability density function
    2) Cumulative distribution function
    3) Mean
    4) Median
    5) Mode
    6) Variance - Standard deviation
    7) Skewness
    8) Gamma function (Lanczos approximation)
"""

import numpy as np
import cmath as cm


def weibProbDist(x, a, b):
    return (a / b) * (x / b) ** (a - 1) * np.exp(-(x / b) ** a)


def weibCumDist(x, a, b):
    return 1 - np.exp(-(x / b) ** a)


def weibMean(a, b):
    return b * gamma(1 + 1 / a)


def weibMedian(a, b):
    return b * np.log(2) ** (1 / a)


def weibMode(a, b):
    if a > 1:
        return b * ((a - 1) / a) ** (1 / a)
    else:
        return 0


def weibVar(a, b):
    return (b ** 2) * gamma(1 + 2 / a) - weibMean(a, b) ** 2


def weibSkew(a, b):
    ax_1 = gamma(1 + 3 / a) + 2 * (gamma(1 + 1 / a) ** 3)
    ax_2 = 3 * (gamma(1 + 2 / a) * gamma(1 + 1 / a))
    return (b ** 3) * (ax_1 - ax_2)

"""
Gamma function:

Lanczos approximation implementation.
Numerical Recipes in C (2nd ed. Cambridge University Press 1992)

"""
g = 5

p0 = float(1.000000000190015)
p1 = float(76.18009172947146)
p2 = float(-86.50532032941677)
p3 = float(24.01409824083091)
p4 = float(-1.231739572450155)
p5 = float(1.208650973866179e-3)
p6 = float(-5.395239384953e-6)

Pn_Coef = np.array([p0, p1, p2, p3, p4, p5, p6])


def gamma(z):
    z = complex(z)
    if z.real < 0.5:
        #Recursion method:
        return cm.pi / (cm.sin(cm.pi * z) * gamma(1 - z))
    else:
        ax_1 = cm.sqrt(cm.pi * 2) / z
        ax2_sum = Pn_Coef[0]
        for i in range(1, g + 2):
            ax2_sum += Pn_Coef[i] / (z + i)
        t = z + g + 0.5
        ax_3 = (t ** (z + 0.5)) * cm.exp(-t)
        return ((ax_1 * ax2_sum) * ax_3).real