开发者社区> 问答> 正文

从截断的高斯分布生成numpy向量化值

我有一个函数,它从截断的正态分布中生成一个值,并带有一个while循环,该循环可确保舍弃位于截断之外的任何生成的值,并将其替换为另一代,直到其位于范围之内。

def gen_truncated(minimum, maximum, ave, sigma):
    # min=0.9, max=1, 
    x = 0.
    while x < minimum or x > maximum:
        x = np.random.normal(0,1)\*igma+ave

    return x

我该如何向量化该函数,使得x现在是由多个x值组成的数组,以这样的方式生成:始终存在一个while循环,以确保只要条件即可重新生成数组元素x <最小值x>最大值`是否达到?有没有一种向量化的方法,可以将x的每个元素与一个数字进行比较,即最小或最大?

编辑:如果我还有更多需要满足的约束该怎么办?最终,我希望向量化通过多个约束生成的4x4矩阵的生成,gen_truncated()中的约束只是众多约束中的一种。我有一个gen_sigma()首先生成3个值lambda1,lambda2,lambda3,现在lambda3再次需要满足lambda1和lambda2的几个条件,否则它们将被重绘。一旦它们正确,就将所有三个值馈送到get_tau()中以生成3个值。同样,这些tau值需要满足更多约束,否则它们将被丢弃并再次生成,直到正确为止。最终,它们形成了一个名为sigma_gen的4x4矩阵,

import numpy as np
from numpy.linalg import norm

def gen_sigma(minimum, maximum, ave, sigma):
    lambda1 = gen_truncated(minimum, maximum, ave, sigma)
    lambda2 = gen_truncated(minimum, maximum, ave, sigma)
    lambda3 = gen_truncated(minimum, maximum, ave, sigma)

    while 1+lambda3 < abs(lambda1+lambda2) or 1-lambda3 < abs(lambda2-lambda1):
        lambda3 = gen_truncated(minimum, maximum, ave, sigma)

    tau = get_tau(lambda1, lambda2, lambda3)
    lambdas = [lambda1, lambda2, lambda3]
    while (norm(tau)\*2 >
           1-sum([x\*2 for x in [lambda1, lambda2, lambda3]]) +
           2\*ambda1\*ambda2\*ambda3) or (z_eta(tau, lambdas) < 0):
        tau = get_tau(lambda1, lambda2, lambda3)

    sigma_gen = np.array([[     1,       0, 0, 0],
                          [tau[0], lambda1, 0, 0],
                          [tau[1], 0, lambda2, 0],
                          [tau[2], 0, 0, lambda3]])

    return sigma_gen

def get_tau(einval1, einval2, einval3):
    max_tau1 = 1 - abs(einval1)
    max_tau2 = 1 - abs(einval2)
    max_tau3 = 1 - abs(einval3)
    tau1 = max_tau1\*2\*p.random.uniform(0,1)-1)
    tau2 = max_tau2\*2\*p.random.uniform(0,1)-1)
    tau3 = max_tau3\*2\*p.random.uniform(0,1)-1)

    return [tau1, tau2, tau3]

def z_eta(t: np.ndarray, l: np.ndarray):
    condition = (norm(t)\*4 - 2\*orm(t)\*2 -
                 2\*um([(l[i]\*2)\*2\*t[i]\*2-norm(t)\*2)) for i in range(3)])+
                 q(l))
    return condition

def q(e: np.ndarray):
    # e are the eigenvalues
    return (1+e[0]+e[1]+e[2])\*1+e[0]-e[1]-e[2])\*1-e[0]+e[1]-e[2])\*1-e[0]-e[1]+e[2])

def create_rotation(angles: np.ndarray) -> np.ndarray:
    "random rotation in PL form"
    # input np.random.normal(0,1,3)\*.06
    rotation = np.eye(4, dtype=complex)
    left = np.array([[ np.cos(angles[0]), np.sin(angles[0]), 0],
                     [-np.sin(angles[0]), np.cos(angles[0]), 0],
                     [                 0,                 0, 1]])
    mid = np.array([[1,                 0,                 0],
                    [0, np.cos(angles[1]), np.sin(angles[1])],
                    [0, -np.sin(angles[1]), np.cos(angles[1])]])
    right = np.array([[ np.cos(angles[2]), np.sin(angles[2]), 0],
                      [-np.sin(angles[2]), np.cos(angles[2]), 0],
                      [                 0,                 0, 1]])
    rotation[1:4,1:4] = left@mid@right

    return rotation

def gen_channel(r1, r2, ave, sigma):
    rand1 = np.random.normal(0,1,3)
    rand2 = np.random.normal(0,1,3)
    channel = create_rotation(rand1\*1)@gen_sigma(0.9, 1, ave, sigma)@\
              create_rotation(rand2\*2)
    return channel

An example run of a channel

gen_channel(0.05, 0.05, 0.98, 0.15)

would give for example

Out[140]: 
array([[ 1.        +0.j,  0.        +0.j,  0.        +0.j,
         0.        +0.j],
       [-0.05828008+0.j,  0.91805971+0.j,  0.14291751+0.j,
        -0.00946994+0.j],
       [-0.00509449+0.j, -0.14170308+0.j,  0.90034613+0.j,
        -0.11548884+0.j],
       [ 0.0467522 +0.j, -0.00851749+0.j,  0.11450963+0.j,
         0.90259637+0.j]])

Now if I want to create say 100 of these 4x4 matrix I'll have to use list comprehension i.e.

np.array([gen_channel(0.05, 0.05, 0.98, 0.15) for i in range(100)])

which will run through all the constraints comparison and create the 4x4 matrices one by one. Now my very original question was motivated by the fact that I want to vectorise them, so rather than comparing one value at a time, just generate an array of values using numpy broadcast and check the constraints such that I have a vectorise version of gen_channel which generates 100 such 4x4 matrices without the need of list comprehension. The list comprehension way contains the repeated use of generating a single random number which leads to a bottleneck in its runspeed. What I want to do is just generate arrays of random numbers, do those checks, then generate array of 4x4 channels so as to reduce the bottleneck.

问题来源: stackoverflow

展开
收起
is大龙 2020-03-24 22:47:23 824 0
1 条回答
写回答
取消 提交回答
  • 您可以从原始分布中抽取大量样本,然后确定哪些条目位于正确的范围内,然后从中进行抽取:

    # parameters
    ave, sigma = 0,1
    minimum, maximum = 0.9, 1
    
    # draw sample and specify which entries are ok
    a = np.random.normal(ave, sigma, 100000)
    index = (a > minimum) & (a < maximum)
    
    # draw from subset
    np.random.choice(a[index], 1000, replace=False)
    

    使用timeit

    在上面的代码上:

    %%timeit -r 10 -n 10 
    2.51 ms ± 87.5 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)
    

    在原件上循环:

    %%timeit -r 10 -n 10
    
    for i in range(1000):
        gen_truncated(0.9,1, 0, 1)
    
    88.5 ms ± 1.24 ms per loop (mean ± std. dev. of 10 runs, 10 loops each)
    

    回答来源:stackoverflow

    2020-03-24 22:47:31
    赞同 展开评论 打赏
问答标签:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
高维向量检索技术在PG中的设计与实践 立即下载
低代码开发师(初级)实战教程 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载