经济学仿真

这是对一个经济市场的模拟，在这个市场中，有许多参与者，每个参与者都有一定的财富水平。在每一个步骤中，两个参与者（由交互函数(interaction function)选择）参与一个交易，在他们之间交换财富（根据交易函数(transaction function)）。其目的是了解人口财富随时间的演变(evolution)。我在2008年访问巴德学院计算机科学系时听说了这个问题。更新：2017年，Uri Wilensky的一个版本开始流行。我们在下面介绍他的版本。

为什么这么有趣？

这是一个使用仿真来对世界建模的例子。这个模型很简单，但捕捉到了复杂世界的某些方面。
许多学生对经济如何运作有着先入为主的观念(preconceptions)，而这些观念将受到这里所示结果的挑战。
它揭示了计算思维、数学思维和统计思维之间的微妙差异(subtle difference)。

人群分布

我们将会用一个还有N个数字的列表来对人群建模，每个数字代表一个参与者的财富值。我们将从高斯分布开始，

平均财富值为100 simoleons，标准差(standard deviation)为平均值的1/5

import random

N = 5000
MU = 100.

population = [random.gauss(mu = MU, sigma=MU/5) for _ in range(N)]

人口统计与可视化

财富在人口分配中有多平均(evenly)？传统方法是基尼系数

。维基百科说，基尼系数(Gini coefficient)是通过这个公式算出来的(假设y值是排过序的)

$G = \frac {2\sum_{i=1}^{n}iy_{i}}{n\sum^{N}{i=1}y{i}} - \frac{n+1}{n}$

基尼系数是0意味着完全平均(每个人有相同数量), 这个值越接近1越不平均(绝大多数的钱在少数人手里)。下面是一些国家基尼系数的值：

Country	Gini coefficient
Sweden	0.250
Canada	0.326
Switzerland	0.337
United States	0.408
Chile	0.521
South Africa	0.631

基尼系数传统上是按收入计算的，但是我们要处理的是财富问题。计算如下

def gini(y):
    y = sorted(y)
    n = len(y)
    numer = 2 * sum((i+1) * y[i] for i in range(n))
    denom = n *sum(y)
    return (numer / denom) - (n + 1) / n

我们将定义函数hist来绘制总体直方图(histogram)。我们的hist包装plt.hist, 但是又一些特定的关键字值：

%matplotlib inline
import matplotlib.pyplot as plt

def hist(population, label='pop', **kwargs):
    label = label + ': G=' + str(round(gini(population), 2))
    h = plt.hist(list(population), bins=30, label=label, **kwargs)
    plt.xlabel('wealth'); plt.ylabel('count');  plt.grid(True)
    plt.legend()

交易

在一笔交易中，两个参与者走到一起，交换他们的一些财富。目前，我们将使用一个节约财富(wealth-conserving)的交易函数，将两个参与者的所有财富放入一个罐(pot)中，然后在两个参与者之间随机、统一地分割：

def random_split(A, B):
    pot = A + B
    share = random.uniform(0, pot)
    return share, pot - share

Canada|0.326|

交互

我们如何决定哪一部分相互影响？我们将定义一个交互函数，在给定人群规模的情况下，随机选择人群中的任意两个参与者（由它们在列表中的索引号表示(denoted by)）。我们将此函数称为anyone，这意味着任何参与者都可以与任何其他参与者交互：

1	def anyone(N): return random.sample(range(N), 2)

仿真

函数simulate接受一个初始人群，调用一个交互函数来选择两个参与者，调用一个交易函数来分割他们的财富，并重复此T次。每次交易后，我们都会生成人群，因此simulate会生成整个仿真历史。

def step(population, transaction=random_split, interaction=anyone):
    i, j = interaction(len(population))
    population[i], population[j] = transaction(population[i], population[j])
    return population

def simulate(population, T, step=step, transaction=random_split, interaction=anyone):
    population=population.copy()
    yield population
    for t in range(T):
        yield step(population, transaction, interaction)

下面是4个参与者交换8次的例子

for pop in simulate([100] * 4, 8):
    print(pop)
  
[100, 100, 100, 100]
[100, 100, 166.07034848304477, 33.92965151695522]
[100, 129.37411310873642, 166.07034848304477, 4.555538408218797]
[100, 21.916292438917807, 166.07034848304477, 112.01335907803742]
[100, 21.916292438917807, 89.32137618847126, 188.7623313726109]
[88.88749715846127, 33.028795280456535, 89.32137618847126, 188.7623313726109]
[88.88749715846127, 0.008026672184755057, 122.34214479674304, 188.7623313726109]
[88.88749715846127, 0.008026672184755057, 147.86216819550745, 163.24230797384647]
[88.88749715846127, 54.73884147345122, 147.86216819550745, 108.51149317257999]

仿真可视化

对于跟多数据，我们需要更好的函数来可视化展示仿真结果

import statistics
def show(population, k=40, percentiles=(1, 10, 50, 90, 99), **kwargs):
    N = len(population)
    start = list(population)
    result = [(t, sorted(pop)) 
              for (t, pop) in enumerate(simulate(population, T=k * N, **kwargs)) 
                                        if t % (N / 10) == 0]
    print('  t  Gini stdev' + (' {:3d}%' * len(percentiles)).format(*percentiles))
    print('_______ ____ _____' + '____' * len(percentiles))
    fmt = '{:7,d} {:.2f} {:5.1f}' + ' {4.0f}' * len(percentiles)
    for (t, pop) in results:
        if t % (k * N //10) == 0:
            data = [percent(pct, pop) for pct in percentiles]
            print(fmt.format(t, gini(pop), statistics.stdev(pop), *data))
            
    plt.title('/'.join(map(str, percentiles)) + 'Percentile Plots')
    times = [t for (t, pop) in results]
    plt.hold(True); plt.xlabel('wealth'); plt.ylabel('time'); plt.grid(True)
    for pct in percentiles:
        line = [percent(pct, pop) for (t, pop) in results]
        plt.plot(line, times)
    plt.show()
    
    plt.title('Histograms')
    R = (min(pop + start), max(pop + start))
    hist(start, 'start', range=R)
    hist(pop, 'end', range=R)
    plt.show()
    
    plt.title('Ordered Curves')
    order = list(range(len(pop)))
    plt.plot(sorted(start), order, label='start')
    plt.plot(sorted(pop), order, label='end')
    plt.xlabel('wealth'); plt.ylabel('order'); plt.grid(True)
    plt.legend()
    
def percent(pct, items):
    return items[min(len(items) - 1, len(items) * pct // 100)]

show(population)
   t    Gini stdev   1%  10%  50%  90%  99%
------- ---- ----- ---- ---- ---- ---- ----
      0 0.11  20.1   54   75  100  126  148
 20,000 0.50  99.4    1   11   71  231  454
 40,000 0.49  98.0    1   11   71  231  452
 60,000 0.50  99.1    1   10   69  233  441
 80,000 0.49  96.6    1   11   71  230  442
100,000 0.50 100.4    1   11   69  230  465
120,000 0.50 101.1    1   10   69  230  468
140,000 0.50 101.8    1   10   69  230  466
160,000 0.50 102.7    1   11   68  227  477
180,000 0.49  98.4    1   11   72  229  450
200,000 0.49  99.0    1   11   70  231  464

输出分为四部分：

打印输出：对初始人口，以及此过程中每20,000个交易，我们打印人口的基尼系数和标准差，以及人口中五个百分点的财富：1%，10%，50%（median）， 90%，99%。

绘图：该图显示与打印信息相同的信息(除了基尼系数)，但沿途又更多数据点。最左边的(蓝色)是1%标记，最右边的(紫色)是99%标记。时间轴是自下而上，99%的线其实位置大概150，随着时间的推移会增加到400以上，这表明最富有的1%的人会变的越来越富有。大约50,000笔交易以后，线几乎直线上升，这表明该系统已经收敛。

直方图：开始与结束人群用直方图绘出。

有序曲线：初始(蓝色),结束(绿色)有序人群曲线表示了财富和顺序数目的关系。最穷的参与者(序数为0)在初始和结束的财富都是0。第2000穷的参与者(中位数下面一点；在40%的位置)，初始时大概有100块，但是在结束人群中只有50块。

结果显示收入不均衡随着时间流逝而增加。何以见得？因为基尼系数随着时间流逝在增加，标准差也在增加。1%和10%的标记在减少（蓝色和橄榄色的线随着时间的增加而向左移动），而90%和99%的标记在增加（浅绿色和紫色的线随着时间的增加而向右移动。

初始人群的影响

初始人群不同，会对最后的结果造成什么影响？下面介绍一个samples函数来对一个分布函数进行n次采样，并将其结果归一化为指定的均值

def samples(distribution, *args, n=N, mu=MU):
    numbers = [distribution(*args) for _ in range(N)]
    return normalize(numbers, mu)

def normalize(numbers, mu):
    numbers = [max(0, n) for n in numbers]
    factor = mu * len(numbers) / sum(numbers)
    return [x * factor for x in numbers]

现在让我们从一个分布函数产生出初始人群。我这里选择均匀分布：

1	show(samples(random.uniform, 0, 200))

   t    Gini stdev   1%  10%  50%  90%  99%
------- ---- ----- ---- ---- ---- ---- ----
      0 0.33  57.9    2   19  102  181  200
 20,000 0.49  97.1    1   11   72  224  447
 40,000 0.50 102.7    1   10   69  229  486
 60,000 0.50 101.1    1   11   68  233  460
 80,000 0.50 101.0    1   10   69  229  490
100,000 0.51 101.7    1   10   67  234  465
120,000 0.50  99.7    1   10   70  229  462
140,000 0.50  99.5    1   11   71  229  444
160,000 0.51 102.6    1   10   66  235  469
180,000 0.50 102.3    1   10   68  229  455
200,000 0.51 102.3    1   10   70  232  472