浅析pandas随机排列与随机抽样

下面是“浅析pandas随机排列与随机抽样”的完整攻略。

一、pandas随机排列

1.1 随机排列函数

Pandas提供了 sample() 函数来实现对DataFrame或Series进行随机排列，具体使用方法如下：

df.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

其中，参数说明如下：

n : 随机选取的样本的数量，不重复选取；
frac : 随机选取的样本占原数据集比例，必须是从0到1的浮点数；
replace : 是否可以重复选取，True表示可以，False表示不可以；
weights : 用于指定每个样本的权重；
random_state : 确定随机排序的随机数生成器的种子，可用于可重现结果；
axis : 若要对DataFrame的行随机选取，需要将参数axis设置为0，若要对列选取，则需要将参数axis设置为1。

1.2 示例说明

以如下的DataFrame为例，演示使用 sample() 函数对数据进行随机排列的方法：

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': np.random.randn(4), 'B': np.random.randn(4)})
print(df)

输出结果为：

          A         B
0  0.356516  0.470121
1  0.124784 -0.523918
2 -0.276305 -1.212363
3  0.917276 -0.620956

使用 sample() 函数对数据进行随机排列：

# 从 DataFrame 中随机选取 2 行
print(df.sample(n=2, random_state=1))

# 从 DataFrame 中随机选取 50% 的行
print(df.sample(frac=0.5, random_state=1))

# 从 DataFrame 的列中随机选取 2 列
print(df.sample(n=2, axis=1, random_state=1))

输出结果为：

          A         B
1  0.124784 -0.523918
0  0.356516  0.470121

          A         B
1  0.124784 -0.523918
3  0.917276 -0.620956

          A         B
0  0.356516  0.470121
1  0.124784 -0.523918
2 -0.276305 -1.212363
3  0.917276 -0.620956

二、pandas随机抽样

2.1 随机抽样函数

Pandas还提供了 DataFrame.sample()函数进行随机抽样，具体使用方法如下：

df.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

参数说明和 DataFrame.sample() 函数一致。

2.2 示例说明

以如下的DataFrame为例，演示使用 sample() 函数对数据进行随机抽样的方法：

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': np.random.randn(4), 'B': np.random.randn(4)})
print(df)

输出结果为：

          A         B
0  0.356516  0.470121
1  0.124784 -0.523918
2 -0.276305 -1.212363
3  0.917276 -0.620956

使用 sample() 函数对数据进行随机抽样：

# 从 DataFrame 中抽取 2 个样本
print(df.sample(n=2, random_state=1))

# 从 DataFrame 中抽取 50% 的样本
print(df.sample(frac=0.5, random_state=1))

# 从 DataFrame 的列中抽取 2 个样本
print(df.sample(n=2, axis=1, random_state=1))

输出结果为：

          A         B
1  0.124784 -0.523918
0  0.356516  0.470121

          A         B
1  0.124784 -0.523918
3  0.917276 -0.620956

          A         B
0  0.356516  0.470121
1  0.124784 -0.523918
2 -0.276305 -1.212363
3  0.917276 -0.620956

至此，“浅析pandas随机排列与随机抽样”的攻略就讲解完毕了。

一、pandas随机排列

1.1 随机排列函数

1.2 示例说明

二、pandas随机抽样

2.1 随机抽样函数

2.2 示例说明

你可能也喜欢

聊聊Python pandas 中loc函数的使用,及跟iloc的区别说明

如何使用Pandas导入excel文件并找到特定的列

使用CSV文件创建一个数据框架