注意
本章大篇幅引用Pyhon for Data Analysis,Third Edition,Wes McKinney,仅作为学习交流使用。
本章默认进行如下操作:
import numpy as np
import pandas as pd

NumPy Basics: Arrays and Vectorized Computation

NumPy, short for Numerical Python, is one of the most important foundational packages for numerical computing in Python.

The NumPy ndarray: A Multidimensional Array Object

Creating ndarrays:arr1 = np.array(data1)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
In [22]: data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]

In [23]: arr2 = np.array(data2)

In [24]: arr2
Out[24]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])

In [25]: arr2.ndim # 查看数组维度
Out[25]: 2

In [26]: arr2.shape # 查看数组形状
Out[26]: (2, 4)

0s,1s,Trues:

1
2
3
np.zeros(10)
np.empty((2, 3, 2))
np.full((3, 3), True, dtype=bool)

Data Types for ndarrays

Numpy的数据类型链接

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
In [33]: arr1 = np.array([1, 2, 3], dtype=np.float64)

In [34]: arr2 = np.array([1, 2, 3], dtype=np.int32)

In [35]: arr1.dtype
Out[35]: dtype('float64')

In [36]: arr2.dtype
Out[36]: dtype('int32')

In [37]: arr = np.array([1, 2, 3, 4, 5])

In [38]: arr.dtype
Out[38]: dtype('int64')

In [39]: float_arr = arr.astype(np.float64) # 转换另一种数据类型

In [40]: float_arr
Out[40]: array([1., 2., 3., 4., 5.])

In [41]: float_arr.dtype
Out[41]: dtype('float64')

Indexing and Slicing

Boolean Indexing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
In [100]: names = np.array(["Bob", "Joe", "Will", "Bob", "Will", "Joe", "Joe"])

In [101]: data = np.array([[4, 7], [0, 2], [-5, 6], [0, 0], [1, 2],
.....: [-12, -4], [3, 4]])

In [102]: names
Out[102]: array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')

In [103]: data
Out[103]:
array([[ 4, 7],
[ 0, 2],
[ -5, 6],
[ 0, 0],
[ 1, 2],
[-12, -4],
[ 3, 4]])

# Boolean Array
In [104]: names == "Bob"
Out[104]: array([ True, False, False, True, False, False, False])

In [105]: data[names == "Bob"]
Out[105]:
array([[4, 7],
[0, 0]])

In [106]: data[names == "Bob", 1:]
Out[106]:
array([[7],
[0]])

In [107]: data[names == "Bob", 1]
Out[107]: array([7, 0])

In [111]: cond = names == "Bob"

# The ~ operator can be useful when you want to invert a Boolean array referenced by a variable
In [112]: data[~cond]
Out[112]:
array([[ 0, 2],
[ -5, 6],
[ 1, 2],
[-12, -4],
[ 3, 4]])

# To select two of the three names to combine multiple Boolean conditions, use Boolean arithmetic operators like & (and) and | (or)
In [113]: mask = (names == "Bob") | (names == "Will")

In [114]: mask
Out[114]: array([ True, False, True, True, True, False, False])

In [115]: data[mask]
Out[115]:
array([[ 4, 7],
[-5, 6],
[ 0, 0],
[ 1, 2]])

Fancy Indexing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
In [120]: arr = np.zeros((8, 4))

In [121]: for i in range(8):
.....: arr[i] = i

In [122]: arr
Out[122]:
array([[0., 0., 0., 0.],
[1., 1., 1., 1.],
[2., 2., 2., 2.],
[3., 3., 3., 3.],
[4., 4., 4., 4.],
[5., 5., 5., 5.],
[6., 6., 6., 6.],
[7., 7., 7., 7.]])

In [123]: arr[[4, 3, 0, 6]]
Out[123]:
array([[4., 4., 4., 4.],
[3., 3., 3., 3.],
[0., 0., 0., 0.],
[6., 6., 6., 6.]])

# Using negative indices selects rows from the end
In [124]: arr[[-3, -5, -7]]
Out[124]:
array([[5., 5., 5., 5.],
[3., 3., 3., 3.],
[1., 1., 1., 1.]])

Pseudorandom Number Generation

引用自:NumPy random number generator methods

Numpy.Random API reference:https://numpy.org/doc/stable/reference/random/index.html

Method Description
permutation Return a random permutation of a sequence, or return a permuted range
shuffle Randomly permute a sequence in place
uniform Draw samples from a uniform distribution
integers Draw random integers from a given low-to-high range
standard_normal Draw samples from a normal distribution with mean 0 and standard deviation 1
binomial Draw samples from a binomial distribution
normal Draw samples from a normal (Gaussian) distribution
beta Draw samples from a beta distribution
chisquare Draw samples from a chi-square distribution
gamma Draw samples from a gamma distribution
uniform Draw samples from a uniform [0, 1) distribution

Universal Functions

Numpy API reference:https://numpy.org/doc/stable/reference/ufuncs.html