seaborn是对matplotlib更高级的API封装,让你能用更少的代码去调用 matplotlib的方法,从而使得作图更加容易。
一、分布图
1、核密度估计图
单变量核密度估计图
1 2 3 4 5 6 7 8 9 10 11 12
| import matplotlib.pyplot as plt import seaborn as sns import numpy as np
mean, cov = [0, 2], [(1, .5), (.5, 1)] x, y = np.random.multivariate_normal(mean, cov, size=50).T
sns.kdeplot(x, shade=True, shade_lowest=False)
plt.show()
|
双变量核密度估计图
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.kdeplot(d.sepal_length[d.species=="setosa"], d.sepal_width[d.species=="setosa"], cmap="Reds", shade=True, shade_lowest=False) sns.kdeplot(d.sepal_length[d.species=="versicolor"], d.sepal_width[d.species=="versicolor"], cmap="Blues", shade=True, shade_lowest=False) plt.show()
|
2、联合分布图
联合概率分布简称联合分布,是两个及以上随机变量组成的随机向量的概率分布。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.jointplot(d.sepal_length, d.sepal_width, data=d, kind='kde', dropna=True) plt.show()
|
3、变量关系组图
变量关系组图非常有用,人们经常用它来查看多个变量之间的联系。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.pairplot(d, hue="species") plt.show()
|
seaborn分布数据可视化
二、回归图
简单线性回归的模型非常容易拟合。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.lmplot(x="sepal_length", y="sepal_width", data=d[d.species == "setosa"]) plt.show()
|
多项式回归模型可以拟合数据集中的一些简单的非线性趋势。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.lmplot(x="sepal_length", y="sepal_width", hue="species", data=d, order=2, ci=None, scatter_kws={"s": 80}) plt.show()
|
三、矩阵图
1、热图
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"]) d["species"] = iris.target
d.loc[d["species"]==0, "species"] = "setosa" d.loc[d["species"]==1, "species"] = "versicolor" d.loc[d["species"]==2, "species"] = "virginica"
sns.heatmap(d.corr(), xticklabels=d.corr().columns, yticklabels=d.corr().columns, cmap='RdYlGn', center=0, annot=True)
plt.show()
|
2、聚类图
1 2 3 4 5 6 7 8 9 10 11 12
| import imp import matplotlib.pyplot as plt import seaborn as sns import pandas as pd from sklearn.datasets import load_iris
iris = load_iris() d = pd.DataFrame(iris.data, columns=["sepal_length","sepal_width","petal_length","petal_width"])
sns.clustermap(d)
plt.show()
|