【译Py】Python交互式数据分析报告框架~Dash介绍
【译Py】Dash用户指南01-02_安装与应用布局
【译Py】Dash用户指南03_交互性简介
【译Py】Dash用户指南04_交互式数据图
【译Py】Dash用户指南05_使用State进行回调
4. 交互图
交互式可视化
dash_core_components
库包含一个叫Graph
的组件。 Graph
组件使用开源的plotly.js(JavaScript图形库)渲染交互式数据可视图。Plotly.js支持超过35种数据图,可以生成高清的SVG矢量图和高性能的WebGL图。
dash_core_components.Graph
组件的figure
与plotly.py
的figure
使用一样的参数,plotly.py是Plotly的Python开源图库,详情可参阅plotly.py文档与图库。
Dash组件通过响应式方法描述属性。回调函数可以更新各个属性,有些属性还可以通过用户交互进行更新。比如,点选dcc.Dropdown
组件的选项,该组件的value
特性就会改变。
用户交互可以改变hoverData
、clickData
、selectedData
及relayoutData
等4个dcc.Graph
组件属性。鼠标悬停、点击数据点或选择图中某个区域的点时,这些属性会相应更新。
下面的例子简单介绍了上述属性。
import json
from textwrap import dedent as d
import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash(__name__)
app.css.append_css(
{"external_url": "https://codepen.io/chriddyp/pen/bWLwgP.css"})
styles = {
'pre': {
'border': 'thin lightgrey solid',
'overflowX': 'scroll'
}
}
app.layout = html.Div([
dcc.Graph(
id='basic-interactions',
figure={
'data': [
{
'x': [1, 2, 3, 4],
'y': [4, 1, 3, 5],
'text': ['a', 'b', 'c', 'd'],
'customdata': ['c.a', 'c.b', 'c.c', 'c.d'],
'name': 'Trace 1',
'mode': 'markers',
'marker': {'size': 12}
},
{
'x': [1, 2, 3, 4],
'y': [9, 4, 1, 4],
'text': ['w', 'x', 'y', 'z'],
'customdata': ['c.w', 'c.x', 'c.y', 'c.z'],
'name': 'Trace 2',
'mode': 'markers',
'marker': {'size': 12}
}
]
}
),
html.Div(className='row', children=[
html.Div([
dcc.Markdown(d("""
**悬停数据**
将鼠标悬停在图中的值上。
""")),
html.Pre(id='hover-data', style=styles['pre'])
], className='three columns'),
html.Div([
dcc.Markdown(d("""
**点击数据**
用鼠标点击图上的点。
""")),
html.Pre(id='click-data', style=styles['pre']),
], className='three columns'),
html.Div([
dcc.Markdown(d("""
**选择数据**
使用菜单的套索或方框工具,选择图上的点。
""")),
html.Pre(id='selected-data', style=styles['pre']),
], className='three columns'),
html.Div([
dcc.Markdown(d("""
**缩放与改变数据布局**
在图形上点击并拖拽,
或点击图形菜单的缩放按钮实现缩放。
点击图例也可以激活此事件。
""")),
html.Pre(id='relayout-data', style=styles['pre']),
], className='three columns')
])
])
@app.callback(
Output('hover-data', 'children'),
[Input('basic-interactions', 'hoverData')])
def display_hover_data(hoverData):
return json.dumps(hoverData, indent=2)
@app.callback(
Output('click-data', 'children'),
[Input('basic-interactions', 'clickData')])
def display_click_data(clickData):
return json.dumps(clickData, indent=2)
@app.callback(
Output('selected-data', 'children'),
[Input('basic-interactions', 'selectedData')])
def display_selected_data(selectedData):
return json.dumps(selectedData, indent=2)
@app.callback(
Output('relayout-data', 'children'),
[Input('basic-interactions', 'relayoutData')])
def display_selected_data(relayoutData):
return json.dumps(relayoutData, indent=2)
if __name__ == '__main__':
app.run_server(debug=True)
悬停数据
将鼠标悬停在图中的值上。
点击数据
用鼠标点击图形上的点。
选择数据
使用图形菜单的套索或方框工具,选择图形上的点。
缩放与改变数据布局
在图形上点击并拖拽,或点击图形菜单的缩放按钮实现缩放。点击图例也可以激活此事件。
鼠标悬停时更新图形
下面的代码对上一章的世界指标器示例进行了升级,升级内容为,当鼠标悬停在散点图上时,时间序列会随之更新。
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd
app = dash.Dash()
df = pd.read_csv(
'https://gist.githubusercontent.com/chriddyp/'
'cb5392c35661370d95f300086accea51/raw/'
'8e0768211f6b747c0db42a9ce9a0937dafcbd8b2/'
'indicators.csv')
available_indicators = df['Indicator Name'].unique()
app.layout = html.Div([
html.Div([
html.Div([
dcc.Dropdown(
id='crossfilter-xaxis-column',
options=[{'label': i, 'value': i} for i in available_indicators],
value='Fertility rate, total (births per woman)'
),
dcc.RadioItems(
id='crossfilter-xaxis-type',
options=[{'label': i, 'value': i} for i in ['Linear', 'Log']],
value='Linear',
labelStyle={'display': 'inline-block'}
)
],
style={'width': '49%', 'display': 'inline-block'}),
html.Div([
dcc.Dropdown(
id='crossfilter-yaxis-column',
options=[{'label': i, 'value': i} for i in available_indicators],
value='Life expectancy at birth, total (years)'
),
dcc.RadioItems(
id='crossfilter-yaxis-type',
options=[{'label': i, 'value': i} for i in ['Linear', 'Log']],
value='Linear',
labelStyle={'display': 'inline-block'}
)
], style={'width': '49%', 'float': 'right', 'display': 'inline-block'})
], style={
'borderBottom': 'thin lightgrey solid',
'backgroundColor': 'rgb(250, 250, 250)',
'padding': '10px 5px'
}),
html.Div([
dcc.Graph(
id='crossfilter-indicator-scatter',
hoverData={'points': [{'customdata': 'Japan'}]}
)
], style={'width': '49%', 'display': 'inline-block', 'padding': '0 20'}),
html.Div([
dcc.Graph(id='x-time-series'),
dcc.Graph(id='y-time-series'),
], style={'display': 'inline-block', 'width': '49%'}),
html.Div(dcc.Slider(
id='crossfilter-year--slider',
min=df['Year'].min(),
max=df['Year'].max(),
value=df['Year'].max(),
step=None,
marks={str(year): str(year) for year in df['Year'].unique()}
), style={'width': '49%', 'padding': '0px 20px 20px 20px'})
])
@app.callback(
dash.dependencies.Output('crossfilter-indicator-scatter', 'figure'),
[dash.dependencies.Input('crossfilter-xaxis-column', 'value'),
dash.dependencies.Input('crossfilter-yaxis-column', 'value'),
dash.dependencies.Input('crossfilter-xaxis-type', 'value'),
dash.dependencies.Input('crossfilter-yaxis-type', 'value'),
dash.dependencies.Input('crossfilter-year--slider', 'value')])
def update_graph(xaxis_column_name, yaxis_column_name,
xaxis_type, yaxis_type,
year_value):
dff = df[df['Year'] == year_value]
return {
'data': [go.Scatter(
x=dff[dff['Indicator Name'] == xaxis_column_name]['Value'],
y=dff[dff['Indicator Name'] == yaxis_column_name]['Value'],
text=dff[dff['Indicator Name'] == yaxis_column_name]['Country Name'],
customdata=dff[dff['Indicator Name'] == yaxis_column_name]['Country Name'],
mode='markers',
marker={
'size': 15,
'opacity': 0.5,
'line': {'width': 0.5, 'color': 'white'}
}
)],
'layout': go.Layout(
xaxis={
'title': xaxis_column_name,
'type': 'linear' if xaxis_type == 'Linear' else 'log'
},
yaxis={
'title': yaxis_column_name,
'type': 'linear' if yaxis_type == 'Linear' else 'log'
},
margin={'l': 40, 'b': 30, 't': 10, 'r': 0},
height=450,
hovermode='closest'
)
}
def create_time_series(dff, axis_type, title):
return {
'data': [go.Scatter(
x=dff['Year'],
y=dff['Value'],
mode='lines+markers'
)],
'layout': {
'height': 225,
'margin': {'l': 20, 'b': 30, 'r': 10, 't': 10},
'annotations': [{
'x': 0, 'y': 0.85, 'xanchor': 'left', 'yanchor': 'bottom',
'xref': 'paper', 'yref': 'paper', 'showarrow': False,
'align': 'left', 'bgcolor': 'rgba(255, 255, 255, 0.5)',
'text': title
}],
'yaxis': {'type': 'linear' if axis_type == 'Linear' else 'log'},
'xaxis': {'showgrid': False}
}
}
@app.callback(
dash.dependencies.Output('x-time-series', 'figure'),
[dash.dependencies.Input('crossfilter-indicator-scatter', 'hoverData'),
dash.dependencies.Input('crossfilter-xaxis-column', 'value'),
dash.dependencies.Input('crossfilter-xaxis-type', 'value')])
def update_y_timeseries(hoverData, xaxis_column_name, axis_type):
country_name = hoverData['points'][0]['customdata']
dff = df[df['Country Name'] == country_name]
dff = dff[dff['Indicator Name'] == xaxis_column_name]
title = '<b>{}</b><br>{}'.format(country_name, xaxis_column_name)
return create_time_series(dff, axis_type, title)
@app.callback(
dash.dependencies.Output('y-time-series', 'figure'),
[dash.dependencies.Input('crossfilter-indicator-scatter', 'hoverData'),
dash.dependencies.Input('crossfilter-yaxis-column', 'value'),
dash.dependencies.Input('crossfilter-yaxis-type', 'value')])
def update_x_timeseries(hoverData, yaxis_column_name, axis_type):
dff = df[df['Country Name'] == hoverData['points'][0]['customdata']]
dff = dff[dff['Indicator Name'] == yaxis_column_name]
return create_time_series(dff, axis_type, yaxis_column_name)
if __name__ == '__main__':
app.run_server()
在左边的散点图上悬停鼠标,会看到右边的线形图根据悬停的点进行了更新。
通用交叉筛选器示例
下面的示例针对6列数据进行常见的交叉筛选。可以使用每个散点图的筛选器对底层数据集进行筛选。
import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html
import numpy as np
import pandas as pd
app = dash.Dash()
np.random.seed(0)
df = pd.DataFrame({
'Column {}'.format(i): np.random.rand(50) + i*10
for i in range(6)})
app.layout = html.Div([
html.Div(
dcc.Graph(
id='g1',
# if selectedData is not specified then it is initialized as None
selectedData={'points': [], 'range': None},
config={'displayModeBar': False}
), className='four columns'
),
html.Div(
dcc.Graph(
id='g2',
selectedData={'points': [], 'range': None},
config={'displayModeBar': False}
), className='four columns'),
html.Div(
dcc.Graph(
id='g3',
selectedData={'points': [], 'range': None},
config={'displayModeBar': False}
), className='four columns')
], className='row')
def highlight(x, y):
def callback(*selectedDatas):
index = df.index
# filter the dataframe by the selected points
for i, hover_data in enumerate(selectedDatas):
selected_index = [
p['customdata'] for p in selectedDatas[i]['points']
# the first trace that includes all the data
if p['curveNumber'] == 0
]
if len(selected_index) > 0:
index = np.intersect1d(index, selected_index)
dff = df.iloc[index, :]
color = 'rgb(125, 58, 235)'
trace_template = {
'marker': {
'color': color,
'size': 12,
'line': {'width': 0.5, 'color': 'white'}
}
}
figure = {
'data': [
# the first trace displays all of the points
# it is dimmed by setting opacity to 0.1
dict({
'x': df[x], 'y': df[y], 'text': df.index,
'customdata': df.index,
'mode': 'markers', 'opacity': 0.1
}, **trace_template),
# the second trace is plotted on top of the first trace and
# displays the filtered points
dict({
'x': dff[x], 'y': dff[y], 'text': dff.index,
'mode': 'markers+text', 'textposition': 'top',
}, **trace_template),
],
'layout': {
'margin': {'l': 15, 'r': 0, 'b': 15, 't': 5},
'dragmode': 'select',
'hovermode': 'closest',
'showlegend': False
}
}
# Display a rectangle to highlight the previously selected region
shape = {
'type': 'rect',
'line': {
'width': 1,
'dash': 'dot',
'color': 'darkgrey'
}
}
if selectedDatas[0]['range']:
figure['layout']['shapes'] = [dict({
'x0': selectedDatas[0]['range']['x'][0],
'x1': selectedDatas[0]['range']['x'][1],
'y0': selectedDatas[0]['range']['y'][0],
'y1': selectedDatas[0]['range']['y'][1]
}, **shape)]
else:
figure['layout']['shapes'] = [dict({
'type': 'rect',
'x0': np.min(df[x]),
'x1': np.max(df[x]),
'y0': np.min(df[y]),
'y1': np.max(df[y])
}, **shape)]
return figure
return callback
app.css.append_css({
'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})
# app.callback is a decorator which means that it takes a function
# as its argument.
# highlight is a function "generator": it's a function that returns function
app.callback(
Output('g1', 'figure'),
[Input('g1', 'selectedData'),
Input('g2', 'selectedData'),
Input('g3', 'selectedData')]
)(highlight('Column 0', 'Column 1'))
app.callback(
Output('g2', 'figure'),
[Input('g2', 'selectedData'),
Input('g1', 'selectedData'),
Input('g3', 'selectedData')]
)(highlight('Column 2', 'Column 3'))
app.callback(
Output('g3', 'figure'),
[Input('g3', 'selectedData'),
Input('g1', 'selectedData'),
Input('g2', 'selectedData')]
)(highlight('Column 4', 'Column 5'))
if __name__ == '__main__':
app.run_server(debug=True)
点击和拖拽任意图形可以筛选不同区域。对于每次选择,每个图中最后选定的区域会激活3个图形的回调函数。Pandas的DataFrame基于选定的点进行筛选,选定的点也会重新绘制图形,选定区域以线型方框的形式显示。
注意,对多维数据集进行筛选和可视化,最好选用平行坐标图这种方式。
Dash的局限性
Dash的图形交互仍存在一些局限,比如:
点击图上的点不能累加:不能累加已经点击的图点数量,也不支持对某个图点进行反选。我们正在解决这个问题,详见https://github.com/plotly/plotly.js/issues/1848;
-
目前还不能自定义悬停交互及选择框的样式,我们正在解决这个问题,详见:
这些交互图特性可以实现很多效果。如果需要我们帮助研究你遇到的问题,可以在Dash社区论坛上开个帖子。
下一章介绍Dash的最后一个概念:用dash.dependencies.State
进行回调。对于包含表格和按钮的UI界面,State
非常有用。
【译Py】Python交互式数据分析报告框架~Dash介绍
【译Py】Dash用户指南01-02_安装与应用布局
【译Py】Dash用户指南03_交互性简介
【译Py】Dash用户指南04_交互式数据图
【译Py】Dash用户指南05_使用State进行回调