Building Interactive Dashboards
Interactive Dashboards for Baseball Analytics
Interactive dashboards have revolutionized how baseball organizations, analysts, and fans explore and communicate data-driven insights. Unlike static reports or presentations, interactive dashboards allow users to filter data, drill down into details, explore different time periods, and answer their own questions through intuitive visual interfaces. Modern dashboard technologies enable analysts to transform complex baseball datasets into accessible, actionable insights that drive decision-making at every level of baseball operations.
Understanding Interactive Baseball Dashboards
Interactive dashboards provide dynamic, user-driven exploration of baseball analytics through filtering, sorting, zooming, and drill-down capabilities. They serve multiple audiences within baseball organizations: front office executives need high-level performance summaries and roster comparisons, coaches require detailed player metrics and matchup analysis, scouts want comprehensive player evaluation tools, and players benefit from personalized performance feedback. The best dashboards balance comprehensive functionality with intuitive design, enabling non-technical users to extract insights without analytical expertise.
The ecosystem of dashboard tools spans from lightweight JavaScript libraries for web embedding to full-featured application frameworks. Plotly offers interactive charts with zoom, pan, and hover capabilities that integrate seamlessly with Python and R analytics workflows. Streamlit enables rapid prototyping of data applications with minimal code, making it ideal for analysts who want to quickly share insights with stakeholders. Shiny provides R users with powerful reactive programming capabilities for building sophisticated statistical dashboards. Each tool offers different trade-offs between ease of development, customization flexibility, and deployment complexity.
Modern baseball dashboards leverage advanced features like real-time data updates from Statcast, personalized filtering based on user roles, responsive designs that work on tablets and phones, and embedded predictive models that generate forecasts on demand. Organizations increasingly deploy dashboards internally for decision support and externally for fan engagement, with teams like the Cleveland Guardians and Houston Astros building comprehensive analytics platforms that integrate scouting, player development, and strategic planning functions into unified interfaces.
Benefits of Interactive Baseball Dashboards
- Self-Service Analytics: Dashboards democratize data access, enabling coaches, scouts, and executives to explore data independently without waiting for analyst reports. Users can answer their own questions, test hypotheses, and discover insights through interactive exploration.
- Real-Time Decision Support: Interactive tools can integrate live game data, updating during games to provide immediate strategic insights. Coaches can analyze matchup probabilities, review defensive positioning recommendations, and evaluate bullpen options based on current game context.
- Enhanced Communication: Visualizations communicate complex analytical findings more effectively than spreadsheets or text reports. Interactive elements allow presenters to demonstrate insights dynamically during meetings, adjusting filters and views based on discussion topics.
- Consistency and Reproducibility: Automated dashboards ensure everyone uses the same data sources and calculation methods, eliminating inconsistencies from manual reporting. Updates to underlying data automatically refresh all dashboard views.
- Rapid Iteration: Modern frameworks like Streamlit allow analysts to build functional prototypes in hours rather than weeks, enabling quick feedback cycles and continuous improvement of analytical tools.
- Scalability: Dashboard platforms can serve hundreds of concurrent users across an organization, from major league staff to minor league coaches, providing appropriate data access and views for each role.
Dashboard Design Principles
Effective Dashboard Design: Clear Purpose → Appropriate Metrics → Intuitive Layout → Thoughtful Interactivity → Performance Optimization
The best baseball dashboards start with a clear understanding of user needs and decision contexts. Design dashboards around specific questions users need to answer rather than simply displaying available data. Prioritize the most important information prominently while maintaining access to supporting details through drill-down capabilities. Use consistent color schemes, familiar baseball terminology, and logical groupings to reduce cognitive load. Test dashboards with actual users and iterate based on their feedback to ensure tools truly enhance their workflows.
Key Baseball Metrics to Include
| Dashboard Type | Essential Metrics | Interactive Features |
|---|---|---|
| Player Evaluation | WAR, wRC+, wOBA, FIP, DRS, Sprint Speed, Exit Velocity | Date range filters, comparison tool, percentile rankings |
| Team Performance | Run Differential, Pythag W-L, Team wRC+, Team ERA-, Defensive Runs Saved | Season selection, home/away splits, monthly trends |
| Matchup Analysis | Historical performance vs pitcher/team, platoon splits, pitch type breakdowns | Player selectors, situation filters (count, runners on) |
| Pitching Arsenal | Pitch velocity, spin rate, movement profiles, usage rates, whiff rates by pitch type | Pitch type filters, strike zone heat maps, comparison tool |
| Batted Ball Profile | Launch angle, exit velocity, barrel rate, hard-hit rate, spray charts | Pitch type filters, count filters, outcome highlighting |
| Defensive Positioning | OAA, sprint speed, catch probability, positioning maps | Batter filters, situation filters, field zone analysis |
Python Implementation: Plotly Interactive Charts
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from pybaseball import statcast_batter, playerid_lookup, batting_stats
class BaseballPlotlyDashboard:
"""
Create interactive baseball visualizations using Plotly.
"""
def __init__(self):
"""Initialize dashboard components."""
self.data = None
def load_season_data(self, year=2023):
"""
Load season-level batting statistics.
Parameters:
year: Season to analyze
Returns:
DataFrame with batting stats
"""
print(f"Loading {year} season data...")
self.data = batting_stats(year)
# Filter to qualified batters (502 PA minimum)
self.data = self.data[self.data['PA'] >= 502].copy()
return self.data
def create_interactive_scatter(self, x_metric='BB%', y_metric='K%',
size_metric='HR', color_metric='wOBA'):
"""
Create interactive scatter plot with hover information.
Parameters:
x_metric: Metric for x-axis
y_metric: Metric for y-axis
size_metric: Metric for point size
color_metric: Metric for color scale
Returns:
Plotly figure object
"""
fig = px.scatter(
self.data,
x=x_metric,
y=y_metric,
size=size_metric,
color=color_metric,
hover_name='Name',
hover_data={
'Team': True,
'AVG': ':.3f',
'HR': True,
'wOBA': ':.3f',
'WAR': ':.1f',
x_metric: ':.1f',
y_metric: ':.1f'
},
title=f'{y_metric} vs {x_metric} (2023 Qualified Batters)',
labels={
x_metric: f'{x_metric}',
y_metric: f'{y_metric}',
color_metric: f'{color_metric}'
},
color_continuous_scale='Viridis'
)
# Customize layout
fig.update_layout(
font=dict(size=12),
hovermode='closest',
width=1000,
height=700,
xaxis=dict(gridcolor='lightgray'),
yaxis=dict(gridcolor='lightgray'),
plot_bgcolor='white'
)
return fig
def create_player_comparison_radar(self, players_list):
"""
Create radar chart comparing multiple players.
Parameters:
players_list: List of player names to compare
Returns:
Plotly figure object
"""
# Select metrics for comparison (normalized to 0-100 scale)
metrics = ['AVG', 'OBP', 'SLG', 'wRC+', 'BB%', 'K%']
fig = go.Figure()
for player_name in players_list:
player_data = self.data[self.data['Name'] == player_name]
if len(player_data) == 0:
print(f"Player {player_name} not found")
continue
# Normalize metrics to percentile ranks
values = []
for metric in metrics:
if metric in ['K%']: # Lower is better
percentile = 100 - (player_data[metric].values[0] /
self.data[metric].max() * 100)
else: # Higher is better
percentile = (player_data[metric].values[0] /
self.data[metric].max() * 100)
values.append(percentile)
# Close the radar chart
values.append(values[0])
metrics_closed = metrics + [metrics[0]]
fig.add_trace(go.Scatterpolar(
r=values,
theta=metrics_closed,
fill='toself',
name=player_name
))
fig.update_layout(
polar=dict(
radialaxis=dict(
visible=True,
range=[0, 100]
)
),
showlegend=True,
title='Player Comparison Radar Chart',
width=800,
height=600
)
return fig
def create_interactive_leaderboard(self, metric='WAR', top_n=20):
"""
Create interactive bar chart leaderboard.
Parameters:
metric: Metric to rank players by
top_n: Number of top players to display
Returns:
Plotly figure object
"""
# Sort and select top players
top_players = self.data.nlargest(top_n, metric)
# Create horizontal bar chart
fig = go.Figure(go.Bar(
x=top_players[metric],
y=top_players['Name'],
orientation='h',
text=top_players[metric].round(1),
textposition='auto',
marker=dict(
color=top_players[metric],
colorscale='Blues',
showscale=True,
colorbar=dict(title=metric)
),
hovertemplate='%{y}
' +
f'{metric}: %{{x:.2f}}
' +
'Team: %{customdata[0]}
' +
' ',
customdata=top_players[['Team']].values
))
fig.update_layout(
title=f'Top {top_n} Players by {metric} (2023)',
xaxis_title=metric,
yaxis_title='',
yaxis=dict(autorange='reversed'),
height=max(500, top_n * 25),
width=900,
plot_bgcolor='white',
xaxis=dict(gridcolor='lightgray')
)
return fig
def create_distribution_histogram(self, metric='wOBA', bins=30):
"""
Create interactive histogram showing metric distribution.
Parameters:
metric: Metric to visualize
bins: Number of histogram bins
Returns:
Plotly figure object
"""
fig = px.histogram(
self.data,
x=metric,
nbins=bins,
title=f'Distribution of {metric} (2023 Qualified Batters)',
labels={metric: metric, 'count': 'Number of Players'},
marginal='box', # Add box plot on top
hover_data={metric: ':.3f'}
)
# Add mean and median lines
mean_val = self.data[metric].mean()
median_val = self.data[metric].median()
fig.add_vline(
x=mean_val,
line_dash='dash',
line_color='red',
annotation_text=f'Mean: {mean_val:.3f}'
)
fig.add_vline(
x=median_val,
line_dash='dash',
line_color='green',
annotation_text=f'Median: {median_val:.3f}'
)
fig.update_layout(
width=900,
height=600,
plot_bgcolor='white',
xaxis=dict(gridcolor='lightgray'),
yaxis=dict(gridcolor='lightgray')
)
return fig
# Example usage
dashboard = BaseballPlotlyDashboard()
data = dashboard.load_season_data(2023)
# Create interactive scatter plot
scatter_fig = dashboard.create_interactive_scatter(
x_metric='BB%',
y_metric='K%',
size_metric='HR',
color_metric='wOBA'
)
scatter_fig.write_html('scatter_plot.html')
scatter_fig.show()
# Compare players
comparison_fig = dashboard.create_player_comparison_radar([
'Ronald Acuna Jr.',
'Mookie Betts',
'Freddie Freeman'
])
comparison_fig.write_html('player_comparison.html')
comparison_fig.show()
# Create leaderboard
leaderboard_fig = dashboard.create_interactive_leaderboard(
metric='WAR',
top_n=20
)
leaderboard_fig.write_html('war_leaderboard.html')
leaderboard_fig.show()
# Distribution analysis
dist_fig = dashboard.create_distribution_histogram(
metric='wOBA',
bins=30
)
dist_fig.write_html('woba_distribution.html')
dist_fig.show()
print("\nInteractive Plotly dashboards created successfully!")
print("HTML files generated: scatter_plot.html, player_comparison.html, war_leaderboard.html, woba_distribution.html")
Python Implementation: Streamlit Dashboard
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from pybaseball import batting_stats, pitching_stats, statcast_batter
from pybaseball import playerid_lookup
# Configure Streamlit page
st.set_page_config(
page_title='Baseball Analytics Dashboard',
page_icon='⚾',
layout='wide',
initial_sidebar_state='expanded'
)
# Custom CSS for better styling
st.markdown("""
""", unsafe_allow_html=True)
# Title and description
st.title('⚾ Baseball Analytics Dashboard')
st.markdown('Explore comprehensive baseball statistics and player performance metrics')
# Sidebar filters
st.sidebar.header('Dashboard Controls')
# Year selection
year = st.sidebar.selectbox(
'Select Season',
options=[2023, 2022, 2021, 2020, 2019],
index=0
)
# Dashboard section selection
dashboard_section = st.sidebar.radio(
'Dashboard Section',
['Overview', 'Player Comparison', 'Team Analysis', 'Leaderboards']
)
# Cache data loading for better performance
@st.cache_data
def load_batting_data(season):
"""Load batting statistics for a season."""
data = batting_stats(season)
return data[data['PA'] >= 100].copy() # Filter to players with 100+ PA
@st.cache_data
def load_pitching_data(season):
"""Load pitching statistics for a season."""
data = pitching_stats(season)
return data[data['IP'] >= 20].copy() # Filter to pitchers with 20+ IP
# Load data
with st.spinner(f'Loading {year} season data...'):
batting_data = load_batting_data(year)
pitching_data = load_pitching_data(year)
# OVERVIEW SECTION
if dashboard_section == 'Overview':
st.header(f'{year} Season Overview')
# Key metrics
col1, col2, col3, col4 = st.columns(4)
with col1:
total_players = len(batting_data)
st.metric('Total Players', total_players)
with col2:
avg_woba = batting_data['wOBA'].mean()
st.metric('Average wOBA', f'{avg_woba:.3f}')
with col3:
total_hrs = batting_data['HR'].sum()
st.metric('Total Home Runs', f'{int(total_hrs):,}')
with col4:
avg_war = batting_data['WAR'].mean()
st.metric('Average WAR', f'{avg_war:.2f}')
# Interactive scatter plot
st.subheader('Batting Metrics Scatter Plot')
col1, col2 = st.columns(2)
with col1:
x_axis = st.selectbox(
'X-Axis Metric',
['BB%', 'K%', 'AVG', 'ISO', 'BABIP', 'wRC+'],
index=0
)
with col2:
y_axis = st.selectbox(
'Y-Axis Metric',
['wOBA', 'SLG', 'OPS', 'WAR', 'wRC+'],
index=0
)
# Create scatter plot
fig = px.scatter(
batting_data,
x=x_axis,
y=y_axis,
size='HR',
color='WAR',
hover_name='Name',
hover_data={'Team': True, 'AVG': ':.3f', 'HR': True, 'WAR': ':.1f'},
title=f'{y_axis} vs {x_axis}',
color_continuous_scale='Viridis',
height=600
)
fig.update_layout(
plot_bgcolor='white',
xaxis=dict(gridcolor='lightgray'),
yaxis=dict(gridcolor='lightgray')
)
st.plotly_chart(fig, use_container_width=True)
# Top performers
st.subheader('Top 10 Performers by WAR')
top_war = batting_data.nlargest(10, 'WAR')[['Name', 'Team', 'AVG', 'HR', 'RBI', 'WAR', 'wOBA', 'wRC+']]
st.dataframe(top_war, use_container_width=True)
# PLAYER COMPARISON SECTION
elif dashboard_section == 'Player Comparison':
st.header('Player Comparison Tool')
# Player selection
all_players = sorted(batting_data['Name'].unique())
col1, col2 = st.columns(2)
with col1:
player1 = st.selectbox('Select Player 1', all_players, index=0)
with col2:
player2 = st.selectbox('Select Player 2', all_players, index=min(1, len(all_players)-1))
# Get player data
p1_data = batting_data[batting_data['Name'] == player1].iloc[0]
p2_data = batting_data[batting_data['Name'] == player2].iloc[0]
# Display comparison metrics
st.subheader('Statistical Comparison')
metrics_to_compare = ['AVG', 'OBP', 'SLG', 'wOBA', 'HR', 'RBI', 'SB', 'WAR', 'wRC+']
comparison_df = pd.DataFrame({
'Metric': metrics_to_compare,
player1: [p1_data[m] for m in metrics_to_compare],
player2: [p2_data[m] for m in metrics_to_compare]
})
st.dataframe(comparison_df, use_container_width=True)
# Radar chart comparison
st.subheader('Performance Radar Chart')
# Normalize metrics for radar chart
radar_metrics = ['AVG', 'OBP', 'SLG', 'wRC+', 'WAR']
p1_values = []
p2_values = []
for metric in radar_metrics:
max_val = batting_data[metric].max()
p1_values.append((p1_data[metric] / max_val) * 100)
p2_values.append((p2_data[metric] / max_val) * 100)
# Close the radar chart
p1_values.append(p1_values[0])
p2_values.append(p2_values[0])
radar_metrics_closed = radar_metrics + [radar_metrics[0]]
fig = go.Figure()
fig.add_trace(go.Scatterpolar(
r=p1_values,
theta=radar_metrics_closed,
fill='toself',
name=player1
))
fig.add_trace(go.Scatterpolar(
r=p2_values,
theta=radar_metrics_closed,
fill='toself',
name=player2
))
fig.update_layout(
polar=dict(radialaxis=dict(visible=True, range=[0, 100])),
showlegend=True,
height=500
)
st.plotly_chart(fig, use_container_width=True)
# TEAM ANALYSIS SECTION
elif dashboard_section == 'Team Analysis':
st.header('Team Performance Analysis')
# Team selection
teams = sorted(batting_data['Team'].unique())
selected_team = st.selectbox('Select Team', teams)
# Filter data for selected team
team_data = batting_data[batting_data['Team'] == selected_team]
# Team metrics
col1, col2, col3, col4 = st.columns(4)
with col1:
team_avg = team_data['AVG'].mean()
st.metric('Team AVG', f'{team_avg:.3f}')
with col2:
team_hrs = team_data['HR'].sum()
st.metric('Team HRs', int(team_hrs))
with col3:
team_war = team_data['WAR'].sum()
st.metric('Team WAR', f'{team_war:.1f}')
with col4:
team_wrc = team_data['wRC+'].mean()
st.metric('Team wRC+', f'{team_wrc:.0f}')
# Team roster table
st.subheader(f'{selected_team} Roster Statistics')
team_stats = team_data[['Name', 'AVG', 'HR', 'RBI', 'SB', 'WAR', 'wOBA', 'wRC+']].sort_values('WAR', ascending=False)
st.dataframe(team_stats, use_container_width=True)
# Distribution of team stats
st.subheader('Team WAR Distribution')
fig = px.bar(
team_stats,
x='Name',
y='WAR',
title=f'{selected_team} WAR by Player',
color='WAR',
color_continuous_scale='Blues'
)
fig.update_layout(
xaxis_title='',
xaxis_tickangle=-45,
height=500,
plot_bgcolor='white'
)
st.plotly_chart(fig, use_container_width=True)
# LEADERBOARDS SECTION
elif dashboard_section == 'Leaderboards':
st.header('Statistical Leaderboards')
# Metric selection
leaderboard_metric = st.selectbox(
'Select Metric',
['WAR', 'HR', 'RBI', 'AVG', 'OBP', 'SLG', 'wOBA', 'wRC+', 'SB'],
index=0
)
# Number of players
top_n = st.slider('Number of Players', 10, 50, 20, 5)
# Get top players
top_players = batting_data.nlargest(top_n, leaderboard_metric)
# Create bar chart
fig = go.Figure(go.Bar(
x=top_players[leaderboard_metric],
y=top_players['Name'],
orientation='h',
text=top_players[leaderboard_metric].round(2),
textposition='auto',
marker=dict(
color=top_players[leaderboard_metric],
colorscale='Viridis',
showscale=True
)
))
fig.update_layout(
title=f'Top {top_n} Players by {leaderboard_metric}',
xaxis_title=leaderboard_metric,
yaxis=dict(autorange='reversed'),
height=max(500, top_n * 20),
plot_bgcolor='white',
xaxis=dict(gridcolor='lightgray')
)
st.plotly_chart(fig, use_container_width=True)
# Detailed table
st.subheader(f'Top {top_n} Players - Detailed Statistics')
detailed_stats = top_players[['Name', 'Team', 'G', 'PA', 'AVG', 'HR', 'RBI', 'SB', 'WAR', 'wOBA', 'wRC+']]
st.dataframe(detailed_stats, use_container_width=True)
# Footer
st.sidebar.markdown('---')
st.sidebar.markdown('### About')
st.sidebar.info(
'This dashboard provides interactive exploration of baseball statistics. '
'Data sourced from FanGraphs and Baseball Savant via PyBaseball.'
)
# Instructions to run:
# Save this file as baseball_dashboard.py
# Run with: streamlit run baseball_dashboard.py
R Implementation: Shiny Dashboard
# Baseball Analytics Shiny Dashboard
# Save as app.R and run with: shiny::runApp()
library(shiny)
library(shinydashboard)
library(tidyverse)
library(plotly)
library(DT)
library(baseballr)
# Load data function (with caching)
load_batting_data <- function(year) {
data <- fg_batter_leaders(startseason = year, endseason = year)
data %>% filter(PA >= 100)
}
# UI Definition
ui <- dashboardPage(
skin = "blue",
# Header
dashboardHeader(title = "Baseball Analytics"),
# Sidebar
dashboardSidebar(
sidebarMenu(
menuItem("Overview", tabName = "overview", icon = icon("dashboard")),
menuItem("Player Comparison", tabName = "comparison", icon = icon("users")),
menuItem("Team Analysis", tabName = "teams", icon = icon("baseball-ball")),
menuItem("Leaderboards", tabName = "leaders", icon = icon("trophy"))
),
hr(),
# Controls
selectInput("year", "Season:",
choices = 2019:2023,
selected = 2023),
hr(),
helpText("Interactive baseball analytics dashboard."),
helpText("Data from FanGraphs via baseballr package.")
),
# Main panel
dashboardBody(
tabItems(
# Overview tab
tabItem(
tabName = "overview",
h2("Season Overview"),
# Value boxes
fluidRow(
valueBoxOutput("total_players", width = 3),
valueBoxOutput("avg_woba", width = 3),
valueBoxOutput("total_hrs", width = 3),
valueBoxOutput("avg_war", width = 3)
),
# Scatter plot
fluidRow(
box(
title = "Interactive Scatter Plot",
status = "primary",
solidHeader = TRUE,
width = 12,
fluidRow(
column(4,
selectInput("scatter_x", "X-Axis:",
choices = c("BB%" = "BB.", "K%" = "K.", "AVG", "ISO", "wRC+"),
selected = "BB.")
),
column(4,
selectInput("scatter_y", "Y-Axis:",
choices = c("wOBA", "SLG", "OPS", "WAR", "wRC+"),
selected = "wOBA")
),
column(4,
selectInput("scatter_color", "Color By:",
choices = c("WAR", "HR", "wRC+", "SB"),
selected = "WAR")
)
),
plotlyOutput("scatter_plot", height = "500px")
)
),
# Top performers table
fluidRow(
box(
title = "Top 10 Performers by WAR",
status = "info",
solidHeader = TRUE,
width = 12,
DTOutput("top_performers")
)
)
),
# Player Comparison tab
tabItem(
tabName = "comparison",
h2("Player Comparison"),
fluidRow(
column(6,
selectInput("player1", "Player 1:",
choices = NULL)
),
column(6,
selectInput("player2", "Player 2:",
choices = NULL)
)
),
fluidRow(
box(
title = "Statistical Comparison",
status = "primary",
solidHeader = TRUE,
width = 12,
DTOutput("comparison_table")
)
),
fluidRow(
box(
title = "Performance Radar Chart",
status = "info",
solidHeader = TRUE,
width = 12,
plotlyOutput("radar_chart", height = "500px")
)
)
),
# Team Analysis tab
tabItem(
tabName = "teams",
h2("Team Performance Analysis"),
fluidRow(
column(6,
selectInput("team", "Select Team:",
choices = NULL)
)
),
fluidRow(
valueBoxOutput("team_avg", width = 3),
valueBoxOutput("team_hrs", width = 3),
valueBoxOutput("team_war", width = 3),
valueBoxOutput("team_wrc", width = 3)
),
fluidRow(
box(
title = "Team Roster Statistics",
status = "primary",
solidHeader = TRUE,
width = 12,
DTOutput("team_roster")
)
),
fluidRow(
box(
title = "Team WAR Distribution",
status = "info",
solidHeader = TRUE,
width = 12,
plotlyOutput("team_war_plot", height = "500px")
)
)
),
# Leaderboards tab
tabItem(
tabName = "leaders",
h2("Statistical Leaderboards"),
fluidRow(
column(6,
selectInput("leader_metric", "Select Metric:",
choices = c("WAR", "HR", "RBI", "AVG", "OBP", "SLG", "wOBA", "wRC+", "SB"),
selected = "WAR")
),
column(6,
sliderInput("top_n", "Number of Players:",
min = 10, max = 50, value = 20, step = 5)
)
),
fluidRow(
box(
title = "Leaderboard",
status = "primary",
solidHeader = TRUE,
width = 12,
plotlyOutput("leaderboard_plot", height = "600px")
)
),
fluidRow(
box(
title = "Detailed Statistics",
status = "info",
solidHeader = TRUE,
width = 12,
DTOutput("leaderboard_table")
)
)
)
)
)
)
# Server logic
server <- function(input, output, session) {
# Reactive data loading
batting_data <- reactive({
req(input$year)
withProgress(message = paste('Loading', input$year, 'data...'), {
load_batting_data(input$year)
})
})
# Update player choices when data changes
observe({
data <- batting_data()
players <- sort(unique(data$Name))
updateSelectInput(session, "player1", choices = players, selected = players[1])
updateSelectInput(session, "player2", choices = players, selected = players[2])
teams <- sort(unique(data$Team))
updateSelectInput(session, "team", choices = teams, selected = teams[1])
})
# Overview - Value boxes
output$total_players <- renderValueBox({
valueBox(
nrow(batting_data()),
"Total Players",
icon = icon("users"),
color = "blue"
)
})
output$avg_woba <- renderValueBox({
avg <- mean(batting_data()$wOBA, na.rm = TRUE)
valueBox(
sprintf("%.3f", avg),
"Average wOBA",
icon = icon("chart-line"),
color = "green"
)
})
output$total_hrs <- renderValueBox({
total <- sum(batting_data()$HR, na.rm = TRUE)
valueBox(
format(total, big.mark = ","),
"Total Home Runs",
icon = icon("baseball-ball"),
color = "yellow"
)
})
output$avg_war <- renderValueBox({
avg <- mean(batting_data()$WAR, na.rm = TRUE)
valueBox(
sprintf("%.2f", avg),
"Average WAR",
icon = icon("star"),
color = "red"
)
})
# Scatter plot
output$scatter_plot <- renderPlotly({
data <- batting_data()
plot_ly(
data = data,
x = ~get(input$scatter_x),
y = ~get(input$scatter_y),
type = 'scatter',
mode = 'markers',
marker = list(
size = ~HR,
sizemode = 'diameter',
sizeref = 2,
color = ~get(input$scatter_color),
colorscale = 'Viridis',
showscale = TRUE,
colorbar = list(title = input$scatter_color)
),
text = ~paste(
"Name:", Name,
"
Team:", Team,
"
AVG:", sprintf("%.3f", AVG),
"
HR:", HR,
"
WAR:", sprintf("%.1f", WAR)
),
hoverinfo = 'text'
) %>%
layout(
xaxis = list(title = input$scatter_x, gridcolor = 'lightgray'),
yaxis = list(title = input$scatter_y, gridcolor = 'lightgray'),
plot_bgcolor = 'white'
)
})
# Top performers table
output$top_performers <- renderDT({
data <- batting_data() %>%
arrange(desc(WAR)) %>%
head(10) %>%
select(Name, Team, AVG, HR, RBI, WAR, wOBA, `wRC+`)
datatable(
data,
options = list(pageLength = 10, dom = 't'),
rownames = FALSE
) %>%
formatRound(columns = c('AVG', 'wOBA'), digits = 3) %>%
formatRound(columns = c('WAR'), digits = 1)
})
# Player comparison table
output$comparison_table <- renderDT({
req(input$player1, input$player2)
data <- batting_data()
p1 <- data %>% filter(Name == input$player1)
p2 <- data %>% filter(Name == input$player2)
if (nrow(p1) == 0 || nrow(p2) == 0) return(NULL)
metrics <- c('AVG', 'OBP', 'SLG', 'wOBA', 'HR', 'RBI', 'SB', 'WAR', 'wRC+')
comparison <- tibble(
Metric = metrics,
!!input$player1 := sapply(metrics, function(m) p1[[m]]),
!!input$player2 := sapply(metrics, function(m) p2[[m]])
)
datatable(
comparison,
options = list(pageLength = 10, dom = 't'),
rownames = FALSE
)
})
# Radar chart
output$radar_chart <- renderPlotly({
req(input$player1, input$player2)
data <- batting_data()
p1 <- data %>% filter(Name == input$player1)
p2 <- data %>% filter(Name == input$player2)
if (nrow(p1) == 0 || nrow(p2) == 0) return(NULL)
metrics <- c('AVG', 'OBP', 'SLG', 'wRC+', 'WAR')
# Normalize to percentiles
p1_values <- sapply(metrics, function(m) {
(p1[[m]] / max(data[[m]], na.rm = TRUE)) * 100
})
p2_values <- sapply(metrics, function(m) {
(p2[[m]] / max(data[[m]], na.rm = TRUE)) * 100
})
plot_ly(
type = 'scatterpolar',
fill = 'toself'
) %>%
add_trace(
r = c(p1_values, p1_values[1]),
theta = c(metrics, metrics[1]),
name = input$player1
) %>%
add_trace(
r = c(p2_values, p2_values[1]),
theta = c(metrics, metrics[1]),
name = input$player2
) %>%
layout(
polar = list(
radialaxis = list(
visible = TRUE,
range = c(0, 100)
)
)
)
})
# Team metrics
output$team_avg <- renderValueBox({
req(input$team)
team_data <- batting_data() %>% filter(Team == input$team)
avg <- mean(team_data$AVG, na.rm = TRUE)
valueBox(
sprintf("%.3f", avg),
"Team AVG",
icon = icon("baseball-ball"),
color = "blue"
)
})
output$team_hrs <- renderValueBox({
req(input$team)
team_data <- batting_data() %>% filter(Team == input$team)
total <- sum(team_data$HR, na.rm = TRUE)
valueBox(
total,
"Team HRs",
icon = icon("home"),
color = "yellow"
)
})
output$team_war <- renderValueBox({
req(input$team)
team_data <- batting_data() %>% filter(Team == input$team)
total <- sum(team_data$WAR, na.rm = TRUE)
valueBox(
sprintf("%.1f", total),
"Team WAR",
icon = icon("star"),
color = "green"
)
})
output$team_wrc <- renderValueBox({
req(input$team)
team_data <- batting_data() %>% filter(Team == input$team)
avg <- mean(team_data$`wRC+`, na.rm = TRUE)
valueBox(
sprintf("%.0f", avg),
"Team wRC+",
icon = icon("chart-line"),
color = "red"
)
})
# Team roster
output$team_roster <- renderDT({
req(input$team)
team_data <- batting_data() %>%
filter(Team == input$team) %>%
arrange(desc(WAR)) %>%
select(Name, AVG, HR, RBI, SB, WAR, wOBA, `wRC+`)
datatable(
team_data,
options = list(pageLength = 15),
rownames = FALSE
) %>%
formatRound(columns = c('AVG', 'wOBA'), digits = 3) %>%
formatRound(columns = c('WAR'), digits = 1)
})
# Team WAR plot
output$team_war_plot <- renderPlotly({
req(input$team)
team_data <- batting_data() %>%
filter(Team == input$team) %>%
arrange(desc(WAR))
plot_ly(
data = team_data,
x = ~Name,
y = ~WAR,
type = 'bar',
marker = list(
color = ~WAR,
colorscale = 'Blues',
showscale = TRUE
),
text = ~sprintf("%.1f", WAR),
textposition = 'auto'
) %>%
layout(
xaxis = list(title = '', tickangle = -45),
yaxis = list(title = 'WAR'),
plot_bgcolor = 'white'
)
})
# Leaderboard plot
output$leaderboard_plot <- renderPlotly({
req(input$leader_metric, input$top_n)
top_data <- batting_data() %>%
arrange(desc(!!sym(input$leader_metric))) %>%
head(input$top_n)
plot_ly(
data = top_data,
y = ~Name,
x = ~get(input$leader_metric),
type = 'bar',
orientation = 'h',
marker = list(
color = ~get(input$leader_metric),
colorscale = 'Viridis',
showscale = TRUE
),
text = ~sprintf("%.2f", get(input$leader_metric)),
textposition = 'auto'
) %>%
layout(
yaxis = list(
title = '',
autorange = 'reversed',
categoryorder = "total ascending"
),
xaxis = list(title = input$leader_metric),
plot_bgcolor = 'white'
)
})
# Leaderboard table
output$leaderboard_table <- renderDT({
req(input$leader_metric, input$top_n)
top_data <- batting_data() %>%
arrange(desc(!!sym(input$leader_metric))) %>%
head(input$top_n) %>%
select(Name, Team, G, PA, AVG, HR, RBI, SB, WAR, wOBA, `wRC+`)
datatable(
top_data,
options = list(pageLength = 20),
rownames = FALSE
) %>%
formatRound(columns = c('AVG', 'wOBA'), digits = 3) %>%
formatRound(columns = c('WAR'), digits = 1)
})
}
# Run the application
shinyApp(ui = ui, server = server)
# To run this dashboard:
# 1. Save as app.R
# 2. Ensure required packages are installed
# 3. Run: shiny::runApp()
# 4. Dashboard will open in your browser
Filtering and Drill-Down Capabilities
Effective baseball dashboards implement hierarchical filtering that allows users to progressively narrow their analysis focus. Start with broad filters like season, league, and team, then enable more specific filters for player position, handedness, and home/away splits. Cascade filters so selections in one control limit options in dependent controls - for example, selecting a team should update the player dropdown to show only that teams roster. Implement reset buttons to quickly clear all filters and return to the default view.
Drill-down functionality enables users to transition from summary views to detailed analysis seamlessly. A team performance dashboard might allow clicking on a team name to navigate to a detailed team page, which in turn enables clicking individual players to see their complete statistical profile. Implement breadcrumb navigation so users always know their current context and can easily return to previous levels. Consider implementing modal windows or expandable panels for quick detail views without losing the current page context.
Performance Considerations
- Data Caching: Cache frequently accessed datasets to reduce database queries and API calls. Streamlit provides @st.cache_data decorator, Shiny offers reactive caching, and Plotly Dash supports Redis caching for production deployments.
- Lazy Loading: Load detailed data only when users request it rather than fetching everything upfront. Implement pagination for large tables and load charts only when users navigate to their tab or section.
- Aggregation: Pre-aggregate data at appropriate levels rather than aggregating on-demand. Store team-level statistics separately from player-level data, and calculate season totals in advance rather than summing individual games.
- Efficient Queries: Use database indexes on frequently filtered columns (player_id, team, date), write optimized SQL queries with proper WHERE clauses, and avoid SELECT * queries that retrieve unnecessary columns.
- Client-Side Processing: Leverage browser-side processing for filtering and sorting already-loaded data. Plotly charts update interactively without server round-trips, and DataTables handles client-side pagination efficiently.
- Progressive Enhancement: Display critical information immediately while loading additional features asynchronously. Show loading spinners for long operations and provide estimated completion times when possible.
Deployment Options
| Platform | Best For | Free Tier | Pros | Cons |
|---|---|---|---|---|
| Streamlit Cloud | Python Streamlit apps | Yes (public apps) | Zero configuration, Git integration, automatic deployment | Limited resources, requires public GitHub repo |
| Heroku | Any framework | Yes (hobby tier) | Flexible, supports multiple languages, add-ons ecosystem | Sleeps after inactivity, limited free hours |
| Shiny Server | R Shiny apps | Open source version | On-premise control, unlimited apps | Requires server management, no built-in authentication |
| shinyapps.io | R Shiny apps | Yes (5 apps, 25 hours) | Easy deployment, no server management | Limited free tier, scales expensively |
| AWS/GCP/Azure | Production apps | Limited trials | Enterprise scalability, full control, advanced features | Complex setup, requires cloud expertise, ongoing costs |
| Render | Modern web apps | Yes | Auto-deploy from Git, easy SSL, good performance | Free tier has cold starts |
Real-World Application
The Tampa Bay Rays developed an internal Shiny dashboard called "The Lab" that consolidates player evaluation, opponent analysis, and strategic planning tools. Coaches can compare pitcher arsenals against upcoming opponent batters, review defensive positioning recommendations based on recent batted ball data, and evaluate lineup construction options using projected run expectancy models. The dashboard pulls data from Statcast, TrackMan, video analysis systems, and internal scouting databases, providing a unified analytics platform accessible from dugouts, offices, and on the road.
FanGraphs built their public-facing analytics platform using a combination of PHP backend with JavaScript interactive charts. Their player pages feature embedded Plotly visualizations showing pitch movement diagrams, batted ball spray charts, and rolling performance trends. Users can filter by date ranges, pitch types, and game situations, with charts updating dynamically. The platform serves millions of page views monthly while maintaining fast load times through aggressive caching and optimized queries.
The Houston Astros created player development dashboards using Streamlit deployed on internal servers. Minor league coordinators track prospect development trajectories, identifying players whose performance metrics suggest readiness for promotion. The dashboards integrate mechanical analysis from motion capture systems with on-field performance data, helping coaches provide targeted feedback. Automated alerts notify staff when players hit specific development milestones or show concerning performance trends.
Key Takeaways
- Interactive dashboards transform baseball analytics from static reports into dynamic exploration tools that empower users to answer their own questions and discover insights independently.
- Plotly provides excellent interactive charting capabilities for both Python and R, with hover tooltips, zooming, filtering, and export features that enhance data exploration.
- Streamlit enables rapid development of Python-based dashboards with minimal code, making it ideal for analysts who want to quickly share insights without extensive web development expertise.
- Shiny offers R users powerful reactive programming capabilities for building sophisticated statistical dashboards with complex interactivity and real-time updates.
- Effective dashboard design requires balancing comprehensive functionality with intuitive usability, implementing thoughtful filtering and drill-down capabilities while maintaining fast performance.
- Deployment options range from free cloud platforms like Streamlit Cloud and shinyapps.io for prototypes to enterprise infrastructure on AWS/Azure for production-grade applications serving large organizations.
- Performance optimization through caching, lazy loading, and efficient queries is essential for dashboards serving many users or working with large datasets like Statcast pitch-level data.
- The best baseball dashboards integrate multiple data sources (Statcast, scouting, video analysis) into unified interfaces that support the complete analytical workflow from exploration to decision-making.