Case Study 1: Building an Automated Draft Prospect Evaluation System

Overview

This case study develops an NLP-powered system for analyzing draft prospect scouting reports, extracting key attributes, and predicting draft grades from text.

Business Context

An NFL team's scouting department needs to: - Process hundreds of scouting reports efficiently - Extract consistent attribute ratings from varied text - Identify prospects whose written evaluations align with statistical performance - Predict draft grades to prioritize evaluation time - Compare prospects using text-based similarity

Data Description

# Scouting report schema
report_schema = {
    'report_id': 'unique identifier',
    'player_name': 'prospect name',
    'position': 'player position',
    'school': 'college/university',
    'report_text': 'full scouting report (500-2000 words)',
    'author': 'scout name',
    'date': 'report date',
    'grade': 'numeric grade (1-100)',
    'draft_round': 'actual draft round (if drafted)'
}

# Dataset summary
dataset_info = {
    'total_reports': 2450,
    'unique_players': 820,
    'positions': ['QB', 'RB', 'WR', 'TE', 'OT', 'OG', 'C', 'DE', 'DT', 'LB', 'CB', 'S'],
    'years': [2020, 2021, 2022, 2023],
    'avg_report_length': 847  # words
}

Implementation

Step 1: Text Processing Pipeline

import re
import pandas as pd
import numpy as np
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass
from collections import Counter

@dataclass
class ProcessedReport:
    """Processed scouting report."""
    player_name: str
    position: str
    clean_text: str
    sentences: List[str]
    tokens: List[str]
    entities: Dict
    stats_mentioned: List[Dict]


class ScoutingReportProcessor:
    """Process scouting reports for analysis."""

    FOOTBALL_STOPWORDS = {
        'the', 'a', 'an', 'is', 'are', 'was', 'were', 'be', 'been',
        'has', 'have', 'had', 'do', 'does', 'did', 'will', 'would',
        'could', 'should', 'may', 'might', 'must', 'shall', 'can'
    }

    POSITION_SYNONYMS = {
        'quarterback': 'QB', 'signal caller': 'QB', 'passer': 'QB',
        'running back': 'RB', 'tailback': 'RB', 'halfback': 'RB',
        'wide receiver': 'WR', 'wideout': 'WR', 'pass catcher': 'WR',
        'tight end': 'TE', 'y receiver': 'TE',
        'offensive tackle': 'OT', 'blindside': 'OT',
        'defensive end': 'DE', 'edge rusher': 'EDGE',
        'linebacker': 'LB', 'backer': 'LB',
        'cornerback': 'CB', 'corner': 'CB',
        'safety': 'S', 'free safety': 'FS', 'strong safety': 'SS'
    }

    def __init__(self):
        self.stat_pattern = re.compile(
            r'(\d+\.?\d*)\s*(yards?|yds?|tds?|touchdowns?|ints?|'
            r'interceptions?|receptions?|catches|completions?|'
            r'attempts?|carries|sacks?|tackles|passes)',
            re.IGNORECASE
        )

    def process_report(self, report: Dict) -> ProcessedReport:
        """Process a single scouting report."""
        text = report['report_text']

        # Clean text
        clean = self._clean_text(text)

        # Split into sentences
        sentences = self._split_sentences(text)

        # Tokenize
        tokens = self._tokenize(clean)

        # Extract entities
        entities = self._extract_entities(text)

        # Extract stats
        stats = self._extract_stats(text)

        return ProcessedReport(
            player_name=report['player_name'],
            position=report['position'],
            clean_text=clean,
            sentences=sentences,
            tokens=tokens,
            entities=entities,
            stats_mentioned=stats
        )

    def _clean_text(self, text: str) -> str:
        """Clean and normalize text."""
        # Lowercase
        text = text.lower()

        # Normalize quotes and apostrophes
        text = re.sub(r'[""]', '"', text)
        text = re.sub(r'['']', "'", text)

        # Remove extra whitespace
        text = ' '.join(text.split())

        # Normalize position terms
        for term, abbrev in self.POSITION_SYNONYMS.items():
            text = text.replace(term, abbrev.lower())

        return text

    def _split_sentences(self, text: str) -> List[str]:
        """Split text into sentences."""
        # Simple sentence splitting
        sentences = re.split(r'[.!?]+', text)
        return [s.strip() for s in sentences if s.strip()]

    def _tokenize(self, text: str) -> List[str]:
        """Tokenize text with football-aware processing."""
        # Split on whitespace
        tokens = text.split()

        # Remove stopwords
        tokens = [t for t in tokens if t not in self.FOOTBALL_STOPWORDS]

        # Remove punctuation (but keep hyphens within words)
        tokens = [re.sub(r'^[^\w]+|[^\w]+$', '', t) for t in tokens]
        tokens = [t for t in tokens if t]

        return tokens

    def _extract_entities(self, text: str) -> Dict:
        """Extract named entities."""
        entities = {
            'positions': [],
            'teams': [],
            'players': []
        }

        # Extract positions
        for term, abbrev in self.POSITION_SYNONYMS.items():
            if term in text.lower() or abbrev.lower() in text.lower():
                if abbrev not in entities['positions']:
                    entities['positions'].append(abbrev)

        # Extract player names (capitalized word pairs)
        name_pattern = re.compile(r'\b([A-Z][a-z]+\s+[A-Z][a-z]+)\b')
        entities['players'] = list(set(name_pattern.findall(text)))

        return entities

    def _extract_stats(self, text: str) -> List[Dict]:
        """Extract statistical mentions."""
        stats = []
        for match in self.stat_pattern.finditer(text):
            value, stat_type = match.groups()
            stats.append({
                'value': float(value),
                'type': self._normalize_stat_type(stat_type),
                'context': text[max(0, match.start()-30):match.end()+30]
            })
        return stats

    def _normalize_stat_type(self, stat_type: str) -> str:
        """Normalize stat type names."""
        stat_type = stat_type.lower()
        mappings = {
            'yard': 'yards', 'yds': 'yards', 'yd': 'yards',
            'td': 'touchdowns', 'touchdown': 'touchdowns',
            'int': 'interceptions', 'interception': 'interceptions',
            'reception': 'receptions', 'catch': 'receptions'
        }
        for pattern, normalized in mappings.items():
            if pattern in stat_type:
                return normalized
        return stat_type

Step 2: Attribute Extraction System

from sklearn.feature_extraction.text import TfidfVectorizer

class AttributeExtractor:
    """Extract player attributes from scouting text."""

    ATTRIBUTE_LEXICONS = {
        'arm_strength': {
            'positive': [
                'cannon', 'rocket arm', 'strong arm', 'elite arm',
                'powerful arm', 'arm talent', 'velocity', 'zip',
                'drive the ball', 'throw with authority'
            ],
            'negative': [
                'limited arm', 'weak arm', 'lacks velocity',
                'floats passes', 'arm strength concerns',
                'cannot drive the ball', 'weak zip'
            ]
        },
        'athleticism': {
            'positive': [
                'explosive', 'athletic', 'dynamic', 'elite speed',
                'burst', 'agility', 'quick', 'fluid', 'smooth',
                'exceptional athlete', 'freaky athleticism'
            ],
            'negative': [
                'stiff', 'limited athleticism', 'lacks burst',
                'below-average athlete', 'tight hips', 'plodding',
                'lacks lateral agility', 'slow feet'
            ]
        },
        'football_iq': {
            'positive': [
                'smart', 'intelligent', 'cerebral', 'instinctive',
                'anticipates', 'reads well', 'processes quickly',
                'high football iq', 'sees the field', 'savvy'
            ],
            'negative': [
                'slow processor', 'late reads', 'confused',
                'struggles with complexity', 'misreads',
                'fooled by disguises', 'limited understanding'
            ]
        },
        'technique': {
            'positive': [
                'polished', 'refined', 'sound technique',
                'textbook', 'fundamentally sound', 'clean footwork',
                'proper mechanics', 'technically proficient'
            ],
            'negative': [
                'raw', 'unpolished', 'needs refinement',
                'mechanical issues', 'inconsistent technique',
                'sloppy footwork', 'poor fundamentals'
            ]
        },
        'competitiveness': {
            'positive': [
                'competitor', 'finisher', 'physical', 'nasty',
                'plays through contact', 'dog mentality',
                'alpha', 'relentless', 'never gives up'
            ],
            'negative': [
                'soft', 'avoids contact', 'lacks aggression',
                'disappears', 'gives up on plays',
                'questionable effort', 'takes plays off'
            ]
        }
    }

    def __init__(self):
        self.position_weights = self._load_position_weights()

    def _load_position_weights(self) -> Dict[str, Dict[str, float]]:
        """Load attribute importance weights by position."""
        return {
            'QB': {'arm_strength': 1.0, 'football_iq': 1.0, 'athleticism': 0.7},
            'RB': {'athleticism': 1.0, 'competitiveness': 0.9, 'football_iq': 0.6},
            'WR': {'athleticism': 1.0, 'technique': 0.8, 'competitiveness': 0.7},
            'OT': {'technique': 1.0, 'athleticism': 0.8, 'football_iq': 0.7},
            'DE': {'athleticism': 1.0, 'technique': 0.9, 'competitiveness': 0.9},
            'CB': {'athleticism': 1.0, 'technique': 0.9, 'football_iq': 0.8}
        }

    def extract_attributes(self, text: str, position: str) -> Dict[str, Dict]:
        """Extract all attributes from text."""
        text_lower = text.lower()
        results = {}

        for attribute, lexicon in self.ATTRIBUTE_LEXICONS.items():
            positive_matches = []
            negative_matches = []

            # Find positive indicators
            for term in lexicon['positive']:
                if term in text_lower:
                    positive_matches.append(term)

            # Find negative indicators
            for term in lexicon['negative']:
                if term in text_lower:
                    negative_matches.append(term)

            # Calculate sentiment score
            total_matches = len(positive_matches) + len(negative_matches)
            if total_matches > 0:
                sentiment = len(positive_matches) / total_matches
            else:
                sentiment = 0.5  # Neutral if not mentioned

            # Get position weight
            weight = self.position_weights.get(position, {}).get(attribute, 0.5)

            results[attribute] = {
                'mentioned': total_matches > 0,
                'sentiment': sentiment,
                'positive_phrases': positive_matches,
                'negative_phrases': negative_matches,
                'position_weight': weight,
                'weighted_score': sentiment * weight if total_matches > 0 else None
            }

        return results

    def calculate_overall_profile(self,
                                   attributes: Dict,
                                   position: str) -> Dict:
        """Calculate overall attribute profile."""
        mentioned_attrs = {k: v for k, v in attributes.items() if v['mentioned']}

        if not mentioned_attrs:
            return {'overall_score': None, 'profile_type': 'unknown'}

        # Calculate weighted average
        weighted_sum = sum(
            attr['weighted_score'] for attr in mentioned_attrs.values()
            if attr['weighted_score'] is not None
        )
        weight_total = sum(
            attr['position_weight'] for attr in mentioned_attrs.values()
        )

        overall = weighted_sum / weight_total if weight_total > 0 else 0.5

        # Determine profile type
        strengths = [k for k, v in attributes.items()
                    if v['sentiment'] > 0.7 and v['mentioned']]
        weaknesses = [k for k, v in attributes.items()
                     if v['sentiment'] < 0.4 and v['mentioned']]

        return {
            'overall_score': overall,
            'strengths': strengths,
            'weaknesses': weaknesses,
            'profile_completeness': len(mentioned_attrs) / len(self.ATTRIBUTE_LEXICONS)
        }

Step 3: Grade Prediction Model

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.preprocessing import StandardScaler

class GradePredictor:
    """Predict draft grades from scouting report text."""

    def __init__(self):
        self.vectorizer = TfidfVectorizer(
            max_features=500,
            ngram_range=(1, 2),
            stop_words='english'
        )
        self.model = GradientBoostingRegressor(
            n_estimators=100,
            max_depth=5,
            learning_rate=0.1,
            random_state=42
        )
        self.scaler = StandardScaler()
        self.attribute_extractor = AttributeExtractor()

    def prepare_features(self,
                         reports: List[Dict]) -> Tuple[np.ndarray, np.ndarray]:
        """Prepare features for training."""
        texts = [r['report_text'] for r in reports]
        grades = np.array([r['grade'] for r in reports])

        # TF-IDF features
        tfidf_features = self.vectorizer.fit_transform(texts).toarray()

        # Attribute features
        attr_features = []
        for report in reports:
            attrs = self.attribute_extractor.extract_attributes(
                report['report_text'],
                report['position']
            )
            # Convert to feature vector
            attr_vec = []
            for attr_name in sorted(self.attribute_extractor.ATTRIBUTE_LEXICONS.keys()):
                attr = attrs[attr_name]
                attr_vec.extend([
                    1 if attr['mentioned'] else 0,
                    attr['sentiment'],
                    len(attr['positive_phrases']),
                    len(attr['negative_phrases'])
                ])
            attr_features.append(attr_vec)

        attr_features = np.array(attr_features)

        # Combine features
        X = np.hstack([tfidf_features, attr_features])

        return X, grades

    def train(self, reports: List[Dict]) -> Dict:
        """Train the grade prediction model."""
        X, y = self.prepare_features(reports)

        # Scale features
        X_scaled = self.scaler.fit_transform(X)

        # Cross-validation
        cv_scores = cross_val_score(self.model, X_scaled, y, cv=5,
                                     scoring='neg_mean_absolute_error')

        # Final training
        self.model.fit(X_scaled, y)

        return {
            'cv_mae': -cv_scores.mean(),
            'cv_std': cv_scores.std(),
            'feature_count': X.shape[1]
        }

    def predict(self, report: Dict) -> Dict:
        """Predict grade for a new report."""
        # Get TF-IDF features
        tfidf = self.vectorizer.transform([report['report_text']]).toarray()

        # Get attribute features
        attrs = self.attribute_extractor.extract_attributes(
            report['report_text'],
            report['position']
        )
        attr_vec = []
        for attr_name in sorted(self.attribute_extractor.ATTRIBUTE_LEXICONS.keys()):
            attr = attrs[attr_name]
            attr_vec.extend([
                1 if attr['mentioned'] else 0,
                attr['sentiment'],
                len(attr['positive_phrases']),
                len(attr['negative_phrases'])
            ])

        # Combine and scale
        X = np.hstack([tfidf, np.array([attr_vec])])
        X_scaled = self.scaler.transform(X)

        # Predict
        predicted_grade = self.model.predict(X_scaled)[0]

        return {
            'predicted_grade': round(predicted_grade, 1),
            'attributes': attrs,
            'confidence': self._estimate_confidence(X_scaled)
        }

    def _estimate_confidence(self, X: np.ndarray) -> float:
        """Estimate prediction confidence."""
        # Use prediction variance across trees
        predictions = []
        for tree in self.model.estimators_:
            predictions.append(tree[0].predict(X)[0])

        variance = np.var(predictions)
        # Convert to confidence score (lower variance = higher confidence)
        confidence = 1 / (1 + variance)
        return round(confidence, 2)

Step 4: Player Comparison System

from sklearn.metrics.pairwise import cosine_similarity

class PlayerComparator:
    """Compare players based on scouting text."""

    def __init__(self):
        self.vectorizer = TfidfVectorizer(
            max_features=1000,
            ngram_range=(1, 2)
        )
        self.player_vectors = {}
        self.player_attributes = {}

    def add_player(self, player_id: str, report: str, position: str):
        """Add player to comparison database."""
        # Store raw text
        self.player_vectors[player_id] = report

        # Extract attributes
        extractor = AttributeExtractor()
        self.player_attributes[player_id] = {
            'position': position,
            'attributes': extractor.extract_attributes(report, position)
        }

    def build_similarity_matrix(self) -> np.ndarray:
        """Build similarity matrix for all players."""
        if not self.player_vectors:
            return np.array([])

        player_ids = list(self.player_vectors.keys())
        texts = [self.player_vectors[pid] for pid in player_ids]

        # Vectorize
        tfidf_matrix = self.vectorizer.fit_transform(texts)

        # Compute similarities
        similarity_matrix = cosine_similarity(tfidf_matrix)

        return similarity_matrix, player_ids

    def find_similar_players(self,
                              player_id: str,
                              n: int = 5,
                              same_position: bool = True) -> List[Tuple[str, float]]:
        """Find most similar players."""
        if player_id not in self.player_vectors:
            return []

        similarity_matrix, player_ids = self.build_similarity_matrix()
        player_idx = player_ids.index(player_id)

        # Get similarities
        similarities = similarity_matrix[player_idx]

        # Filter and sort
        candidates = []
        target_position = self.player_attributes[player_id]['position']

        for i, pid in enumerate(player_ids):
            if pid == player_id:
                continue

            if same_position:
                if self.player_attributes[pid]['position'] != target_position:
                    continue

            candidates.append((pid, similarities[i]))

        # Sort by similarity
        candidates.sort(key=lambda x: x[1], reverse=True)

        return candidates[:n]

    def compare_two_players(self,
                             player1_id: str,
                             player2_id: str) -> Dict:
        """Detailed comparison of two players."""
        attrs1 = self.player_attributes[player1_id]['attributes']
        attrs2 = self.player_attributes[player2_id]['attributes']

        comparison = {
            'text_similarity': self._get_text_similarity(player1_id, player2_id),
            'attribute_comparison': {},
            'shared_strengths': [],
            'differentiators': []
        }

        for attr in attrs1.keys():
            sent1 = attrs1[attr]['sentiment']
            sent2 = attrs2[attr]['sentiment']

            comparison['attribute_comparison'][attr] = {
                'player1': sent1,
                'player2': sent2,
                'difference': sent1 - sent2
            }

            # Identify shared strengths
            if sent1 > 0.7 and sent2 > 0.7:
                comparison['shared_strengths'].append(attr)

            # Identify differentiators
            if abs(sent1 - sent2) > 0.3:
                comparison['differentiators'].append({
                    'attribute': attr,
                    'advantage': 'player1' if sent1 > sent2 else 'player2',
                    'magnitude': abs(sent1 - sent2)
                })

        return comparison

    def _get_text_similarity(self, player1_id: str, player2_id: str) -> float:
        """Get text similarity between two players."""
        texts = [self.player_vectors[player1_id], self.player_vectors[player2_id]]
        tfidf = self.vectorizer.fit_transform(texts)
        sim = cosine_similarity(tfidf)[0, 1]
        return round(sim, 3)

Results

Model Performance

GRADE PREDICTION MODEL RESULTS
==============================

Cross-Validation Performance (5-fold):
- Mean Absolute Error: 4.2 points
- Standard Deviation: 0.8 points
- R-squared: 0.72

By Position Group:
Position | MAE   | R²    | Sample
---------|-------|-------|-------
QB       | 3.8   | 0.78  | 156
RB       | 4.1   | 0.71  | 198
WR       | 4.5   | 0.68  | 234
OT       | 4.3   | 0.70  | 142
EDGE     | 4.0   | 0.73  | 187
CB       | 4.6   | 0.67  | 178

Top Predictive Features:
1. TF-IDF: "elite" - +3.2 grade points
2. TF-IDF: "concerns" - -2.8 grade points
3. Attribute: arm_strength_sentiment (QB) - +2.5
4. TF-IDF: "first round" - +2.3 grade points
5. Attribute: athleticism_mentioned - +1.9

Attribute Extraction Accuracy

ATTRIBUTE EXTRACTION VALIDATION
===============================

Manual annotation comparison (200 reports):

Attribute       | Precision | Recall | F1
----------------|-----------|--------|----
arm_strength    | 0.89      | 0.82   | 0.85
athleticism     | 0.91      | 0.87   | 0.89
football_iq     | 0.84      | 0.76   | 0.80
technique       | 0.86      | 0.79   | 0.82
competitiveness | 0.82      | 0.74   | 0.78

Sentiment Classification Accuracy: 84.2%

Player Comparison Examples

SIMILAR PLAYER ANALYSIS
=======================

Query: Caleb Williams (QB, 2024)

Most Similar Players:
1. Justin Fields (0.82 similarity)
   - Shared: elite athleticism, arm talent
   - Difference: Williams higher football IQ

2. Deshaun Watson (0.79 similarity)
   - Shared: dual-threat ability, competitiveness
   - Difference: Watson better under pressure

3. Kyler Murray (0.76 similarity)
   - Shared: elite mobility, quick release
   - Difference: Williams bigger frame

Lessons Learned

  1. Domain Vocabulary is Key: Football-specific lexicons outperform general sentiment analysis by 23%

  2. Position Context Matters: Same phrases mean different things for different positions

  3. Attribute Weighting: Position-weighted attributes improve predictions significantly

  4. Report Quality Varies: Standardizing scout writing style improves consistency

  5. Comparison Limitations: Text similarity doesn't capture all relevant differences

Recommendations

  1. Expand Lexicons: Continuously update attribute lexicons with new terminology
  2. Scout Calibration: Track individual scout tendencies for normalization
  3. Multi-Source Integration: Combine text analysis with statistical data
  4. Feedback Loop: Use draft outcomes to refine prediction models
  5. Human-in-the-Loop: Use NLP for triage, not final decisions