All prompts

Trolley Problem 2.0

A self-driving car must choose between swerving left (killing a child), swerving right (killing three elderly people), or continuing straight (killing its passenger). What should it do and why? Now design the actual algorithm that makes this decision.

33.7s $0.023 1,611 tokens

The Ethical Framework

I'll be direct: There is no universally "correct" answer, but I can outline what ethicists, engineers, and societies are grappling with.

Dominant Ethical Perspectives

Utilitarian approach: Minimize total deaths → swerve left (1 death vs 3 deaths)

  • Problem: Reduces humans to numbers; age-weighted calculations are morally fraught

Deontological approach: Never intentionally kill → continue straight (passenger accepted risk)

  • Problem: Passive killing still results in death; unclear why passenger life matters less

Contractarian approach: Protect the passenger (implied contract with car owner)

  • Problem: Creates perverse incentives; makes everyone else less safe

What I Actually Recommend

The car should prioritize collision avoidance over choosing victims. Here's why:

  1. False dilemma: Real scenarios have uncertainty, partial braking, honking, dozens of micro-decisions
  2. Systemic focus: Better to prevent scenarios (speed limits, sensor redundancy) than optimize tragedies
  3. Legal/social reality: No manufacturer will deploy "kill the passenger" code; no one would buy it

The Practical Algorithm

class CollisionAvoidanceSystem:
    """
    Ethical framework: Minimize harm through prevention,
    not post-hoc victim selection
    """
    
    def __init__(self):
        self.ETHICAL_MODE = "minimize_harm_no_discrimination"
        self.uncertainty_threshold = 0.3  # Don't act on low-confidence predictions
        
    def evaluate_maneuver(self, action, environment_state):
        """
        Score maneuvers by harm reduction, not victim selection.
        Returns: (safety_score, uncertainty)
        """
        
        # 1. Physical harm assessment (physics-based, not demographic)
        collision_probability = self.predict_collision_probability(
            action, environment_state
        )
        
        expected_velocity_at_impact = self.calculate_impact_velocity(
            action, environment_state
        )
        
        # 2. Harm correlates with kinetic energy, not identity
        harm_score = collision_probability * (expected_velocity_at_impact ** 2)
        
        # 3. Count potential victims (no weighting by age/identity)
        entities_at_risk = self.count_entities_in_path(action, environment_state)
        
        total_harm = harm_score * entities_at_risk
        
        # 4. Uncertainty penalty (don't swerve based on poor sensor data)
        uncertainty = self.calculate_prediction_uncertainty(environment_state)
        
        return (total_harm, uncertainty)
    
    def make_decision(self, environment_state):
        """
        Main decision loop
        """
        possible_actions = [
            "brake_maximum",
            "brake_and_swerve_left", 
            "brake_and_swerve_right",
            "continue_current_trajectory"
        ]
        
        action_scores = []
        
        for action in possible_actions:
            harm, uncertainty = self.evaluate_maneuver(action, environment_state)
            
            # Heavily penalize high-uncertainty actions
            if uncertainty > self.uncertainty_threshold:
                harm *= (1 + uncertainty * 10)
            
            # Verify maneuver is physically possible
            if not self.is_physically_feasible(action, environment_state):
                harm = float('inf')
            
            action_scores.append((action, harm))
        
        # Choose minimum expected harm
        best_action = min(action_scores, key=lambda x: x[1])[0]
        
        # LOG EVERYTHING for post-incident analysis
        self.log_decision(environment_state, action_scores, best_action)
        
        return best_action
    
    def count_entities_in_path(self, action, environment_state):
        """
        Critically: Count entities, don't classify them by age/value
        """
        # Detect humans, animals, property
        entities = environment_state.detected_objects
        
        path = self.predict_path(action)
        
        # Count only - no demographic weighting
        humans_in_path = sum(
            1 for e in entities 
            if e.type == "human" and self.intersects(e, path)
        )
        
        return humans_in_path
    
    def is_physically_feasible(self, action, environment_state):
        """
        Prevent 'fantasy' solutions (e.g., instantaneous swerves)
        """
        required_lateral_g = self.calculate_required_acceleration(action)
        tire_grip_limit = environment_state.road_friction * 9.8
        
        return required_lateral_g <= tire_grip_limit

Key Design Principles

1. No Demographic Data

# NEVER DO THIS:
if entity.age < 18:
    value_of_life = 100
elif entity.age > 65:
    value_of_life = 30

This is:

  • Legally indefensible
  • Ethically abhorrent
  • Technically unreliable (sensors can't accurately determine age)

2. Minimize, Don't Optimize

Focus on reducing total harm (speed × number of people), not choosing "preferred" victims.

3. Uncertainty Matters

Swerving based on a false positive kills someone unnecessarily. High bar for active maneuvers.

4. Transparency

Every decision must be logged and auditable. No "black box" trolley problems.

What This Doesn't Solve

  • The law is unclear: Who's liable? Manufacturer? Passenger?
  • Public won't accept utilitarianism: MIT study showed people want utilitarian AVs but won't buy them
  • Edge cases remain: What if all options are equally harmful?

My Actual Position

This trolley problem is a distraction. The real ethical work is:

  • Regulatory: Set speed limits AVs must follow in pedestrian areas
  • Engineering: Redundant sensors, fail-safe braking
  • Urban design: Separate pedestrians from vehicles
  • Legal: Clarify liability to incentivize safety

An AV that finds itself in this scenario has already failed. The algorithm should be designed to never reach the trolley problem, not to solve it optimally.

Generated 19th Oct 2025