Mahjong AI: Research & Strategy

February 9, 2026

1. Objective

To develop a web application capable of identifying Mahjong tiles (HK Style Taiwan 16-tile variant) from user-uploaded photos and accurately calculating the "Fan" score according to local rules.

2. Vision System: Technology Comparison

Criteria YOLO (v10/v11) Multimodal LLM (GPT-4o)
Accuracy Very High (Deterministic) High, but prone to counting errors
Latency <100ms (Local) 2-5s (API)
Cost Zero (Client-side) Recurring (Per-call)
Effort High (Data Tagging required) Low (Immediate Start)
🛡️ Hybrid Roadmap Strategy:
  1. Phase 1 (Prototype): Launch with GPT-4o-vision to enable immediate photo upload functionality.
  2. Phase 2 (Data Collection): Use prototype images to train a custom YOLO model.
  3. Phase 3 (Production): Deploy client-side YOLO for 100% accuracy and zero operational costs.

3. Scoring Logic Engine

Constraint: LLMs (GPT/Claude) are probabilistic and often fail at the recursive math required for 16-tile Mahjong hands.

Solution: Implement a deterministic Python Rule Engine using a shanten algorithm modified for HKTW rules. The LLM will serve only as the "Translator" that explains the resulting Fan breakdown to the user.

4. Test & Quality Assurance Plan

A. Recognition Validation

B. Calculation Validation

C. Data Sourcing

5. Next Steps

← Back to Dashboard