When my kids and I play Scrabble, keeping track of scores and validating moves can sometimes take away from the fun of the game itself. This inspired me to build an AI-powered companion that could handle these tasks while adding an educational twist. Here’s how I built it and what I learned along the way.
Building a Scrabble Companion with Gemini 2.0
The Vision
I wanted to create two applications that would work together:
A moderator app that uses AI to capture and validate game states
A companion app that provides real-time game insights and AI-powered move explanations
Companion App — Adjusting board position in the camera
Technical Architecture
Technical architecture diagram for the Scrabble companion project
Flutter & Firebase: The Foundation
The project is built on Flutter for both apps, with Firebase handling real-time synchronization. This combination provides:
Cross-platform compatibility
Real-time data sync between moderator and companion apps
Reliable state management
Smooth animations for game state visualization
AI Integration Stack
Computer Vision with Gemini
For board state capture, I implemented:
Image processing to identify board state
AI-powered OCR to recognize letters and positions
Position validation using board rules
While the accuracy isn’t perfect yet, it provides a solid foundation for game state capture.
Move Analysis with Multiple LLMs
I experimented with several LLM providers, all integrated through Vertex AI for consistency. Here’s how the integration works:
classGeminiService {
late GenerativeModel _model;
final ImageStorageService _imageStorage = ImageStorageService();
final FirebaseService _firebaseService = FirebaseService();
GeminiService() {
// Initialize with Gemini 2.0 Flash
_model = FirebaseVertexAI.instance
.generativeModel(model: 'gemini-2.0-flash');
}
Future<Map<String, dynamic>> analyzeBoardImage(
String sessionId,
String imagePath,
) async {
try {
// Read current image bytes
final currentImageBytes =await File(imagePath).readAsBytes();
// Get board state
final boardState =await _firebaseService.getBoardState(sessionId).first;
final isFirstMove = boardState.isEmpty;
if (isFirstMove) {
// Handle first move analysis
final response =await _model.generateContent([
Content.multi([
TextPart(_constructInitialBoardPrompt()),
DataPart('image/jpeg', currentImageBytes),
]),
]);
return {
'status':'success',
'type':'initial',
'data': _parseGeminiResponse(response.text!, true),
};
} else {
// Compare with previous state
final response =await _model.generateContent([
Content.multi([
TextPart(_constructImageComparisonPrompt(boardState)),
DataPart('image/jpeg', currentImageBytes),
]),
]);
return {
'status':'success',
'type':'move',
'data': _parseGeminiResponse(response.text!, false),
};
}
} catch (e) {
return {
'status':'error',
'message': e.toString(),
};
}
}
// Prompt construction for initial board analysis
String _constructInitialBoardPrompt() {
return'''
You are analyzing an image of an initial Scrabble board move.
Accurately identify all visible letters and their positions.
Return ONLY a JSON with this format:
{
"board": [
{
"letter": "A",
"row": 7,
"col": 7,
"points": 1
}
]
}
''';
}
}
Each LLM provider offered different strengths:
DeepSeek: Provided detailed move explanations with strategic insights
Gemini 2.0 Flash: Excellent balance of speed and accuracy, particularly for image analysis
The current implementation uses Gemini 2.0 Flash through Firebase’s Vertex AI SDK, which provides seamless integration with other Firebase services and excellent performance for both text and image analysis.
Move Explanations and Voice Synthesis
The move explanation system combines LLM analysis with voice synthesis:
classAIService {
final LLMService _llmService;
bool _isTtsInitialized =false;
Future<String> generateMoveExplanation(
String playerName,
Move move,
int currentScore,
) async {
final prompt ='''
Explain this Scrabble move played by $playerName:
- Word: ${move.word} - Score: ${move.score} points
- Tiles: ${move.tiles.map((t) =>'${t.letter}(${t.points})').join(', ')}
- Current total: $currentScore points
Keep it brief but informative in 2 sentences maximum.
''';
returnawait _llmService.generateExplanation(prompt);
}
Future<List<int>> convertToSpeech(String text, AppLanguage language) async {
if (!_isTtsInitialized) {
await _initializeTts();
}
final targetVoice = language == AppLanguage.english
?'en-US-Wavenet-I'// English male voice
:'fr-FR-Wavenet-D'; // French male voice
final params = TtsParamsGoogle(
voice: voice,
audioFormat: AudioOutputFormatGoogle.linear16,
text: text,
);
final ttsResponse =await TtsGoogle.convertTts(params);
return ttsResponse.audio.buffer.asUint8List().toList();
}
}
Implemented Google Cloud TTS for bilingual support (French/English)
Tested ElevenLabs for more natural voice qualities
Created a flexible voice provider system for easy switching between services
The biggest challenge was achieving reliable board state capture. Some strategies I implemented:
Grid overlay for better image alignment
Multiple image processing attempts
Manual correction capabilities
Prompt Engineering Techniques
One of the most interesting aspects of this project was crafting effective prompts. Here’s what I learned:
Board State Analysis Prompts
For capturing board state, specificity and constraints were crucial:
String _constructImageComparisonPrompt(Map<String, dynamic> previousState) {
return'''
Compare these two Scrabble board images: the first is the previous state, the second is after a move.
Identify ONLY new letters that appear in the second image.
Previous board state for reference:
${jsonEncode(previousState)} Return ONLY a JSON object in exactly this format:
{
"word": "EXAMPLE",
"score": 15,
"newLetters": [
{
"letter": "A",
"row": 7,
"col": 7,
"points": 1
}
]
}
Rules:
- Use 0-based indices (0-14) for coordinates
- All coordinates must be within the 15x15 grid
- Return ONLY the JSON, no explanatory text
- If no valid word was played, return: {"word": "", "score": 0, "newLetters": []}
''';
}
Move Explanation Prompts
For move explanations, I found that “role-playing” and context-setting improved results:
String createMoveExplanationPrompt(String playerName, Move move, int currentScore) {
return'''
You are an enthusiastic Scrabble commentator.
Explain this move by $playerName:
- Word: ${move.word} - Score for this move: ${move.score} points
- Tiles placed: ${move.tiles.map((t) =>'${t.letter}(${t.points})').join(', ')}
- Current total after this move: $currentScore Focus on:
1. Strategic value of the move
2. Clever use of board multipliers
3. Point calculation highlights
Keep it brief but engaging in 2 sentences.
''';
}
Key Learnings
1. Structured Output
Always specify exact output format
Use JSON for structured data
Include example responses in prompts
2. Context Management
Provide relevant game state
Include previous moves when needed
Set clear role and tone expectations
3. Multilingual Considerations
Maintain same structure across languages
Adapt cultural references appropriately
Keep consistent tone and expertise level
This approach to prompt engineering resulted in more consistent and reliable responses, while maintaining the engaging and educational aspect of the game.
Real-time Sync
Firebase made real-time synchronization straightforward, but required careful planning for:
State consistency across devices
Handling network interruptions
Managing game session lifecycle
Future Improvements
1. Enhanced Image Recognition
Implementing better board detection algorithms
Adding support for different board layouts
Improving accuracy in various lighting conditions
2. Advanced AI Features
Move suggestion capabilities
Strategy analysis
Learning patterns from gameplay
3. Voice Synthesis
Exploring ElevenLabs integration
Adding more language support
Improving natural speech patterns
Testing Notes
For developers interested in testing the board recognition features, I highly recommend using ScrabbleCam. It provides a consistent way to test board capture functionality without needing a physical board.
Conclusion
Building this project has been a fantastic journey in combining AI technologies with real-world gaming. The most rewarding part has been seeing my kids’ reactions to AI-powered move explanations and how it adds a new dimension to our game.
Check out the code on GitHub to explore the implementation details or contribute to the project!
Technical Stack Summary
Flutter for cross-platform development
Firebase for real-time synchronization
Vertex AI for LLM integration
Google Cloud TTS for voice synthesis
Computer vision for board state capture
This project is open source and available on GitHub. Feel free to explore, contribute, or adapt it for your own AI experiments!