Chapter 11: Error Handling and Reporting
In this chapter, we’ll implement comprehensive error handling to provide helpful messages when things go wrong.
Error Categories
Our assembler has three categories of errors:
- Scanner Errors: Problems reading source characters
- Parse Errors: Problems understanding the syntax
- Code Generation Errors: Problems during assembly
The Error Type Hierarchy
#![allow(unused)]
fn main() {
pub enum AssemblerError {
Scanner(ScannerError),
Parse(ParseError),
CodeGen(CodeGenError),
Io { message: String },
Multiple(Vec<AssemblerError>),
}
}
This unified type lets us handle errors consistently throughout the pipeline.
Scanner Errors
#![allow(unused)]
fn main() {
pub enum ScannerError {
UnknownCharacter {
line: usize,
column: usize,
character: char,
},
UnknownDirective {
line: usize,
column: usize,
directive: String,
},
NumberExpected {
line: usize,
column: usize,
symbol: char,
},
UnterminatedString {
line: usize,
column: usize,
quote: char,
},
}
}
Example Messages
error: unknown character '?'
--> game.s:5:10
|
5 | lda ?
| ^
error: unterminated string
--> game.s:12:9
|
12 | .db "Hello
| ^^^^^^ missing closing quote
Parse Errors
#![allow(unused)]
fn main() {
pub enum ParseError {
UnexpectedToken {
expected: String,
found: TokenKind,
location: Location,
},
UnexpectedEof {
location: Location,
},
InvalidOperand {
message: String,
location: Location,
},
InvalidExpression {
message: String,
location: Location,
},
InvalidDirective {
message: String,
location: Location,
},
InvalidLabel {
message: String,
location: Location,
},
}
}
Example Messages
error: unexpected token: expected ')', found ','
--> game.s:15:14
|
15 | lda (0x80,y)
| ^ expected ')' for indirect addressing
error: invalid operand: indexed indirect only supports X register
--> game.s:20:9
|
20 | sta (0x80,y)
| ^^^^^^^^ use (zp),y for indirect indexed mode
Code Generation Errors
#![allow(unused)]
fn main() {
pub enum CodeGenError {
UndefinedSymbol {
name: String,
location: Location,
},
DuplicateSymbol {
name: String,
first: Location,
second: Location,
},
InvalidAddressingMode {
mnemonic: Mnemonic,
mode: AddressingMode,
location: Location,
},
BranchOutOfRange {
offset: i64,
location: Location,
},
ValueOutOfRange {
value: i64,
max: i64,
location: Location,
},
CircularInclude {
path: String,
location: Location,
},
EvaluationError {
message: String,
location: Location,
},
}
}
Example Messages
error: undefined symbol 'sprite_ptr'
--> game.s:42:9
|
42 | lda sprite_ptr
| ^^^^^^^^^^ symbol not defined
error: duplicate symbol 'main'
--> game.s:30:1
|
10 | main:
| ---- first defined here
|
30 | main:
| ^^^^ redefined here
error: branch target out of range (offset -200)
--> game.s:55:5
|
55 | bne far_label
| ^^^^^^^^^^^^^ target is 200 bytes away (max: -128 to +127)
Formatting Errors
The Diagnostic Structure
#![allow(unused)]
fn main() {
pub struct Diagnostic {
pub severity: Severity,
pub message: String,
pub location: Option<Location>,
pub file: Option<String>,
}
pub enum Severity {
Error,
Warning,
Note,
}
}
Formatting with Source Context
#![allow(unused)]
fn main() {
impl Diagnostic {
pub fn format(&self, source: &str) -> String {
let severity = match self.severity {
Severity::Error => "error",
Severity::Warning => "warning",
Severity::Note => "note",
};
let mut output = String::new();
if let Some(loc) = &self.location {
// Header line
if let Some(file) = &self.file {
output.push_str(&format!(
"{}: {}:{}:{}: {}\n",
severity, file, loc.line, loc.column, self.message
));
} else {
output.push_str(&format!(
"{}: [{}:{}]: {}\n",
severity, loc.line, loc.column, self.message
));
}
// Show source context
let lines: Vec<&str> = source.lines().collect();
if loc.line > 0 && loc.line <= lines.len() {
let line = lines[loc.line - 1];
let line_num = format!("{}", loc.line);
let padding = " ".repeat(line_num.len());
output.push_str(&format!(" {} |\n", padding));
output.push_str(&format!(" {} | {}\n", line_num, line));
// Underline
let underline_padding = " ".repeat(loc.column.saturating_sub(1));
let underline = "^".repeat(loc.length.max(1));
output.push_str(&format!(
" {} | {}{}\n",
padding, underline_padding, underline
));
}
} else {
output.push_str(&format!("{}: {}\n", severity, self.message));
}
output
}
}
}
Error Recovery
Rather than stopping at the first error, we continue parsing to find more issues.
Parser Error Recovery
#![allow(unused)]
fn main() {
fn synchronize(&mut self) {
// Skip to the next line
while !self.is_at_end() {
if self.check(TokenKind::NewLine) {
self.advance();
return;
}
self.advance();
}
}
pub fn parse(&mut self) -> Result<Program, Vec<ParseError>> {
let mut program = Program::new();
while !self.is_at_end() {
self.skip_empty_lines();
if self.is_at_end() { break; }
match self.parse_line() {
Ok(stmts) => program.statements.extend(stmts),
Err(e) => {
self.errors.push(e);
self.synchronize(); // Skip to next line
}
}
}
if self.errors.is_empty() {
Ok(program)
} else {
Err(std::mem::take(&mut self.errors))
}
}
}
Assembler Error Collection
#![allow(unused)]
fn main() {
fn pass2(&mut self, program: &Program) -> Result<(), AssemblerError> {
for stmt in &program.statements {
match stmt {
Statement::Instruction(instr) => {
if let Err(e) = self.emit_instruction(instr) {
self.errors.push(e); // Collect, don't stop
}
}
// ...
}
}
Ok(())
}
}
Warnings
Some issues don’t prevent assembly but should be reported:
#![allow(unused)]
fn main() {
// Zero page address used with absolute mode
pub fn warn_zp_as_absolute(&self, addr: u16, location: Location) {
if addr <= 0xFF {
eprintln!(
"warning: [{}:{}]: address 0x{:02X} could use zero page mode",
location.line, location.column, addr
);
}
}
// Unused symbol
pub fn warn_unused_symbols(&self) {
for sym in self.symbols.unreferenced_symbols() {
eprintln!(
"warning: [{}:{}]: symbol '{}' is defined but never used",
sym.defined_at.line, sym.defined_at.column, sym.name
);
}
}
}
CLI Error Formatting
#![allow(unused)]
fn main() {
fn format_error(error: &ParseError, source: &str, file: &Path) -> String {
let loc = error.location();
let lines: Vec<&str> = source.lines().collect();
let mut output = format!(
"error: {}\n --> {}:{}:{}\n",
error, file.display(), loc.line, loc.column
);
if loc.line > 0 && loc.line <= lines.len() {
let line = lines[loc.line - 1];
let line_num = format!("{}", loc.line);
let padding = " ".repeat(line_num.len());
output.push_str(&format!(" {} |\n", padding));
output.push_str(&format!(" {} | {}\n", line_num, line));
output.push_str(&format!(
" {} | {}{}'\n",
padding,
" ".repeat(loc.column.saturating_sub(1)),
"^".repeat(loc.length.max(1))
));
}
output
}
}
Complete Error Output Example
error: unexpected token: expected identifier, found Number
--> game.s:5:10
|
5 | .equ 123 456
| ^^^ expected identifier after .equ
error: undefined symbol 'plyer_x'
--> game.s:15:9
|
15 | lda plyer_x
| ^^^^^^^ did you mean 'player_x'?
error: branch target out of range (offset -150)
--> game.s:42:5
|
42 | bne start
| ^^^^^^^^^ target is too far away
Found 3 errors.
Summary
In this chapter, we implemented comprehensive error handling:
- Three error categories: Scanner, parser, and code generation
- Location tracking: Every error includes file, line, and column
- Source context: Show the offending line with underline
- Error recovery: Continue after errors to find more issues
- Warnings: Non-fatal issues like unused symbols
In the next chapter, we’ll build the command-line interface for the assembler.
Previous: Chapter 10 - Implementing Directives | Next: Chapter 12 - The CLI