Project Metrics Guide¶

Overview¶

The metrics script (scripts/metrics.py) provides comprehensive code quality and test coverage analysis for the AppImage Updater project. It generates a detailed report covering source code metrics, test coverage, cyclomatic complexity, and risk analysis.

Usage¶

Running the Metrics Script¶

# Using task runner (recommended)
task metrics

# Or directly with uv
uv run python scripts/metrics.py

The script takes approximately 11-13 seconds to run, as it executes the full test suite with coverage analysis.

Understanding the Metrics Report¶

The report is organized into several sections, each providing different insights into code quality.

1. Source Code Metrics¶

Source Code (src/)
  Total files: 126
  Maximum lines in a file: 1082
  Average lines per file: 178
  Total SLOC: 16818
  Average code paths per file: 36.5
  Maximum code paths in a file: 255
  Code duplication score: 9.94/10
  Top 5 files with most imports:
    src/appimage_updater/core/update_operations.py               (20 imports)
    ...

What these metrics mean:

Total files: Number of Python source files in src/
Maximum lines in a file: Largest file size (helps identify files that may need splitting)
Average lines per file: Mean file size (typical range: 150-250 lines is manageable)
Total SLOC: Source Lines of Code (excludes comments and blank lines)
Average code paths per file: Mean cyclomatic complexity across all files
Maximum code paths in a file: Highest complexity in any single file
Code duplication score: Measured by pylint (10/10 is perfect, \<8/10 indicates significant duplication)
Top 5 files with most imports: Files with many dependencies (may indicate coupling issues)

Interpretation:

Files over 500 lines may benefit from refactoring
Average code paths > 50 suggests high complexity
Duplication score < 8.0 indicates copy-paste code that should be refactored

2. Test Code Metrics¶

Test Code (tests/)
  Total test files: 85
  Total SLOC: 19322
  Test breakdown:
    Unit: 1215
    Functional: 81
    Integration: 31
    E2E: 54
    Regression: 35
  Source files (SLOC > 20) without tests:
    src/appimage_updater/ui/output/html_formatter.py             (SLOC: 197)
    ...

What these metrics mean:

Total test files: Number of test files across all test types
Total SLOC: Lines of test code (more test code than source code is common and healthy)
Test breakdown: Count of test functions by category
Unit tests: Test individual functions/classes in isolation
Functional tests: Test complete features or workflows
Integration tests: Test interaction between components
E2E tests: End-to-end tests of complete user scenarios
Regression tests: Tests that prevent previously fixed bugs from returning
Source files without tests: Files that have no test file importing them (based on import detection)

Interpretation:

Test SLOC > Source SLOC is a good sign (indicates thorough testing)
Files without tests are candidates for adding test coverage
A healthy mix of test types provides comprehensive coverage

Important Note about "Without tests": This list uses import detection heuristic (checking if test files import the module). It's an approximation - a file may appear here but still have some coverage if tests import it indirectly or if it has low coverage. See the coverage distribution for actual coverage data.

3. Risk Analysis¶

  Top 5 highest risk files (high complexity + low coverage):
    src/appimage_updater/repositories/sourceforge/repository.py  (complexity: 10, coverage: 0.0%, risk: 10.0)
    src/appimage_updater/core/update_operations.py               (complexity: 9, coverage: 0.0%, risk: 9.0)
    ...

What these metrics mean:

Risk score: Calculated as complexity × (100 - coverage) / 100
High risk files: Complex code with low test coverage (most likely to contain bugs)

Interpretation:

Risk score > 5.0: High priority for adding tests
Risk score > 10.0: Critical - complex code with no safety net
Focus testing efforts on high-risk files first for maximum impact

Formula explanation:

A file with complexity 10 and 0% coverage has risk = 10 × (100 - 0) / 100 = 10.0
A file with complexity 10 and 80% coverage has risk = 10 × (100 - 80) / 100 = 2.0
Testing reduces risk proportionally to coverage

4. Cyclomatic Complexity¶

Cyclomatic Complexity
  Top 5 most complex files:
    src/appimage_updater/repositories/sourceforge/repository.py  (max: 10)
    src/appimage_updater/core/update_operations.py               (max: 9)
    ...
  Files with complexity > 5: 11
  Total code paths: 3761

What these metrics mean:

Cyclomatic complexity: Measures the number of independent paths through code
Max complexity: Highest complexity of any function in the file
Files with complexity > 5: Files containing functions that may be hard to test
Total code paths: Sum of all code paths across the entire codebase

Complexity ratings:

1-5 (A): Simple, easy to understand and test
6-10 (B): Moderate complexity, may need refactoring
11-20 (C): High complexity, should be refactored
21+ (D/E/F): Very high complexity, difficult to maintain

Interpretation:

Functions with complexity > 10 should be broken into smaller functions
Files with many B-rated functions are refactoring candidates
Total code paths helps calculate test/path ratio (see Summary)

5. Code Coverage¶

Code Coverage
  1735 passed, 3 xfailed in 12.51s
  Overall coverage: 70.7%
  Coverage distribution:
    100%    :  46 files
    90-99%  :  24 files
    80-89%  :  11 files
    70-79%  :   6 files
    60-69%  :   7 files
    50-59%  :   8 files
    40-49%  :   9 files
    30-39%  :   9 files
    20-29%  :   2 files
    10-19%  :   1 files
    0-9%    :   3 files

What these metrics mean:

Test results: Number of tests passed/failed/xfailed (expected failures)
Overall coverage: Percentage of code lines executed during tests
Coverage distribution: Number of files in each coverage range

Coverage targets:

100%: Perfect coverage (46 files achieved this)
90-99%: Excellent coverage
80-89%: Good coverage
70-79%: Acceptable coverage
60-69%: Needs improvement
Below 60%: Priority for adding tests

Interpretation:

Overall coverage > 70% is good, > 80% is excellent
Focus on files with 0-9% coverage first (highest impact)
Files with 100% coverage are well-tested and safe to refactor

Why "Without tests" differs from "0-9% coverage": These measure different things:

"Without tests": Uses import detection heuristic (may miss indirect imports)
"0-9% coverage": Actual line execution from pytest (ground truth)
A file can be imported but have low coverage if tests don't exercise much code
A file may have tests but not appear in "without tests" if it's imported indirectly

6. Summary Statistics¶

=== Summary ===

  Source files: 126 | Test files: 85 | Tests: 1416
  Code paths: 3761
  Test/Path ratio: 0.38 (1416 tests / 3761 code paths)
  Coverage: 70.7%

What these metrics mean:

Source files vs Test files: Ratio of production code to test code files
Tests: Total number of test functions across all test types
Code paths: Total cyclomatic complexity (sum of all function complexities)
Test/Path ratio: Tests per code path (indicates test thoroughness)
Coverage: Overall percentage of code executed by tests

Interpretation:

Test/Path ratio:
< 0.3: Insufficient tests for code complexity
0.3-0.5: Reasonable test coverage
0.5: Excellent test coverage
Ideal ratios:
Test files ≈ 50-70% of source files
Test SLOC ≥ Source SLOC
Coverage > 70%

Metrics Script Architecture¶

The script follows a two-phase architecture for maintainability and extensibility:

Phase 1: Data Gathering¶

All metrics are collected into structured dataclasses:

gather_complexity_metrics(): Radon integration for cyclomatic complexity
gather_source_metrics(): Source code analysis (files, SLOC, imports)
gather_test_metrics(): Test code analysis and untested file detection
gather_coverage_metrics(): Pytest integration with coverage
gather_risk_metrics(): Risk assessment (complexity × low coverage)
gather_all_metrics(): Orchestrates all gathering

Phase 2: Report Generation¶

Formatted output from collected data:

report_source_metrics(): Display source code metrics
report_test_metrics(): Display test metrics
report_risk_metrics(): Display risk analysis
report_complexity_metrics(): Display complexity metrics
report_coverage_metrics(): Display coverage metrics
report_summary(): Display summary statistics
generate_report(): Orchestrates all reporting

Benefits of This Architecture¶

Separation of Concerns: Data gathering is independent of presentation
Testability: Each phase can be tested independently
Reusability: Data can be used for multiple output formats (JSON, HTML, etc.)
Maintainability: Changes to one phase don't affect the other
Extensibility: Easy to add new metrics or output formats

Dependencies¶

The metrics script requires these tools:

pytest: For running tests and generating coverage data
radon: For cyclomatic complexity analysis
pylint: For code duplication detection (optional)

All dependencies are included in the project's pyproject.toml.

Interpreting Results for Action¶

High Priority Actions¶

Files with 0-9% coverage: Add basic tests first
High risk files (risk > 10): Add tests to complex, untested code
Functions with complexity > 10: Refactor into smaller functions
Code duplication < 8.0: Identify and eliminate duplicate code

Medium Priority Actions¶

Files with 10-49% coverage: Expand existing tests
Files with complexity 6-10: Consider refactoring if difficult to test
Files > 500 lines: Consider splitting into smaller modules

Maintenance Goals¶

Overall coverage: Maintain > 70%, target > 80%
Test/Path ratio: Maintain > 0.3, target > 0.4
Code duplication: Keep > 9.0/10
Complexity: Keep all functions at A-rating (complexity ≤ 5)

Continuous Improvement¶

Run metrics regularly to track progress:

# Before making changes
task metrics > metrics_before.txt

# After making changes
task metrics > metrics_after.txt

# Compare results
diff metrics_before.txt metrics_after.txt

This helps ensure code quality improvements over time.