traininglogs
A pipeline for turning workout notes into a queryable, chartable training history.
Why this exists
I lift weights. I write down what I do, and later I want to answer questions: am I getting stronger? am I hammering the same muscle group three days in a row? what was my squat doing six months ago? did the deload actually do anything?
Spreadsheets work until they don't. Off-the-shelf apps assume a workout shape that isn't mine — they don't model myo-reps, partial reps, or the difference between a warmup set and a working set the way I think about them. The shape I want is: write the session the way I'd write it in a notebook, and have a system parse it into something strict enough to query and chart.
That's traininglogs. It is small on purpose.
The shape of the system
Five things, in order:
Each step is a boundary. Markdown is what I write. Pydantic is what the rest of the system trusts. Postgres is the source of truth for everything queryable. The API and the dashboard both read from Postgres; nothing writes back through them.
This separation means I can rewrite any one stage without disturbing the others. The parser is currently rule-based. An LLM-driven capture flow is planned to replace it ("Track B" below); the Pydantic model and the database schema are the stable contract between stages.
The data model
A session is one workout on one day. A session has many exercises. An exercise has sets (strength or activity) and optionally warmup sets. A strength set may carry a failure technique (myo-reps, lengthened-partials, static hold, drop set) when it's taken to RPE 10.
TrainingSession
├── session_id, date, program, phase, week, focus
├── is_deload_week, session_duration_minutes, weight_unit
└── exercises[]
├── number, name, target_muscle_groups, rep_tempo, exercise_type
├── current_goal (weight, sets, rep_range, rest)
├── warmup_sets[] (number, weight_kg, rep_count)
└── sets[] (discriminated union on set_type)
├── StrengthSet (set_type = "strength")
│ ├── number, weight_kg, rep_count {full, partial}
│ ├── unilateral_rep_count? {left: {full, partial}, right: {full, partial}}
│ ├── rpe?, rep_quality?
│ └── failure_technique? (discriminated union)
│ ├── MyoReps → mini_sets[]
│ ├── LLP → partial_rep_count
│ ├── Static → hold_duration_seconds
│ └── DropSet → drop_sets[]
└── ActivitySet (set_type = "activity")
├── number, rpe?
├── duration_seconds?, distance_meters?, heart_rate_bpm?
Sets are first-class; warmups are lower-resolution because I rarely care about them later. RPE is optional because I don't always log it. The model is permissive at the edges and strict in the middle — exactly the parts I query.
A few invariants the model enforces:
failure_techniqueis only valid on sets whererpe == 10.- Exercise
numberfields must be sequential starting at 1. weekmust be between 1 andprogram_length_weeks.- RPE is a whole or half step between 1 and 10.
The parser
Markdown in, TrainingSession out. Two stages: parser/extract.py walks the markdown into an intermediate dict; parser/parse.py turns that dict into a plain dict shaped directly for Pydantic. The processor injects the session_id, runs lbs→kg conversion if needed, then calls TrainingSession.model_validate() — that's where validation fires. One model layer, no translation step.
The parser is rule-based. There is no LLM in the hot path. This is a deliberate choice: parsing must be deterministic and free, and my markdown is regular enough that rules win.
TrainingSession after the fact, with a clarification loop for anything ambiguous. It will replace the rule parser, not live alongside it.Storage
Two artifacts, one source of truth.
- PostgreSQL is the source of truth. Four tables:
sessions,exercises,working_sets,warmup_sets. Foreign keys cascade on delete.failure_techniqueis stored asJSONB— its shape matches the discriminated union in the Pydantic model, withtechnique_typeas the discriminator. No ORM; raw SQL viapsycopg2. Notable columns:sessions.weight_unit;exercises.exercise_type;working_sets.set_type,duration_seconds,distance_meters,heart_rate_bpm, unilateral rep columns (left_reps_full,left_reps_partial,right_reps_full,right_reps_partial). - JSON snapshots are written to
output_training_logs_json/alongside the markdown, after the DB insert succeeds. They come fromsession.model_dump(mode="json"). Cheap, diffable, version-controllable — useful for sanity-checking the parser without touching the database.
The processor is DB-first, JSON-second: a session_id collision raises SystemExit rather than silently overwriting. Session IDs are derived from the file path (YYYY-MM-DD-sha256[:6]), so a collision means the same file was already processed.
The API
FastAPI, read-only. Three endpoints, mapped to the questions I actually ask:
| Endpoint | Answers |
|---|---|
GET /sessions | What did I do, and when? (filterable by phase, week, date range) |
GET /sessions/{id} | Full detail of one session, exercises and sets included. |
GET /exercises/{name}/history | How has this lift moved over time? Every working set, every session. |
Every request requires an X-Api-Key header. The app fails at startup if API_KEY is unset. Single key, no users table, no OAuth — it's mine and the dashboard's.
The dashboard
The dashboard is a single static HTML file rebuilt from the database by scripts/build_dashboard.py and written to docs/index.html. Clean white background, Inter for body text, JetBrains Mono for labels and numbers. Six sections:
| Section | Purpose |
|---|---|
| Overview | Total sessions · current plan · current phase · last session date |
| Session Timeline | Sessions grouped by ISO week; color-coded by focus; links to source files on GitHub |
| Strength Progression | Top weight per day per exercise (actual kg, not e1RM); goal weight as a faded dashed red line; set-level note shown in side panel |
| Weekly Load | Total kg moved per phase/week; deload weeks in red |
| Personal Bests | Heaviest set per highlight exercise — weight, reps, date |
| Program | Auto-discovered from inputs/programs/*/program.md; alias read from YAML frontmatter |
The data flow at build time:
The HTML embeds all data as inline JSON and renders charts with Chart.js. No server is required to view it. traininglogs log --publish stages, commits, and pushes the file to the website repo, where a GitHub Action deploys it.
Typography and palette
Inter for body text; JetBrains Mono for labels, numbers, and monospace elements. Background #ffffff; ink #111827; accent #dc2626 (red) for emphasis, deload bars, and the current-phase hero stat; muted #6b7280 for secondary text.
Program auto-discovery
Each program lives at inputs/programs/<slug>/program.md. The file must have a YAML frontmatter block at the top with an alias field (e.g. alias: Strength & Hypertrophy). build_dashboard.py scans all matching paths at build time and renders a link for each. No code change is needed to add a new program — drop the directory and file, set the alias, rebuild.
Rules of the road
- Never hand-edit
docs/index.html— it is fully regenerated on every build. - Always rebuild against the real DB before judging output. Synthetic data lies.
- e1RM (Epley formula) is intentionally absent — the chart plots actual top-set weight. Do not reintroduce it.
Versioning
Each session carries data_model_version and data_model_type fields in both the JSON snapshot and the DB row. Schema and parser changes are tracked in CHANGELOG.md (Keep a Changelog format). The app version is stamped in the eyebrow of this document and auto-synced by CI on merge to main.
Schema changes to date have been additive (new columns with defaults). db/schema.sql is the authoritative schema; db/db.py:apply_schema() is run explicitly by migration/import scripts (not automatically on app startup). Breaking migrations will be documented here when they land.
What's next
- Track B — AI capture. A PWA captures free-form workout notes offline. On reconnect, Claude (Haiku 4.5) maps the draft to
TrainingSession, asks clarifying questions if the mapping is incomplete, confirms the parsed summary with me, and only then writes to the DB. Replaces the rule parser. - Cloud sync. Supabase as a remote replica of the local Postgres. Writing locally remains the source of truth; the cloud copy exists so the dashboard build and the API can run from anywhere.
- Programs. A layer above sessions: a training plan describes intended sessions (exercise, target sets, target weight), and logged sessions reconcile against the plan. This is where "did I follow the program?" becomes a query.
Each of these will get its own section here when it lands. This document is meant to be the one place that explains the system as it is today, not as it was or as it might be.