Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Penmark

Penmark is a toolkit in Rust for creating, playing, and analyzing Nikoli-style pen-and-paper puzzles. Through a unified view of pen-and-paper puzzles, a new constraint programming engine written from scratch, and a portable WASM library, Penmark aims to be the best in class toolkit for these pen-and-paper puzzles in all regards.

This book documents how Penmark is structured, why it’s structured that way, and how to use it. It’s updated regularly throughout development; changes often appear here before being implemented.

What’s in this book

  • Core concepts — the conceptual layers (genre / puzzle / game), grid substrates, regions, and the rule vocabulary that every constraint draws from.
  • Implementations — the trait surface (Puzzle, Game, the four action traits), tagged serialization, dispatching, the propagation engine, and the Puzzlehound dataset.
  • Solver, Generator, Grader, Enumerator — one section per action.
  • Frontends — CLI verbs and the directory layout that holds it together.
  • Roadmap — MVP scope, performance state, open work.

Chapters are mostly self-contained; cross-links handle the cases where one needs to lean on another. Skim the library-structure and genre-puzzle-game chapters first if you want orientation; otherwise jump wherever.

Penmark — Business Plan

Version 0.9 — strategic clarity document, not a pitch deck


Penmark will be the new home for setters and solvers of pen and paper logic puzzles. With a best in class logic engine, integrated web and mobile experience, and marketplace and platform to connect fans to their favorite creators, Penmark will become a dominant force in the puzzle world. Penmark aims to become the Bandcamp of puzzles, allowing fans to fully support the livelihoods of the creators of the puzzles that people love so much.


Market and incumbent analysis

The relevant market is the Cracking the Cryptic orbit and the broader Nikoli-style logic puzzle community. It is small, but extraordinarily passionate, with a clear willingness to pay (existing setter Patreon revenue demonstrates this) and a clear under-served need: no integrated marketplace exists, and no embeddable player exists either.

Sudokupad is the incumbent. Its strengths are a polished free web experience, deep integration with the CtC YouTube ecosystem, and trust within the variant-Sudoku community. Its weaknesses are real and structural: the mobile apps are paid, falling behind on updates, and a recurring source of complaints. It has no creator profiles, no follows, no microblogs, no discovery, no payment infrastructure, and no embeddable player. Sharing happens via Discord links and DMs. Setters monetize off-platform through Patreon and Ko-fi, decoupled entirely from where their work is consumed.

Sudokumaker is editor-focused with no marketplace, no community, and no embed.

Adjacent nichesCross+A, Logic Masters Deutschland, Janko.at — are either community-only with no commerce, specialty, or aimed at a different (often older or more puzzle-construction-purist) audience. None offer embeds.

The market today is small not because the audience is small but because the rails to monetize don’t exist. Top setters earn a few hundred to a few thousand dollars a month from Patreon — meaningful side income, not a livelihood. Penmark’s purpose is not to take a percentage of an existing pie but to grow the pie by 100x — to make full-time puzzle setting a viable career for the people this community is already built on. Without that ambition, take rates don’t matter; strategy doesn’t matter; Penmark is just another tool layered onto a constrained market. The mission is the multiplier.


The product

Penmark’s engine is the foundation: a constraint-programming substrate that grades, solves, plays, edits, and generates, designed so new puzzle genres are cheap to add. Sudoku, Slitherlink, and Akari today; Nurikabe, Heyawake, Hashiwokakero, Masyu, and the rest of the Nikoli family within reach on the same architecture. The advantage is breadth — one engine, every Nikoli-style puzzle, an integrated authoring loop on top.

On top of the engine, four product surfaces.

The player — the puzzle-playing software in any of its deployments, distinct from “players” the people who use it — has three modes. A web destination at penmark.com, where every puzzle has a shareable URL that anyone can play (no account required, like Sudokupad), wrapped in browse, discovery, profiles, and the marketplace. A native mobile app shipped via Capacitor and Fastlane to iOS and Android. And — the wedge — an embeddable iframe that any setter can paste onto their existing website, free, no commitment, with a “Powered by Penmark” footer, a setter-profile link, and a “get the rest of this pack on Penmark” CTA when the puzzle is part of a paid pack. The embed makes setter websites potential distribution surfaces — a foot-in-the-door bet, more constrained in scale than Bandcamp’s or itch.io’s web-wide presence, but enough to seed the niche if the conversion mechanics work.

The editor is for creating puzzles. Browser-based, no exporting required, instant publish — both to the marketplace and to embeds placed elsewhere. Creating and publishing happen in the same tool, in the same place where the audience already is.

The marketplace and social layer is the business. Profiles, paid and free packs, follows, microblogs, monthly support tiers. The marketplace generates revenue; the social layer is what makes the marketplace work — without follows, microblogs, and discovery, a paid pack is just a link, and links can be hosted anywhere.

The account model is unified, free, and role-fluid. Penmark has people accounts, not role-specific accounts: any account can be used to play, set, solve on stream, or publish at scale. Casual fans are players; creators are setters; performers who solve in front of an audience are streamer-solvers; companies and labels like GMPuzzles are publishers, running curated packs from many setters. People hold multiple roles at once, or move between them over time. Everyone gets the same primitives — profile, microblog, puzzle registry, embed-code generator, basic stats, marketplace publishing. Bulk uploads are gated lightly to deter IP theft; otherwise nothing is locked behind a tier.


Channel — how people hear about Penmark

Awareness is two distinct problems. Setters and solvers find Penmark through different paths, and the strategy has to work both flywheels deliberately.

Setters become aware through three paths. Direct outreach from the founder, in the early days — manual and unscalable, but the only way to seed the first ten. Setter-to-setter word of mouth, once those first ten are happy. And ambient visibility — embeds on respected setter websites that other setters notice, and Penmark’s name appearing across CtC-orbit communities: Reddit’s r/sudoku, the CtC Discord, the YouTube setter community, niche Twitter. Ambient visibility cannot be rushed. It requires showing up over months not weeks, commenting thoughtfully on others’ work, sharing progress openly, slowly becoming someone the community knows by name. The trust earned this way cannot be faked. Search is a smaller channel that pays off late but pays off, since there is currently no obvious result for setters Googling “host my puzzles” or “puzzle marketplace.”

Solvers become aware through the embed and through Penmark URLs. Every embedded puzzle on a setter’s website is an impression for Penmark — footer link, setter-profile link, and, when relevant, a “get the rest of this pack on Penmark” CTA. Penmark URLs shared in Discord, Reddit, and YouTube descriptions land solvers directly on penmark.com, where the rest of the platform is right there. Secondary solver channels are word of mouth in puzzle Discords and subreddits, mentions in Cracking the Cryptic videos when a setter’s pack is being solved, and — once it ships — mobile app store discovery.

The two flywheels compound when they intersect. A setter publishes on Penmark or embeds on their site. Solvers play, see the brand, click through, follow the setter on Penmark, subscribe to support them. The setter sees real engagement and tells another setter. The second setter publishes. Repeat. The growth model is: every embed and every shared URL generates both setter referrals and solver clicks from the same primitive. Slow to start; self-sustaining once turning.

Channels deferred, then opened. In v1, no paid acquisition, no SEO content farms, no influencer deals, no launch-day press push. The niche is small enough that hype-style marketing reads as inauthentic and damages the only growth lever available at this stage. Once the niche is seeded — Penmark a real platform with real setters publishing real packs — the calculus changes. Growing the market beyond the CtC orbit will require paid ads and direct outreach: those are the tools that take a niche to 100x, and patience alone is not. Patience is for the founding period; the marketing engine is for the growth period.


Wedge and differentiation

A useful distinction: the wedge is how Penmark gets its first setters; the differentiators are why those setters stay, grow, and recommend it.

The wedge: distribution as embeddable infrastructure

The primary wedge is the iframe embed. The first ask to a setter is not “migrate your work and publish exclusively here” — it’s “paste this iframe on your existing site, free, no commitment, your existing puzzles work in it.” That’s the lowest-commitment first touch in the market and the highest-leverage move for Penmark, because every embed is a piece of distribution: a “Powered by Penmark” footer, a link to the setter’s Penmark profile, and a “get the rest of this pack on Penmark” CTA when relevant. Setter websites with Penmark embeds become Penmark’s marketing channel.

For setters without personal websites — which is most variant-Sudoku setters in the CtC orbit — the parallel low-friction path is publishing directly to Penmark with shareable URLs, the same way Sudokupad links circulate today. Same low commitment, same conversion mechanics, just funneling through penmark.com instead of the setter’s own domain.

Differentiators (why setters stay, grow, and recommend Penmark)

Creator economics and identity. Profiles, packs, follows, microblogs, paid support. A setter’s Penmark page becomes their home, replacing the awkward Patreon-plus-Twitter-plus-Discord triangulation. Creators stay where they feel like they have a page, not a payload.

Polished mobile experience. Sudokupad’s paid, neglected mobile app is a real opening. A free, polished, frequently-updated Penmark mobile app would win on UX alone for the meaningful subset of CtC fans who want to solve on phones and tablets but currently don’t because the available app is mediocre. Mobile ships in Phase 3, after the embed and the marketplace are working.

Multi-genre breadth, growing. Sudoku is the entry point but the engine architecture means the long-term position is being the home for all Nikoli-style puzzles. No competitor has this. It also strengthens the embed pitch — “embed any Nikoli-style puzzle with one tag” is cleaner than “embed a Sudoku” and nobody else can offer it.


Beachhead audience

Variant-Sudoku enthusiasts in the CtC orbit, specifically: active solvers who follow a small set of setters, setters currently publishing free puzzles and monetizing via Patreon, and the Discord-and-link-sharing crowd who would benefit from in-platform discovery. Small but passionate — the right shape for a marketplace that doesn’t need millions of users to start working, even if it needs many more to fulfill the mission.

Streamer-solvers are the demand engine. In the CtC orbit, fans show up because Simon and Mark are solving; they engage with the setter retrospectively. Setters are supply, but streamers are demand. Penmark’s pitch to a streamer-solver is simple: you want your audience to support the setters whose puzzles you solve. Penmark makes that one click. The streamer doesn’t need to do anything beyond linking to Penmark in their video description — fans see the puzzle, the setter’s profile, the support button, all in one place. Penmark becomes the easy answer to the “how can I support these creators” comment that streamers already field. This is the highest-leverage acquisition lever in the niche, and it costs Penmark nothing beyond making sure the streamer-to-fan-to-setter path is friction-free.

The embed wedge makes the supply side of this beachhead more tractable than it would otherwise be. Recruiting the first ten setters as embed-only adopters is a much easier conversation than recruiting them as marketplace publishers. Once embedded — or published directly to Penmark — the marketplace conversion happens organically: they already trust the player, so the next step is a small one rather than a leap.

Going broader from day one (mobile casuals, education, classic Nikoli demographics in Japan) is expensive and unfocused for a solo bootstrapper. Sudoku now means Sudoku only at v1. Other genres come once Sudoku has traction.


Business model

The model is intentionally simple, in the spirit of free-as-possible-to-encourage-growth. Three revenue streams, each scaling differently and compounding together.

Flat percentage on pack sales. Taken from setters at the point of transaction. Bandcamp’s 15% (post-$5K-revenue threshold) is a useful benchmark; landing somewhere in the 10–15% range is more generous than Apple’s 30% while still sustainable. Avoid tiered or volume-based percentages early — they’re complicated to communicate and don’t change behavior at small scale.

Monthly creator support tier. Patreon-style, with the same flat percentage. Backers get a feed of microblog updates and any creator-designated perks (early access, supporters-only packs).

Penmark Premium. A direct solver subscription at $5/month or $50/year. Premium subscribers get unlimited hints and no ads. Free users get a hint cap per puzzle and modest, non-intrusive ads. Premium is the friction-removal product — solvers who play often will pay to remove the soft ceiling on hints and the ad presence; casual solvers stay free. Crucially, this revenue stream doesn’t depend on pack sales — it scales with raw user activity, compounding with the 100x mission. Ships in Phase 2 alongside payment infrastructure.

Free puzzles must remain a first-class concept. Solvers can play any puzzle on Penmark or via embeds without paying anything. Creators can publish free packs alongside paid ones — exactly the Bandcamp model where artists release free tracks, name-your-price albums, and paid albums under one roof. The free tier’s limits (hint cap, light ads) are calibrated to be felt by heavy users only — never inside the play view, never blocking access. The goal is to make Premium worth paying for without making free feel hostile.

Pay-what-you-want pricing is worth supporting from launch. Deeply on-brand for the Bandcamp framing and lowers the friction for setters experimenting with paid work for the first time.

The embed is free and stays free. The temptation to introduce a paid “remove Penmark branding” tier should be resisted early — the branded embeds are the marketing channel, and pricing them away optimizes against your own distribution. The embed monetizes indirectly, by funneling solvers back to Penmark where the marketplace and Premium subscription live.

Ads exist only as part of the freemium structure. Surfaced to free-tier users as a soft conversion lever toward Premium, never as platform-wide noise and never inside the active solving experience. Calibrated to be felt by heavy users without making free feel hostile.

The unit economics now have multiple compounding streams. Pack sales: at a 10–15% take, $1K/month of personal income from packs alone requires roughly $7K–10K of GMV per month — 700–1,000 pack purchases at $10 each. Premium subscriptions: at $5/month, 200 paying subscribers contribute $1K/month directly to Penmark; this scales linearly with engaged-solver count rather than pack volume. Creator support: additional flat-percentage take on monthly Patreon-style subscriptions to setters. These three streams describe ramen profitability for one person; the 100x ambition implies a market many times larger, where many setters earn full-time and Penmark scales to a small team.

The strategic implication of three streams: Penmark is not a pure marketplace dependent on creator success. Premium subscriptions give a direct B2C revenue base that survives even when the marketplace is still warming up. This de-risks the business model and makes solver acquisition (not just setter recruitment) materially important.


Go-to-market — solo bootstrapper edition

With no team and no growth budget, Penmark’s path is patient, supply-side, and embedded in the community.

Lead with the embed (or direct Penmark publishing), not the marketplace. When approaching the first ten setters, the offer is: “I built a puzzle player that you can either embed on your site or publish through, free, no commitment. Your existing puzzles work in it.” This is a much smaller ask than “come publish exclusively on my new platform” and gets to yes faster. Once a setter is hosting puzzles via Penmark in either mode, the marketplace conversation later (when Phase 2 ships) is “you already trust the player; here’s the marketplace where you can sell packs” — a much warmer conversation than starting from cold.

Hand-recruit the first ten. Personally identify them, personally reach out, personally onboard them. Do the embed setup yourself if needed. The first ten matter enormously; treat them as design partners, not customers. Talk to them weekly.

Make setter onboarding frictionless. The marketplace is supply-constrained, not demand-constrained. The editor needs to be excellent. The publishing flow needs to be one click. Imports from existing puzzle formats need to work. Reduce every gram of friction between a setter having a finished puzzle and that puzzle being live — both as embed and as marketplace listing.

Track embed-to-marketplace conversion from week one. This is the single most important early metric. If solvers click through from embedded puzzles to Penmark profiles and end up subscribing or buying packs, the wedge works. If they don’t, the embed is being too polite — strengthen the CTAs, add a more visible “Penmark profile” link, surface the buy-this-pack call earlier in the solve flow.

Cultivate streamer-solver relationships early. The streamer-solvers in the CtC orbit are a small group; over months of being a CtC-orbit citizen, get on their radar. The goal is not transactional sponsorship — it’s making sure that when a streamer wants to point fans toward “support this setter,” Penmark is the natural answer. This is the highest-leverage demand-side relationship Penmark has, and it needs deliberate cultivation, not just hoping it happens.

Ship the mobile app once embeds and the marketplace are working. The web destination is mobile-responsive, which covers the immediate need. The native mobile app becomes a depth-of-engagement play once there’s marketplace content density worth a flagship experience. This is why mobile sits in Phase 3, not Phase 2.

What not to do: don’t launch with multiple genres, don’t build an algorithmic feed, don’t add gamification or social-game mechanics. None of these earn trust in this community, and trust is the only growth lever available at this stage.


Roadmap

Penmark ships in five phases over roughly twenty-four months — engine foundation, embeddable infrastructure, marketplace and Premium, mobile, multi-genre breadth.

Phase 0 — engine and editor solidified. Largely done. The constraint-programming substrate handles Sudoku, Slitherlink, and Akari. This is the foundation everything sits on.

Phase 1 — embeddable player, direct Penmark hosting, and people accounts (next 3–6 months). Ship the iframe embed: clean, fast, mobile-responsive, with mandatory branding, setter-profile link, and pack CTA. Ship penmark.com as a play surface with shareable URLs that work like Sudokupad’s. Build the unified people-account experience: claim a profile, register puzzles, generate embed codes, see basic play stats. Hand-recruit five to ten setters; personally onboard them via whichever mode fits (embed for those with sites, Penmark URLs for those without). Goal: live setter presence either embedded or hosted, and qualitative validation that the player is at parity or better than what the setter currently uses.

Phase 2 — marketplace, social layer, and Penmark Premium (months 6–9). Paid and free packs, microblogs, follows, monthly creator support tiers, payment integration (Stripe or similar), pay-what-you-want pricing, and the launch of Penmark Premium ($5/month or $50/year for unlimited hints and no ads) alongside the free tier with hint cap and light ads. Open signups. Phase 1 setters become first publishers. Begin streamer-solver outreach in earnest. Goal: the loop “embed/URL → click through → solver subscribes or buys → setter gets paid” works end to end, and free-to-Premium conversion is measurable.

Phase 3 — mobile app (months 9–12). Capacitor + Fastlane to iOS and Android. Launch as the better, free, well-maintained alternative to Sudokupad’s mobile apps. Goal: become the recommended answer in CtC-orbit threads about mobile.

Phase 4 — multi-genre expansion and market growth (months 12–18). Slitherlink and Akari go live to public. Add one or two more genres opportunistically. The pitch shifts from “Sudoku marketplace” to “the home for Nikoli-style puzzles.” Begin paid acquisition and direct outreach to puzzle-curious audiences outside the CtC orbit. This is the phase where the marketing engine turns on.

Phase 5 — sustainability decision (months 18–24). Ramen profitability either exists or it doesn’t. Decide whether to raise, hire, stay solo, or pivot. Funding is on the table if mission acceleration justifies it; bootstrap remains plan A.


Operating model

Solo and bootstrapped to start. Funding is on the table if mission acceleration justifies it, but the primary plan is bootstrap. Costs are mostly hosting, payment-processing fees, mobile developer account fees, and time. The financial model isn’t complicated — what matters is the discipline.

Define ramen profitability concretely. Pick a monthly personal-income number that covers rent and basic costs, then work backward across all three revenue streams (pack take, creator-support take, Penmark Premium subscriptions) to a milestone for “Penmark works.” Without a number, it’s hard to know whether to keep going or pivot.

The embed introduces a hosting-cost watch item. Embedded plays consume bandwidth and compute even when they don’t generate revenue. At small scale this is negligible; at large scale (thousands of plays per day across many embeds) it can become a real line. Build modest cost monitoring early so that if an embed goes viral the costs don’t surprise. CDN-friendly embed architecture matters more for this reason than for performance alone.

The most important operational habit is constant conversation with setters and streamer-solvers. Ten substantive conversations with passionate users will shape the product more than a hundred hours of solo design. The temptation as a solo founder is to build in isolation; resist it. Talk to people more than you build.

Resist feature creep. The product surface area is already large (engine + editor + player + embed + marketplace + social + Premium + mobile). The bar for any new feature should be: does this make a setter more likely to embed or publish, a streamer more likely to recommend, or a player more likely to come back tomorrow (and convert to Premium)? Features that don’t pass that bar get cut.

Mission-success implies a small team eventually — hosting at scale, payment disputes at scale, support and moderation at scale. Cross that bridge when the volume forces it. The lean operating model is the v1 model, not the v3 model.


Risks and open questions

The risks worth losing sleep over.

Creator cold start, even with the embed. The embed wedge lowers the bar but doesn’t eliminate it. Without pre-existing community connections, breaking into the CtC orbit as a credible host is still the single biggest risk. The embed makes the first conversation easier but doesn’t solve getting to that first conversation. Hand-recruiting and patient community presence over months are the only answers.

Embed succeeds, marketplace doesn’t. If solvers play embedded puzzles but never click through to Penmark, you’ve built free infrastructure with hosting costs and no revenue. Mitigation: track conversion from week one, tighten CTAs aggressively if conversion is weak, be willing to make the embed slightly less convenient if the marketplace funnel needs more pressure.

Premium conversion is too low. The freemium math depends on a non-trivial fraction of free users upgrading. Most freemium apps see 2–5% conversion at steady state; below ~1% the model doesn’t pay for itself. Mitigation: calibrate hint cap and ad intensity as soft conversion levers, A/B test with real users, raise the friction (carefully) if conversion lags.

The market doesn’t grow. The 100x ambition assumes the CtC-orbit audience can be converted into paying fans of multiple creators at much higher rates than today, and that paid acquisition in Phase 4+ can bring in new puzzle fans from outside the orbit. If either fails — if willingness-to-pay caps lower than expected, or if non-orbit audiences turn out to be expensive to acquire — Penmark works as a niche tool but the mission fails. The mitigation is the cumulative effect of every other choice in this plan: lower friction to pay, give creators real homes, make supporting feel personal, make new creators emerge because the rails exist.

Cracking the Cryptic builds their own marketplace. They have the audience, trust, setter relationships, and have already shipped two sudoku apps; “CTC Marketplace” would be a one-quarter project for them and could win by default through audience alone. Operating instinct says they’re not the type — they’re solvers and YouTube creators, not platform builders — but include this as a hedge. Mitigation: stay CTC-friendly. Don’t compete with them on solving content; complement them.

Sudokupad responding. They could ship an embed feature in a few weeks if they wanted. Historically unlikely given the mobile-app neglect signal, but possible. The defensible asset isn’t the embed itself but the relationships with setters who adopted Penmark first. Speed in shipping a credible embed and recruiting launch setters matters partly for this reason.

Scope creep on a solo founder. Engine, editor, player, embed, mobile, marketplace, social, Premium, multi-genre. That’s a lot of surface for one person. The roadmap is sequenced to ship one wedge at a time, but the temptation to build adjacent things will be constant. Holding the line on phasing is the single biggest discipline question.

Solo burnout. Bootstrapped, niche, slow validation, many months of low signal. Build in personal sustainability — runway buffer, time off, social outlets outside the project. Most solo bootstrap failures aren’t strategic; they’re emotional.

The open questions worth resolving soon, but not blocking on now.

Freemium calibration. What’s the right hint cap per puzzle for the free tier — 3? 5? Per-genre? Per-difficulty? And what’s the right ad intensity — interstitial between sessions only, or also a banner outside the play view? Both need to be tuned through real user behavior, not chosen up front. Likely starting position: 3 hints per puzzle, light interstitial ads between sessions only, never inside the play view. Aim for >2% free-to-Premium conversion at steady state.

Premium and creator support bundling. If a solver pays $5/month to support a specific setter, do they also get Premium features (unlimited hints, no ads) bundled in? Generous interpretation: yes, support-tier pays for both; cleaner: separate. Tentative position: keep separate for clarity early on; consider bundling later if it drives more support-tier conversions.

v1 embed scope. Tentative position: the v1 embed includes a full play experience, mandatory Penmark branding, a setter-profile link, and a setter-configurable “buy this pack on Penmark” CTA when the embedded puzzle is part of a paid pack. Defer creator analytics (plays, completions, time-to-solve) to v1.5 — useful but not needed for the wedge to work, and adding scope. Worth validating with the first launch setters before locking in.

Take rate. Pick something defensible (10% or 15%) and commit; don’t agonize.

Pay-what-you-want. Likely yes from launch — very on-brand and lowers setter experimentation friction.

iOS-first vs. simultaneous mobile launch. Capacitor makes simultaneous more feasible than native, but iOS-first probably still wins for a solo founder (smaller surface, more willing-to-pay audience).

Bulk-upload gating mechanism. Gate at what threshold, verified how? Small uploads remain frictionless; bulk uploads need just enough friction to deter scraping and re-uploading a publisher’s catalog. Likely a quiet rate-limit plus manual review for first-time bulk uploaders. Designed properly closer to publisher onboarding.

Moderation and IP at scale. Open marketplaces eventually face spam, plagiarism, AI-generated puzzles flooding, copyright disputes between setters, and scrape-and-republish attacks. The plan defers a real moderation policy until volume forces it; this is acceptable for v1 but is genuine ongoing operational design that will need a position before publishers like GMPuzzles trust the platform with their catalogs.

Legal structure (LLC vs. sole prop). Light-touch policy for puzzle copyright disputes (only escalate when needed). Language scope (English-first; Japanese is a tempting later market but not a v1 fight).

The questions you do not need to answer yet: educational and B2B variants, exact pricing of every premium feature beyond hints/ads, whether to offer creator analytics beyond basic plays/completions, anything beyond Phase 2.


Summary of strategic positions

These are the commitments to preserve if the document ever compresses to a single page.

The mission is to grow the puzzle market by 100x, not to take a percentage of an existing one. The point is to make full-time puzzle setting a viable career for the people this community is already built on. That ambition is the multiplier behind every other choice.

This is a small-business mission — not a lifestyle business, not a venture rocket. Penmark aims to be a profitable, real, growing operation in a niche-then-broader market. Bootstrap with funding optionality; solo with small-team optionality; patient marketing in v1, paid acquisition and outreach post-seeding.

Penmark is infrastructure first, destination second. The engine and embeddable player deploy across the web; the marketplace and social layer live at the destination they funnel into. Penmark URLs work like Sudokupad’s for setters without sites; embeds work for setters with them.

The wedge is the embed (or direct Penmark publishing), not the marketplace. The first ask to a setter is “host your puzzles here, free, no commitment,” not “migrate to my platform.” It is the sharpest competitive lever available.

Streamer-solvers are the demand engine. Setters are supply; streamers drive fans toward setters. The pitch to streamers is: Penmark makes supporting your setters one click for your audience.

The differentiators — creator economics, polished mobile experience, multi-genre breadth — are reasons setters stay and grow, not reasons they show up. Acquisition runs on the wedge; retention runs on the differentiators.

Accounts are unified, free, and role-fluid. Penmark has people accounts; players, setters, streamer-solvers, and publishers all use the same primitives. Bulk uploads gated for IP protection; nothing else locked behind a tier.

Three revenue streams, compounding. Flat take on pack sales, flat take on monthly creator support, and Penmark Premium ($5/month or $50/year) for unlimited hints and no ads. Free remains genuinely playable — hint caps and light ads only, calibrated as soft conversion levers toward Premium. The embed stays free and Penmark-branded; no paid “remove branding” tiers.

Sudoku is the beachhead. Multi-genre is the long-term moat.

The biggest risk is creator cold start. The highest-leverage early action is hand-recruiting and personally onboarding the first ten setters. The marketplace conversation comes later, on a warmer base. CTC building their own marketplace is a hedge-risk; stay CTC-friendly.

Solo and bootstrapped means saying no to almost every feature, talking to setters and streamer-solvers constantly, and shipping one wedge at a time without skipping ahead.

Background

Why Penmark exists and what came before. Read these before the technical chapters if you want context; skip them if you already know the puzzle-engine landscape.

Motivation

Why Penmark exists, what gap it’s filling, and what it unlocks once it does. To be fleshed out.

Why one engine

The shape of the existing ecosystem — Tatham’s collection, PuzzleKit, per-puzzle solvers, SudokuPad, pzv.jp, F-Puzzles — and what none of them cover end-to-end.

Why now

What changed: Rust + WASM let one codebase serve CLI, desktop, and browser without rewrites; modern CSP techniques handle this catalog cleanly; large public puzzle datasets exist to mine.

What this enables

The tools, datasets, and feedback loops Penmark makes possible once the core engine is real — authoring, generation, grading, discovery, cross-genre infrastructure.

Non-goals

What’s intentionally out of scope, so scope creep has somewhere to be pushed back against.

Prior art

Penmark sits at an intersection of two ecosystems, and we want to be honest about both — the people doing the same engineering work, and the people serving the same users.

On the engine side (libraries that solve / grade / generate logic puzzles):

On the player / editor side (where Penmark’s WASM embed and ordo’s archive land):

  • SudokuPadhttps://sudokupad.app — best-in-class Sudoku player + variant authoring; the bar for in-browser puzzle UX.
  • pzv.jp / Puzz.linkhttp://pzv.jp/ — canonical multi-puzzle player with the URL-encoded share format that everyone consumes.
  • F-Puzzleshttps://f-puzzles.com/ — Sudoku-variant editor + player popular with constructors.

Penmark’s ambition is to be one library covering both halves: the engine work that Tatham, PuzzleKit, and copris-puzzle do, plus the embeddable player + editor that SudokuPad / pzv.jp / F-Puzzles ship. One core crate, one rule vocabulary, three frontends. None of the above does all of it; whether we end up doing all of it is what implementation will decide.

Puzzle families

Different puzzle genres need fundamentally different algorithms, not just different rules. Penmark today is a constraint-satisfaction engine — built around Puzzle = (Substrate, Vec<Constraint>, &'static Genre) and a Game<'p> whose state is per-coord candidate sets. That covers one large family natively. Several other families share substantial plumbing (substrate, geometry, parsing, dataset infra, frontends) but need different solvers, graders, and state models entirely. A few are different enough that hosting them in penmark would be a different project.

This chapter maps the landscape so we can be specific about what penmark covers, what it could cover with focused work, and what’s out of scope.

How this chapter classifies

Genres are grouped by what algorithm solves them, since that’s the axis that decides reuse. Rules and aesthetics matter for play; they don’t determine whether penmark’s Propagator and FastSolver apply.

For each family, this chapter lists:

  • What — examples and the canonical solver shape.
  • Reusable from penmark today — concrete subsystems that transfer without modification.
  • Missing — subsystems that would need to be added.

The same vocabulary recurs across families. Reusable pieces are almost always the static layer: Substrate + Coord + Region + Mark + PuzzleMeta + tagged envelope + parsing dispatch + dataset infra + bench infra + the eframe paint pipeline + the Tier / GradeStatus taxonomy. What varies is the rule layer and the solver track sitting on top.

1. Constraint satisfaction

Penmark’s home family.

  • Examples: Sudoku, Akari, Slitherlink, Nurikabe, Heyawake, Star Battle, Kakuro, Masyu, Hitori, KenKen, Yajilin, Tents, Skyscrapers, Nonograms, Numberlink, Battleships, Einstein/zebra logic puzzles, cryptarithmetic, futoshiki.
  • Solver shape: backtracking DFS + arc-consistency propagation, or SAT / CP-SAT for harder instances. Domains are small (2–9 marks per coord); the propagator narrows them via per-rule reasoning until every coord is singleton.

Reusable from penmark today: everything. Akari is wired end-to-end; Sudoku and Slitherlink are partway. Adding a new genre is a Genre impl plus the rule list its build_constraints emits. See the Adding a genre chapter.

Missing: more Rule variants for genres that aren’t yet expressible — for instance, Hashi’s “edge multiplicity 0/1/2” needs a small extension to DegreeIn, and Yajilin’s arrow-with-count clue needs a directional region shape. The genre expansion plan tracks which rule kinds each upcoming genre promotes from stub to real.

2. State-space planning

Sequential games. State changes with every move; a “solution” is a plan, not an assignment.

  • Examples: Rush Hour, Sokoban, Klotski, the 15-puzzle and other n-puzzles, Tower of Hanoi, Lights Out, peg solitaire, Atomix, Lunar Lockout, Bloxorz, Rubik’s cube.
  • Solver shape: BFS for shortest-plan (when state space is small), IDA* with a heuristic and pattern databases for larger ones, parent-pointer reconstruction for the move sequence. State is typically a packed bitboard or a small tuple, hashed into a visited-set.

Reusable from penmark today:

  • Substrate + Material for the static board (walls, exits, special cells).
  • Coord + Region + grid geometry helpers for legality checks (“which cells does this car occupy after sliding”).
  • Tagged-JSON envelope, content-hash, PuzzleMeta.
  • The half of Genre that describes “what does the level look like”: NAME, LAYERS, MARK_RENDER, default_substrate, parse_text, to_janko_text, dataset / parsing dispatch.
  • Tier / GradeStatus for difficulty bucketing.
  • CLI scaffolding, dataset loaders, eframe grid paint pipeline.

Missing:

  • A planning sister-trait (call it PlanningRules) parallel to the CSP-rules half of Genre, with associated State, Move, legal_moves, apply, unapply, is_goal, optional heuristic, and a canonicalize for visited-set keys.
  • Generic BfsPlanner<P> and IdaStarPlanner<P> returning Plan = Vec<P::Move>.
  • A planning-shaped grader (min-plan-length → Tier).
  • A planning-shaped generator (sample layout, run BFS, accept if min-length is in target band).
  • An eframe view for stepping through / animating a plan.

The apply/unapply discipline is conceptually the same as FastGame’s trial-1 rollback, but the trait surface differs enough that sharing one impl isn’t worth it — better to keep CSP and planning as parallel solver families.

See the Roadmap for whether this is planned.

3. Exact cover and packing

Find a subset of placements that covers every cell exactly once.

  • Examples: pentominoes, polyomino tilings, tangram, Soma cube, Eternity II, jigsaws, peg-grid placement puzzles, Sudoku itself when reformulated as exact cover.
  • Solver shape: Knuth’s Algorithm X with Dancing Links (DLX). Expressible as CSP but DLX dominates by an order of magnitude on this shape, because the “choose one constraint to satisfy next, branch over the rows that satisfy it” loop is unusually well- matched to the data structure.

Reusable from penmark today:

  • Substrate, region, coord, geometry — exactly the same vocabulary for cells / edges / vertices.
  • Mark works for piece-id labelling.
  • Parsing, dataset infra, eframe paint, CLI scaffolding.
  • Generation patterns (cancel: AtomicBool, time budgets, seeds).

Missing:

  • A DLX solver. Penmark’s Propagator can technically encode exact cover via ExactCount { mark, n: 1 } per cell + per-piece ExactCount, but it’s slow compared to native DLX.
  • A piece-placement vocabulary in Region (arbitrary shape + rotation / reflection orbit).
  • A grader axis for packing puzzles — uniqueness of solution is the usual target, but “how forced is each placement” is the analogue of human technique grading.

This is a real gap. Pentominoes-shaped puzzles are popular and penmark’s current family doesn’t serve them well; a DLX track alongside the CSP track is plausible future work.

4. Word and dictionary puzzles

Constraint satisfaction with a lexical domain.

  • Examples: crosswords (American, cryptic, themeless), word ladders, Scrabble (the legal-move enumerator), Boggle, fill-in puzzles, anagram puzzles.
  • Solver shape: still CSP — slot variables, intersection constraints — but the domain is a 100k–500k-entry dictionary, not a small mark set. Propagation is bitset-AND over per-(length, position, letter) precomputed filter masks; branching is at the cell level rather than the slot level to deduplicate redundant propagation across candidate words sharing a letter at the crossing. See Orca’s writeup for the canonical modern implementation.
  • For playable crosswords (find the right fill given clue scores), the problem becomes MAP inference rather than satisfaction — loopy belief propagation over the slot graph with per-slot answer distributions from a neural clue→answer scorer (Berkeley Crossword Solver, Dr. Fill). That’s a different solver again.

Reusable from penmark today:

  • Substrate (grid with black squares), coord, region (a slot is a row/col slice of cells), parsing, eframe paint.
  • The shape of Propagator: arc-consistency loop + per-arc cache
    • bitset domain reasoning. The architecture transfers; only the domain backing changes.
  • Tagged-JSON, dataset infra, CLI scaffolding, the Tier taxonomy.
  • Constraint’s shape is right (“variables crossing at this coord must agree on letter”); just a new Rule variant (LetterAgreement or similar) is needed.

Missing:

  • A large-domain domain store. FastGame hardcodes u32 per coord because mark domains are tiny; crosswords need heap-allocated bitsets sized to the wordlist (~52 KB per slot for a 411k-word dictionary).
  • The (length, position, letter) → bitset filter index. This is the data structure that makes propagation fast.
  • A variable model where slots, not coords, are the primary variables. Today penmark’s variables are coords. Crosswords need region-as-variable, with cell-level branching on top.
  • A clue→answer scoring layer for the MAP-inference flavor — that’s a separate concern (NLP), and grid-fillers like Orca skip it entirely.

A crossword genre is genuinely possible inside penmark, but the propagator-internal work is non-trivial.

Find a path or sequence of moves through a graph.

  • Examples: mazes, Hashiwokakero (Hashi — bridges between islands), river-crossing puzzles (missionaries + cannibals), the classic shortest-path puzzles. Hashi is partly CSP (constraint: bridge counts at each island) and partly graph (the bridges must form a connected planar embedding).
  • Solver shape: graph search (Dijkstra, A*) for the pure- pathfinding cases; CSP with topological constraints for Hashi and friends. Mazes are a degenerate planning family with single- cell moves.

Reusable from penmark today:

  • Substrate + coord + region: the graph is implicit in the geometry.
  • For Hashi: Constraint mostly works (bridge-count is ExactCount); Connected { mark } already exists for the global “all islands reachable” check.
  • For mazes: a pure-pathfinding solver doesn’t really need penmark’s machinery, but the parsing / rendering / dataset infra still lands cleanly.

Missing:

  • For Hashi specifically: an edge-multiplicity rule (0/1/2 bridges per slot) and a planarity / non-crossing constraint. The latter is awkward in pure CSP; today’s Propagator would handle it via lazy cuts at the leaf, similar to the Path rule.
  • For mazes: not really anything — they’re so simple that a 50-line BFS is the whole solver. The question is whether penmark adds value over just embedding mazes as a degenerate planning genre.

6. Hidden information and probabilistic puzzles

CSP under uncertainty.

  • Examples: Minesweeper, Battleship (solving), Mastermind, Hangman, Wordle (the strategic-guesser variant). The solver enumerates hidden states consistent with the observations made so far, then picks the maximum-information probe.
  • Solver shape: CSP for consistency checking + expected- information-gain or expected-regret reasoning over the model space on top. Often Monte Carlo if exact enumeration is infeasible.

Reusable from penmark today:

  • The CSP layer for consistency checking — very directly. Each observation narrows the candidate-state set; that’s exactly what Propagator does.
  • Substrate, region, mark vocabulary, parsing, frontends.

Missing:

  • A probability layer. Game::status() is InProgress / Solved / Contradicted — a binary verdict. Probabilistic puzzles need per-coord probability distributions and an expected-value computation over the consistent-models space.
  • A “next probe” selector that scores candidate moves by information gain, not just legality.
  • A grader that distinguishes “always solvable with logic” from “requires guessing” — this is the famous Minesweeper grader question and is genuinely hard.

A Minesweeper genre that just shows the deduction is feasible today; a Minesweeper player with optimal-guess support would need real new machinery.

7. Adversarial two-player games

  • Examples: Chess, Go, Checkers, Othello, Hex, Connect 4, Nim (when played, not when computed via Sprague–Grundy), TwixT.
  • Solver shape: minimax + alpha-beta, MCTS, neural evaluation. State model is similar to planning but the objective is game- theoretic value (best move under optimal opponent), not goal- reaching.

Reusable from penmark today:

  • Substrate / coord / region / parsing / eframe / dataset / CLI scaffolding. Same level-description machinery.
  • The tagged-JSON envelope.

Missing:

  • Essentially the entire algorithmic layer: search trees with two alternating players, evaluation functions, transposition tables, iterative deepening, opening books, endgame tablebases. Almost no overlap with CSP propagation.
  • A player-identity layer in the game model.
  • For neural-eval games: a totally separate ML stack.

This is realistically a sister project, not an extension. Penmark could host the board and the move format for an adversarial game, but the engine work is far enough afield that mashing it in would dilute the rest of the codebase.

8. Synthesis and programming-game puzzles

Build a program meeting a spec.

  • Examples: SpaceChem, Opus Magnum, Shenzhen IO, TIS-100, Manufactoria, Human Resource Machine, Lightbot, the various Zachtronics catalog. Often optimized along multiple axes (cycles, area, ops).
  • Solver shape: program synthesis — superoptimization, SMT- guided search, RL, neural code generation. The “puzzle” is a program with input/output specs; the “solution” is a program.

Reusable from penmark today: very little. Maybe the CLI scaffolding and PuzzleMeta.

Missing: everything else. Different state model, different search techniques, different evaluation criteria, different authoring tools, different visualization needs.

Out of scope.

9. Combinatorial game theory

  • Examples: Nim and its many variants, Wythoff, Hackenbush, Domineering, Toads-and-Frogs.
  • Solver shape: Sprague–Grundy values for impartial games; surreal-number arithmetic for partizan games. Largely closed-form rather than search-based for the classical instances.

Reusable from penmark today: nothing meaningful.

Missing: the entire mathematical apparatus.

Niche, mathematically beautiful, and disjoint from everything else in this catalog. Out of scope.

Summary table

FamilyPenmark fitWhat’s reusableWhat’s missing
1. Constraint satisfactionNativeEverythingPer-genre rule variants
2. State-space planningAdjacentSubstrate / level half of Genre / TierPlanning trait + BFS/IDA* + plan-length grader + plan generator
3. Exact cover / packingAdjacentSubstrate, region, parsing, frontendsDLX solver, piece-orbit region kind
4. Word and dictionaryAdjacentSubstrate, propagator architecture, frontendsLarge-domain store, filter index, slot-as-variable model, optional clue scorer
5. Path / graphMixedSubstrate, Connected ruleEdge-multiplicity, planarity (Hashi); maze BFS overlap is small
6. Hidden / probabilisticAdjacentCSP layer for consistency checkProbability layer, info-gain selector, guessing-aware grader
7. AdversarialSister projectLevel / paint / parsingMinimax / MCTS / eval / player model
8. SynthesisOut of scopeCLI scaffoldingEverything else
9. Combinatorial game theoryOut of scopeNothingEverything else

“Native” means penmark hosts it today. “Adjacent” means a real but contained body of new work — a sibling solver track or a new domain store, not a rewrite. “Sister project” means the level layer transfers but the engine doesn’t. “Out of scope” means penmark is the wrong home.

The recurring lesson: penmark’s level-description machinery (substrate, coord, region, parsing, dataset, paint, CLI) is genre- agnostic and pays for itself across every family that has a board. The rule and solver layers are family-specific. The clean extension story is to keep the level layer as the shared core and add sister rule-traits and sister solver families as new puzzle classes land — not to force every problem through the CSP propagator.

Core concepts

The conceptual model Penmark works in — the substrate / domain / mark / region / constraint vocabulary, the genre / puzzle / game decomposition, and the four actions (solving, grading, generating, enumerating) defined as traits. The rest of the book assumes these terms.

Library structure

One core crate, two shipped frontends, one planned:

  • CLIpenmark <verb> <genre> for every operation: solve, generate, scaffold, grade, enumerate, profile-eval/init, import, list-impls. The actual verb set lives in the CLI chapter.
  • eframe desktop app — local GUI for the same operations: build a puzzle, watch it solve, browse the dataset, browse the db.
  • WASM (not built yet) — eventual wasm-bindgen surface so the website at ordo/edit can call the same engine in-browser. Solving
    • grading + library introspection only; DB and enumeration stay CLI-side. The wire format on both sides is the tagged envelope described in the Tagged serialization chapter.

CLI and eframe call into the same core directly via Rust types — see the Dispatching chapter for the routing approach.

Grids and Coords

Almost every Nikoli + Sudoku-variant puzzle lives on a square grid; the only outlier worth treating specially is hex. So the standard layout is a square cell grid with three companion sub-grids over the same coordinate system:

LayerDimensionsUsed by
CellsN × MSudoku, Akari, Nurikabe, Heyawake, Hitori, Masyu
Edges(N+1) × M horizontals + N × (M+1) verticalsSlitherlink, fence rules
Corners(N+1) × (M+1)rare; Masyu pearls have corner-flavored constraints

A single puzzle can host constraints across any mixture of sub-grids. Most genres use one; Slitherlink uses cells (for clues) and both edge grids (for the loop). The four sub-grids and a universal coordinate type are library-defined:

/// Reference enum for *talking about* layers — used by
/// `Region::AllOf(Layer)`, `Coord::layer()`, `Substrate::grid(layer)`.
pub enum Layer {
    /// `N × M` cell grid. Default for almost every puzzle.
    Cells,
    /// `(N+1) × M` horizontal edges (between vertically-adjacent cells).
    EdgesH,
    /// `N × (M+1)` vertical edges (between horizontally-adjacent cells).
    EdgesV,
    /// `(N+1) × (M+1)` corners.
    Corners,
}

/// A position in any sub-grid. The library's universal coordinate
/// type — every Nikoli/Sudoku-variant genre shares it. Hex puzzles
/// and other non-square layouts live behind a separate `HexPuzzle`
/// trait family with their own coord type.
pub enum Coord {
    Cell   { r: usize, c: usize },
    EdgeH  { r: usize, c: usize },
    EdgeV  { r: usize, c: usize },
    Corner { r: usize, c: usize },
}

Layer is the discriminator that constraints and regions use to say which sub-grid they’re talking about; Coord is the value-level position they resolve to. A Slitherlink puzzle can carry cell clues alongside edge state without ambiguity because every constraint ultimately bottoms out at Coord values.

Coordinates. Internal: (row, col) zero-indexed from the top- left, so (0, 0) is the upper-left cell — matches array indexing and how every renderer naturally walks.

External (URLs, shared puzzle strings, REPL output) is layer-prefixed and chess-up-numbered: <layer>:<col><row>, where the column is a bijective base-26 letter sequence (a, b, …, z, aa, ab, …, az, ba, …) and the row is a 1-indexed integer counted from the bottom up (so cell:a1 is bottom-left, matching chess and Sudoku convention). The bijective base-26 scheme handles arbitrarily wide boards — a 30-column Akari grid uses a..z then aa..ad; the chess 8-column ceiling does not apply.

Each layer has its own grid dimensions and is addressed within that layer’s own coordinate space:

PrefixLayerGridExample
cell:CellsN × Mcell:a1 = bottom-left cell
edgeh:EdgesH(N+1) × Medgeh:a1 = bottommost horizontal edge of column a
edgev:EdgesVN × (M+1)edgev:a1 = leftmost vertical edge of row 1
corner:Corners(N+1) × (M+1)corner:a1 = bottom-left corner

The cell: prefix may be omitted (cells are the default layer), so a1 and cell:a1 parse identically. The other prefixes are required.

A human-friendlier alternate form references neighboring cells — edgeh(a1,a2) for the horizontal edge between cells a1 and a2, edgev(a1,b1) for the vertical edge between a1 and b1, corner(a1,b1,a2,b2) for the corner shared by those four cells. Both forms parse to the same Coord; the canonical <layer>:<col><row> is what the wire format uses, and the alternate form is sugar for hand-typed input.

Grid storage

Layer contents — substrate Material markings, per-coord candidate bitmasks, render glyph caches — all share one storage type: a flat row-major Grid<T>, in core::grid.

pub struct Grid<T> {
    rows: usize,
    cols: usize,
    data: Vec<T>,
}

impl<T: Default + Copy> Grid<T> {
    pub fn new(rows: usize, cols: usize) -> Self;        // filled with default
    pub fn get(&self, r: usize, c: usize) -> T;
    pub fn set(&mut self, r: usize, c: usize, v: T);
    pub fn iter(&self) -> impl Iterator<Item = ((usize, usize), T)>;
    pub fn data(&self) -> &[T];
}

Flat storage gives O(1) index, no hash overhead. The Default + Copy bound covers every layer payload the catalog reaches for — Material and Mark are both small Copy enums that fit it trivially.

Substrates

A substrate records what positions exist in a puzzle and what’s at them. One struct per puzzle, four optional layer grids inside, one Material per position.

// core::substrate

/// How a puzzle's grid is sized. Most genres use `Rect`. Sudoku
/// uses `Boxed` so the substrate carries the box partition that
/// constraint emission needs (per-box rectangles).
pub enum Dimensions {
    /// `rows × cols` rectangle. Akari, slither, most genres.
    Rect  { rows: usize, cols: usize },
    /// Sudoku-style: side = `box_w × box_h`, partitioned into
    /// `box_h` rows × `box_w` cols of boxes.
    Boxed { box_w: u8, box_h: u8 },
}

impl Dimensions {
    pub fn rows(&self) -> usize { /* … */ }
    pub fn cols(&self) -> usize { /* … */ }
}

/// The structural shape of a puzzle. Every `Puzzle` carries one
/// reachable via `substrate()`. Direct fields per layer; the
/// populated ones describe which positions exist and what's at each.
pub struct Substrate {
    pub dims: Dimensions,

    /// `None` = the genre doesn't use this layer. `Some(grid)` =
    /// the layer is in use; grid contents give per-position Material.
    pub cells:    Option<Grid<Material>>,   // rows × cols
    pub edges_h:  Option<Grid<Material>>,   // (rows + 1) × cols
    pub edges_v:  Option<Grid<Material>>,   // rows × (cols + 1)
    pub corners:  Option<Grid<Material>>,   // (rows + 1) × (cols + 1)
}

/// What's *at* a position — the substance, structurally.
#[repr(u8)]
pub enum Material {
    /// Decision-target position. The default; what `Grid::new` fills with.
    #[default]
    Floor,
    /// Excluded; blocks LOS rays. Akari walls, Hitori shaded givens.
    Wall,
    /// Excluded; doesn't block LOS. Akari holes, Samurai padded gaps.
    Hole,
    /// Excluded; exists for rendering only. Slitherlink clue cells,
    /// Killer-cage corner labels, Kropki edge dots, Picross hint
    /// rows/cols outside the play grid.
    LabelOnly,
}

Material is the structural classification of a position — fixed when the substrate is built, immutable during play. Four variants, each with a different effect on what the engine and renderer do at that coord:

  • Floor is a decision target: the player or solver will commit a mark here. It’s the default value of Material (and what Grid::new fills with), so a fresh grid is fully playable. The vast majority of positions in most genres are Floor.
  • Wall is excluded and blocks LOS rays: not a decision target (no mark ever lives here), and rays walked from neighboring cells stop at the wall position. Akari walls, Hitori shaded givens. The renderer paints them as solid blockers.
  • Hole is excluded but transparent to LOS: also not a decision target, but rays pass straight through. This is the mechanism by which non-rectangular puzzles fit a flat R × C bounding box — Samurai gap cells, Iraka holes; see below.
  • LabelOnly is excluded, transparent to LOS, and exists for rendering: the position is structurally there so the renderer can paint a label / glyph / decoration anchored at it, but the engine never allocates decision state for it. Slitherlink clue cells, Killer-cage corner sums, Kropki edge dots, Picross hint rows/cols. The label value lives in puzzle-level data (e.g. cell_clue(), variant payloads); Material::LabelOnly is just the structural marker.

Material answers “what’s structurally here?”, not “what value lives here?”. A Floor cell can hold any mark in its domain; Wall, Hole, and LabelOnly cells can’t hold any mark at all.

Edges H and V stay separate fields because their dimensions differ and their geometric meaning is orientation-specific (Coord::EdgeH vs Coord::EdgeV); renderer and constraint code dispatches on orientation.

Substrate is uniformly typed across genres — every puzzle holds one, accessed the same way. What varies per genre is which fields are populated and what the contents are. Walls and holes shape what coords exist before any rule fires; the substrate is structurally prior.

Per-coord queries live as inherent methods:

impl Substrate {
    pub fn rows(&self) -> usize;  // forwards to dims.rows()
    pub fn cols(&self) -> usize;  // forwards to dims.cols()

    /// Direct lookup by Layer reference. `None` if the genre
    /// doesn't populate this layer.
    pub fn grid(&self, layer: Layer) -> Option<&Grid<Material>>;

    /// `Material` at `coord`; `Floor` (defensive default) for
    /// unmapped layers.
    pub fn material(&self, coord: Coord) -> Material;
    pub fn set_material(&mut self, coord: Coord, m: Material);

    /// Whether the cell at raw `(r, c)` blocks LOS rays
    /// (`Material::Wall`).
    pub fn blocks_los(&self, r: usize, c: usize) -> bool;

    /// Every `Floor` position across every populated layer — the
    /// decision-target set.
    pub fn coords(&self)    -> impl Iterator<Item = Coord> + '_;
    /// Every populated-layer position regardless of material —
    /// renderer-facing.
    pub fn positions(&self) -> impl Iterator<Item = Coord> + '_;
    /// Iterate `(Layer, &Grid)` for every populated layer.
    pub fn layers(&self)    -> impl Iterator<Item = (Layer, &Grid<Material>)> + '_;

    // ── Material-keyed iterators ──
    /// Iterate every cell whose Material equals `target`. Akari uses
    /// this for "every floor cell" and "every hole cell" sweeps
    /// during constraint emission.
    pub fn cells_by_material(&self, target: Material)
        -> impl Iterator<Item = Coord> + '_;

    // ── Geometry helpers ──
    pub fn edges_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
    pub fn edges_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
    pub fn vertices_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
    pub fn vertices_of_edge(&self, edge: Coord) -> [Coord; 2];
    pub fn cells_of_edge(&self, edge: Coord) -> Vec<Coord>;
    pub fn cells_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;

    /// Iterate every vertex paired with its incident edges. Slither
    /// uses this for the per-vertex `DegreeIn{0, 2}` constraint emit.
    pub fn vertices_with_edges(&self)
        -> impl Iterator<Item = (Coord, Vec<Coord>)> + '_;

    /// Walk maximal `Floor` segments along `axis`. Walls flush the
    /// current segment; non-floor materials other than `Wall` (Hole,
    /// LabelOnly) are transparent. Akari uses this for the
    /// per-row / per-col `AtMost(Bulb, 1)` segment constraints.
    pub fn for_each_floor_segment(
        &self,
        axis: Axis,
        on_segment: impl FnMut(Vec<Coord>),
    );
}

pub enum Axis { Horizontal, Vertical }

Per-genre allocations

Genres build their substrate via Genre::default_substrate(dims), returning whatever shape is canonical for that genre:

// Akari 7×7 — every cell starts as Floor; constructor calls
// set_wall / set_hole / add_clue to install the puzzle's walls,
// holes, and clue values.
Substrate {
    dims: Dimensions::Rect { rows: 7, cols: 7 },
    cells: Some(Grid::filled(7, 7, Material::Floor)),
    edges_h: None, edges_v: None, corners: None,
}

// Sudoku 9×9 — Boxed dims so build_constraints can read box_w/box_h.
Substrate {
    dims: Dimensions::Boxed { box_w: 3, box_h: 3 },
    cells: Some(Grid::filled(9, 9, Material::Floor)),
    edges_h: None, edges_v: None, corners: None,
}

// Slitherlink 5×5 — cells (clue containers, all LabelOnly) + edges
// (every edge a decision target).
Substrate {
    dims: Dimensions::Rect { rows: 5, cols: 5 },
    cells:   Some(Grid::filled(5, 5, Material::LabelOnly)),
    edges_h: Some(Grid::filled(6, 5, Material::Floor)),
    edges_v: Some(Grid::filled(5, 6, Material::Floor)),
    corners: None,
}

// Samurai 21×21 — cells with 72 Hole positions for the gap regions.
Substrate {
    dims: Dimensions::Rect { rows: 21, cols: 21 },
    cells: Some(grid_with_padding_holes()),
    edges_h: None, edges_v: None, corners: None,
}

The “all Floor” grids cost a byte per position — 81 bytes for sudoku, trivial. Dense storage is correct because Floor is the dominant state; the previous HashMap-of-walls representation was a sparseness optimization for a non-sparse situation.

Non-rectangular shapes via Material::Hole

Hole is the mechanism by which non-rectangular puzzles fit the flat R × C model. Pad the bounding rectangle, mark the gaps as Material::Hole, and the substrate machinery treats them correctly: not in coords(), not painted as decision cells, LOS rays pass straight through.

This handles every “multiple sub-grids in a bigger rectangle” topology in the puzz.link / canonical roster:

  • Samurai Sudoku — five 9×9s arranged in an X, fitted into a 21×21 bounding box with 72 padded gap cells.
  • Gattai-8 — eight 9×9s edge-to-edge, larger bounding box, gaps where rows or cols don’t line up.
  • Shogun — a 5×5 of 9×9s (45×45 bounding), regular grid of gaps.
  • Sumo — four 9×9s overlapping at a center 9×9.

In every case, the cells-appearing-in-multiple-regions pattern that already powers Sudoku’s 27 standard regions just keeps working: the overlap zones are coords that appear in two sub-grids’ constraint regions (one row from sub-grid A, one row from sub-grid B), and the propagator handles the shared-cell case naturally — no special multi-grid coordinate machinery, no new Layout type.

What this doesn’t cover: genuinely non-rectangular topologies — hex grids, polar grids, Möbius strips, toruses. Those need new Layer variants (a hypothetical Layer::HexCells) and a different geometric vocabulary in RegionShape. Until then, flat R × C plus Hole covers every multi-grid puzzle on the current catalog without architectural concessions.

Future genres

Heyawake rooms, Hashi islands, Killer cages, Star Battle anchors — all deferred to the constraint list. A Heyawake room is a Region referenced by every constraint targeting it; the room layout ships as the region’s coord set. Hashi islands are Pin-style constraints anchoring bridge endpoints; the island position is just a Floor cell that happens to have constraints rooted at it.

Material::LabelOnly covers the recurring “structurally exists, holds a label, never decides” pattern across Sudoku variants and Picross-family genres:

  • Killer Sudoku — corner positions in the Corners layer carry cage-sum labels. Material::LabelOnly at the cage’s anchor corner; the sum value lives in the variant payload.
  • Kropki Sudoku — edge positions in EdgesH / EdgesV host white/black dots between adjacent cells. The edges layer is all LabelOnly (no edge marks); dot kind comes from the variant payload.
  • Sandwich / Little Killer / Quadruple clues — extra rows or columns of LabelOnly corner/edge anchors outside the cell grid.
  • Picross — the bounding box pads the cell grid with hint rows on top and hint columns on the left; those padding cells are LabelOnly, holding the hint sequences. The actual play grid is Floor cells in the bottom-right rectangle.

Extending Material further is reserved for genuinely new structural categories (a hypothetical “portal” cell that warps LOS, a “tinted” non-blocker that affects scoring) that no current or near-future genre needs.

Domains and Marks

Marks are what the player puts on the board — a Bulb here, the Sudoku digit 7 there, this edge Drawn. Each move either commits a mark at a coord or eliminates one from the set still possible there. The puzzle is solved when every coord’s set has narrowed to a single mark and every rule still holds.

The set of marks still possible at a coord at a given moment is its candidate set; the domain is the candidate set a fresh game starts with — every mark the genre permits at that position. This chapter covers both — the type of the values themselves (Mark) and how a candidate set narrows during play.

What’s at a coord

Every Floor coord starts a game with a full domain. For Sudoku, that’s 1..=9 per cell; for Akari, {Bulb, Empty} per floor cell; for Slitherlink, {Drawn, NotDrawn} per edge. Walls and holes have no domain — they aren’t decision targets.

Play narrows the candidate set. Each move eliminates a mark from a coord, or commits to a single one, and the propagator cascades the elimination through constraints: a Sudoku row’s Distinct rule removes the same digit from the other eight cells the moment one is committed; an Akari bulb removes Bulb from every cell its rays reach.

A coord ends up in one of three states:

  • Open — candidate set has more than one mark left. Play continues.
  • Solved — candidate set narrowed to a singleton. The remaining mark is the answer at this coord.
  • Contradicted — candidate set is empty. No legal mark; the engine reports a violation and the puzzle (or the current branch of the search) is unsolvable.

A puzzle is solved when every Floor coord is Solved and every Goal constraint is Satisfied (covered in the next chapter).

Mark

Every decision in the catalog is either binary (Akari bulb / empty, Slitherlink drawn / not-drawn, Nurikabe shaded / unshaded) or N-state numeric (Sudoku digits, Skyscrapers, Hashiwokakero bridge counts).

pub enum Mark {
    /// Binary mark — `false` / `true`. Per-puzzle prose labels
    /// ("Bulb" / "Empty", "Drawn" / "NotDrawn") are render concerns,
    /// not type concerns.
    Binary(bool),
    /// N-state numeric mark. The valid range is whatever the puzzle's
    /// `full_domain(coord)` emits — there's no library-fixed
    /// convention. Sudoku uses `1..=9` (matches the printed glyph);
    /// Hashiwokakero bridge slots use `0..=2` (zero bridges is a
    /// real value). The `u8` carries the value as displayed.
    Numeric(u8),
}

Keeping Binary distinct from Numeric(u8) is deliberate. Collapsing both into a single u8 would force binary genres to pick a convention (is bulb 0 or 1?) and propagator code to match-and-decode at every boundary. With the split, LOS sweeps and count constraints over Binary marks dispatch on the type itself, and Sudoku arithmetic stays unambiguous.

Numeric(u8) carries the displayed value — a Sudoku 5 is Numeric(5), not Numeric(4) zero-indexed. The u8 is wide enough for every digit any genre prints and narrow enough that a per-coord candidate set still fits in a small bitmask, which the fast solver relies on.

Constraints, Roles, Rules and Regions

Every constraint is a role + one or more regions + a rule:

  • A role says how the engine reads the rule’s outcome: Goal (must end Satisfied for the puzzle to be solved) or Forbidden (must never be Violated; Pending and Satisfied are both fine during play).
  • A list of regions says where the rule applies. Most rules are single-region (regions.len() == 1); a few — Clone for Quoits-style ring duplications, future cross-region equality constraints — bind multiple regions and treat them collectively. The constraint owns its regions (no separate region registry elsewhere on the puzzle), so the renderer iterates puzzle.constraints().iter().flat_map(|c| &c.regions) to find every paintable region — combined with Region.presentation (above), this collapses what was per-genre paint code into a single uniform pass.
  • A rule says what holds over those regions — drawn from the small shared vocabulary defined below.

A puzzle is solved iff every Goal constraint is Satisfied and no constraint is Violated.

There is no library-wide “every coord decided” rule baked into the solved check. Whether a genre needs that depends on what kind of puzzle it is, and the asymmetry is real:

  • Value-filling puzzles (Sudoku, Hidato, KenKen, Numberlink) are solved when every coord is decided to a specific value; they include a Goal + All + Decided constraint to enforce it.
  • Lighting / shading / loop puzzles (Akari, Slitherlink, Nurikabe, Heyawake) are solved when their structural goal holds; unmarked cells are fine, and Decided is omitted.

In Akari you can solve while leaving non-bulb cells unmarked because the goal is “every cell lit” (ExactCount(Bulb, 1) per LosCross), not “every cell decided.” In Sudoku the opposite is true: every cell must commit, and that requirement is itself a constraint, not a property of the engine.

A killer cage is Sum { target: 23, domain: 1..=9 } on a Set of cells. A Sudoku row is Distinct on a Row. A 3×3 box is Distinct on a Rect. A thermo is Increasing on a Path. Slitherlink’s loop is Path { mark: Binary(true), closed: true } on Union([AllOf(EdgesH), AllOf(EdgesV)]). A Quoits clone group is Clone over multiple Set-shape regions, one per ring — the only case in the current vocabulary where regions.len() > 1.

pub struct Constraint {
    pub role: Role,

    /// Owned regions the rule applies over. Length-1 for single-
    /// region rules (the common case); length-N for multi-region
    /// rules like `Clone`.
    pub regions: Vec<Region>,

    pub rule: Rule,
}

Constraints are identified by their slice index in puzzle.constraints() — that’s the propagator’s cache key, the violation report’s locator, the trace’s per-step constraint reference. There’s no separate id field because the slice index already does the job: stable within a puzzle (the list is built by Genre::build_constraints and rebuilt — in the same canonical order — on every mutation), O(1) to look up, no boilerplate at construction.

Constraint exposes shorthands for the two common cases:

impl Constraint {
    pub fn one(role: Role, region: Region, rule: Rule) -> Self;
    pub fn many(role: Role,
                regions: impl IntoIterator<Item = Region>,
                rule: Rule) -> Self;
    /// `Role::Goal` single-region — the common case.
    pub fn goal(region: Region, rule: Rule)      -> Self;
    /// `Role::Forbidden` single-region.
    pub fn forbidden(region: Region, rule: Rule) -> Self;
}

build_constraints implementations use goal / forbidden to keep the emission sites compact.

pub enum Role {
    /// Must be `Satisfied` for the puzzle to be solved. *Drives*
    /// the propagator toward solution: forced moves come from
    /// `Goal` constraints. Most rules are Goals — Sudoku
    /// `Distinct`, Akari per-cell `ExactCount(Bulb, 1)`, Killer
    /// `Sum`, Slitherlink loop `Path`.
    Goal,
    /// Must never be `Violated`, but doesn't drive solution. Used
    /// for **negative invariants** — pure validity gates the
    /// engine watches without trying to satisfy. The propagator
    /// treats `Violated` `Forbidden` constraints as
    /// `Status::Contradicted`, but doesn't emit forced moves from
    /// them.
    ///
    /// Examples that earn their `Forbidden` role:
    ///
    /// - Nurikabe **no-2×2** of shaded cells: a `NoTwoByTwo` rule
    ///   the puzzle must never violate, but no shaded cell is
    ///   *forced* by it.
    /// - Hashi **no-cross** between bridges: bridge edges may not
    ///   intersect, but the rule doesn't tell you which bridges to
    ///   draw.
    /// - Akari **at-most-one-bulb-per-LosCross**: paired with the
    ///   `Goal: AtLeastOne` half — the lower bound drives, the
    ///   upper bound gates.
    /// - Per-coord upper-bound caps in any genre.
    ///
    /// The `Forbidden` role is what lets these be expressed in the
    /// existing rule vocabulary without inventing new "negative"
    /// rule kinds. `NoTwoByTwo` is just a rule; it's the role that
    /// distinguishes "drive shading" from "watch for violations."
    Forbidden,
}

Regions

A region names a set of coords a constraint applies to, plus an optional render hint. Listing every coord by hand works but bloats the constraint list — a Sudoku row is nine cells, a 9×9 box is nine more, the entire grid has 27 such regions before any variants enter the picture. So RegionShape carries a small vocabulary of named shorthands; the engine expands each to its concrete coord list at evaluation time.

pub struct Region {
    /// What coords this region covers — the geometric vocabulary.
    pub shape: RegionShape,

    /// Optional render hint. `None` for purely structural regions
    /// (Sudoku rows / columns are solver-only — the row constraint
    /// fires `Distinct` but no row outline gets drawn). `Some(_)`
    /// for regions the renderer should depict — see the next
    /// chapter, *Decorations, Themes and Colors*, for what
    /// `RegionPresentation` carries.
    pub presentation: Option<RegionPresentation>,
}

pub enum RegionShape {
    /// Every coord across every layer the puzzle's `Substrate`
    /// populates. Reach for `AllOf(Layer)` instead unless you really
    /// mean every layer.
    All,
    /// Every coord of one specific layer. Slitherlink's loop is
    /// `Union([AllOf(EdgesH), AllOf(EdgesV)])`.
    AllOf(Layer),

    // Geometric, layer-parameterized: rows, columns, rects,
    // diagonals.
    Row { layer: Layer, r: usize },
    Col { layer: Layer, c: usize },
    Rect { layer: Layer, top: usize, left: usize,
           height: usize, width: usize },
    Diagonal { layer: Layer, from: Coord,
               dir: DiagDir, length: Option<usize> },

    // Cell-only LOS / adjacency. Take a `Coord` whose variant must
    // be `Coord::Cell` (validated at constraint construction);
    // `Los` walks until it hits a wall, `LosCross` is the union of
    // all four rays from a cell *including* the source.
    Los { from: Coord, dir: OrthoDir },
    LosCross { from: Coord },
    Neighbors4 { from: Coord },

    // Universal fallbacks — explicit Vec<Coord>, work over any
    // layer. The escape hatch for arbitrary cages, hand-authored
    // configurations, direction-sensitive rules.
    Set(Vec<Coord>),
    Path(Vec<Coord>),

    // Set algebra over shapes. "The row except this cell," etc.
    Union(Vec<RegionShape>),
    Intersect(Vec<RegionShape>),
    Diff(Box<RegionShape>, Box<RegionShape>),
}

pub enum OrthoDir { Up, Down, Left, Right }
pub enum DiagDir  { UpLeft, UpRight, DownLeft, DownRight }

Convenience builders sugar over the common cases:

impl Region {
    /// Bare region with no presentation — the common case for
    /// solver-only regions (Sudoku rows, Akari LosCross sets).
    pub fn shape(shape: RegionShape) -> Self {
        Self { shape, presentation: None }
    }

    pub fn all() -> Self;
    pub fn all_of(layer: Layer) -> Self;
    pub fn row(layer: Layer, r: usize) -> Self;
    pub fn col(layer: Layer, c: usize) -> Self;
    pub fn rect(layer: Layer, top: usize, left: usize, h: usize, w: usize) -> Self;
    pub fn cells_at(coords: impl IntoIterator<Item = Coord>) -> Self;
    /// Singleton-cell region — the common case for `Pin` constraints.
    pub fn single_cell(r: usize, c: usize) -> Self;
    pub fn los_cross(at: Coord) -> Self;
    pub fn neighbors4(from: Coord) -> Self;
    pub fn path(coords: Vec<Coord>) -> Self;
    pub fn union(parts: Vec<RegionShape>) -> Self;
    // ... one per RegionShape variant.
}

The with_presentation chain builder, plus the full set of construction examples, lives in the next chapter alongside the RegionPresentation definition.

Why Layer shows up twice. A RegionShape::Row { layer, r } declares which layer the row lives on; once the engine resolves that region to Vec<Coord>, every emitted Coord already carries the same information as its variant tag (Coord::Cell, Coord::EdgeV, …). The duplication is intentional: pre-resolution RegionShape values are symbolicRow(3) alone would be ambiguous (cells? edges? corners?), so the Layer field disambiguates the symbolic form. Post-resolution, layer moves from explicit field to implicit variant tag. The cell-only shorthands (Los, LosCross, Neighbors4) skip the field because their semantics — wall-blocking, orthogonal adjacency — are only defined over cells; they take a Coord whose variant must be Coord::Cell, validated at constraint construction. Same currency (Coord) everywhere; one type for callers to remember.

Rules and violations

Each Rule variant is paired 1:1 with a Violation variant carrying the offending coords / counts. The mirroring is intentional: the engine pattern-matches violations to format prose, highlight cells, and grade without re-running the rule. Every rule has one discriminant; the engine’s dispatch key is just Rule, no sub-discriminants.

One file per rule. Each Rule variant owns its entire implementation in core/src/rules/<rule>.rs: tightness, mark-vocabulary requirement, structural validate, the slow-path consistency check, propagator-side combined eval+force-deduction, OR-tools CP-SAT encoding (gated behind #[cfg(feature = "ortools")] — used as a test oracle and bench baseline, not a production solver path), grader techniques, and (for rules with an incremental cache) the per-constraint aggregate’s build and apply/undo updates. Each file defines a marker ZST (e.g. pub struct ExactCount;) that impls crate::rules::RuleImpl.

The trait surface itself is intentionally small. Two methods do the actual work:

  • check(args, coords, game) -> ConstraintOutcome — required. Universal consistency check; reads candidate masks via Game, no aggregate access, no force-emission. Used by SimpleGame::status, generation validation, tests, and as the slow-path fallback inside the propagator. The CSP textbook’s check.
  • propagate(args, ci, prop, out) -> Option<ConstraintOutcome> — optional. Combined eval + force-emission for the AC-3 worklist. Reads the rule’s aggregate (if any), pushes implied moves into out. Returns Some(status) when the rule computed status, None to defer to the slow-path check. The Option<_> makes the asymmetry explicit: check is always definitive; propagate may shrug.

Plus the supporting methods (extract, tightness, mark_requirement, validate, techniques, build_aggregate), all defaulted except extract. Central dispatch sites (eval_rule in algorithm/eval.rs, the propagator’s propagate_at, and ortools::encode_constraint) are exhaustive matches that delegate to the rule’s trait method — adding a new variant is a compile error at every dispatch site, not a panic at runtime.

Propagator caches. Rules that maintain an incremental cache (ExactCount/AtMost/AtLeastOne/Pin/DegreeIn/ParityEven share a MarkAggregate; DistinctSequences and Path { closed: true } each have their own) plug into a single per-constraint Vec<AggregateState> on the propagator. AggregateState is a tagged union with variants None / Mark / DistinctSeq / PathClosed; the rule’s RuleImpl::build_aggregate produces the matching variant, and per-apply / per-undo updates dispatch through the rule module. CellMin is the exception: its storage is a shared SoA of counter state keyed by counter name (not by constraint), so it lives directly on the propagator with rule-side helpers in rules/cell_min.rs.

OR-tools is not on the prod path. The encode_ortools function each rule provides is for the test oracle (OrToolsSolver, used in core/src/tests/puzzles/* to cross-check FastSolver) and bench baselining. FastSolver is the production solver. Some Rule variants — Sum, Increasing, MinDifference, Polyomino, BoundedSize, ValueGroupedSize — currently have no FastSolver aggregate and return Pending from check; in production those rules are caught by trial-1 and slow-walk verification, not by an external CP-SAT call.

New puzzles get new rules by adding variants here, paired with the matching Violation variant. There is no escape hatch — if a puzzle needs a rule the vocabulary doesn’t cover, that’s a signal to extend the vocabulary, not work around it.

pub enum Rule {
    /// Marks in the region are pairwise distinct.
    /// Sudoku rows/cols/boxes, Hitori, Latin squares.
    Distinct,

    /// Marks in the region sum to `target`, each value drawn from
    /// `domain`. Killer cages, Arrow sums, Sandwich Sudoku.
    Sum { target: i64, domain: RangeInclusive<u8> },

    // ... Increasing, MinDifference, AtLeastOne, AtMost,
    // ExactCount — the rest of the counting / ordering family.

    /// The single coord in the region is committed to `mark`. The
    /// canonical shape of every clue-shaped given — Sudoku givens,
    /// Star Battle pre-placed stars, Hidato endpoints. Region must
    /// contain exactly one coord. Distinguished from
    /// `ExactCount { mark, n: 1 }` because the propagator can
    /// shortcut to "set candidates to {mark}" without going through
    /// the count machinery, and the grader treats `Pin` as "given"
    /// (which the renderer wants for distinguishing givens from
    /// deductions).
    Pin { mark: Mark },

    /// Count of `mark`-committed coords lies in `allowed`.
    /// Generalises `ExactCount`; covers Slitherlink/Masyu/Yajilin
    /// vertex degree `{0, 2}`. `allowed` must be sorted and deduped.
    DegreeIn { mark: Mark, allowed: AllowedCounts },

    // ─── Structural / shape rules ──────────────────────────────

    /// The coords marked `mark` form a single path under
    /// substrate-natural adjacency. `closed = true` for cycles;
    /// Slitherlink loop is `Path { mark: Binary(true), closed: true }`
    /// over edge coords with corner-sharing adjacency.
    Path { mark: Mark, closed: bool },

    // ... Connected, Polyomino, BoundedSize, NoTwoByTwo — the
    // other structural / shape rules.

    // ─── Completion + value-grouping ──────────────────────────

    /// Every coord in the region has its candidate set narrowed to
    /// a singleton — the *value* doesn't matter, only that the cell
    /// is committed. With `Goal` role, this is "every cell decided."
    /// Sudoku, Hidato, KenKen, Numberlink need it; Akari does not.
    Decided,

    // ... ValueGroupedSize — Fillomino's "components of value v have
    // size v."

    // ─── Per-cell + cross-rule shapes ──────────────────────────

    /// For every coord in the region, the named per-cell counter at
    /// that coord must be at least `min`. Counters are declared by
    /// the puzzle via `Puzzle::cell_counters`. The propagator forces
    /// the counter's target mark at the unique remaining contributor
    /// when only one is left in the coord's reach.
    CellMin { counter: String, min: u32 },

    // ... LoopFreezesCounts — cross-rule freeze for Slitherlink-style
    // "loop closed AND each clue's count satisfied."
}

/// Sorted, deduped list of allowed counts for `Rule::DegreeIn`.
/// Inline up to 4 entries — covers Slitherlink's `{0, 2}` and any
/// future genre that picks among a handful of values.
pub type AllowedCounts = SmallVec<[u8; 4]>;

pub enum Violation {
    /// `Distinct` violated: two coords share a mark.
    Distinct { a: Coord, b: Coord, mark: Mark },

    /// `Sum` exceeded its target.
    SumOver { target: i64, actual: i64 },
    /// `Sum`'s max possible total is below target.
    SumUnder { target: i64, max_possible: i64 },

    // ... ExactCountOver, ExactCountUnder, AtLeastOneStarved,
    // AtMostOver, OrderBroken, DegreeOutOfAllowed — the rest of
    // the counting / ordering violations.

    /// `Pin` violated: the pinned coord lost `mark` from its
    /// candidate set (the player or solver eliminated the pinned
    /// value, contradicting the given).
    PinLost { coord: Coord, mark: Mark },

    // ... Disconnected, PolyominoIsolated, BadPath, OverSize,
    // HasTwoByTwo — the structural-rule violations.

    /// `Decided` violated: a coord has no candidates left
    /// (irrecoverable empty domain).
    EmptyDomain { coord: Coord },

    // ... ValueGroupedSizeMismatch — completion / value-grouping
    // violations.
}

/// Discriminant-only mirror of `Rule`. Used by techniques to declare
/// which rule kinds they need present in `puzzle.constraints()` to be
/// applicable; the grader filters out techniques whose required
/// kinds don't appear, once per puzzle.
pub enum RuleKind {
    Distinct, Sum, Increasing, MinDifference,
    AtLeastOne, AtMost, ExactCount, Pin, DegreeIn,
    Connected, Polyomino, Path, BoundedSize, NoTwoByTwo,
    Decided, ValueGroupedSize,
    CellMin, LoopFreezesCounts,
}

impl Rule {
    pub fn kind(&self) -> RuleKind { /* trivial match */ }
}

Outcomes

What a constraint’s eval returns when the engine queries it, and the puzzle-wide rollup that follows from walking every constraint.

pub enum ConstraintOutcome {
    /// Every requirement of this constraint is met by current marks.
    Satisfied,
    /// Constraint isn't decidable yet — needs more state.
    Pending,
    /// Constraint can never be satisfied from this state.
    Violated(Violation),
}

pub enum Status {
    InProgress,
    Solved,
    /// The first constraint to evaluate as `Violated` in
    /// `puzzle.constraints()` order. The engine short-circuits on
    /// the first violation it finds — when several constraints are
    /// violated simultaneously, only one is reported, and "first"
    /// is the slice index of the constraint that produced it.
    /// Callers that want to enumerate every violation walk
    /// `puzzle.constraints()` themselves.
    Contradicted(Violation),
}

Move — the unit of change that narrows a candidate set, applied between eval rounds — lives in the Propagation engine chapter, where the moves-from-state derivation actually happens.

Givens are constraints, too

The constraint list is the puzzle’s every rule the solution must satisfy. Region rules (Sudoku row Distinct, Akari LosCross ExactCount(Bulb, 1)); invariants (Nurikabe NoTwoByTwo); shape rules (Path { closed: true } for the slitherlink loop); and the givens themselves. A sudoku given “cell (3, 4) = 7” is just Pin { mark: Numeric(7) } over a single-coord region. An akari clued wall translates to ExactCount over the wall’s visible floor cells (the wall position itself isn’t a coord, but its visibility region is). A killer cage is Sum over its cell set.

There is no separate “givens” surface on Puzzle — the constraint list is the canonical form. Genres still keep raw author-input fields (Akari.clues: HashMap<(r, c), u8>, Sudoku.clues: HashMap<(r, c), u8>) as editor scaffolding: the form a UI mutates when a player clicks “place a clue here.” But every such field is translated into constraints at construction (populate_ constraints); the constraint list is what every downstream consumer reads.

The Pin rule (see the Rules section above) is the workhorse for clue-shaped givens. Single-coord region, mark carries the pinned value. The propagator narrows the coord’s candidate set to {mark} immediately at game construction; a PinLost violation fires if the player or solver later eliminates that mark. Sudoku givens, Star Battle pre-placed stars, Hidato endpoints — all Pin.

Givens that aren’t Pin-shaped are still constraints, just under a different rule. Akari clue values are ExactCount over visible floors. Slitherlink cell clues are ExactCount over the cell’s four edges. Killer cage sums are Sum. Hashi island bridge counts are ExactCount over outgoing edges. Same story every time: the editor mutates raw fields, the constructor emits constraints, the engine reads constraints.

Mark / rule compatibility

Mark collapses to Binary | Numeric(u8), so compatibility is runtime-checked rather than gated by trait bounds. Sum, Increasing, and MinDifference are meaningful only over coords carrying Numeric marks; constructing one over a Binary region errors at constraint validation. Distinct, AtLeastOne, ExactCount, Pin, DegreeIn, and the structural rules (Connected, Polyomino, Path, BoundedSize, NoTwoByTwo) work for both mark variants.

Growing the vocabulary

When a new puzzle wants a rule the existing vocabulary doesn’t cover, the discipline is to extend the vocabulary, not work around it: add a variant on Rule, the paired variant on Violation, and a matcher in the engine. There is no escape hatch. Slitherlink’s “edges form a single closed loop” earned a slot via Path { mark: Binary(true), closed: true }; Masyu’s pearl rules will earn theirs the day Masyu lands.

Decorations, Themes and Colors

How regions look on the screen. The previous chapter introduced Region.presentation: Option<RegionPresentation>None for solver-only regions, Some(_) for anything the renderer should depict (Sudoku 3×3 boxes, Killer cages, thermos, Quoits clones). This chapter unpacks what RegionPresentation carries.

/// Optional render hint attached to a region. Renderers walk
/// `puzzle.constraints().iter().flat_map(|c| &c.regions)` and paint
/// every region whose `presentation` is `Some(_)`.
pub struct RegionPresentation {
    /// Border on the perimeter (between this region and any coord
    /// not in it). Cell-region only; edge / corner regions ignore.
    pub border: Option<BorderStyle>,

    /// Fill tone. `None` = transparent. `Some(color)` = solid tint.
    /// Two regions with the same color render with the same shade.
    pub fill: Option<Color>,

    /// Optional label rendered in the region. Cage sums, room
    /// counts, sandwich totals.
    pub label: Option<RegionLabel>,

    /// Whether the region's cells get a decorative shading tone —
    /// the visual rhythm Sudoku 3×3 box parity uses. Independent of
    /// `fill`; shaded cells get a frontend-default tint, fill is for
    /// puzzle-author-chosen colors.
    pub shaded: bool,

    /// Decorative overlay drawn on top of the region's coords.
    /// `None` = no overlay (the common case).
    pub decoration: Option<Decoration>,
}

pub enum BorderStyle { Solid, Dashed, Dotted }

pub enum RegionLabel {
    Number(u32),
    DigitList(Vec<u8>),
    Text(String),
}

Decorations

Decorative overlays — cross-cutting visual primitives that recur across variant puzzles. Each variant carries an Option<Color>: Some(_) for puzzle-author-chosen color, None lets the frontend pick.

pub enum Decoration {
    Ring        { color: Option<Color> },   // Quoits clone groups
    LineThrough { color: Option<Color> },   // palindromes, thermos, German whispers
    Arrow       { color: Option<Color> },   // sudoku arrow constraints
    Dot         { color: Option<Color> },   // Kropki edges
    QuadCircle  { color: Option<Color> },   // Quoits-style four-cell anchors
}

The five variants exhaust the cross-cutting visual primitives in the catalog so far — adding a new one is a deliberate extension, not a casual addition.

Colors

Three states, mapping to how puzzles actually want colors specified:

pub enum Color {
    Hex(u32),
    Named(NamedColor),
}

pub enum NamedColor {
    Red, Orange, Yellow, Green, Blue, Purple, Pink, Brown, Grey,
    // Closed set; add a variant only when a real puzzle needs more.
}
  • None (an Option<Color> that’s None) — the puzzle didn’t care. Frontend picks however it likes; renderers may differ. Use for autogen Killer cages where the constructor doesn’t bother picking, or for decorations where any reasonable color works.
  • Color::Hex(rgb) — exact RGB. Editor or constructor has a specific color in mind; frontends render that color (possibly with a theme transform). Two frontends in the same theme agree exactly.
  • Color::Named(name) — semantic color from a small closed palette. Themes resolve to their palette’s appropriate shade. The puzzle’s “pink” is pink everywhere; the exact shade adapts per theme.

Theme adaptation

Frontends are free to transform Hex colors for accessibility or theme constraints — a dark-mode renderer might lightness-flip every Hex; a high-contrast theme might quantize to a distinguishable subset. Named colors are resolved per theme to the theme’s matching palette entry. None lets the frontend do whatever fits its theme. This means two themes show the same puzzle with different absolute colors — correct, not a bug.

Construction

The chained-builder pattern lets a constraint pin only the presentation fields it cares about; everything else falls back to the field’s default.

impl Region {
    /// Chain a presentation onto a region. The `..Default::default()`
    /// pattern lets a constraint pin only the fields it cares about
    /// (Sudoku boxes set border + shaded; Killer cages set border +
    /// label; nothing else needs to populate every field).
    pub fn with_presentation(mut self, p: RegionPresentation) -> Self {
        self.presentation = Some(p);
        self
    }
}

Construction reads naturally:

// Sudoku 3×3 box at (br, bc) with parity shading.
Region::rect(Layer::Cells, br * 3, bc * 3, 3, 3)
    .with_presentation(RegionPresentation {
        border: Some(BorderStyle::Solid),
        shaded: (br + bc) % 2 == 1,
        ..Default::default()
    })

// Sudoku row — solver-only, no presentation.
Region::row(Layer::Cells, r)

// Killer cage with dashed border + sum label + a chosen color.
Region::cells_at(cage.cells.clone())
    .with_presentation(RegionPresentation {
        border: Some(BorderStyle::Dashed),
        label: Some(RegionLabel::Number(cage.sum)),
        fill: Some(Color::Named(NamedColor::Yellow)),
        ..Default::default()
    })

// Thermo: line drawn through the path, bulb implied by the path's
// first coord. Pairs with `Rule::Increasing` over the same Path.
Region::path(thermo.path.clone())
    .with_presentation(RegionPresentation {
        decoration: Some(Decoration::LineThrough {
            color: Some(Color::Named(NamedColor::Grey)),
        }),
        ..Default::default()
    })

// Quoits clone group — two rings sharing a Pink color, paired
// across regions inside one `Rule::Clone` constraint.
Region::cells_at(ring_a_cells.clone())
    .with_presentation(RegionPresentation {
        decoration: Some(Decoration::Ring {
            color: Some(Color::Named(NamedColor::Pink)),
        }),
        ..Default::default()
    })

// Kropki edge: a small dot between two adjacent cells.
Region::cells_at(vec![Coord::Cell{r,c}, Coord::Cell{r,c+1}])
    .with_presentation(RegionPresentation {
        decoration: Some(Decoration::Dot { color: None }),
        ..Default::default()
    })

Genres, Puzzles, Games

Now that the vocabulary’s in place — Substrate, Layer, Coord, Mark, Region, Constraint, Move — we can talk about how Penmark actually represents puzzles.

A genre is a runtime value: a &'static Genre fn-pointer table. A puzzle is a Puzzle — a Substrate plus a Vec<Constraint>, tagged with its genre handle. A game is Marks on a puzzle.

A genre is a value, not a type. The Genre struct (core/src/genre.rs) is a table of fn pointers plus identity constants (name, display_name, mark_render, layers, build_constraints, cell_clue, …). One &'static Genre literal ships per genre (AKARI, SLITHER, SUDOKU, TAKUZU) and they all live in the GENRES slice. The per-genre logic itself is authored as an impl GenreTrait for AkariGenre block on a zero-sized marker — the macro genre_static! then coerces those monomorphized items into the struct’s fn-pointer slots. Authors think in trait methods; callers dispatch through the runtime handle.

A puzzle is a value of Puzzle — a single non-generic struct that carries its genre as a field:

pub struct Puzzle {
    pub genre: &'static Genre,
    rows: usize,
    cols: usize,
    substrate: Substrate,
    constraints: Vec<Constraint>,
    meta: PuzzleMeta,
}

Per-genre type aliases still exist (pub type Sudoku = Puzzle;) as shorthand, but they’re vestigial — Sudoku and Puzzle are the same type. Constructors live as associated functions on the marker struct: SudokuGenre::new(box_w, box_h, givens) returns a Puzzle whose .genre is set to crate::genre::SUDOKU.

A puzzle carries a Substrate (which positions exist, where walls and holes are) and a Vec<Constraint> (every rule the solution must satisfy, with givens included as Pin-shaped constraints). There’s no separate “clue map” or “wall set” field — those live as Material::Wall / Material::Hole in the substrate and as constraint entries in the rule list.

Editing a puzzle (add_clue, clear_clue, set_wall, set_hole, set_floor, set_material) goes through methods on Puzzle that mutate the substrate and rebuild the constraint list by dispatching through puzzle.genre.build_constraints(substrate, clues). Rebuilds are microseconds at human-interaction sizes, so we always rebuild rather than apply surgical updates.

A game is someone’s in-progress play — which marks have been committed, which candidates eliminated, where the propagator has narrowed things. A game is a value of Game<'p>, holding Marks (per-coord candidate sets) plus a borrow back to the static Puzzle. The borrow lifetime makes “this game is bound to this puzzle” a compile-time guarantee.

In prose, lowercase puzzle keeps its colloquial sense — “this Sudoku puzzle is hard.” Capital Puzzle always refers to the struct.

Solving, Grading, Generating, Enumerating

The four operations Penmark performs on a puzzle.

  • Solving — finding solutions. Solver trait; pluggable per-solver impls. Takes &Puzzle.
  • Grading — assigning a difficulty by simulating a human solver step by step. Grader trait; one shipped impl (StandardGrader). Takes &Puzzle.
  • Generating — producing a uniquely-solvable puzzle. Lives as genre.generate (an fn pointer on the runtime handle) rather than a separate trait, since generation needs intimate access to per-genre structural state.
  • Enumerating — streaming uniquely-solvable puzzles of a given size. Same shape: genre.enumerate.

Solving and grading are pluggable because real users want to compare solver / grader implementations side-by-side (the bench matrix shows FastSolver vs SimpleSolver vs OrToolsSolver). Generating and enumerating aren’t pluggable in the same way — each genre has at most one generation pipeline that makes sense for it. Folding them into the genre handle keeps the surface lean.

Pluggable solvers and graders

The algorithm traits (Solver, Grader) are non-generic and take &Puzzle. Dispatch on a puzzle’s genre happens at runtime through its &'static Genre handle, not by monomorphization.

The shipped set:

  • FastSolver — DFS + AC-3 propagation + depth-1 trial-1 sweep, driven through a Propagator that wraps a FastGame<'p> (bitmask domains, per-constraint aggregates, per-cell counters).
  • SimpleSolver — brute-force DFS over SimpleGame’s HashMap<Coord, Vec<Mark>> candidate sets. The “obvious correct” reference every faster impl is cross-checked against.
  • StandardGrader — fires named human-pattern techniques in priority order against the same Propagator engine, with trial-1 as backward escalation. Techniques are rule-kind-driven and apply across genres.
impl Solver for FastSolver { /* … */ }
impl Grader for StandardGrader { /* … */ }

Adding a new solver impl is one trait impl plus a row in the bench matrix. No per-genre wiring needed.

Genre-handle generators and enumerators

Generation and enumeration are fn pointers on the Genre struct, populated by the macro that builds each per-genre static from its GenreTrait impl:

pub struct Genre {
    pub generate: fn(&GenConfig, &AtomicBool) -> Option<Puzzle>,
    pub enumerate: fn(&EnumConfig, &AtomicBool,
                      &mut dyn FnMut(&Puzzle),
                      &mut dyn FnMut(EnumStats)) -> String,
    /* … */
}

// Dispatch at a call site:
let p = (puzzle.genre.generate)(&cfg, &cancel);

Both default to a universal recipe (scaffold_solve_derive / walk_canonical) that uses genre.material_vocab to know what to scatter and genre.derive_clue_at to derive a ClueValue per clueable cell from a bare-solve completion. The default derive_clue_at body dispatches through genre.clue_pattern (the Pin / Count / ConnectedSize declarative shortcut) and wraps the u8 result in ClueValue::Number; genres whose clues don’t fit that closed set (Tapa, future Yajilin) override derive_clue_at directly. Akari uses both defaults verbatim. Sudoku and slither override generate because they need a genre-specific structural seed (sudoku: random diagonal; slither: connected inside-region) before the bare-solve can find a useful completion.

See the Generator and Enumerator chapters for full configuration surfaces and extension points.

Implementations

The Rust trait surface that ties the conceptual model to running code: Puzzle for the static puzzle data, Game for in-progress play, tagged serialization for round-tripping puzzles and games across the wire, dispatching for routing genres at runtime, the shared propagation engine, and the Puzzlehound dataset that feeds everything else.

Genre and Puzzle

Penmark splits puzzle representation into two pieces:

  • Genre — a runtime fn-pointer table (&'static Genre). Hosts every per-genre rule the dispatch path needs: domain shape, render style, how to build the constraint list from a substrate + clue grid, parsers, generators, enumerators. One static literal per genre (AKARI, SLITHER, SUDOKU, TAKUZU); they all live in the GENRES slice.
  • Puzzle — a single non-generic struct holding the canonical state (the runtime genre handle, rows, cols, substrate, constraints, meta).

Per-genre logic is authored as an impl GenreTrait for <X>Genre block on a zero-sized marker. The genre_static! macro coerces those monomorphized methods and constants into the runtime Genre struct’s fn-pointer fields. Authors think in trait methods; callers dispatch through puzzle.genre.method(args).

The split keeps the value-carrying struct dimension-uniform across genres while letting per-genre code stay co-located in core/src/puzzles/<genre>/. Heterogeneous storage and iteration (“every genre”) are now trivial — just walk &[&Genre].

The Genre struct

pub struct Genre {
    // ── Identity ────────────────────────────────────────────────
    pub name: &'static str,            // "sudoku", "akari", …
    pub display_name: &'static str,    // "Sudoku" — shown in UIs
    pub baseline_version: u32,         // wire-format schema version

    // ── Frontend defaults ──────────────────────────────────────
    pub mark_render: MarkRenderStyle,  // Lamp / Glyph / EdgeStroke
    pub layers: &'static [Layer],      // cells / edges / corners

    // ── Slider ranges ──────────────────────────────────────────
    pub gen_budget_range: (f32, f32),
    pub gen_initial_budget_secs: f32,
    pub enum_budget_range: (f32, f32),
    pub enum_initial_budget_secs: f32,
    pub enum_empty_message: &'static str,

    // ── Generator default ──────────────────────────────────────
    pub default_gen_density: f32,

    // ── Puzzle-free fn pointers ─────────────────────────────────
    pub default_substrate: fn(Dimensions) -> Substrate,
    pub dims_for: fn(usize, usize) -> Dimensions,
    pub build_constraints:
        fn(&Substrate, &Grid<Option<ClueValue>>) -> Vec<Constraint>,
    pub cell_counters: fn() -> &'static [CellCounter],
    pub format_value: fn(u8) -> String,
    pub emits_variant_directives_in_janko: fn() -> bool,
    pub variant_aliases: fn() -> &'static [(&'static str, &'static str)],
    pub material_vocab: fn() -> MaterialVocab,
    pub clue_host_material: fn() -> Material,
    pub clue_pattern: fn() -> CluePattern,
    pub parse_external_line: fn(&str) -> Option<String>,

    // ── Puzzle-aware fn pointers ────────────────────────────────
    pub full_domain: fn(&Puzzle, Coord) -> Vec<Mark>,
    pub format_value_for: fn(&Puzzle, u8) -> String,
    pub cell_clue: fn(&Puzzle, usize, usize) -> Option<u8>,
    pub extra_constraints: fn(&Puzzle) -> Vec<Constraint>,
    pub empty_at: fn(Dimensions) -> Puzzle,
    pub extract_clues: fn(&Puzzle) -> Grid<Option<ClueValue>>,
    pub on_clue_added: fn(&mut Puzzle, usize, usize, &ClueValue),
    pub derive_clue_at:
        fn(&Puzzle, &Solution, Coord) -> Option<ClueValue>,
    pub clue_domain: fn(&Puzzle, Coord) -> Vec<u8>,
    pub sample: fn() -> Puzzle,
    pub overlays:
        fn(&Puzzle, &Marks, &Status) -> Vec<Overlay>,
    pub generate:
        fn(&GenConfig, &AtomicBool) -> Option<Puzzle>,
    pub scaffold_passes: fn() -> &'static [ScaffoldPass],
    pub clue_post_filters: fn() -> &'static [CluePostFilter],
    pub apply_cli_density: fn(f32, &mut GenConfig),
    pub enumerate: fn(
        &EnumConfig,
        &AtomicBool,
        &mut dyn FnMut(&Puzzle),
        &mut dyn FnMut(EnumStats),
    ) -> String,
    pub parse_text: fn(&str) -> Result<Puzzle, ParseError>,
    pub from_janko_text: fn(&str) -> Result<Puzzle, ParseError>,
    pub to_janko_text: fn(&Puzzle) -> String,
    pub from_puzzlink_url: fn(&str) -> Result<Puzzle, ParseError>,
    pub to_puzzlink_url: fn(&Puzzle) -> Result<String, ParseError>,
}

pub static AKARI:   &Genre = genre_static!(AkariGenre);
pub static SLITHER: &Genre = genre_static!(SlitherlinkGenre);
pub static SUDOKU:  &Genre = genre_static!(SudokuGenre);
pub static TAKUZU:  &Genre = genre_static!(TakuzuGenre);

pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU];

pub fn by_name(name: &str) -> Option<&'static Genre> {
    GENRES.iter().copied().find(|g| g.name == name)
}

Call sites dispatch through the field:

let p = (puzzle.genre.parse_text)(s)?;
let dom = (puzzle.genre.full_domain)(&puzzle, coord);
if puzzle.genre.name == "sudoku" { /* … */ }

GenreTrait — the authoring interface

GenreTrait (in crate::puzzle::genre) is the trait per-genre markers implement. Its const items and method signatures mirror the Genre struct’s field types. The genre_static! macro fills in each &'static Genre literal by copying the monomorphized items out of the marker’s impl:

pub trait GenreTrait: 'static + Sized {
    const NAME: &'static str;
    const DISPLAY_NAME: &'static str;
    const LAYERS: &'static [Layer];
    const MARK_RENDER: MarkRenderStyle;
    const BASELINE_VERSION: u32 = 1;
    const DEFAULT_GEN_DENSITY: f32 = 0.22;
    /* …slider const ranges… */

    /// Runtime handle — must match the literal in `crate::genre`.
    /// Default methods that construct a `Puzzle` use it to set the
    /// `genre` field.
    const GENRE: &'static crate::genre::Genre;

    fn full_domain(p: &Puzzle, coord: Coord) -> Vec<Mark>;
    fn default_substrate(dims: Dimensions) -> Substrate;
    fn build_constraints(
        substrate: &Substrate,
        clues: &Grid<Option<ClueValue>>,
    ) -> Vec<Constraint>;
    fn clue_pattern() -> CluePattern;
    fn sample() -> Puzzle;
    /* …~25 more methods, most with sensible defaults… */
}

Required methods (no default): NAME, DISPLAY_NAME, LAYERS, MARK_RENDER, GENRE, full_domain, default_substrate, build_constraints, clue_pattern, sample. Everything else has a default — see core/src/puzzle.rs for the full trait.

Clue payload — ClueValue

Per-cell clues thread through build_constraints, extract_clues, add_clue, and derive_clue_at as a closed enum:

pub enum ClueValue {
    Number(u8),    // sudoku digit, akari/slither count, takuzu cell value
    Runs(Vec<u8>), // Tapa run-length sequences
    // future Yajilin arrows, Masyu pearls go here as new variants
}

The four shipped genres only ever emit ClueValue::Number(_); Puzzle::add_clue accepts impl Into<ClueValue> so existing u8 literal call sites continue to work via impl From<u8> for ClueValue. Multi-byte payloads (Tapa runs, Yajilin arrows when it lands) carry their own variants.

Genre::cell_clue: Option<u8> is unchanged — it’s the display shorthand renderers (eframe painter, CLI Janko writer) use to fold a clue down to a single glyph. Multi-byte payloads return their first byte. The typed ClueValue channel is for construction / canonical reconstruction / round-trip; rendering stays on the cheap u8 path.

Generation extension points (default-implemented hooks)

  • material_vocab() — non-Floor materials available on the cells layer. Drives the universal generator’s density sliders.
  • clue_host_material() — substrate material that holds clue cells (sudoku/takuzu: Floor; akari: Wall; slither: LabelOnly). Drives Puzzle::is_clue_target and the editor’s clue-cycle gate.
  • clue_domain(p, coord) — valid clue values; empty means no clue allowed at this coord.
  • clue_pattern() — declarative shortcut for the “given a solution, what’s the clue here?” question. Three canonical shapes: Pin (sudoku), Count{target, region} (akari/slither), ConnectedSize{target} (nurikabe). The default derive_clue_at body dispatches on this enum and wraps the result in ClueValue::Number. Genres whose clues don’t fit any of these shapes (Tapa runs) override derive_clue_at directly and ignore clue_pattern.
  • derive_clue_at(p, sol, coord) -> Option<ClueValue> — inverse of cell_clue. The walker calls it once per clueable cell to derive the canonical clue set from a bare-solve completion. Default body forwards through clue_pattern(); the escape valve for novel clue shapes.
  • scaffold_passes() — named structural pre-passes (akari ships six wall styles + hole layouts).
  • clue_post_filters() — value-filters applied after clue derivation (akari ships only_evens etc).
  • apply_cli_density(d, cfg) — map the CLI’s --density flag into the right GenConfig slot for this genre.
  • dims_for(rows, cols) — sudoku rounds rows to a square factor pair and returns Boxed{box_w, box_h}.

See the Generator chapter for the full configuration surface.

Implementations live in core/src/puzzles/<genre>/mod.rs. The marker struct is pub struct <Genre>Genre; — ZST, all logic on the trait impl block.

The Puzzle struct

pub struct Puzzle {
    pub genre: &'static crate::genre::Genre,
    rows: usize,
    cols: usize,
    substrate: Substrate,
    constraints: Vec<Constraint>,
    meta: PuzzleMeta,
}

Same six fields across every genre. Serialization delegates to WireDelta so the on-disk JSON elides any constraint / substrate cell that the genre’s build_constraints + default_substrate would produce on its own; custom constraints land in constraints_added. Deserialize is not implemented directly — rehydration needs the runtime &'static Genre handle, which serde can’t thread through. Callers go through Puzzle::from_tagged_json (envelope dispatch on the genre tag) or crate::puzzle_wire::from_wire_delta(genre, wire).

Puzzle exposes a uniform API that covers most engine and frontend needs:

impl Puzzle {
    // ── Construction ──
    pub fn from_state(
        genre: &'static Genre,
        rows: usize, cols: usize,
        substrate: Substrate,
        constraints: Vec<Constraint>,
    ) -> Self;

    // ── Reads ──
    pub fn rows(&self) -> usize;
    pub fn cols(&self) -> usize;
    pub fn substrate(&self) -> &Substrate;
    pub fn constraints(&self) -> &[Constraint];
    pub fn meta(&self) -> &PuzzleMeta;

    pub fn coords(&self) -> impl Iterator<Item = Coord> + '_;
    pub fn material(&self, r: usize, c: usize) -> Material;
    pub fn cell_clue(&self, r: usize, c: usize) -> Option<u8>;
    pub fn cell_clues(&self) -> impl Iterator<Item = ((usize, usize), u8)> + '_;
    pub fn is_play_target(&self, coord: Coord) -> bool;
    pub fn is_clue_target(&self, coord: Coord) -> bool;

    pub fn is_floor(&self, r: usize, c: usize) -> bool;
    pub fn is_wall(&self, r: usize, c: usize)  -> bool;
    pub fn is_hole(&self, r: usize, c: usize)  -> bool;
    pub fn walls(&self) -> impl Iterator<Item = (usize, usize)> + '_;
    pub fn holes(&self) -> impl Iterator<Item = (usize, usize)> + '_;
    pub fn box_size(&self) -> u8;  // sqrt(rows), sudoku-shaped

    // Geometry helpers (forward to substrate)
    pub fn edges_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
    pub fn edges_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
    pub fn vertices_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
    pub fn vertices_of_edge(&self, edge: Coord) -> [Coord; 2];
    pub fn cells_of_edge(&self, edge: Coord) -> Vec<Coord>;
    pub fn cells_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;

    // ── Mutation (always rebuilds constraints via puzzle.genre) ──
    pub fn add_clue(&mut self, r: usize, c: usize, v: impl Into<ClueValue>);
    pub fn clear_clue(&mut self, r: usize, c: usize);
    pub fn set_material(&mut self, coord: Coord, m: Material);
    pub fn set_floor(&mut self, r: usize, c: usize);
    pub fn set_wall(&mut self, r: usize, c: usize);
    pub fn set_hole(&mut self, r: usize, c: usize);
    pub fn substrate_mut(&mut self) -> &mut Substrate;
    pub fn constraints_mut(&mut self) -> &mut Vec<Constraint>;

    // ── Genre-derived (dispatch through self.genre) ──
    pub fn full_domain(&self, coord: Coord) -> std::vec::IntoIter<Mark>;
    pub fn cell_counters(&self) -> &'static [CellCounter];
    pub fn format_value(&self, v: u8) -> String;

    // ── Wire format ──
    pub fn to_tagged_json(&self) -> String;
    pub fn from_tagged_json(s: &str) -> Result<Self, TagError>;
    pub fn to_tagged_json_with_marks(&self, marks: &Marks) -> String;
    pub fn from_tagged_json_with_marks(s: &str)
        -> Result<(Self, Option<Marks>), TagError>;
    pub fn to_janko(&self) -> String;
    pub fn content_hash(&self) -> String;
}

Editing model. Constraints are canonical state but never edited directly during normal use. Mutating the puzzle (add_clue, set_wall, …) updates the substrate and re-derives the constraint list from (self.genre.build_constraints)(substrate, clues). Builds run in single-digit microseconds across genres at human-interaction sizes, so we don’t bother with surgical updates.

constraints_mut() is exposed for use cases that need it (variant constraint authoring, debugging) but isn’t the normal path.

Per-genre aliases and constructors

Each genre exposes a friendly type alias (vestigial since the collapse — they’re just shorthand for Puzzle) and inherent constructors on its marker:

pub type Sudoku = Puzzle;

impl SudokuGenre {
    pub fn new(box_w: u8, box_h: u8,
               givens: HashMap<(usize, usize), u8>) -> Puzzle { /* … */ }
    pub fn empty(box_size: u8) -> Puzzle { /* … */ }
}

So SudokuGenre::new(3, 3, givens) returns a Puzzle whose .genre is set to crate::genre::SUDOKU. The constructor builds an empty puzzle, calls set_wall / set_hole / add_clue as needed, and the substrate-then-rebuild path keeps constraints in sync.

Akari constructors take (rows, cols, walls, clues, holes); Slitherlink takes (rows, cols, clues).

Why a struct of fn pointers instead of a trait?

The pre-collapse design had trait Genre and Puzzle<G: Genre>. Heterogeneous iteration over every genre meant a for_each_genre! macro expanded at 11+ call sites across the workspace; runtime genre lookup needed a parallel GenreEntry trait object slice; loading a puzzle without knowing its genre at compile time meant an AnyPuzzle enum.

Collapsing the trait into a value-typed struct paid off three ways:

  • One iteration mechanism. Every “for every genre” call site becomes for g in GENRES { … }. Both for_each_genre! and the GenreEntry object slice deleted.
  • One puzzle type. Puzzle carries its genre as a field; the generic parameter is gone everywhere. AnyPuzzle deleted.
  • Editor-friendly. Threading a runtime &'static Genre through gesture-codec, variant, and apply paths is mechanical; threading <G> through them is friction. The collapse unblocked the universal-editor work.

The cost is a small indirection (an fn-pointer call) on the per-genre dispatch path, which doesn’t show up in benchmarks (propagation work dominates), and a one-time macro that copies monomorphized items into the runtime literal. The full design lives in penmark/notes/genre-collapse.md.

Presentation declarations

Every frontend (eframe today, WASM/web/SVG/terminal later) needs to know two things to paint a puzzle: what’s at each position structurally (Floor? Wall? Hole?) and what player marks mean visually (lamp? digit? edge stroke?). The first answer reads from Substrate (per-position Material); the second is a small closed enum (MarkRenderStyle) declared per-genre as a const on the GenreTrait impl and surfaced on the runtime handle as genre.mark_render. A frontend implements one paint pass per Material variant and one per MarkRenderStyle variant; new genres land without touching the renderer.

The substrate side deliberately doesn’t carry given information (clue values, pinned digits) — that comes from walking the constraint list at paint time, not from the per-coord material. Cell clues are read uniformly via Puzzle::cell_clue(r, c) which forwards to (self.genre.cell_clue)(self, r, c).

pub enum MarkRenderStyle {
    /// Cell-substrate, binary marks. `Binary(true)` is a token
    /// (lamp / bulb / star), `Binary(false)` is a small X.
    /// Akari, Star Battle.
    Lamp,
    /// Cell-substrate, numeric marks rendered via
    /// `genre.format_value`. Sudoku, KenKen.
    Glyph,
    /// Cell-substrate, binary marks, tri-state visual: wall fill,
    /// explicit empty fill, and a third "unknown" tone for cells
    /// with no committed mark. Nurikabe, LITS, Hitori, Tapa,
    /// Yin-Yang.
    Fill,
    /// Cell-substrate, numeric marks where `Numeric(tag)` is a
    /// region tag; the renderer derives a partition. Fillomino,
    /// Suguru.
    Partition,
    /// Edge-substrate, binary marks. `Binary(true)` is a thick
    /// line, `false` is an X cross. Slitherlink, Masyu, Yajilin
    /// loop edges.
    EdgeStroke,
    /// Edge-substrate, numeric marks: `Numeric(1|2)` is one or
    /// two parallel lines. Hashi.
    BridgeCount,
    /// Corner-substrate, binary marks. Shingoki and other
    /// vertex-mark genres.
    CornerCircle,
    /// Standard pass paints nothing; the genre's overlay handles
    /// all mark visualisation.
    Custom,
}

MarkRenderStyle is a semantic commitment (“Akari marks ARE lamps”) — what a lamp looks like (yellow disc with rays in eframe, a <div class="lamp"> in web, a * in a terminal) is the frontend’s call. The genre declares what; frontends decide how.

The four small read accessors a frontend uses, in paint order:

  • puzzle.material(r, c) -> Material — paint backgrounds.
  • puzzle.cell_clue(r, c) -> Option<u8> — paint clue text.
  • puzzle.is_play_target(coord) -> bool — gate input dispatch.
  • puzzle.format_value(v) -> String — render a small int as the genre’s glyph. Forwards to (self.genre.format_value)(v).

Game<'p>

Holds the candidate sets per coord. Borrows its puzzle for its entire lifetime — the borrow checker enforces “this game is bound to this puzzle.” Cloning a Game copies only the candidate state plus a reference; the static puzzle is never duplicated.

pub trait Game<'p>: Sized + Clone {

    /// Fresh game bound to a puzzle — every coord carries its full
    /// domain.
    fn new(puzzle: &'p Puzzle) -> Self;

    /// The puzzle this game is playing.
    fn puzzle(&self) -> &'p Puzzle;

    /// Current candidate set at a coordinate.
    fn candidates(&self, coord: Coord) -> impl Iterator<Item = Mark> + '_;

    /// Apply a move. No validity checks — the caller is responsible.
    /// `Move` is `Copy` and small (≤ 32 bytes), so we pass by value:
    /// no indirection on the DFS hot path, and the data lives in
    /// registers across the call.
    fn apply(&mut self, mv: Move);

    /// Undo the most recently applied move. Implementations track the
    /// history themselves (per-game undo stack), so the caller doesn't
    /// pass the move back — pop-the-stack semantics. Cheaper at the
    /// call site (one fewer parameter to thread through DFS frames)
    /// and lets fast impls undo via diff-records they already keep
    /// for incremental propagator state.
    fn undo(&mut self);

    // ─── Derived ─────────────────────────────────────────────────

    /// `Contradicted(v)` if any constraint is `Violated` — `v` is
    /// the violation reported by the *first* such constraint in
    /// `puzzle.constraints()` order. Else `Solved` if every `Goal`
    /// constraint is `Satisfied`. Else `InProgress`.
    fn status(&self) -> Status;

    // ─── Player-mark accessors ───────────────────────────────────

    /// Iterate every coord currently committed to a single mark,
    /// paired with that mark. Skips undecided and contradicted
    /// coords. Default impl walks `puzzle().coords()` and yields
    /// singletons; overrides can stream from a denser store.
    fn marks(&self) -> impl Iterator<Item = (Coord, Mark)> + '_ { /* default */ }

    /// Read the committed mark at `coord`. `None` for undecided or
    /// empty-domain coords. Default impl runs in O(domain size)
    /// via `candidates()`.
    fn get(&self, coord: Coord) -> Option<Mark> { /* default */ }

    /// Force `coord`'s candidate set to `{mark}` — the player UI
    /// path. Doesn't go through the apply/undo log; player edits
    /// aren't solver moves. No default: the trait's
    /// `apply(Move::Commit)` only works for multi-candidate
    /// coords, so it can't express "overwrite a previously-
    /// committed value" — a common player gesture.
    fn set(&mut self, coord: Coord, mark: Mark);

    /// Restore `coord` to its full domain — the player's "clear
    /// this cell" gesture. Like `set`, not undo-stack-tracked.
    fn clear(&mut self, coord: Coord);

    // ─── Serialization (game-with-givens) ────────────────────────

    /// Snapshot the game's current state: every puzzle coord paired
    /// with its remaining candidate list. Forgiving by design — a
    /// caller can save partial state and round-trip it. The
    /// matching wire shape lives in `core::tagged::Marks` (see the
    /// *Tagged serialization* chapter).
    fn to_marks(&self) -> Marks { /* default impl walks coords */ }

    /// Build a fresh game and replay each saved candidate
    /// restriction into it. Coords absent from `marks` are left at
    /// their full domain — partial saves stay loadable. Saved
    /// marks outside a coord's full domain are silently dropped;
    /// the engine's domain is the source of truth.
    fn from_marks(puzzle: &'p Puzzle, marks: &Marks) -> Self { /* default impl */ }
}

Saved-game serialization round-trips through Puzzle’s tagged-JSON methods (to_tagged_json_with_marks / from_tagged_json_with_marks, see the Tagged serialization chapter); the loader then rebuilds a Game in two steps because the borrow checker won’t let one function return both the owned puzzle and a game that borrows it:

let (puzzle, marks) = Puzzle::from_tagged_json_with_marks(s)?;
let game = match marks {
    Some(m) => FastGame::from_marks(&puzzle, &m),
    None    => FastGame::new(&puzzle),
};

Tagged serialization and dispatch

Every serialized puzzle (and every serialized game) carries its own genre tag. The wire format is the same in every direction (db row, data/puzzles/<genre>.jsonl.gz line, penmark import input, future WASM IPC) — a self-describing JSON envelope:

{
  "genre": "akari",
  "data":  { "rows": 7, "cols": 7, "walls": [...], ... },
  "marks": {                            // optional — present iff this is a saved game
    "cell:00": ["bulb", "empty"],
    "cell:01": ["empty"],
    ...
  }
}

The marks field is the saved-game extension: present means “this is the puzzle plus where the player is now”; absent means “bare puzzle.” A serialized game always carries the givens — data is the full puzzle definition, every time. There is no marks-only wire form, by design: a save-game blob is portable on its own and needs no separate puzzle reference to render or play.

After the trait→struct collapse there is exactly one path into the format:

  • Puzzle::to_tagged_json / from_tagged_json — for every caller, whether or not it knows the genre yet. from_tagged_json reads the "genre" field from the JSON envelope, looks the name up via core::genre::by_name, and constructs a Puzzle whose .genre field is that runtime handle. The pre-collapse AnyPuzzle enum (one arm per genre) is gone — the runtime handle carries everything it used to encode in its discriminant.
  • Puzzle::to_tagged_json_with_marks / from_tagged_json_with_marks — the saved-game pair. The latter returns (Puzzle, Option<Marks>).
impl Puzzle {
    pub fn to_tagged_json(&self) -> String;
    pub fn from_tagged_json(s: &str) -> Result<Self, TagError>;

    pub fn to_tagged_json_with_marks(&self, marks: &Marks) -> String;
    pub fn from_tagged_json_with_marks(s: &str)
        -> Result<(Self, Option<Marks>), TagError>;
}

/// Errors from tagged-JSON parsing.
pub enum TagError {
    Json(serde_json::Error),
    UnknownGenre(String),
    GenreMismatch { expected: &'static str, found: String },
}

Round-trip property tests pin the agreement:

  • Puzzle: for every Puzzle value p, p.to_tagged_json() parses back to a Puzzle whose .genre and storage equal the original.
  • Game: for every (puzzle, sequence of valid moves), building the game, applying the moves, calling to_tagged_json_with_marks, parsing via from_tagged_json_with_marks, and rebuilding via Game::from_marks produces a game whose status and candidates(c) match the original at every coord.

Adding a new genre = one &'static Genre literal + one entry in the GENRES slice (see Adding a genre). The serialization paths require no changes — they look up by name.

How variants ride inside

Variants of an existing genre — Killer Sudoku, Thermo, Sandwich, Sudoku-X, German Whispers — are not new genres. They’re additional constraints stacked onto the same puzzle’s constraints vector, so the genre tag on the wire stays the same regardless of how exotic the puzzle gets.

canonical/sudoku/variants.rs is the worked example:

  • A VariantTag enum names every recognised variant (Killer, Thermometer, Sandwich, AntiKnight, NonConsecutive, …).
  • A Constraints struct carries one Vec<…> field per variant with payload (killer: Vec<KillerCage>, thermometer: Vec<Thermometer>, …), each #[serde(default, skip_serializing_if = "Vec::is_empty")] so absent variants vanish from the JSON.
  • A TOGGLE_VARIANTS const lists the payload-less ones (Diagonals, AntiKnight, NonConsecutive, …); they ride in a variants: Vec<VariantTag> list on the puzzle.

A Killer-plus-Thermo-plus-Sudoku-X puzzle round-trips as:

{
  "genre": "sudoku",
  "data": {
    "box_size": 3,
    "givens": { ... },
    "variants": ["diagonals"],
    "constraints": {
      "killer":      [{ "cells": [...], "total": 23 }, ...],
      "thermometer": [{ "bulb": [3,3], "path": [...] }, ...]
    }
  }
}

Two clean properties fall out:

  • The dispatch tag never grows. Every Sudoku — vanilla, Killer, Thermo, all of them — serializes as "genre": "sudoku". The variant universe expands inside Sudoku’s data shape. New variant = new field on Constraints + new VariantTag arm; no change to the genre registry, no change to the db schema, no change to the import pipeline.
  • Querying for variants stays cheap without normalising. The db row’s data JSON is searchable by tag (data->'variants' @> '["diagonals"]' in Postgres, json_extract(data, '$.variants') in SQLite) — no separate puzzle_variants join table required unless query patterns demand it. If “show me every Killer” earns a denormalised indexed column, it’s the same write-side helper pattern as genre: extract variants[] into a sibling text array.

What this rules out: “Killer Sudoku is its own genre with its own dispatch arm.” That’d contradict the constraint-composition thesis the rest of the design is built on, and force every cross-genre caller to know that Killer is just-Sudoku-but-different. The genre is the rule vocabulary the puzzle uses; variants are constraints stacked on top.

What this unifies

  • Databasepuzzles(id, genre TEXT, data JSON, rows, cols, difficulty, hard_score, ...) where data is the tagged blob and genre is denormalised from the tag for indexing. Loader is Puzzle::from_tagged_json(row.data)?. A CHECK constraint or a write-side helper enforces that the column and the blob’s tag agree.
  • data/puzzles/<genre>.jsonl.gz — one tagged blob per line. The per-genre file split stays (keeps file sizes / query patterns reasonable), but each line is self-describing — a stray Sudoku blob in the Akari file is rejected at read time via the tag / expected-genre check.
  • penmark import — reads puzzlehound’s collated output, parses each line via Puzzle::from_tagged_json, validates the resulting .genre.name against the target file’s expected genre, appends. The “what genre is this?” question is asked once (during the tag lookup) and the rest of the pipeline operates on the resulting Puzzle.
  • Saved gamesPuzzle::to_tagged_json_with_marks(&marks) produces the same envelope with a marks field appended. A save / share / URL-hash carries everything needed to render and play in one blob; the givens travel with the marks, so a recipient can play with no prior context.
  • WASM (when it lands) — the JS side hands a tagged blob to Rust, which parses through Puzzle::from_tagged_json[_with_marks] and dispatches per puzzle.genre. No parallel envelope format, no out-of-band genre header, no separate “puzzle channel” and “game-state channel”.

The thing this doesn’t do is force every caller to think about dispatch. Puzzle::from_tagged_json(blob) returns a Puzzle that already knows its genre; method calls go through the fn-pointer field, not through a match.

Dispatching

Every “for every genre” call site iterates the GENRES slice declared in core/src/genre.rs:

pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU];

CLI, eframe, and the bench harness all walk this slice; there is no parallel hand-maintained list to keep in sync. Adding a genre means one &'static Genre literal + one append to the slice (see Adding a genre).

The runtime Genre struct answers every per-genre question the harnesses used to out-source: parser, default density, dimension shape, generation pipeline, painter overlays, etc. Call sites dispatch through the fn-pointer fields:

// CLI command dispatch:
let Some(genre) = core::genre::by_name(&args.genre) else {
    bail!("unknown genre: {}", args.genre);
};
let p = (genre.from_janko_text)(&input)?;
let out = (genre.to_puzzlink_url)(&p)?;

// Iterating every genre:
for genre in core::genre::GENRES {
    println!("{} — {}", genre.name, genre.display_name);
}

penmark list-impls walks GENRES and prints genre.name per entry; the technique list is queried through core::algorithm::grader::technique_names(), which is itself genre- neutral now that techniques are object-safe.

The pre-collapse design had a for_each_genre! macro expanded at 11+ call sites across the workspace plus a parallel GenreEntry trait-object slice for runtime iteration. Both deleted. The full rationale lives in penmark/notes/genre-collapse.md.

Propagation engine

Underneath Solver, Generator, and Grader is a single propagation engine that runs the same way for every genre. It has one currency — Move — and two roles for the rule machinery, kept clean.

pub enum Move {
    /// Eliminate `mark` from `coord`'s candidate set.
    Eliminate { coord: Coord, mark: Mark },
    /// Narrow `coord`'s candidates to exactly `mark`.
    Commit    { coord: Coord, mark: Mark },
}

A Move is the unit of change. Eliminate is fine-grained (drop one mark from a coord’s candidate set); Commit is coarse (collapse to a singleton). Commit is conceptually equivalent to eliminating every other mark, but it’s its own variant because (a) it’s the player’s natural gesture — “place a bulb here” — and (b) the engine can shortcut it as a one-bit-set bitmask write instead of N eliminations. Game::apply and Game::undo both consume Moves.

The two roles, then:

  • Constraints are predicates. They report Satisfied, Pending, or Violated(reason) against the current state. They don’t produce moves; the engine never asks them to.
  • Propagator-tagged techniques produce moves. Techniques flagged Phase::Propagation are the cheap, rule-shape-specific propagators that emit Deduction { observed, moves }. Naked singles, bulb-clears-LOS, clue-saturation, etc. live here.

This split means a constraint’s eval is a yes/no/maybe check; the work of deriving what to do about it lives in propagators. If a propagator’s pass produces moves, those moves change the state, and the next pass evaluates constraints again. Fixed-point iteration handles correctness without Forces plumbing on the constraint side.

pub enum Phase {
    /// Cheap forward inference — runs in the inner propagation
    /// loop after every move, to fixpoint. Akari LOS clearing,
    /// Sudoku naked singles, clue-saturation,
    /// single-lighter-must-be-bulb.
    Propagation,
    /// Higher-cost pattern matching — runs in the outer grader
    /// loop only when propagation has stalled. X-Wing, Hidden
    /// Pairs, Akari clue-cover-by-elimination, paired-clue
    /// interactions.
    Pattern,
}

/// Object-safe: the grader holds a `Vec<Box<dyn Technique>>`.
pub trait Technique: Send + Sync {
    fn name(&self) -> &'static str;
    fn rule_kinds(&self) -> &'static [RuleKind];
    fn fire_once(&self, ci: usize, c: &Constraint, prop: &Propagator)
        -> Vec<TechniqueMatch> { /* default: empty */ }
    fn fire_all(&self, prop: &Propagator) -> Vec<TechniqueMatch>;
}

The loop

propagate(puzzle, game) -> Result<Vec<SolveStep>, Violation>:
    loop:
        before = game.move_count()
        for each Technique t with PHASE = Propagation, applicable to puzzle:
            run t.fire on builder; apply resulting Deduction; record steps
        for each Constraint c in puzzle.constraints():
            if c.eval(game) == Violated(v):
                return Err(v)
        if game.move_count() == before:
            return Ok(steps)   # fixpoint reached

Each pass: run all applicable propagators, applying any moves they produce; then check every constraint for a violation; loop until no move is made. Constraints are pure predicates; propagators are the move-producing engine.

How callers use this

Three callers weave propagate into their own outer loops, each adapting the engine to a different goal.

Solver fires the generic Propagator — the same set of forced-move derivations the engine knows about, batched and unnamed for speed. There’s no per-move trace; the solver just wants the answer. When propagation stalls without a solution or contradiction, the solver branches (MRV-ordered open coord, trial-1 with intra-sweep cascade) and recurses.

Grader fires the same forced-move logic, but as named Phase::Propagation techniques, one at a time, so each move lands in the trace tagged with the technique that derived it. The profile-driven scorer reads those tags to assign difficulty. When propagation stalls, the grader walks Phase::Pattern techniques in priority order, fires the first one that hits, and re-enters propagation. When no Pattern technique fires either, it escalates through backward fallbacks (clue-cover, trial-1, trial-2/N), each itself a Pattern-phase technique recorded in the trace. See the Grader chapter for the full layering.

Discovery loop runs propagation to the stall point, captures the state as a specimen (board layout, prior moves, the trial-search-forced move that no technique predicted), applies that move, and continues. The harness mines specimens that expose gaps in the technique library.

Three solvers, one contract

FastSolver’s body is: propagate to fixpoint → trial-1 sweep (MRV order, intra-sweep cascade) → if still in progress, branch on the first MRV-ordered open coord. The propagation step uses the generic Propagator, which derives forced moves from each rule kind’s cached aggregate state — no genre-specific code. This is the production solve path for every genre.

SimpleSolver is the deliberate brute-force baseline: HashMap<Coord, Vec<Mark>> candidates, no propagation, full DFS. Used for cross-checks on small puzzles and as the bench reference every faster impl beats badly.

The OR-Tools wrapper in algorithm/ortools/ translates each constraint into CP-SAT primitives and asks Google’s solver for solutions. It’s slower than FastSolver and not on the production path; its job is to be a totally independent oracle. When the three disagree on solve(p, k).len() or on the solution sets, the bug is on our side.

Puzzle dataset gathered by Puzzlehound

The engine reads puzzles from per-genre canonical files — penmark/data/puzzles/<genre>.jsonl.gz, one tagged JSON blob per line in the format described in Tagged serialization and dispatch. The per-line shape is whatever Puzzle::to_tagged_json() produces; FastSolver, StandardGrader, the eframe dataset browser, the benchmarks, and the test fixtures all consume this format and only this format.

The per-genre file split is for query patterns and reasonable file sizes, not for type discrimination — every line is self-describing, so a stray Sudoku blob in the Akari file is rejected at read time with a TagError::GenreMismatch, never silently decoded into the wrong type.

Strict mode at the boundary: every field in a canonical record maps to a typed variant struct, or the record is rejected at write time. No opaque escape hatch — if it can’t be encoded, it isn’t stored. Downstream consumers never branch on “this field might not be there”.

Where the puzzles come from: puzzlehound

Penmark doesn’t fetch puzzles. puzzlehound does — a sibling project at threeemojis/puzzlehound/ (Python, has its own README) that discovers, fetches, and collates puzzles from registered sources (logic-masters.de, Cracking the Cryptic spreadsheets, gmpuzzles RSS, swaroopg92 Atom, tdoku benchmarks, …) into a form penmark can consume directly.

penmark import reads puzzlehound’s collated output, parses each record with Puzzle::from_tagged_json, validates puzzle.genre.name against the target file’s genre, and appends as a tagged blob. From there the dataset-reading verbs (solve, grade-canon, profile-eval) take over. Adding a new upstream source is puzzlehound’s problem; adding a new genre to canonical-ize into is penmark’s — one new entry in the GENRES slice + one new data/puzzles/<genre>.jsonl.gz file.

Solver

Solver finds solutions to a puzzle. Three live impls today, one contract.

/// One valid solution: the candidate set per coord narrowed to a
/// singleton, expressed as a flat coord → mark map. Genre-specific
/// readers (e.g., "extract bulb positions") project from this.
pub type Solution = HashMap<Coord, Mark>;

/// Up to `max_solutions` solutions, in unspecified order. Length
/// ≤ `max_solutions`; equal to the true solution count if fewer.
/// The generator's uniqueness gate calls `solve(p, 2)` and checks
/// `len == 1`.
pub type SolutionSet = Vec<Solution>;

pub trait Solver {
    type Config: Default + Serialize + DeserializeOwned;
    const NAME: &'static str;
    fn new(config: Self::Config) -> Self;
    fn solve(&self, puzzle: &Puzzle, max_solutions: usize) -> SolutionSet;
}

/// Cross-puzzle solver options. Per-impl Configs typically embed
/// this with `#[serde(flatten)]`.
pub struct SolverParams {
    pub time_budget: Duration,
}

One CSP engine for every genre

The generic FastSolver, driven through Propagator over FastGame, is the single solve path for every genre. The rule vocabulary — DegreeIn, CellMin, LoopFreezesCounts, plus the per-cell counter machinery the engine maintains automatically — is expressive enough that no per-genre solver override is needed. Per-genre fast solvers (e.g. a future AkariFastSolver that registers fast genre-specific propagators while keeping the same outer DFS shape) are post-MVP; the generic impl earns its keep on every benchmark we run today.

OR-Tools as the sanity-check oracle

A separate algorithm/ortools/ impl wraps Google OR-Tools’ CP-SAT behind the same Solver trait. It’s not on the hot path — FastSolver outperforms it for everything we ship. What it provides is a totally different code path with the same contract: every generated puzzle and every imported dataset can be cross-checked against an industry-grade CSP solver to confirm the answer set matches. When FastSolver and OR-Tools disagree, the bug is in our engine, full stop. Fallout from going hard on the CSP-engine path — once the rule vocabulary maps cleanly onto CP-SAT primitives, getting a second opinion is mostly plumbing.

Cross-checks

FastSolver, SimpleSolver, and the OR-Tools wrapper all implement the same Solver trait. The benchmark harness runs them in the same Criterion invocation on a fixed puzzle set; the differential tests run them over a fuzzed corpus and assert solve(p, k).len() agrees and the returned sets are equal as sets when fewer than max_solutions exist. SimpleSolver is the brute-force baseline, OR-Tools is the independent oracle, FastSolver is the production path. Three impls, one contract, three angles on every disagreement.

Fast solver

A report on the architecture of the FastSolver (penmark-core), the optimizations that landed, the ones we tried and reverted, and ideas we discussed but haven’t pursued.

The engine lives under core/src/algorithm/ in two families:

  • algorithm/fast/ — the production family.
    • game.rsFastGame, the bitmask-backed CSP state (per-coord candidate bitmasks + apply/undo + slow status()).
    • propagator.rsPropagator, the speed: every incremental cache (per-constraint aggregates, per-cell counters, dirty queue, region indices, trial order) plus the forced_moves_at / eval_constraint_at machinery.
    • solver.rsFastSolver, the search loop. Drives a Propagator wrapping a FastGame.
  • algorithm/simple/ — the brute-force reference family.
    • game.rsSimpleGame, a HashMap<Coord, Vec<Mark>>-backed minimal Game impl. No bitmask, no caches.
    • solver.rsSimpleSolver, plain DFS over SimpleGame. The correctness baseline — every other solver should beat it badly.
  • algorithm/geometry.rs — region/reach helpers shared between the two families.
  • algorithm/eval.rs — generic constraint evaluator with both a fast eval_constraint (assumes pre-resolved regions) and a slow eval_constraint_full (handles LosCross/Neighbors4/CellMin by walking the puzzle on the fly).
  • algorithm/grader.rsStandardGrader, the human-techniques grader. Drives a Propagator (same engine as FastSolver) and fires named techniques in priority order. Techniques are generic, keyed off Rule kind + cached aggregate state: exact-count-zero, exact-count-saturated, exact-count-forced, at-most-saturated, at-least-one-witness, cell-min-witness, plus trial-1 as backward escalation. Any genre using these rule kinds gets the techniques for free; genres needing bespoke patterns can append their own Technique impls.

1. What the solver does

At every DFS node:

  1. Forward propagation to fixpoint. AC-3-style worklist over constraints, plus a parallel dirty-cell drain for CellMin-style counter rules. Continues until both queues are empty.
  2. Trial-1 sweep with intra-sweep cascade. Walk open coords in degree-descending order (game.trial_order()). At each coord, hypothesize every candidate; any commit that propagates to a contradiction is recorded as a forced elimination. Eliminations are applied immediately and followed by an inline forward- propagation drain, before moving to the next coord.
  3. Branch. If still InProgress, pick the first trial_order-ranked open coord and branch over its candidates. For binary domains (Akari) “smallest domain” MRV is degenerate (every open cell has 2 candidates) — degree is the heuristic that actually shrinks the search tree.

The undo log inside FastGame records every applied move; on exit from each DFS frame we pop exactly the count of moves applied at this level.

The same three-phase shape lives in the grader, except:

  • the grader runs in an outer loop instead of recursing;
  • the grader keeps a step trace tagged by RuleKind;
  • the grader never branches (trial-1 is its terminal technique).

The shared propagation routines are intentionally close to literal copies — keeping them divergent for tiny gains is more cost than it’s worth.


2. The FastGame / Propagator split

FastGame is intentionally minimal: per-coord candidate bitmasks, the coord_index / mark_to_bit lookups, the undo_stack, and an empty_count counter that lets status() short-circuit O(1) when no domain is empty. Apply and undo do the bit-flip + push/pop the undo log + maintain empty_count. That’s it. status() walks all constraints with a slow per-region resolver — fine for tests and the brute-force Simple solver, slow on the hot path.

Propagator<'g, 'p> borrows &'g mut FastGame<'p> and owns every incremental cache. Its constructor builds coord_constraints, region_indices, trial_order, the per-counter inverse-reach maps, the lifted CellMin rule table, the per-constraint aggregates, and the per-cell counters — all from the game’s current candidate masks. After that, the propagator is the sole driver of apply / undo: each call flips the game’s bits, then walks the incremental updates against the (pre, post) mask delta.

The propagator keeps a parallel transition_log so undo can reverse cache deltas without inspecting the game’s undo stack — two stacks, kept in sync by the rule “only the propagator drives moves during a solve.” Anyone holding a &mut FastGame (player UI, parser tests, future slow grader) can still apply moves; they just don’t get the cache.

3. State that makes propagation cheap

The hot loops touch these structures:

  • Bitmask candidates. Vec<u32> per-coord, one bit per mark in the puzzle’s universe (cap 32 marks, see MAX_MARKS). Apply / undo become mask & !bit / mask | bit. Iteration is a BitIter on the mask. Replaces the original Vec<Vec<Mark>>.
  • Static region indices. Each constraint’s region is resolved once at construction to a Vec<usize> of dense coord indices. Removes the per-eval coord → idx HashMap lookup.
  • First-class LOS regions. Puzzle::los_blocks lets the generic region walker handle Los, LosCross, and Neighbors4 natively — the genre doesn’t reimplement ray walking.
  • Per-constraint aggregates. Each constraint caches committed_with_mark and possible_with_mark for its primary mark. apply / undo keep these in sync; eval_constraint_at reads them in O(1) instead of rescanning the region. Rules without a primary mark (Decided, Distinct, NoTwoByTwo) carry a zero target mask — the update is a no-op and the generic scan handles them.
  • Per-cell counters with a dirty-cell queue. CellMin (the Akari “lit at least once” rule) is lifted out of the AC-3 worklist and onto a dedicated coord-indexed counter with a dirty_cells Vec. A 100×100 Akari has thousands of CellMin rules; routing them through AC-3 would put thousands of constraints in the worklist. Routing through the counter+dirty- cell queue costs a single push per affected coord.
  • trial_order. Coords sorted by descending degree (regular constraints + CellMin rules touching them). Drives the trial-1 sweep order and DFS branch ordering — the branch coord is the first one in trial_order with two or more candidates.
  • FxHashMap for coord_index and mark_to_bit. Cheap hashing replacing the std SipHash default; not visible in any one profile but compounds across construction.
  • Undo log of bit-level deltas. AddBack { idx, bit } and Restore { idx, mask } entries. apply pushes; undo pops.
  • Reusable Vec<Move> buffer. forced_moves_at writes into the caller’s buffer instead of returning a fresh allocation per call. Same trick for cell_min_forced_moves_at.

Every one of those is the result of a profiling round (mostly samply call-tree inspection, plus a couple of pprof FlameGraph runs) showing one specific hot frame and chasing it.


4. What worked, in order

Headline numbers below are wall-clock medians from the solve_one example, plus cargo bench summaries. Where two numbers appear, the left is “before the change” and the right is “after.”

Compile profile.

  • lto = "fat" and codegen-units = 1 in the workspace release profile. Modest but free win.

Profiling instrumentation.

  • Optional --pprof flag on the CLI for FlameGraph output.
  • Switched to samply for higher-fidelity call-tree timing on macOS.
  • Throttled Instant::now() sampling inside the propagation loop to 1-in-1000 calls; the original per-iter timestamping was measurably visible in the profile.

Data-structure swaps.

  • HashSet<usize>Vec<bool> for the AC-3 in_worklist and the dirty-cell in_dirty flag. Hashing was a profile hot spot.
  • std::collections::HashMapFxHashMap (rustc-hash) for small-key tables.
  • Region indices materialized once at construction (Tier 1).
  • LOS / LosCross / Neighbors4 handled by the generic walker via los_blocks (Tier 2).
  • Per-cell counters with a dirty-cell queue (Tier 3) — replaces a per-coord linear scan that the original AtLeastOne path did.

Bitmask candidates. Per-coord u32 instead of Vec<Mark>.

  • This was the biggest single mid-stage win on dense puzzles: the inner candidate loops collapse to a popcount + bit test, and apply/undo become two ops.

status() aggregate-driven evaluation. The “is the puzzle solved / contradicted / in progress?” check was rescanning every constraint. Reading the cached aggregates first short-circuits the common cases. (The variant that also maintained per-constraint “unsatisfied” / “starved” flags was tried and reverted — see §4.)

MRV trial ordering. Sorting trial-1’s coord walk by descending degree (trial_order) was the largest single win on the grader. On 100×100: ~31 s → ~2 s once trial-1 was MRV-ordered. The intuition is that high-degree cells are the ones whose hypothesis collapses the search space fastest; they surface forced eliminations sooner, so trial-1 returns sooner with progress.

Intra-sweep propagation in trial-1. When trial-1 finds an elimination, drain forward propagation in place before moving to the next coord, instead of returning to the outer loop. This avoids the round-trip “trial-1 returns → outer re-runs forward → trial-1 restarts at coord 0.” On 100×100: ~2 s → ~280 ms after this single change. The cost of the inline drain is dwarfed by the work it removes.

MRV branching + trial-1 inside the solver. Lifting the trial-1 sweep and trial_order-driven branch pick out of the grader and into the solver was the last big win:

sizesolver beforesolver nowgrader
50×50~2 s3.7 ms4.5 ms
100×100(didn’t terminate cleanly)43 ms64 ms
17×17 corpus (criterion)~30% faster on naive

The solver now beats the grader on big puzzles, which makes sense: solver stops at the first solution, grader records every step.


5. What we tried and reverted

status() with cached cell_min_unsatisfied / starved counters. Idea: maintain bookkeeping so status() is O(1) by counting how many CellMin rules are currently unsatisfied or starved. Result: helped 100×100 by ~12% but regressed 50×50 by ~10% and the dataset by ~11%. The per-apply maintenance cost swallowed the per-status() win on workloads where status() isn’t actually the bottleneck. Reverted. The cached aggregate fast-path remained; only the extra counters were dropped.

Batch trial-1 eliminations across coords without intervening propagation. Idea: skip the inline propagation drain in §3 and instead apply every elimination found on the full sweep at once. Result: 5–50× slowdown. The reason is subtle: forward propagation between trials was doing a lot of cheap deduction. If you batch, those cheap deductions surface as additional trial-1 hypotheses on the next sweep — paying for them at trial-1 prices instead of forward-propagation prices. Reverted.

Per-floor AtLeastOne constraints. Original encoding for Akari’s “every floor must be lit” rule. Each floor became its own AtLeastOne constraint, which means a 100×100 puzzle pushed thousands of constraint visits through AC-3 per cycle. Replaced by a single CellMin rule routed through the dirty-cell queue. Several construct.rs constraint-count tests broke and were updated to the new counts.

Janko parser without _ for holes. Bug discovered during a round-trip test: to_janko writes _ but parse only accepted - and .. Fixed by adding _ to the hole branch.


6. Things we discussed but haven’t done

Generic clue-cover and lit-witness propagators for Akari. Open question: is there a clean way to express these as genre-supplied propagation rules instead of baking them into the generic constraint walker? Today the only Akari-specific propagation is the CellMin rule; clue-cover (a clue with n unlit neighbors that all need bulbs is fully determined) and lit-witness (the only remaining floor that can light a forced cell must be a bulb) are subsumed by trial-1 today, but a direct propagator would cut the trial-1 cost.

MAC / GAC / AC-2001. Discussed as standard upgrades to the AC-3 worklist. The user leaned toward “make the naive solver as fast as possible” instead — the trial-1 + MRV + intra-sweep work that followed delivered most of the wins these would have. AC-2001 in particular is the natural next step if propagation throughput becomes the bottleneck again.

Trial-2 / trial-N. Higher-order trial search (commit two hypotheses, look for contradictions). Cost goes up combinatorially and most useful on harder puzzles where trial-1 stalls. Not yet needed — trial-1 + branching solves every dataset puzzle in the budget.

Smarter branch heuristics. Today the branch coord is the first trial_order entry with ≥ 2 candidates. Possible refinements: domain-size tiebreak (degenerate for binary, useful for richer domains), least-constraining-value ordering on the candidate side, random restarts. None implemented.

Refactor the grader to be pure human techniques. The optimization machinery now lives in Propagator, separated from FastGame — that work is done. The remaining piece is teaching the grader to drive a different engine: instead of running AC-3 to fixpoint, it should fire one human technique at a time, recording each step under its proper name. That refactor is the next planned change.

Parallelism. rayon is a feature flag and is wired into a couple of construction paths but not the solve loop itself. Per- puzzle parallelism (across the corpus) is the obvious win; per- solve parallelism (multiple branches concurrently) is a much bigger lift and not on the table.

WASM and the editor surface. The editor (ordo/edit) is supposed to consume the same core through WASM. The solver compiles to WASM today but no measurements have been taken — we don’t know where the slow path is in the browser yet.

External solvers

penmark ships its own FastSolver (see the Fast solver chapter for internals) and treats external solvers as feature-gated oracles — useful for cross-validation, useful as fallbacks for genres where writing a domain-specific propagator is hard, but not on the production solve path.

This doc captures what we tried and what we learned, since most of it isn’t obvious from the source.

Headline numbers

100×100 Akari (release build, single thread, scaffold-derived clued puzzles at 0.42 wall density, cached at /tmp/penmark_compare_100x100.json):

solverwall-clockgap to FastSolver
FastSolver (production)~10 ms
OrToolsSolver (CP-SAT, tuned)~28 ms2.8×
OrToolsSolver (CP-SAT, defaults)~52 ms5.2×
OrToolsSolver (initial port)~115 ms11.4×

compare_ortools (in core/examples/) reproduces this table; profile_ortools breaks the OR-tools call into model-build vs CP-SAT-solve and runs a 30+ parameter sweep so the next round of tuning has a measurement harness ready.

The takeaway is structural: FastSolver is a hand-rolled Akari propagator (per-cell lit_by counter, LOS-cross-aware AC-3, trial-1 with MRV, bitmask state). A general CP solver can’t replicate that without knowing about the puzzle’s lighting structure; it just sees clauses and cardinality constraints. The “10-100×” pitch from CP-SAT marketing assumes a naive backtracker as the baseline, not a tuned propagator.

OR-tools journey

Initial port: 115 ms at 100×100

The first cut used a one-hot encoding: one BoolVar per (coord, mark) pair with add_exactly_one over each coord. For Akari (binary domain) that’s 2× the variables it needs. CP-SAT’s default parameters were also fighting us — the portfolio of subsolvers, presolve, probing, and linearization all charge fixed setup costs that don’t pay off on a 7,000-constraint Boolean satisfaction model.

Win 1: single-bool encoding for binary domains (115 ms → 45 ms, 2.5×)

When every coord shares a 2-mark domain, allocate one BoolVar per coord and use !v as the indicator for the negative mark. BoolVar implements Not so the negation is free. No exactly_one constraints, half the variables. OneHot stays as a fallback for wider domains (Sudoku-style numeric marks).

Win 2: SatParameters tuning (45 ms → 33 ms)

Methodically benchmarked each top-level parameter against the 100×100 corpus. The CP-SAT primer is emphatic about not modifying top-level params — the portfolio of subsolvers is what makes CP-SAT fast — but that advice is tuned for optimization workloads. For pure satisfaction at our scale:

paramsavings
cp_model_presolve = false~7 ms
cp_model_probing_level = 0~3 ms
linearization_level = 0~2 ms
symmetry_level = 0~3 ms
use_sat_inprocessing = false~1 ms
use_phase_saving = falsesmall
stop_after_first_solution = truesmall
num_search_workers = 1(avoids 8-worker overhead)

These are deliberately against the primer’s guidance because at our scale the portfolio’s fixed setup cost outweighs its wins.

Win 3: specialised constraint encodings (33 ms → 28 ms)

The CP-SAT modelling page points out boolean constraints (add_and, add_or, add_at_most_one, add_exactly_one) are recognised as native clause forms by the SAT presolve, while add_eq on a linear sum routes through the LP relaxation — costlier and less helpful for boolean inputs. Specialising ExactCount { n } over k literals:

  • n = 0add_and(negations)
  • n = 1add_exactly_one
  • n = k - 1add_exactly_one(negations)
  • n = kadd_and(literals)
  • otherwise → linear add_eq (the prior path)

For Akari this hits the 0/1/2/3/4 clue cases that dominate the constraint count.

Things that didn’t work

subsolvers = ["fixed"] — went down to 21 ms but broke 1×1 (returned empty solution sets). Single-strategy fixed-order DFS skips edge cases the default subsolver handles.

Warm-start via penmark’s Propagator — left as opt-in OrToolsSolverConfig::warm_start (default false). The propagator covers ~99.8% of cells in 5 ms at 100×100. Hinting those to CP-SAT helps at small sizes (5×5–17×17, ~10-20% faster) but tanks at 50×50+ (5× regression): the missing 0.2% (10 cells out of 5832) trips up CP-SAT’s hint-driven search badly. Verified the propagator’s deductions are correct (0/5822 disagreements with FastSolver), so it’s a CP-SAT behaviour issue with high-coverage-but-incomplete hints. Tried add_and pin, fix_variables_to_their_hinted_value, presolve toggling — none recovered the regression. Useful at small sizes; opt-in.

Pinning warm-start cells via add_and instead of hint — 200 ms at 100×100, 6× regression. Without presolve to simplify them, the unit clauses from a single add_and([v1, v2, ..., v5822]) slow down CP-SAT’s search loop. With presolve on, still regressed because the model becomes disconnected fragments that the LP doesn’t help with.

Disabling cp_model_use_sat_presolve only — wash. The other presolve phases dominate.

disable_constraint_expansion — crashed with SolutionIsFeasible assertion failures. Constraint expansion is required for our encoding.

Changing search branchingLP_SEARCH, PSEUDO_COST_SEARCH, PORTFOLIO_WITH_QUICK_RESTART_SEARCH were all within 0.5 ms of each other. AUTOMATIC_SEARCH (default) wins by being marginally faster on average and not having edge cases.

linearization_level = 0 + lighting=add_ge — 33 ms vs 28 ms with add_or. add_or is recognised as a clause; add_ge(sum, 1) goes through the linear path even with linearization off.

What about pure SAT?

Briefly explored: a SatSolver that translated the same constraint list to CNF + cardinality constraints via the rustsat library and fed it to BatSat / CaDiCaL / Kissat / MiniSat. Numbers at 100×100:

backendwall-clockgap to FastSolver
Kissat~16 ms1.6×
BatSat~18 ms1.8×
MiniSat~18 ms1.8×
CaDiCaL~25 ms2.4×

Pure SAT genuinely beat OR-tools’ general-purpose CP solver on this problem class — Akari is essentially Boolean satisfaction with cardinality, and a modern CDCL solver (Kissat especially) plus rustsat’s totalizer cardinality encoding landed within 60% of the hand-rolled engine. CaDiCaL underperformed despite its reputation — the problem is too small for its strengths (incremental clause learning across many calls, lazy clause minimisation).

The SatSolver lived in the codebase for a few commits before being removed to keep the maintenance surface small; the experiment validated the “SAT-as-IR” approach for genres where we’d otherwise have to write a domain-specific propagator (Slitherlink’s “single closed loop”, Nurikabe’s connectivity, etc.). If we ever bring it back, the encoding is documented in git history (search for algorithm::rustsat).

When external solvers would actually win

At our scale they don’t. Scenarios where they would:

  • Optimization variants. CP-SAT’s branch-and-bound shines with objectives. min unlit cells, max bulbs — the LP relaxation actually earns its keep. FastSolver has no objective machinery.
  • Genres with hard-to-propagate constraints. Slitherlink’s “single closed loop”, Nurikabe’s “all white cells connected”, Yajilin’s path rules. Writing a CP propagator for these is months of work; CP-SAT handles them via lazy clauses with reasonable performance even on first try. Pure SAT (Kissat etc.) was particularly strong here since these genres reduce naturally to CDCL with cardinality constraints — if the SAT path becomes load-bearing, that’s the point to bring rustsat back.
  • Really hard hand-crafted puzzles. Where FastSolver’s trial-1 stalls and DFS branching explodes. CDCL clause learning starts paying off. We haven’t actually hit one in the corpus; if we do, escalation to the external solver becomes useful.
  • Multi-genre infrastructure that doesn’t justify per-genre tuning. If the goal is “solve any reasonable CSP in ≤ 1 s”, CP-SAT or rustsat are the right call. penmark’s bet is the opposite: domain knowledge per genre, with the external solvers as backstops.

Maintenance posture

  • OrToolsSolver lives behind --features ortools. Default builds don’t pull in cp_sat (which links a system OR-tools install plus libprotobuf).
  • encode_constraint is exhaustive on Rule — adding a new variant is a compile error here, not a panic at run time. Every variant in the enum has either an eager encoding (boolean-friendly rules) or a lazy cut (global rules).
  • The criterion akari/solver and slitherlink/solver groups under the consolidated penmark bench include OR-tools as a sibling series when the ortools feature is on, so head-to-head ratios show up in the report alongside FastSolver.
  • Per-puzzle cross-check tests (ortools_agrees_with_fast_* in each genre’s tests.rs) make divergence a unit-test failure, not a silent corpus drift.

Build setup

cargo … --features ortools Just Works on Apple Silicon Homebrew:

brew install or-tools
cargo bench -p penmark-core --features ortools --bench penmark -- akari/solver
cargo bench -p penmark-core --features ortools --bench penmark -- slitherlink/solver

The wiring:

  • penmark/.cargo/config.toml sets ORTOOLS_PREFIX (read by cp_sat’s build.rs) and CXXFLAGS (read by cc-rs when cp_sat compiles its C++ wrapper, so abseil + protobuf headers are reachable).
  • core/build.rs adds the protobuf link line that cp_sat doesn’t emit itself, plus rpaths for both dylibs so binaries don’t need DYLD_LIBRARY_PATH at run time.

Both auto-detect via brew --prefix <pkg> if the env var isn’t set, so the hardcoded /opt/homebrew/... defaults are just a fallback. To use a non-Homebrew install (Linux, Intel Mac, source build), export ORTOOLS_PREFIX / PROTOBUF_PREFIX / CXXFLAGS in your shell — those override the [env] defaults.

If cargo build --features ortools fails with 'absl/...' file not found, your CXXFLAGS doesn’t include the abseil headers. If it fails with Undefined symbols ... google::protobuf::..., the build script didn’t find protobuf — set PROTOBUF_PREFIX.

Rule encoding strategy

CP-SAT 0.4’s Rust bindings expose boolean primitives (add_or, add_at_most_one, add_exactly_one), linear constraints over multi-interval domains (add_linear_constraint), and reified constraints via only_enforce_if. They do not expose add_circuit, add_table, or add_inverse — which leaves the “single closed loop” / “all marked cells connected” family of rules without a clean direct encoding.

The compromise: eager + lazy cuts.

Eager encodings (CP-SAT enforces directly):

Rule variantEncoding
Distinctper-mark add_at_most_one over region
Sumadd_eq over Σ value · indicator
Increasingpairwise add_lt on per-coord value expressions
MinDifferencereified add_or([pos, neg]) over pos → a-b≥n, neg → b-a≥n
AtLeastOneadd_or
AtMost { n }add_at_most_one (n=1) or add_le
ExactCount { n }specialised: add_and (n∈{0,k}), add_exactly_one (n∈{1,k-1}), add_eq otherwise
DegreeIn { allowed }add_linear_constraint(sum, allowed-as-singleton-intervals)
NoTwoByTwoper 2×2 block, add_or of 4 negated indicators
Polyominoper coord with mark, add_or([!self, neighbor_indicators…])
Decidedno-op (Binary singleton / OneHot exactly_one already enforces)
CellMinper region cell, add_ge over reach-set indicators (or add_or for min=1)
LoopFreezesCountsper (cells, target), eager ExactCount over cells

Lazy cuts (CP-SAT solves, post-check, add add_or no-good clause if violated, re-solve until satisfied or infeasible):

  • Connected { mark } — find connected components of marked coords; if more than 1, cut the smallest component as “this exact set must not all be marked together.”
  • Path { mark, closed } — partition drawn-edge subgraph into connected components; for closed, each must be a degree-2 cycle and there must be exactly one; for open, exactly one component with two degree-1 endpoints. Cut violating shapes.
  • BoundedSize { mark, max } — cut any component larger than max.
  • ValueGroupedSize — for each numerically-marked coord, BFS over same-value neighbors; cut if size ≠ value.

The lazy-cut loop terminates because each cut shrinks the feasible boolean assignment space by at least one. Worst case is exponential, but in practice for well-constrained puzzles convergence is a handful of iterations.

The slitherlink bench shows the lazy path is workable: 4×4 puzzles solve in ~600µs–2ms, 6×6 in ~3-7ms (vs FastSolver’s ~100µs–1ms on 4×4, ~1-13ms on 6×6 — comparable, and OR-tools even beats FastSolver on a hard 6×6 case).

When a future genre needs a rule whose lazy-check encoding is prohibitively slow at the target sizes, that’s the signal to reach for a different external solver (rustsat with Kissat, say) rather than to push the lazy-cut harder.

Generator

Penmark generates puzzles through genre.generate (an fn pointer on the runtime Genre handle). There’s no separate Generator trait — generation is a per-genre function, with a universal default recipe that genres override only when their structural space requires bespoke setup.

impl GenreTrait for SudokuGenre {
    fn generate(
        cfg: &GenConfig,
        cancel: &AtomicBool,
    ) -> Option<Puzzle> {
        // Genre-specific impl — sudoku needs diagonal seeding for
        // solution diversity since the universal recipe's bare-solve
        // returns the same canonical solution every time.
        sudoku_generate(cfg, cancel)
    }
    // …
}

The default behind genre.generate is scaffold_solve_derive:

  1. Scaffold — pin a structural pre-pass (genre.scaffold_passes, selected by name via cfg.scaffold) if any, then sprinkle per-(layer, material) density across remaining Floor cells.
  2. Bare-solveFastSolver finds one valid completion of the scaffolded substrate. Skip + retry if no solution exists.
  3. Derive clues — for each cell whose genre.clue_domain is non-empty, call genre.derive_clue_at(p, sol, coord) and place the returned ClueValue on the puzzle. The default derive_clue_at body dispatches through genre.clue_pattern (Pin, Count{target, region}, or ConnectedSize) and wraps the u8 result in ClueValue::Number; genres whose clue shape doesn’t fit (Tapa runs, future Yajilin arrows) override derive_clue_at directly.
  4. Reduce — drop clues by symmetry orbit while uniqueness holds, stopping at the cfg.clue_density[Layer::Cells] floor. (Or apply genre.clue_post_filters instead, when cfg.clue_filter is set.)
  5. VerifyFastSolver.solve(p, 2).len() == 1 gates the return. Loop on failure until cfg.time_budget elapses or cancel flips.

Akari uses this default verbatim. Sudoku and slither override generate because each invents its own structural seed (sudoku: random diagonal; slither: connected inside-region) before the bare-solve can find a useful completion. Their helpers live in core/src/puzzles/<genre>/generate.rs as crate-private <genre>_generate functions; the GenreTrait::generate impl is a one-line forwarder.

GenConfig

Universal config for every genre’s generate:

pub struct GenConfig {
    pub dims: Dimensions,                        // Rect or Boxed
    pub material_density: HashMap<(Layer, Material), f32>,
    pub clue_density: HashMap<Layer, f32>,        // Reduce floor; default 1.0
    pub symmetry: Symmetry,
    pub seed: Option<u64>,
    pub time_budget: Duration,
    pub scaffold: Option<String>,                 // Genre::scaffold_passes name
    pub clue_filter: Option<String>,              // Genre::clue_post_filters name
    pub extra: HashMap<String, f32>,              // genre-specific knobs
}

extra is the escape hatch for knobs that don’t map onto material or clue density — slitherlink uses extra["inside_density"] to drive loop topology, for example. Genres document the keys they read.

Extension points

Two hooks let a genre tune the universal recipe without overriding generate wholesale:

genre.scaffold_passes()

A static slice of named structural pre-passes. Each runs before sprinkle and pins materials in a deterministic or topologically- constrained pattern. Akari ships six (walled, grid, maze, layered, holes_donut, holes_islands); sudoku and slither ship none.

pub struct ScaffoldPass {
    pub name: &'static str,
    pub apply: fn(&mut Puzzle, Dimensions, &mut dyn rand::RngCore),
    pub respects_symmetry: bool,
}

The user picks one by name via cfg.scaffold. Sprinkle is additive over the pre-pass — it only writes Floor cells, leaving the pre-placed walls / holes alone.

genre.clue_post_filters()

Value-filters that run after clue derivation. When cfg.clue_filter is set, the named filter replaces the density-driven reduce pass. Akari ships seven (only_0..only_4, only_evens, only_odds); sudoku and slither ship none.

pub struct CluePostFilter {
    pub name: &'static str,
    pub allows: fn(value: u8) -> bool,
}

CLI mapping

Each genre exposes two CLI-friendly knobs through GenreTrait:

trait GenreTrait: 'static + Sized {
    const DEFAULT_GEN_DENSITY: f32 = 0.22;
    fn apply_cli_density(density: f32, cfg: &mut GenConfig) { /* default */ }
}

DEFAULT_GEN_DENSITY is the value the CLI’s --density flag falls back to when omitted (akari 0.22, sudoku 0.36, slither 0.45). The runtime Genre struct surfaces both as genre.default_gen_density and genre.apply_cli_density for call-site dispatch. apply_cli_density writes the user’s value into the right GenConfig slot — material density for akari, clue_density[Cells] for sudoku, extra["inside_density"] for slither. Adding a 4th genre with a fundamentally new knob means overriding both.

CLI

penmark generate <genre> --rows R --cols C --count N --density D --seed S --budget-ms B [--no-db] looks up the genre handle by name, calls (genre.generate)(cfg, &cancel) in a loop with per-iteration seed offsets, prints each puzzle in Janko format, and (unless --no-db) inserts into the local SQLite library.

penmark scaffold <genre> --rows R --cols C prints (genre.empty_at)(dims) as a sanity check.

Grader

Grader assigns a difficulty label to a puzzle by simulating a human solver step by step.

pub trait Grader {
    type Config: Default + Serialize + DeserializeOwned;
    const NAME: &'static str;
    fn new(config: Self::Config) -> Self;
    fn grade(&self, puzzle: &Puzzle) -> GradeResult;
}

/// Cross-puzzle grader options. Per-impl Configs typically embed
/// this with `#[serde(flatten)]`.
pub struct GraderParams {
    pub time_budget: Duration,
    pub profile: String,         // grading profile name
    pub max_trial_depth: Option<u32>,
}

StandardGrader works for every genre by firing named human-pattern techniques in priority order; the techniques are generic over rule kind, so genres that share rule shapes share deductions. Genres with bespoke patterns can append their own Technique impls to the list.

Forward logic: techniques and families

A technique is a small named function over (puzzle, game) that observes a DeductionBuilder — recording which cells the pattern matched and what moves they force — and returns nothing. The grader owns the builder’s lifecycle; the technique just describes what it saw. Techniques are grouped into families by their underlying rule shape — Sudoku families like Naked Singles, Hidden Pairs, X-Wing; Akari families like LosCross-singles, clue-saturation, mutual-visibility-elim. Each family is one module; each technique is one function in it.

The Technique trait is defined above (see “Propagation engine”). A technique declares PHASE: Propagation for cheap per-move fixpoint inference, Pattern for the higher-cost matchers that only run when propagation stalls. The Grading section here focuses on Pattern-phase techniques and how the grader sequences them; Propagation-phase techniques participate in the inner propagation loop instead.

TARGETS examples (across both phases):

TechniquePHASETARGETSFires on…
Naked Singles (Sudoku)Propagation&[RuleKind::Distinct]Any Sudoku-shape puzzle
Bulb-clears-LOS (Akari)Propagation&[RuleKind::ExactCount]Akari
Hidden Pairs / X-Wing (Sudoku)Pattern&[RuleKind::Distinct]Sudoku, when propagation stalls
Hidden Cage Sum (Killer)Pattern&[RuleKind::Sum]Killer / Sandwich / Frame — skipped on vanilla
Bulb-Lit Single (Akari)Pattern&[RuleKind::ExactCount]Akari (clue or floor cells)
Trial-1 / Trial-2 / clue-coverPattern&[]Universal backward fallbacks

The grader sets up the builder, calls fire, then converts the builder into a Deduction (or skips the step if nothing was recorded). Every technique skips the bookkeeping — no DeductionBuilder::new() at the top, no into_deduction() at the bottom, no Option<…> plumbing. Just observe.

Deduction is a library type carrying both halves of the result:

pub struct Deduction {
    /// Cells the pattern matched on — used by the renderer for
    /// highlighting and by traces for explanation. Doesn't affect
    /// solve correctness.
    pub observed: Vec<Coord>,
    /// The moves the technique wants applied.
    pub moves: Vec<Move>,
}

DeductionBuilder is the shared accumulator, library-defined and used identically across every technique in every genre. It handles three pieces of bookkeeping — deduplicating moves (multiple observers may target the same coord), recording an observer only when its moves were novel, and producing an Option<Deduction> based on whether anything was new:

pub struct DeductionBuilder { /* observers, moves, seen-coord set */ }

impl DeductionBuilder {
    pub fn new() -> Self;

    /// Record that `observer` produced these `moves`. Moves whose
    /// target coord was already added by an earlier observer are
    /// dropped. If at least one move actually landed, `observer` is
    /// appended to the observed list; otherwise this is a no-op.
    pub fn observe(
        &mut self,
        observer: Coord,
        moves: impl IntoIterator<Item = Move>,
    );

    /// Pair-observer variant — for techniques whose deduction comes
    /// from two interacting cells (e.g., paired clues, X-Wing's two
    /// rows). Both observers land in the list iff the pair produced
    /// novel moves.
    pub fn observe_pair(
        &mut self,
        a: Coord, b: Coord,
        moves: impl IntoIterator<Item = Move>,
    );

    /// `Some(Deduction)` if any move was recorded, else `None`. Called
    /// by the grader after `fire`, never by the technique.
    pub fn into_deduction(self) -> Option<Deduction>;
}

A canonical technique impl: just the pattern, nothing else.

fn fire<'g, G: Game<'g>>(
    puzzle: &Puzzle, game: &G, b: &mut DeductionBuilder,
) {
    for bulb in bulbs(game) {
        b.observe(bulb, los_cross_from(puzzle, bulb)
            .filter(|&c| game.candidates(c).any(|m| m == Mark::Binary(true)))
            .map(|c| Move::Eliminate { coord: c, mark: Mark::Binary(true) }));
    }
}

The grader calls each registered technique’s fire against a fresh DeductionBuilder, then drains it with into_deduction():

// Inside the grader's forward-logic loop, once per technique:
let mut b = DeductionBuilder::new();
TechniqueX::fire(puzzle, game, &mut b);
if let Some(d) = b.into_deduction() {
    /* apply d.moves, record one SolveStep per move */
}

When grading begins, the grader walks puzzle.constraints() once to compute the set of present RuleKinds, then filters the genre’s registered technique list down to those whose TARGETS are either empty or share at least one kind with that set. The filtered list is iterated in registration order until none fire. Each fire produces one or more SolveSteps in the trace, tagged with the technique name.

This is the “concatenating constraints” pitch from earlier paying its dues at runtime. A vanilla Sudoku grade never invokes any killer-cage technique — they’re filtered out before the inner loop. Adding Killer cages to a puzzle expands its constraint set, which expands the filter’s positive set, which lets the killer-cage techniques back in. The cost of an inapplicable technique is one HashSet::contains check at puzzle-load time, not one fire-and-bail per inner-loop iteration.

Backward logic: when forward stalls

When no technique fires, the grader escalates through fallbacks in order. Each fallback is itself a “technique” (named, recorded in the trace, scorable in the profile) — the difference between forward and backward is just which bucket the technique lives in.

  1. Clue-cover analysis (clue-cover). Walks unsatisfied clues and looks for minimum-cover deductions structural to the constraint shape: a clue needing N more bulbs whose remaining LOS has exactly N candidates forces all of them; two clues sharing candidates forcing each other; an ExactCount whose region’s remaining capacity equals its target. Polynomial-time backward reasoning — cheap, no search.

  2. Trial search at depth 1 (trial-1). Pick a candidate move, commit it, run forward + clue-cover. If it reaches contradiction, the opposite move is forced.

  3. Trial search at iterating depths (trial-2, trial-3, …). Iterative deepening up to a configurable wall-clock budget (default 10s). At depth d, the grader runs forward + clue-cover

    • recursive trial-search at depths < d. No artificial depth cap beyond the time budget.

When the budget expires, GradeResult::status = TimedOut and the trace is whatever was reached up to that point. The puzzle may still be solvable by deduction at deeper trial — these get bucketed for study but filtered out of curated sets.

Tiers and scores live in profiles, not on techniques

Techniques don’t carry their own tier or score. Both live in a grading profile — a runtime-loadable table mapping technique name → tier → score. The profile is the unit of tuning.

pub struct GradingProfile {
    pub name: String,
    pub entries: Vec<GradingEntry>,
}

pub struct GradingEntry {
    pub technique: String,        // kebab-case technique name
    pub tier: Tier,
    pub score: u64,
}

pub enum Tier { Trivial, Easy, Medium, Hard, Extreme }

GradingProfile is Serialize + Deserialize and loads from TOML. The thesis is “difficulty is relative; recalibration must be cheap” — compiling Rust to retune a curve would be the opposite of cheap.

Profiles live in two places, with different cadences:

  • Shipped defaults, embedded via include_str! at build time from penmark/core/src/puzzles/<genre>/profiles/<name>.toml. Editing one of these does require a rebuild — they’re a maintainer task, the curve we ship by default. Source-of-truth in the repo so they version with the techniques they reference.
  • User overrides, runtime-loaded from ~/.config/penmark/profiles/<genre>/<name>.toml (or --profile-file=<path> on the CLI). These take precedence over the embedded default of the same name. Editing one is a TOML save, no compile, and the next penmark grade picks it up.

The grader walks the user dir first, falling back to the embedded default. So the everyday calibration loop — tweak weights, re-grade the db, eyeball the distribution — runs entirely against user TOMLs. The embedded copies only change when we ship a new release with a different default curve.

penmark profile-init <genre> writes the embedded default to the user dir, giving you a starting TOML to edit instead of building one from scratch.

A profile TOML reads like:

name = "nikoli-strict"

[[entries]]
technique = "naked-singles"
tier      = "trivial"
score     = 1

[[entries]]
technique = "hidden-pairs"
tier      = "easy"
score     = 4

[[entries]]
technique = "x-wing"
tier      = "hard"
score     = 25

Each genre ships one or more default profiles in its source tree (under puzzles/<genre>/profiles/) so out-of-the-box runs work without any user setup. Editing a default means editing a TOML file in the repo, not Rust source.

Why outside the technique impl: difficulty is relative. Calling naked-singles “Trivial” only means something next to where x-wing lands. Pulling tier and score onto each technique would force every recalibration to ripple through impls; keeping them in a TOML file lets you re-cut the whole curve, save the file, re-grade, repeat — no compile in the loop.

penmark profile-eval <genre> --profile=<name> runs the active profile against the db and prints the difficulty distribution, so calibration has a tight feedback loop without any code changes.

Profile loading fails loudly on mismatch. TOML technique names and Rust Technique::NAME constants are decoupled, so a rename in code would silently invalidate stale TOML entries. The grader validates the profile against the genre’s registered techniques at load time and errors immediately with the offending name(s) and the list of valid names. Same check on a registered technique that has no entry. No silent fallbacks — fix the profile or the rename.

Adding a new technique means: write the Rust impl, register it, add an entry to the default profile TOML. Three edits.

GradeResult

pub struct GradeResult {
    /// How grading concluded. `Tier` is purely difficulty; operational
    /// state lives here.
    pub status: GradeStatus,
    /// Highest tier any step reached. Meaningful in every status —
    /// for `Solved` it's the puzzle's final difficulty; for `TimedOut`
    /// or `Unsolvable` it's the deepest tier reached during the
    /// attempt.
    pub difficulty: Tier,
    /// Sum of per-step scores under the active grading profile.
    pub hard_score: u64,
    /// Ordered trace of the steps the grader took.
    pub steps: Vec<SolveStep>,
}

pub enum GradeStatus {
    /// Forward + backward logic completed; puzzle solved.
    Solved,
    /// The grader's time budget expired before reaching a solution.
    /// The trace is whatever was reached.
    TimedOut,
    /// The grader proved no solution exists (backward logic forced a
    /// contradiction at the root).
    Unsolvable,
}

pub struct SolveStep {
    pub technique: &'static str, // technique name (forward or backward)
    pub observed: Vec<Coord>,    // cells the pattern matched on
    pub mv: Move,                // the deduction's move
    pub depth: u32,              // 0 forward, ≥1 trial search
}

Tier is purely a difficulty bucket; “we ran out of budget” is GradeStatus::TimedOut, “the puzzle is broken” is GradeStatus::Unsolvable. Both persist alongside puzzles in the db, indexed separately for curation queries — “show me solved Hard puzzles” and “show me puzzles that timed out at any tier” are two different filters.

Enumerator

Enumeration is genre.enumerate — an fn pointer on the runtime &'static Genre handle that streams uniquely-solvable puzzles for a given size, calling on_puzzle for each hit and on_stats periodically for progress UIs.

pub enumerate: fn(
    &EnumConfig,
    &AtomicBool,
    &mut dyn FnMut(&Puzzle),
    &mut dyn FnMut(EnumStats),
) -> String,

The default behind genre.enumerate is walk_canonical — a true canonical-form walker shared across every genre, driven entirely by the runtime Genre handle (genre.material_vocab, genre.derive_clue_at, genre.clue_domain, genre.empty_at, genre.build_constraints). No genre overrides it today.

Clue derivation per cell goes through genre.derive_clue_at(p, sol, coord) -> Option<ClueValue>. The default body dispatches through genre.clue_pattern (the canonical Pin / Count / ConnectedSize shortcut for u8 genres); future multi-byte clue genres (Tapa, Yajilin) override derive_clue_at directly. See the ClueValue discussion in Traits → Puzzle for the full picture.

The walker has three nested phases:

  1. Layouts. Walk the cell-layer material space at the given size, in canonical-form order under D4 (square) / D2 (non-square) lex-min. Genres with empty material_vocab (sudoku, slitherlink) collapse to a single all-floor layout.
  2. Completions. Bare-solve each layout to enumerate the legal full assignments under the genre’s interior rules (no clues active). Skip completions whose maximally-clued puzzle isn’t uniquely solvable — no clue subset of an ambiguous full-clue set can be unique.
  3. Clue subsets. For each accepted completion, walk lex-ascending k-combinations of clueable cells for k = 1, 2, … Smallest-puzzle-first ordering. Stabilizer-based dedup filters out clue arrangements that are D4-images of each other within a single layout’s orbit, before invoking the solver.

A monotonicity short-circuit runs alongside the subset walk: if a candidate subset M contains some previously-confirmed unique subset R as a strict superset, M must also be unique (extra clues are consistent with the same bare-solve completion, so adding them can’t introduce a second solution). The walker tracks recorded unique masks in popcount-bucketed Vec<u64> per completion and skips the solver call when M is subsumed. Solving was ~90% of per-subset cost, so this typically yields an order-of-magnitude speedup once minimal-unique anchors fill in at low k.

See notes/canonical-enumeration.md for the full design rationale, soundness arguments for the completions skip, and follow-on directions (value-relabeling-aware dedup, beautification).

EnumConfig

pub struct EnumConfig {
    pub dims: Dimensions,
    pub limit: usize,             // 0 = unlimited
    pub time_budget: Duration,
    pub holes: bool,              // include Material::Hole in the cell alphabet
    pub max_bare_solutions: usize, // cap on completions per layout (default 10_000)
}

pub struct EnumStats {
    pub layouts_examined: u64,
    pub total_layouts: u64,
    pub completions_examined: u64,
    pub subsets_examined: u64,
    pub unique: u64,
}

holes widens akari from 2-state (floor/wall) to 3-state (floor/wall/hole). Other genres don’t declare Hole on the cell layer, so the flag is a no-op for them. Default is true; set to false for larger akari grids where the 3-state space blows up.

max_bare_solutions caps the bare-solve completion enumeration per layout. Sudoku 4×4 has exactly 288 completions; slither 4×4 has thousands. The cap mostly exists so a pathological scaffold with millions of legal completions can’t hang the walker.

The return type is String — a free-form “stop reason” the caller echoes back ("exhausted", "limit", "deadline", "cancelled", or "bad-config:<why>").

Why callbacks, not iterators

Every real caller wants two things an iterator handles awkwardly:

  • Progress reporting, since canonical-form scans can examine millions of layouts even at small sizes; silent UIs are broken.
  • Early exit on multiple conditions — output limit, wall-clock budget, user cancel. Returning an iterator forces every consumer to re-thread those exit conditions through the pull loop; baking them into the contract keeps callers trivial.

cancel is a separate &AtomicBool rather than a config field so EnumConfig stays Serialize + DeserializeOwned — runtime cancellation has no business in a serialized config.

Discovery and coverage (planned)

The discovery + coverage feedback loop is described elsewhere — pair the enumerator with the technique library and the trial-search oracle to mine specimens (small puzzles annotated with what was hard about them), then run every technique against every persisted specimen for per-technique hit counts and coverage residue.

The CLI verbs discover and coverage are placeholders for that loop; the mining loop runs ad-hoc against Genre::enumerate until those land.

Frontends

The two shipped surfaces: the penmark CLI for batch operations and the eframe desktop app for interactive use. WASM is planned but not yet built.

CLI

penmark <verb> <genre> [options]. The genre is a required positional, never a flag — every command knows up front which genre it’s running, and a missing genre is an immediate error rather than a subtle dispatch into the wrong defaults.

The verb set is still in flux; what’s wired today:

VerbPurpose
solveSolve every puzzle in a canonical dataset; report counts and timings.
gradeGrade one puzzle against a profile and print the trace.
grade-canonGrade every puzzle in a canonical dataset against a profile; print difficulty distribution.
generateFull pipeline — scaffold + solve + derive + reduce. Wired for akari, sudoku, slitherlink.
scaffoldPhase 1 only — wall/hole structural pass, no solver, no clues.
enumerateWalk uniquely-solvable puzzles of a given size for a genre.
importRead puzzlehound’s collated output and write canonical records to data/puzzles/<genre>.jsonl.gz.
profile-initCopy the embedded default profile to ~/.config/penmark/profiles/<genre>/ for editing.
profile-evalRun a profile against a canonical dataset; print the difficulty distribution.
list-implsPrint the hand-maintained list of genres + algorithms.

The verbs discover (capture specimens at every trial-search step) and coverage (per-technique hit-count report against persisted specimens) are described in the Enumerator chapter but haven’t shipped as CLI verbs yet. The mining loop runs ad-hoc against the enumerator until those land.

Config knobs are clap flags on the universal Cmd enum. There are no per-genre CLI files — GenericCli is a single non-generic impl that handles every command for every genre by dispatching through the runtime &'static Genre handle (genre.default_gen_density, genre.apply_cli_density, genre.parse_text, …). Adding a new option means one clap field on Cmd::Generate (or wherever) plus reading it from GenericCli::generate.

Parallelism is op-by-op: generate and the enumerator fan out via rayon::par_iter across puzzles in the work queue; one puzzle’s solve/grade is sequential. --threads=N is consistent where it applies and ignored on single-puzzle ops.

Directory layout

Penmark lives at threeemojis/penmark/ — its own top-level project, not nested under ordo. The repo is a Cargo workspace:

threeemojis/penmark/
├── Cargo.toml              — workspace manifest
├── core/                   — engine: traits, library types, all algorithms
├── cli/                    — `penmark` binary
├── eframe/                 — local desktop app
├── data/                   — canonical per-genre datasets + import inputs
│   ├── puzzles/<genre>.jsonl.gz   — what the engine reads
│   └── dailyakari/                — one source's raw download script
└── scripts/                — bench-all, fetch-datasets, …

Inside core/src/:

core/src/
├── lib.rs                  — crate root, re-exports
├── coord.rs                — Coord, OrthoDir, DiagDir, wire encoding
├── substrate.rs            — Substrate, Material, MaterialVocab, Dimensions
├── mark.rs                 — Mark
├── region.rs               — Region + resolution against a puzzle
├── constraint.rs           — Constraint, Role, Rule, Violation, RuleKind
├── outcome.rs              — Move, ConstraintOutcome, Status
├── puzzle.rs               — Puzzle struct + GenreTrait (implementation interface)
├── genre.rs                — runtime Genre struct + per-genre &'static Genre handles
├── overlay.rs              — Overlay vocabulary (Highlight / Conflict / LoopFill / …)
├── render.rs               — MarkRenderStyle (Lamp / Glyph / EdgeStroke)
├── tagged.rs               — tagged-JSON wire format + Marks
├── game.rs                 — Game trait
├── cell_counter.rs         — CellCounter declarations
├── symmetry.rs             — orbit-based canonical-form helpers
├── db.rs                   — SQLite-backed persistence
├── algorithm/
│   ├── mod.rs              — Solver, Grader traits, shared *Params
│   ├── eval.rs             — generic constraint evaluator
│   ├── geometry.rs         — region / reach geometry helpers
│   ├── generate.rs         — GenConfig + scaffold_solve_derive
│   ├── enumerate.rs        — EnumConfig + walk_canonical
│   ├── profile.rs          — GradingProfile loading + validation
│   ├── fast/               — FastGame + Propagator + FastSolver
│   ├── simple/             — SimpleGame + SimpleSolver (brute-force ref)
│   ├── grader/             — StandardGrader + Technique trait + standard techniques
│   └── ortools/            — OR-Tools CP-SAT wrapper (cross-check oracle)
└── puzzles/                — one subdirectory per genre
    ├── pzv.rs              — shared pzv-encoding helpers
    ├── store.rs            — shared per-genre JSONL store
    ├── akari/              — mod + parse + pzv + scaffold + variants + tests
    ├── sudoku/             — mod + parse + pzv + generate + tests
    └── slitherlink/        — mod + parse + pzv + generate + tests

Per-genre directories vary in shape — some genres ship a generate.rs (when the universal recipe needs an override), some ship a scaffold.rs (akari’s wall-style + clue-filter slices). The required files are just mod.rs (with GenreTrait impl on the marker struct)

  • parse.rs. Enumeration runs through the universal algorithm::enumerate::walk_canonical for every genre — there’s no per-genre enumerate.rs to write.

Adding a new genre is a new directory under core/src/puzzles/ plus a one-line entry each in cli/src/main.rs::genres() and eframe/src/main.rs’s genre list — see the Adding a genre chapter.

Inside the frontends:

cli/src/
├── main.rs                 — clap setup, dispatches to GenericCli
├── common.rs               — shared CLI plumbing (load_dataset, summaries)
├── generic.rs              — GenericCli — one impl handles every genre via &'static Genre
├── genre.rs                — GenreCli trait (dyn dispatch surface)
├── grade_dataset.rs        — grade against canonical datasets
├── import.rs               — puzzlehound output → canonical pipeline
└── profile.rs              — profile-init / profile-eval

eframe/src/
├── main.rs                 — egui app shell + genre routing
├── common.rs               — shared widget code
├── genre.rs                — GenreApp trait
├── games.rs                — universal app behavior dispatching on puzzle.genre
├── painter.rs              — universal puzzle painter (dispatches on genre.mark_render)
├── harness/                — solve/generate/grade harness panels
├── dataset_browser.rs      — browse data/puzzles/<genre>.jsonl.gz
└── db_browser.rs           — browse the SQLite db

Both frontends pulled per-genre files pre-collapse; they’re now single implementations that dispatch on the &'static Genre handle carried by each Puzzle. The fn-pointer table in Genre answers every question they used to out-source per-genre.

Inside the bench harness:

core/benches/
├── penmark.rs              — single bench file, dispatches all genres
├── common/mod.rs           — solver_bench<G>, grader_bench<G>,
│                             generator_bench<G>, pipeline_bench<G>,
│                             load_fixtures<G>, load_dataset_corpus<G>
└── fixtures/<R>x<C>.json   — checked-in akari fixture corpora

cargo bench --bench penmark -- akari/solver filters via criterion’s name match.

Adding a new puzzle genre

End-to-end checklist for landing a new genre. Akari, sudoku, slitherlink, and takuzu are the worked examples — open them side by side when following along. Takuzu is the smallest of them (single mod.rs + parse.rs + tests.rs, no pzv codec, no scaffold passes), so start there if you’re after the minimum viable genre.

The post-collapse architecture means a genre is now mostly one file: core/src/puzzles/<genre>/mod.rs with a GenreTrait impl on a zero-sized marker. Per-genre CLI files, eframe files, and bench files no longer exist — the universal GenericCli, eframe app, and bench harness all dispatch through the runtime &'static Genre handle, so adding a genre to the live binaries is one entry in the GENRES slice plus a few re-exports.


TL;DR

Two scenarios:

  1. The genre’s rules already have Rule variants — common case once Connected, NoTwoByTwo, ValueGroupedSize etc. land. You’re writing the genre marker + GenreTrait impl + parsers + a GenreTrait::generate override (when the universal recipe isn’t enough) + tests + UI wiring + bench wiring. No engine changes.

  2. You need a new Rule variant — same as (1), plus you add the variant to Rule, encode it in eval.rs, optionally land a propagator fast path, and land an OR-tools encoding (the match in algorithm::ortools::solver is exhaustive — your build won’t compile without it).

The order below assumes (1); the “New Rule variant” sidebar folds in (2).


File map

A complete genre touches:

penmark/
├── core/
│   ├── src/
│   │   ├── lib.rs                        ← add `pub mod <genre>;` under puzzles
│   │   ├── genre.rs                      ← add `pub static MYGENRE: &Genre = genre_static!(MyGenre);`
│   │   │                                    + append to `GENRES` slice
│   │   └── puzzles/
│   │       └── <genre>/
│   │           ├── mod.rs                ← marker struct + impl GenreTrait + type alias + constructors
│   │           ├── parse.rs              ← Janko text format parser + writer
│   │           ├── pzv.rs                ← puzz.link / pzv URL parser
│   │           ├── generate.rs           ← (optional) Genre::generate override
│   │           ├── scaffold.rs           ← (optional) ScaffoldPass + CluePostFilter slices
│   │           ├── tests.rs              ← unit tests + ortools cross-checks
│   │           └── inspect.rs            ← (optional) solution accessors
│   └── benches/penmark.rs                ← add `register_<genre>(c)` call
├── cli/
│   └── src/main.rs                       ← genres come from `core::genre::GENRES` — no manual list
├── eframe/
│   └── src/main.rs                       ← same — iterates `GENRES`
└── data/
    └── puzzles/<genre>.jsonl.gz          ← canonical per-genre store, populated by
                                            `penmark import`. Nothing to commit by hand.

There’s no per-genre enumerate.rs — the universal walk_canonical covers enumeration for every genre using only the existing trait surface (material_vocab, clue_pattern, clue_domain, empty_at, build_constraints). See the Enumerator chapter for the walker’s algorithm.

The optional files only exist when the genre needs them:

  • generate.rs — only if the universal scaffold_solve_derive recipe doesn’t produce diverse output. Sudoku and slither override because they invent topology before the bare-solve can run; akari uses the default.
  • scaffold.rs — only if you ship ScaffoldPass / CluePostFilter slices. Akari ships six wall styles and seven clue filters; sudoku and slither ship none.
  • inspect.rs — only when your solution shape (e.g. akari bulb positions) needs accessors that don’t fit the generic Solution.

Step-by-step

Substitute <Genre> for the title-case name (e.g. Masyu) and <genre> for the kebab-case slug.

1. Pick your substrates

Look at core/src/coord.rs::SubstrateCells, EdgesH, EdgesV, Corners. Akari uses just Cells; slitherlink uses EdgesH + EdgesV + Corners. Most loop genres need edges + corners; most cell-marking genres need just cells.

2. Pick your rules

Walk core/src/constraint.rs::Rule and pick the variants whose semantics match your puzzle.

GenreRules
akariExactCount, AtMost, AtLeastOne, CellMin, Decided
slitherlinkExactCount, DegreeIn, Path { closed: true }
sudokuPin, Distinct
masyu (planned)Path { closed: true }, DegreeIn, plus MasyuBlack / MasyuWhite
nurikabe (planned)ValueGroupedSize, Connected, NoTwoByTwo

If every rule already exists and has an eval.rs arm that’s not stubbed Pending, you’re in scenario (1). Otherwise see the New Rule variant sidebar.

3. Genre module: core/src/puzzles/<genre>/mod.rs

Define the genre marker, the (vestigial) type alias, impl GenreTrait, and constructors. The trait surface is large but most methods have sensible defaults. Required:

pub type MyPuzzle = Puzzle;  // vestigial; just shorthand for Puzzle

#[derive(Clone, Copy, Debug, Default)]
pub struct MyGenre;

impl GenreTrait for MyGenre {
    const NAME: &'static str = "mygenre";
    const DISPLAY_NAME: &'static str = "MyGenre";
    const LAYERS: &'static [Layer] = &[Layer::Cells];
    const MARK_RENDER: MarkRenderStyle = MarkRenderStyle::Glyph;

    /// Runtime handle — must match the literal added in genre.rs.
    const GENRE: &'static crate::genre::Genre = crate::genre::MYGENRE;

    fn full_domain(p: &Puzzle, _coord: Coord) -> Vec<Mark> {
        // For binary genres: vec![Mark::Binary(false), Mark::Binary(true)]
        // For numeric: (1..=p.rows() as u8).map(Mark::Numeric).collect()
    }

    fn default_substrate(dims: Dimensions) -> Substrate { /* ... */ }

    fn build_constraints(
        substrate: &Substrate,
        clues: &Grid<Option<ClueValue>>,
    ) -> Vec<Constraint> {
        // Walk substrate + clue grid. For u8 genres, use the
        // convenience helper `for_each_clue_byte` which
        // pattern-matches `ClueValue::Number(_)` for you.
    }

    fn clue_pattern() -> CluePattern {
        // Pin / Count{target,region} / ConnectedSize{target}.
        // The convenience layer over `derive_clue_at` — see below.
    }

    fn sample() -> Puzzle { /* small demo puzzle */ }
}

The macro in core/src/genre.rs coerces these monomorphized items into the runtime Genre struct’s fn-pointer fields:

// core/src/genre.rs
pub static MYGENRE: &Genre = genre_static!(MyGenre);

pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU, MYGENRE];

Optional overrides used by various genres:

  • material_vocab() — non-Floor materials available on the cells layer (akari: [Wall, Hole]).
  • clue_host_material() — substrate material that holds clue cells (default Floor; akari overrides to Wall; slither to LabelOnly). Drives Puzzle::is_clue_target.
  • clue_domain(p, coord)Vec<u8> of valid clue values at the coord; empty means no clue allowed there.
  • cell_clue(p, r, c) — read non-Pin clues from constraints (akari’s ExactCount over Neighbors4, slither’s over edges). Returns the display shorthand: a single u8. Multi-byte payloads fold to their first byte.
  • extract_clues(p) — substrate-filtered clue grid (akari only walls hold clues). Returns Grid<Option<ClueValue>>; u8 genres emit ClueValue::Number(_).
  • on_clue_added(p, r, c, v) — promote material on clue add (akari promotes Floor → Wall). v: &ClueValue.
  • derive_clue_at(p, sol, coord) -> Option<ClueValue> — inverse of cell_clue. The walker calls this once per clueable cell to derive the canonical clue from a bare-solve completion. Default body dispatches through clue_pattern() and wraps in ClueValue::Number. Override directly when your clue shape doesn’t fit Pin / Count / ConnectedSize (Tapa run sequences, future Yajilin arrows).
  • dims_for(rows, cols) — sudoku rounds to nearest factor pair and returns Boxed{box_w, box_h}.
  • format_value(v) — sudoku overrides for hex / base-36 on big grids.
  • overlays(p, marks, status) — paint overlays (akari lit cells, sudoku candidates + conflicts, slither loop fill at solve).
  • from_janko_text / from_puzzlink_url / to_puzzlink_url — text-format parsers and writers.

Multi-byte clue payloads. The four shipped genres carry single-byte clues, so ClueValue::Number(_) is the only variant they emit. Genres whose clues are richer (Tapa’s run-length sequences, future Yajilin arrows) use ClueValue::Runs(_) or a new variant, and override derive_clue_at directly instead of relying on clue_pattern. The ClueValue enum is closed — adding a variant is one enum entry plus exhaustive-match warnings pointing to every site that needs an arm.

4. Generator hook

The universal GenreTrait::generate default is scaffold_solve_derive: scatter materials per material_vocab, bare-solve, derive clues per clue_pattern, reduce by orbit. Akari uses this verbatim.

Override GenreTrait::generate when the recipe doesn’t produce diverse output. The override forwards to a crate-private helper:

fn generate(cfg: &GenConfig, cancel: &AtomicBool) -> Option<Puzzle> {
    generate::mygenre_generate(cfg, cancel)
}

The helper in core/src/puzzles/<genre>/generate.rs reads cfg.dims, cfg.seed, cfg.time_budget, plus whatever cfg.extra / cfg.clue_density keys make sense for the genre. Sudoku reads cfg.clue_density[Cells] as target_givens; slither reads cfg.extra["inside_density"].

Two CLI knobs you’ll usually also override:

const DEFAULT_GEN_DENSITY: f32 = 0.36;  // CLI's --density default
fn apply_cli_density(d: f32, cfg: &mut GenConfig) {
    // Map --density into the right cfg slot for your genre.
}

If you ship structural pre-passes (border ring, maze carving, hole-pin patterns) or clue value filters (only-evens etc), declare them in core/src/puzzles/<genre>/scaffold.rs and override scaffold_passes() / clue_post_filters(). See core/src/puzzles/akari/scaffold.rs for the template.

5. Parser: core/src/puzzles/<genre>/parse.rs

Two formats per genre:

  • Janko — small-grid text format from janko.at. Useful for hand-built fixtures and dataset smoke tests.
  • puzz.link / pzv — URL format used by puzz.link. Lives in pzv.rs; puzzlekit datasets ship in this format.

Round-trip in tests: parse(s).to_janko() == s for at least a 3×3 plus a corner case.

6. Tests: core/src/puzzles/<genre>/tests.rs

Minimum:

  • coord_count_formula_holds_for_various_sizes — coord enumeration matches grid math.
  • empty_NxN_constraint_count — right number of Constraints for an empty puzzle.
  • parse_janko_round_trip over a fixed string.
  • parse_puzzlink_round_trip from a real-corpus URL.
  • simple_and_fast_agree_on_* over a hand-built puzzle whose solution count is known.
  • ortools_agrees_with_fast_* (#[cfg(feature = "ortools")]) — at least one cross-check. Catches engine drift.

7. Module registration: core/src/lib.rs

One line: add pub mod <genre>; in the pub mod puzzles { ... } block.

8. Genre registration: core/src/genre.rs

Two edits:

pub static MYGENRE: &Genre = genre_static!(MyGenre);

pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU, MYGENRE];

That’s it for CLI and eframe — both iterate GENRES at startup and dispatch on the runtime handle. There is no separate genres list to maintain in cli/src/main.rs or eframe/src/main.rs.

9. Frontend rendering

The eframe app’s painter dispatches on puzzle.genre.mark_render (Lamp / Glyph / EdgeStroke / …); clicks dispatch on (mode, mark_render); overlays come from (puzzle.genre.overlays)(puzzle, marks, status).

If your genre needs a render style none of the existing variants cover, add a MarkRenderStyle variant in core/src/render.rs and a paint pass in eframe/src/painter.rs::paint_marks. Most new genres won’t.

10. Bench registration: core/benches/penmark.rs

The bench harness iterates GENRES and dispatches solver / grader / generator / pipeline benchmarks for each. Most genres don’t need manual wiring beyond appearing in GENRES; genres that ship a hand-curated corpus instead of generated fixtures register that corpus in core/benches/common/mod.rs. Filter the bench matrix via criterion’s name match: cargo bench --bench penmark -- mygenre.

11. Dataset coverage — automatic

The pr_report harness in penmark-eval walks data/puzzles/<genre>.jsonl.gz end-to-end (FastSolver + grader, bucketed by rows × cols) for every PR run. Once your genre has records under data/puzzles/<genre>.jsonl.gz, the PR comment picks it up automatically — no per-genre test file to maintain. Earlier iterations of penmark kept hand-rolled <genre>_dataset_smoke.rs tests for this; they were retired in favor of the per-genre dataset phase in the eval pipeline.

12. Sanity check

cargo build --workspace --features ortools
cargo test -p penmark-core --features ortools
cargo bench --bench penmark -- mygenre  # smokes the bench matrix
cargo run -p penmark-cli -- list-impls  # shows the new genre
cargo run -p penmark-cli -- generate mygenre --rows R --cols C

New Rule variant sidebar

If your genre needs a constraint shape no existing Rule covers, add one before doing the steps above. The change crosses several files because adding a variant breaks every exhaustive match.

  1. Variant: add to Rule enum in core/src/constraint.rs. Add the matching RuleKind variant. Add a Violation variant — used by eval to surface which constraint failed.
  2. Eval: add an arm in core/src/algorithm/eval.rs::eval_rule. At minimum return ConstraintOutcome::Pending until decided, plus the obvious-violation cases. Slow-path-only is fine for a first cut.
  3. Tightness: add a tightness arm in Rule::tightness. High tightness means “prunes a lot per commit”; the solver tries it first.
  4. Propagator (optional): core/src/algorithm/fast/propagator.rs has per-rule fast paths for forced_moves_at. Eval-only is fine; trial-1 picks up the slack. Add a fast path later if benches demand.
  5. OR-tools: core/src/algorithm/ortools/solver.rs::encode_constraint is exhaustive on Rule — your build won’t compile until you add an arm. Two patterns:
    • Eager: pick the closest CP-SAT primitive (add_or / add_at_most_one / add_linear_constraint). Use for anything expressible as linear / boolean over per-coord indicators.
    • Lazy cut: push a LazyCheck variant. CP-SAT 0.4 doesn’t have direct primitives for graph-shape rules (Connected, Path, BoundedSize). solve_inner calls each lazy check after every CP-SAT response and adds an add_or no-good cut on violation.
  6. Grader (optional): if your rule yields a recognisable human technique, add a Technique impl in core/src/algorithm/grader/. Most rules don’t add new techniques — the existing ones cover most patterns generically.
  7. Docs: update the encoding table in External solvers.

Definition of done

  • cargo test -p penmark-core --features ortools passes.
  • cargo bench --bench penmark -- <genre> -- --test smokes.
  • cargo run -p penmark-cli -- list-impls shows the new genre.
  • cargo run -p penmark-cli -- generate <genre> --rows R --cols C produces a valid puzzle.
  • At least one ortools_agrees_with_fast_* test in core/src/puzzles/<genre>/tests.rs::solver.
  • eframe app loads and a sample puzzle renders + solves interactively.

Cross-references

  • Generator chapter — GenConfig, genre.generate, scaffold passes, clue filters.
  • Fast solver — internals before writing a propagator fast path.
  • External solvers — OR-Tools encoding strategy table.

Roadmap

What the MVP covers, where performance currently sits, and what’s open.

MVP

Three genres on one engine, two frontends. Mostly landed.

Genres in scope

GenreMarkSubstratesVariantsConstraint shapes used
AkariBinaryCellsclassic + holes (Iraka)ExactCount, CellMin
SlitherlinkBinaryCells, EdgesH, EdgesVclassicExactCount, DegreeIn, Path { closed }, LoopFreezesCounts
SudokuNumericCellsclassic 9×9 + several variants via canonical-import pathDistinct, Decided

For each genre:

  • Concrete Puzzle impl with constructor that populates the constraints Vec at build time. ✅
  • Propagators per RuleKind the genre uses, riding the shared Propagator engine. ✅
  • FastSolver + StandardGrader end-to-end. ✅
  • OR-Tools cross-check via algorithm/ortools/. ✅
  • Enumerator<G> impl walking the genre’s canonical structural space, with shared EnumStop semantics. ✅
  • Starter grading profile shipped as TOML, validated against the registered technique list at load time. ✅
  • ASCII renderer for terminal output. ✅
  • Generator: shipped for Akari; not for Slitherlink/Sudoku yet (those load from the canonical dataset).

Frontends in scope

  • CLI — see the CLI chapter for the live verb set. Genre routing is hand-wired per verb (no registry).
  • eframe desktop app — one window, per-genre interactive views in eframe/src/games/, plus harness/ (solve / generate / grade panels), dataset_browser.rs, and db_browser.rs. Per-genre rendering lives in each genre’s module directly; no Render trait abstraction (it’ll earn one if/when the catalog grows enough to make the overlap clear).

Explicitly out of scope, for now

  • More genres: Nurikabe, Heyawake, Hitori, Tapa, Yajilin, Masyu, Hashiwokakero, Numberlink, Fillomino, etc.
  • Sudoku variants beyond what the canonical importer already understands (Killer, Sandwich, Thermo, Arrow, Sudoku-X, German Whispers, Renban, Kropki, XV are partially modelled but not uniformly graded).
  • Pattern-phase techniques beyond a starter handful.
  • Hex / non-square substrates.
  • Backward-logic refinements (clue-cover beyond the basic case).
  • A Render trait abstracted across genres.
  • The wasm crate, wasm-bindgen surface, and the ordo/edit page that consumes them. The wire-format work (tagged puzzle and game serialization) is in scope and described above; the WASM bindings themselves and whichever registry shape they want are what’s still to land.

Genre expansion

Which genres ship next and why, revisited against what’s actually landed. The original sequencing (masyu → nurikabe → nurimisaki → heyawake → yajilin) was written when adding a genre meant ~7 files across core/, cli/, and eframe/, and when the OR-Tools encoder didn’t yet cover the rules that nurikabe-family genres need. Both of those have changed. This chapter rewrites the plan on top of the current infra.

What shipped vs. the plan

GenreOriginal planActual
akaridone at plan timedone
slitherlinkdone at plan timedone
sudokunot on planshipped (variant-CtC market fit; jumped the queue)
takuzunot on planshipped (worked example for the “smallest possible genre” path)
masyuPhase 1not started
nurikabePhase 2not started
nurimisakiPhase 3not started
heyawakePhase 4not started
yajilinPhase 5not started
hashiwokakerodeferrednow in scope (named in the business plan)

Sudoku displaced masyu because the business plan makes variant-Sudoku in the CtC orbit the beachhead audience. Takuzu displaced nurikabe because shipping a third tiny genre (single mod.rs + parse.rs + tests.rs, no pzv codec, no scaffold passes) validated the universal-genre architecture and gave the Adding a genre chapter a minimum-viable example to point at.

What changed in the infra

Three concrete shifts since the original plan was written.

Frontend wiring is one line

Adding a genre to CLI + eframe + bench is now one entry in the core::genre::GENRES slice, not three full files. The post-collapse architecture in Adding a genre spells this out: all three harnesses iterate the slice and dispatch through the runtime &'static Genre handle; per-genre cli/src/<genre>.rs and eframe/src/<genre>.rs no longer exist. The wiring cost per genre dropped from “a day or two” to “an hour.”

OR-Tools already encodes the stubbed rules

core/src/algorithm/ortools/solver.rs::encode_constraint is exhaustive on Rule and has arms for every variant currently stubbed in eval.rs:

Ruleeval.rs (FastSolver)OR-Tools encoder
ConnectedPendingimplemented
NoTwoByTwoPendingimplemented
ValueGroupedSizePendingimplemented
PolyominoPendingimplemented
BoundedSizePendingimplemented
Path { closed: false }Pendingimplemented
Sum / Increasing / MinDifferencePendingimplemented

This is the part that re-orders the roadmap. A genre whose rules are stubbed in eval.rs is not blocked from shipping — OR-Tools solves it correctly, the universal walker enumerates it, and trial-1 carries forces in the absence of a fast eval arm. The oracle works before the fast path does. Fast-path arms can land later when bench numbers force them.

Universal walker, generator, and tests

walk_canonical removed per-genre enumerators (~200 LOC each). scaffold_solve_derive covers most generation. Per-genre dataset-smoke tests are copy-paste. The dataset infra (puzzlekit-dataset repo, /private/tmp/penmark-data/) is in place. The cost of “adding a genre that uses already-shipped rules” is now mostly: write build_constraints, write the pzv parser, write tests.

The new ordering optimizes for two things: cheapest first (stay in flow, ship visible progress), and then heaviest infra-leverage (the genre that promotes the most stubs to real fast-path eval). This is a different ordering than the original plan, which optimized for engine-reuse compounding alone.

Phase 1 — masyu (cheapest delta)

Still the right first ship. Slitherlink-minus-clues with two new local rules (MasyuBlack, MasyuWhite). Reuses every existing slitherlink primitive: Path { closed: true }, DegreeIn, union-find, parse_url_frame, the pzv decode4Cell family. Eval-only first cut for the two new rules; trial-1 carries forces. OR-Tools needs new exhaustive arms for the two variants, but they’re local and small.

Why this ordering still holds: it’s the genre with the smallest possible delta against shipped code, so it’s the ideal “warm up the new architecture” ship and proves the once-per-genre cost is actually as low as the Adding a genre chapter claims.

Ship target: a few days. Definition of done unchanged from the original plan — 50-puzzle masyu dataset smoke passes, ortools cross-check, eframe pane renders.

Phase 2 — hashiwokakero

New addition from the business plan. Hashi is named explicitly as “within reach on the same architecture.” Engine-wise it sits between slither (edges) and nurikabe (Connected). It needs:

  • An edge-multiplicity rule — the existing edge substrate carries binary drawn/not drawn; Hashi needs {0, 1, 2} per edge slot. Either a new Mark::Numeric on edges (cleanest) or a new Rule::EdgeMultiplicity { allowed: [0,1,2] } over edges (smaller diff but encodes a marker as a constraint).
  • ExactCount per island over its incident edges (already shipped).
  • Connected { mark: Bridge } over the bridge edge set (stubbed in eval.rs; needs a real arm — see below).
  • A planarity / non-crossing cut: bridges can’t cross. Lazy cut pattern, same shape as the slither loop closure.

The Hashi solver work doubles as the forcing function for promoting Connected from stub to real. Once Connected has a fast eval arm, nurikabe / nurimisaki / heyawake / yajilin all benefit.

Why earlier than Phase 3+: the business plan explicitly calls Hashi out as part of the “Nikoli family within reach” pitch. CTC solves Hashi occasionally on YouTube. It’s a different shape of puzzle from the cell-coloring family, which broadens the engine demo without demanding the whole nurikabe-family stub-to-real push at once.

Ship target: ~2 weeks. Most of the time goes to the planarity cut and to making Connected fast enough for trial-1 to be useful.

Phase 3 — nurikabe (the high-leverage stub-to-real push)

This was Phase 2 in the original plan, and the underlying argument hasn’t changed: implementing real fast-path eval for Connected, NoTwoByTwo, and ValueGroupedSize unlocks four later genres simultaneously. What’s changed is that Hashi (Phase 2 above) already paid for Connected’s real arm, which makes nurikabe’s incremental cost smaller than originally estimated.

Three rules need real eval.rs arms after Hashi:

  • NoTwoByTwo { mark } — slow-path is trivial (walk every 2×2); fast-path is per-2×2 aggregate with force-when-three-and-one-undecided.
  • ValueGroupedSize — designed for nurikabe; per-value-seed BFS over committed cells of that mark, asserting size == value.
  • Connected (already done in Phase 2 if Hashi shipped first; otherwise here).

The pzv work is the standard decode4Cell family.

Ship target: ~3 weeks. Most of the time is the propagator work for the three rules.

Phase 4 — nurimisaki, heyawake (proves Phase 3 transports)

Both reuse Connected + NoTwoByTwo from Phase 3. Both ship fast on top of that base.

  • Nurimisaki adds one new local rule: Cape { at, len }. LOS ray of length len-1 white, terminated by black/wall, other three neighbors black/wall. Eval-only first cut.
  • Heyawake adds room geometry + adjacency. No new Rule variants — encodes via ExactCount per numbered room + AtMost(1) per adjacent black pair + per-straight-run Forbidden(ExactCount(black, k)) for the room-boundary rule. Lots of constraints, no new engine work.

Land them in the same phase because nurimisaki proves Phase 3’s stub promotions transport, and heyawake stress-tests the constraint count on existing rules.

Ship target: ~2 weeks combined.

Phase 5 — yajilin (the prize)

Original Phase 5; still the hardest. Cell-only Model A ({Black, WhiteLoop, Numbered} per cell, loop recovered from WhiteLoop set via Connected + DegreeIn(2)) reuses everything in place by then. Model B (edges + cells coupled biconditional) deferred until benches force it.

If Phase 3’s Connected arm proves not strong enough as a “single closed loop” check on cells, add a cell-substrate Path { closed: true } variant. The slither edge-based version demonstrates the algorithm is tractable.

Ship target: ~3 weeks.

Phases 1-5 in roughly two months

Estimate is honest if focused: masyu (3-5 days) + hashi (2 weeks)

  • nurikabe (3 weeks) + nurimisaki+heyawake (2 weeks) + yajilin (3 weeks). About 10 weeks of focused work, less if some genres ship faster than estimated, more if Connected’s fast eval arm turns into a multi-week project on its own.

This is materially faster than the original plan’s pace because of the architecture collapse: each genre is mostly one file in core/ plus three Vec entries elsewhere, and OR-Tools solves correctly today for every rule shape we care about. The gating work is per-rule fast-path eval, not per-genre wiring.

Deferred (still out of scope)

  • shakashaka — needs Mark::Triangle(Empty, NW, NE, SW, SE) (5-cardinal mark; engine assumes Binary/Numeric) and a RegionIsRectangle geometric rule with no analog. Revisit when there’s a concrete demand signal.
  • lits — tetromino placement is a different paradigm than per-coord candidate sets. Would need Rule::OneOfShapes and same-shape adjacency across rooms. Tractable but architecturally distinct.
  • dbchoco — region-pairing congruence; needs shape-comparison primitives.
  • Crossword, mazes, hidden-information genres — covered in the Puzzle families chapter; sister-track work, not in-scope for the CSP roadmap.

Locked decisions (carry over)

  • Trial-1 + OR-Tools oracle carries any genre as a “slow but correct” first cut. Fast-path eval lands when benches demand.
  • New genres are one-file-in-core/puzzles/<genre>/ + three Vec entries (CLI / eframe / bench). No per-genre frontend files.
  • pzv URL framing goes through core/src/puzzles/pzv.rs::parse_url_frame.
  • Each genre lands with a 50-puzzle dataset smoke test against the puzzlekit / puzzlink corpus.

Why this matters for the business plan

The business plan makes engine breadth a long-term moat: “the home for all Nikoli-style puzzles” is the embed-pitch differentiator no competitor can offer. The Phase 1 beachhead is variant-Sudoku — done — but Phase 4 of the business roadmap (“multi-genre expansion and market growth”) is the phase that turns the engine into a real pitch. Shipping six more genres in the next ~10 weeks moves Phase 4 from “long-term moat” to “ready to demo when the marketplace ships in the business plan’s Phase 2.”

The cheap-then-leveraged ordering (masyu → hashi → nurikabe → nurimisaki+heyawake → yajilin) means the engine demo gets visibly broader every couple of weeks, which is the showing-up-in-CtC-orbit cadence the business plan’s go-to-market section depends on.

Performance roadmap

The originally-planned tiers all landed inside FastGame + Propagator. FastSolver solves a 100×100 Akari in ~42 ms; StandardGrader grades the same in ~210 ms. See the Fast solver chapter for the full optimization log — what worked, what was reverted, and what’s still open.

The ideas that landed:

  • Region indices cached once at construction. Each constraint resolves its Region to a Vec<usize> of dense coord indices in Propagator::region_indices; the hot path walks the slice directly with candidates[i] access.
  • First-class LOS / ray substrates. Region::Los, Region::LosCross, and Region::Neighbors4 resolve natively via puzzle.substrate().blocks_los(r, c) — a direct read on the cells grid (Material::Wall blocks, Floor and Hole pass through), no trait-method dispatch.
  • Per-constraint aggregates. Each constraint caches committed_with_mark / possible_with_mark; apply/undo maintain them; forced_moves_at reads them in O(1).
  • Per-cell counters (Tier 3, the big one). Puzzles declare named counters via Puzzle::cell_counters(); the propagator maintains them through apply/undo via precomputed inverse- reach. Rule::CellMin { counter, min } propagates cell-by-cell through a dirty-cells queue instead of riding the AC-3 worklist.
  • Bitmask candidates. Per-coord u32, one bit per mark in the puzzle’s universe. Apply / undo are mask & !bit / mask | bit.
  • MRV trial ordering + intra-sweep propagation in trial-1.
  • LTO=fat + codegen-units=1 in release.

Still open: fused substrate-aware updates (a move that walks a ray AND emits forced moves on those same cells, in one pass), MAC / GAC / AC-2001, trial-2/N, smarter branch heuristics, parallelism in the solve loop, WASM measurement.

Virtuous cycle

How the four operations Penmark performs on a puzzle — generate, solve, grade, enumerate — and the discovery loop that connects them, fit together as one feedback system rather than four standalone tools.

The loop

flowchart LR
    GEN["Generate<br/><sub>Genre::generate</sub><br/><sub>scaffold → bare-solve →<br/>derive clues → reduce</sub>"]
    PUZ[(Puzzle corpus)]
    SLOW["Slow solve<br/><sub>SimpleSolver,<br/>OR-Tools CP-SAT</sub><br/><sub>reference truth</sub>"]
    FAST["Fast solve<br/><sub>FastSolver</sub><br/><sub>AC-3 + trial-1<br/>+ Propagator</sub>"]
    GRADE["Grade<br/><sub>StandardGrader</sub><br/><sub>fire named Techniques<br/>in priority order</sub>"]
    ENUM["Enumerate<br/><sub>Genre::enumerate</sub><br/><sub>stream unique puzzles<br/>at a given size</sub>"]
    DIFF["Difficulty<br/>label"]
    DISCOVER["Discover<br/><sub>mining pass over corpus:<br/>capture contradictions,<br/>cluster proof-shapes</sub>"]
    HUMAN["Human curation<br/><sub>inspect candidates,<br/>encode as Technique impl</sub>"]
    CATALOG[("Grader<br/>technique<br/>catalog")]

    GEN -->|new puzzle| PUZ
    ENUM -->|stream| PUZ

    PUZ --> SLOW
    PUZ --> FAST
    SLOW -->|cross-check| FAST

    FAST --> GRADE
    GRADE --> DIFF
    DIFF -->|filter / re-roll| GEN

    PUZ --> DISCOVER
    FAST -->|trial-k probes,<br/>contradiction proofs| DISCOVER
    DISCOVER --> HUMAN
    HUMAN -->|new Technique impl| CATALOG
    CATALOG --> GRADE
    CATALOG -->|finer-grained<br/>difficulty control| GEN

    classDef gen fill:#1f3b73,stroke:#3b5fa8,color:#fff
    classDef solve fill:#2d5a3d,stroke:#4a8a5a,color:#fff
    classDef grade fill:#5a4a2d,stroke:#8a7a4a,color:#fff
    classDef discover fill:#5a2d4a,stroke:#8a4a7a,color:#fff
    classDef store fill:#3a3a3a,stroke:#6a6a6a,color:#fff
    class GEN,ENUM gen
    class SLOW,FAST solve
    class GRADE,DIFF grade
    class DISCOVER,HUMAN discover
    class PUZ,CATALOG store

The four “boxes a user runs” are colored separately:

  • blue — sources of puzzles (generate, enumerate).
  • green — solvers (slow reference + fast production).
  • gold — grading (the technique catalog plus the resulting difficulty label).
  • purple — the discovery loop that grows the catalog.
  • grey — passive stores (the puzzle corpus, the technique catalog).

How each edge actually flows

Generate → corpus

genre.generate runs the universal scaffold_solve_derive recipe (or a genre-specific override): pick structural materials, bare-solve a completion with FastSolver, derive clues via genre.derive_clue_at (the typed ClueValue-producing hook; defaults through genre.clue_pattern for u8 genres), then reduce by symmetry orbit until clue density hits the floor. The output is one uniquely-solvable Puzzle (with its genre field set to the runtime handle). Repeated calls under varied seeds populate a corpus file (the per-genre data/puzzles/<genre>.jsonl[.gz]).

genre.enumerate is the same idea wholesale — instead of “give me one good puzzle for this config” it streams every uniquely-solvable puzzle at a given size. Either path lands in the same on-disk corpus.

Corpus → solve

Two solvers run in parallel against the same corpus, for different reasons:

  • Slow solve is the reference. SimpleSolver is brute-force DFS over HashMap<Coord, Vec<Mark>>; the obvious-correct impl every faster solver gets cross-checked against. OrToolsSolver (CP-SAT, behind the ortools feature) is the “real CP solver” point of comparison — answers the question “how close is our bespoke propagator to a production CP engine on this corpus?”.
  • Fast solve is production. FastSolver runs DFS plus AC-3 propagation through a Propagator over a FastGame<'p> (bitmask domains, per-constraint aggregates, per-cell counters), with a depth-1 trial-1 sweep at every node. This is what the grader and generator call internally.

The cross-check edge between them is what keeps fastsolve honest. Every simple_and_fast_agree_* test asserts the two return the same solution set on a fixed puzzle; the corpus-wide bench harness asserts the same across thousands of puzzles.

Fast solve → grade

StandardGrader fires named human-pattern Techniques in priority order against the same Propagator engine FastSolver uses. Techniques are rule-kind-driven (Propagation-phase ones participate in the inner fixpoint loop; Pattern-phase ones run when propagation stalls), so genres sharing rule shapes share deductions. The output is a difficulty label assigned by counting which technique tiers were needed to close the puzzle.

Difficulty → generate

The generator’s cfg.time_budget and cfg.clue_density knobs control raw structural shape, but the grader’s difficulty label is what makes the generator useful. A “produce a hard 6×6 sudoku” workflow is the generator producing candidates, the grader rating each, and the harness re-rolling anything that doesn’t land in the requested tier. Without the grader the generator is just a uniqueness oracle.

Corpus + fast solve → discover

The discovery loop is the part of this diagram that doesn’t yet exist as production code but is the point of the whole exercise. The shape: run trial-k preprocessing across the corpus at depth k > 1, capture each contradiction’s minimal-variable- set proof, cluster the proofs by structural shape (modulo symmetry), and surface the top-N candidate patterns.

The patterns that cluster cleanly are candidate techniques. Each one is a deduction the existing grader catalog doesn’t have a name for, surfaced from real puzzles where it would have mattered.

Discover → human → catalog

A human inspects the candidates. Patterns that read like something a sudoku setter would call a technique (“locked candidate,” “sashimi X-wing,” etc.) get encoded as Technique impls and added to the grader’s catalog. Patterns that turn out to be 12-variable monsters with no human-recognisable shape get discarded — they don’t pollute the catalog.

The grader stays a catalog of hand-coded named techniques. The discovery loop is a helper for the human writing techniques, not a generator of them. That matters for three reasons:

  1. The grader API is unchanged; difficulty calibration stays anchored to human-recognisable techniques.
  2. Solver internals (aux-var state, proof shapes, search bookkeeping) never have to be human-readable.
  3. Auto-discovered candidates that aren’t interpretable get filtered at human review.

Catalog → generate (the long edge)

This is the edge that closes the cycle. A grader catalog with finer-grained tiers gives the generator finer-grained difficulty control: “produce a puzzle that needs technique X but not technique Y” becomes expressible. That produces a richer corpus that, on the next mining pass, surfaces a different set of candidate patterns — including some that the previous catalog was masking by always firing first.

Why this framing matters

Each operation in isolation is unremarkable: a CP solver, a DFS generator, a rule-fires grader, a difficulty tagger. The interesting part is that the four operations share the same propagation engine and so each improvement compounds:

  • A faster FastSolver makes the generator produce more candidates per second, makes trial-k tractable for larger k, and makes the grader reach harder puzzles within the same time budget.
  • A bigger grader catalog makes the generator’s difficulty output sharper and makes the discovery loop’s clustering pass see fewer false-positive “new” patterns.
  • A bigger corpus makes both the bench harness more meaningful and the discovery loop more likely to surface low-frequency patterns.

The cycle’s bottleneck moves around. When the existing catalog covers most of the easy patterns, the discovery loop becomes the rate-limiting step. When the discovery loop’s canonicalisation surfaces clean clusters, the human-curation step becomes the rate-limiting step. When the catalog is rich enough that the generator has fine-grained control, the next unlock is harder corpora that exercise the long tail of techniques.

Where we are in this picture

The blue / green / gold parts (generate, both solvers, grader) are all in tree and exercised by the bench matrix and the canonical corpus. The purple discovery loop is not yet built — every “Technique” in the grader catalog is hand- written from scratch, with the discovery side filled in by human intuition rather than corpus mining.

The aux-vars / sudoku-triads experiment on claude/penmark-algorithmic-speedups-9tMvr was a pre-discovery attempt to enrich the propagator’s primitive set so the discovery loop, when built, would have a richer vocabulary to cluster proofs in. The integration didn’t pay off on the current sudoku corpus (~2× slowdown, identical DFS node count across difficulty tiers — trial-1 already subsumes the patterns triads catch). The full writeup is in penmark/notes/aux-vars-postmortem.md; the takeaway is that trial-1 is the floor on the current corpus, so future discovery work should target either genuinely-hard puzzles (where trial-1 doesn’t close the gap and the search tree grows) or other genres entirely.

Status

Four genres (Akari, Sudoku, Slitherlink, Takuzu) run end-to-end on a single CSP engine, on CLI + eframe. OR-Tools wired in as a cross-check oracle. Canonical/wire layer ingests external puzzle formats via puzzlehound. Tagged-serialization (puzzle blobs) and the saved-game wire format (puzzle + marks envelope) both ship, with round-trip tests across every genre via the single Puzzle::from_tagged_json[_with_marks] path. The Tier 1 presentation surface (MarkRenderStyle, Material, material, cell_clue, is_play_target, format_value on Puzzle; marks / get / set / clear on Game) lives on the trait, and eframe consumes it (per-genre HashMap mark stores collapsed into the generic tagged::Marks).

Open frontiers — designs documented in this book that haven’t yet landed in the code:

  • Substrate reification + Pin constraint. Documented in the Substrates and Constraints chapters; not yet implemented. The big move: collapse SUBSTRATES / coords() / coord_bg() / los_blocks() / rows() / cols() into one substrate() accessor returning a Substrate struct with four direct layer fields. Bg becomes Material; the discriminator enum renames SubstrateLayer. Givens collapse into the constraint list as Pin (single-coord) or ExactCount / Sum (count-shaped); per-genre clues: HashMap fields stay as editor scaffolding only.

  • Constraint owns its regions + Region.presentation. Documented in the Constraints chapter (Regions section) and Decorations, Themes and Colors; not yet implemented. region: Region becomes regions: Vec<Region> on Constraint — owned by the constraint, no separate region registry. Region wraps a RegionShape plus an optional RegionPresentation (border / fill / label / shaded / decoration). The renderer’s per-genre paint code collapses to a single walk: puzzle.constraints().iter().flat_map(|c| &c.regions), painting whatever presentation is set.

  • Generic grid widget. Once Region.presentation and substrate land, eframe’s per-genre draw_grid / paint_edge collapse onto a single widget that dispatches on MARK_RENDER for marks and walks constraints for region presentations.

  • WASM bindings, discovery / coverage CLI verbs, expanding the genre catalog, and deciding the registry shape the website will want — all open work.

Shape, traits, and module layout will keep iterating; this book is the current snapshot, not a contract.