Penmark
Penmark is a toolkit in Rust for creating, playing, and analyzing Nikoli-style pen-and-paper puzzles. Through a unified view of pen-and-paper puzzles, a new constraint programming engine written from scratch, and a portable WASM library, Penmark aims to be the best in class toolkit for these pen-and-paper puzzles in all regards.
This book documents how Penmark is structured, why it’s structured that way, and how to use it. It’s updated regularly throughout development; changes often appear here before being implemented.
What’s in this book
- Core concepts — the conceptual layers (genre / puzzle / game), grid substrates, regions, and the rule vocabulary that every constraint draws from.
- Implementations — the trait surface
(
Puzzle,Game, the four action traits), tagged serialization, dispatching, the propagation engine, and the Puzzlehound dataset. - Solver, Generator, Grader, Enumerator — one section per action.
- Frontends — CLI verbs and the directory layout that holds it together.
- Roadmap — MVP scope, performance state, open work.
Chapters are mostly self-contained; cross-links handle the cases where one needs to lean on another. Skim the library-structure and genre-puzzle-game chapters first if you want orientation; otherwise jump wherever.
Penmark — Business Plan
Version 0.9 — strategic clarity document, not a pitch deck
Penmark will be the new home for setters and solvers of pen and paper logic puzzles. With a best in class logic engine, integrated web and mobile experience, and marketplace and platform to connect fans to their favorite creators, Penmark will become a dominant force in the puzzle world. Penmark aims to become the Bandcamp of puzzles, allowing fans to fully support the livelihoods of the creators of the puzzles that people love so much.
Market and incumbent analysis
The relevant market is the Cracking the Cryptic orbit and the broader Nikoli-style logic puzzle community. It is small, but extraordinarily passionate, with a clear willingness to pay (existing setter Patreon revenue demonstrates this) and a clear under-served need: no integrated marketplace exists, and no embeddable player exists either.
Sudokupad is the incumbent. Its strengths are a polished free web experience, deep integration with the CtC YouTube ecosystem, and trust within the variant-Sudoku community. Its weaknesses are real and structural: the mobile apps are paid, falling behind on updates, and a recurring source of complaints. It has no creator profiles, no follows, no microblogs, no discovery, no payment infrastructure, and no embeddable player. Sharing happens via Discord links and DMs. Setters monetize off-platform through Patreon and Ko-fi, decoupled entirely from where their work is consumed.
Sudokumaker is editor-focused with no marketplace, no community, and no embed.
Adjacent niches — Cross+A, Logic Masters Deutschland, Janko.at — are either community-only with no commerce, specialty, or aimed at a different (often older or more puzzle-construction-purist) audience. None offer embeds.
The market today is small not because the audience is small but because the rails to monetize don’t exist. Top setters earn a few hundred to a few thousand dollars a month from Patreon — meaningful side income, not a livelihood. Penmark’s purpose is not to take a percentage of an existing pie but to grow the pie by 100x — to make full-time puzzle setting a viable career for the people this community is already built on. Without that ambition, take rates don’t matter; strategy doesn’t matter; Penmark is just another tool layered onto a constrained market. The mission is the multiplier.
The product
Penmark’s engine is the foundation: a constraint-programming substrate that grades, solves, plays, edits, and generates, designed so new puzzle genres are cheap to add. Sudoku, Slitherlink, and Akari today; Nurikabe, Heyawake, Hashiwokakero, Masyu, and the rest of the Nikoli family within reach on the same architecture. The advantage is breadth — one engine, every Nikoli-style puzzle, an integrated authoring loop on top.
On top of the engine, four product surfaces.
The player — the puzzle-playing software in any of its deployments, distinct from “players” the people who use it — has three modes. A web destination at penmark.com, where every puzzle has a shareable URL that anyone can play (no account required, like Sudokupad), wrapped in browse, discovery, profiles, and the marketplace. A native mobile app shipped via Capacitor and Fastlane to iOS and Android. And — the wedge — an embeddable iframe that any setter can paste onto their existing website, free, no commitment, with a “Powered by Penmark” footer, a setter-profile link, and a “get the rest of this pack on Penmark” CTA when the puzzle is part of a paid pack. The embed makes setter websites potential distribution surfaces — a foot-in-the-door bet, more constrained in scale than Bandcamp’s or itch.io’s web-wide presence, but enough to seed the niche if the conversion mechanics work.
The editor is for creating puzzles. Browser-based, no exporting required, instant publish — both to the marketplace and to embeds placed elsewhere. Creating and publishing happen in the same tool, in the same place where the audience already is.
The marketplace and social layer is the business. Profiles, paid and free packs, follows, microblogs, monthly support tiers. The marketplace generates revenue; the social layer is what makes the marketplace work — without follows, microblogs, and discovery, a paid pack is just a link, and links can be hosted anywhere.
The account model is unified, free, and role-fluid. Penmark has people accounts, not role-specific accounts: any account can be used to play, set, solve on stream, or publish at scale. Casual fans are players; creators are setters; performers who solve in front of an audience are streamer-solvers; companies and labels like GMPuzzles are publishers, running curated packs from many setters. People hold multiple roles at once, or move between them over time. Everyone gets the same primitives — profile, microblog, puzzle registry, embed-code generator, basic stats, marketplace publishing. Bulk uploads are gated lightly to deter IP theft; otherwise nothing is locked behind a tier.
Channel — how people hear about Penmark
Awareness is two distinct problems. Setters and solvers find Penmark through different paths, and the strategy has to work both flywheels deliberately.
Setters become aware through three paths. Direct outreach from the founder, in the early days — manual and unscalable, but the only way to seed the first ten. Setter-to-setter word of mouth, once those first ten are happy. And ambient visibility — embeds on respected setter websites that other setters notice, and Penmark’s name appearing across CtC-orbit communities: Reddit’s r/sudoku, the CtC Discord, the YouTube setter community, niche Twitter. Ambient visibility cannot be rushed. It requires showing up over months not weeks, commenting thoughtfully on others’ work, sharing progress openly, slowly becoming someone the community knows by name. The trust earned this way cannot be faked. Search is a smaller channel that pays off late but pays off, since there is currently no obvious result for setters Googling “host my puzzles” or “puzzle marketplace.”
Solvers become aware through the embed and through Penmark URLs. Every embedded puzzle on a setter’s website is an impression for Penmark — footer link, setter-profile link, and, when relevant, a “get the rest of this pack on Penmark” CTA. Penmark URLs shared in Discord, Reddit, and YouTube descriptions land solvers directly on penmark.com, where the rest of the platform is right there. Secondary solver channels are word of mouth in puzzle Discords and subreddits, mentions in Cracking the Cryptic videos when a setter’s pack is being solved, and — once it ships — mobile app store discovery.
The two flywheels compound when they intersect. A setter publishes on Penmark or embeds on their site. Solvers play, see the brand, click through, follow the setter on Penmark, subscribe to support them. The setter sees real engagement and tells another setter. The second setter publishes. Repeat. The growth model is: every embed and every shared URL generates both setter referrals and solver clicks from the same primitive. Slow to start; self-sustaining once turning.
Channels deferred, then opened. In v1, no paid acquisition, no SEO content farms, no influencer deals, no launch-day press push. The niche is small enough that hype-style marketing reads as inauthentic and damages the only growth lever available at this stage. Once the niche is seeded — Penmark a real platform with real setters publishing real packs — the calculus changes. Growing the market beyond the CtC orbit will require paid ads and direct outreach: those are the tools that take a niche to 100x, and patience alone is not. Patience is for the founding period; the marketing engine is for the growth period.
Wedge and differentiation
A useful distinction: the wedge is how Penmark gets its first setters; the differentiators are why those setters stay, grow, and recommend it.
The wedge: distribution as embeddable infrastructure
The primary wedge is the iframe embed. The first ask to a setter is not “migrate your work and publish exclusively here” — it’s “paste this iframe on your existing site, free, no commitment, your existing puzzles work in it.” That’s the lowest-commitment first touch in the market and the highest-leverage move for Penmark, because every embed is a piece of distribution: a “Powered by Penmark” footer, a link to the setter’s Penmark profile, and a “get the rest of this pack on Penmark” CTA when relevant. Setter websites with Penmark embeds become Penmark’s marketing channel.
For setters without personal websites — which is most variant-Sudoku setters in the CtC orbit — the parallel low-friction path is publishing directly to Penmark with shareable URLs, the same way Sudokupad links circulate today. Same low commitment, same conversion mechanics, just funneling through penmark.com instead of the setter’s own domain.
Differentiators (why setters stay, grow, and recommend Penmark)
Creator economics and identity. Profiles, packs, follows, microblogs, paid support. A setter’s Penmark page becomes their home, replacing the awkward Patreon-plus-Twitter-plus-Discord triangulation. Creators stay where they feel like they have a page, not a payload.
Polished mobile experience. Sudokupad’s paid, neglected mobile app is a real opening. A free, polished, frequently-updated Penmark mobile app would win on UX alone for the meaningful subset of CtC fans who want to solve on phones and tablets but currently don’t because the available app is mediocre. Mobile ships in Phase 3, after the embed and the marketplace are working.
Multi-genre breadth, growing. Sudoku is the entry point but the engine architecture means the long-term position is being the home for all Nikoli-style puzzles. No competitor has this. It also strengthens the embed pitch — “embed any Nikoli-style puzzle with one tag” is cleaner than “embed a Sudoku” and nobody else can offer it.
Beachhead audience
Variant-Sudoku enthusiasts in the CtC orbit, specifically: active solvers who follow a small set of setters, setters currently publishing free puzzles and monetizing via Patreon, and the Discord-and-link-sharing crowd who would benefit from in-platform discovery. Small but passionate — the right shape for a marketplace that doesn’t need millions of users to start working, even if it needs many more to fulfill the mission.
Streamer-solvers are the demand engine. In the CtC orbit, fans show up because Simon and Mark are solving; they engage with the setter retrospectively. Setters are supply, but streamers are demand. Penmark’s pitch to a streamer-solver is simple: you want your audience to support the setters whose puzzles you solve. Penmark makes that one click. The streamer doesn’t need to do anything beyond linking to Penmark in their video description — fans see the puzzle, the setter’s profile, the support button, all in one place. Penmark becomes the easy answer to the “how can I support these creators” comment that streamers already field. This is the highest-leverage acquisition lever in the niche, and it costs Penmark nothing beyond making sure the streamer-to-fan-to-setter path is friction-free.
The embed wedge makes the supply side of this beachhead more tractable than it would otherwise be. Recruiting the first ten setters as embed-only adopters is a much easier conversation than recruiting them as marketplace publishers. Once embedded — or published directly to Penmark — the marketplace conversion happens organically: they already trust the player, so the next step is a small one rather than a leap.
Going broader from day one (mobile casuals, education, classic Nikoli demographics in Japan) is expensive and unfocused for a solo bootstrapper. Sudoku now means Sudoku only at v1. Other genres come once Sudoku has traction.
Business model
The model is intentionally simple, in the spirit of free-as-possible-to-encourage-growth. Three revenue streams, each scaling differently and compounding together.
Flat percentage on pack sales. Taken from setters at the point of transaction. Bandcamp’s 15% (post-$5K-revenue threshold) is a useful benchmark; landing somewhere in the 10–15% range is more generous than Apple’s 30% while still sustainable. Avoid tiered or volume-based percentages early — they’re complicated to communicate and don’t change behavior at small scale.
Monthly creator support tier. Patreon-style, with the same flat percentage. Backers get a feed of microblog updates and any creator-designated perks (early access, supporters-only packs).
Penmark Premium. A direct solver subscription at $5/month or $50/year. Premium subscribers get unlimited hints and no ads. Free users get a hint cap per puzzle and modest, non-intrusive ads. Premium is the friction-removal product — solvers who play often will pay to remove the soft ceiling on hints and the ad presence; casual solvers stay free. Crucially, this revenue stream doesn’t depend on pack sales — it scales with raw user activity, compounding with the 100x mission. Ships in Phase 2 alongside payment infrastructure.
Free puzzles must remain a first-class concept. Solvers can play any puzzle on Penmark or via embeds without paying anything. Creators can publish free packs alongside paid ones — exactly the Bandcamp model where artists release free tracks, name-your-price albums, and paid albums under one roof. The free tier’s limits (hint cap, light ads) are calibrated to be felt by heavy users only — never inside the play view, never blocking access. The goal is to make Premium worth paying for without making free feel hostile.
Pay-what-you-want pricing is worth supporting from launch. Deeply on-brand for the Bandcamp framing and lowers the friction for setters experimenting with paid work for the first time.
The embed is free and stays free. The temptation to introduce a paid “remove Penmark branding” tier should be resisted early — the branded embeds are the marketing channel, and pricing them away optimizes against your own distribution. The embed monetizes indirectly, by funneling solvers back to Penmark where the marketplace and Premium subscription live.
Ads exist only as part of the freemium structure. Surfaced to free-tier users as a soft conversion lever toward Premium, never as platform-wide noise and never inside the active solving experience. Calibrated to be felt by heavy users without making free feel hostile.
The unit economics now have multiple compounding streams. Pack sales: at a 10–15% take, $1K/month of personal income from packs alone requires roughly $7K–10K of GMV per month — 700–1,000 pack purchases at $10 each. Premium subscriptions: at $5/month, 200 paying subscribers contribute $1K/month directly to Penmark; this scales linearly with engaged-solver count rather than pack volume. Creator support: additional flat-percentage take on monthly Patreon-style subscriptions to setters. These three streams describe ramen profitability for one person; the 100x ambition implies a market many times larger, where many setters earn full-time and Penmark scales to a small team.
The strategic implication of three streams: Penmark is not a pure marketplace dependent on creator success. Premium subscriptions give a direct B2C revenue base that survives even when the marketplace is still warming up. This de-risks the business model and makes solver acquisition (not just setter recruitment) materially important.
Go-to-market — solo bootstrapper edition
With no team and no growth budget, Penmark’s path is patient, supply-side, and embedded in the community.
Lead with the embed (or direct Penmark publishing), not the marketplace. When approaching the first ten setters, the offer is: “I built a puzzle player that you can either embed on your site or publish through, free, no commitment. Your existing puzzles work in it.” This is a much smaller ask than “come publish exclusively on my new platform” and gets to yes faster. Once a setter is hosting puzzles via Penmark in either mode, the marketplace conversation later (when Phase 2 ships) is “you already trust the player; here’s the marketplace where you can sell packs” — a much warmer conversation than starting from cold.
Hand-recruit the first ten. Personally identify them, personally reach out, personally onboard them. Do the embed setup yourself if needed. The first ten matter enormously; treat them as design partners, not customers. Talk to them weekly.
Make setter onboarding frictionless. The marketplace is supply-constrained, not demand-constrained. The editor needs to be excellent. The publishing flow needs to be one click. Imports from existing puzzle formats need to work. Reduce every gram of friction between a setter having a finished puzzle and that puzzle being live — both as embed and as marketplace listing.
Track embed-to-marketplace conversion from week one. This is the single most important early metric. If solvers click through from embedded puzzles to Penmark profiles and end up subscribing or buying packs, the wedge works. If they don’t, the embed is being too polite — strengthen the CTAs, add a more visible “Penmark profile” link, surface the buy-this-pack call earlier in the solve flow.
Cultivate streamer-solver relationships early. The streamer-solvers in the CtC orbit are a small group; over months of being a CtC-orbit citizen, get on their radar. The goal is not transactional sponsorship — it’s making sure that when a streamer wants to point fans toward “support this setter,” Penmark is the natural answer. This is the highest-leverage demand-side relationship Penmark has, and it needs deliberate cultivation, not just hoping it happens.
Ship the mobile app once embeds and the marketplace are working. The web destination is mobile-responsive, which covers the immediate need. The native mobile app becomes a depth-of-engagement play once there’s marketplace content density worth a flagship experience. This is why mobile sits in Phase 3, not Phase 2.
What not to do: don’t launch with multiple genres, don’t build an algorithmic feed, don’t add gamification or social-game mechanics. None of these earn trust in this community, and trust is the only growth lever available at this stage.
Roadmap
Penmark ships in five phases over roughly twenty-four months — engine foundation, embeddable infrastructure, marketplace and Premium, mobile, multi-genre breadth.
Phase 0 — engine and editor solidified. Largely done. The constraint-programming substrate handles Sudoku, Slitherlink, and Akari. This is the foundation everything sits on.
Phase 1 — embeddable player, direct Penmark hosting, and people accounts (next 3–6 months). Ship the iframe embed: clean, fast, mobile-responsive, with mandatory branding, setter-profile link, and pack CTA. Ship penmark.com as a play surface with shareable URLs that work like Sudokupad’s. Build the unified people-account experience: claim a profile, register puzzles, generate embed codes, see basic play stats. Hand-recruit five to ten setters; personally onboard them via whichever mode fits (embed for those with sites, Penmark URLs for those without). Goal: live setter presence either embedded or hosted, and qualitative validation that the player is at parity or better than what the setter currently uses.
Phase 2 — marketplace, social layer, and Penmark Premium (months 6–9). Paid and free packs, microblogs, follows, monthly creator support tiers, payment integration (Stripe or similar), pay-what-you-want pricing, and the launch of Penmark Premium ($5/month or $50/year for unlimited hints and no ads) alongside the free tier with hint cap and light ads. Open signups. Phase 1 setters become first publishers. Begin streamer-solver outreach in earnest. Goal: the loop “embed/URL → click through → solver subscribes or buys → setter gets paid” works end to end, and free-to-Premium conversion is measurable.
Phase 3 — mobile app (months 9–12). Capacitor + Fastlane to iOS and Android. Launch as the better, free, well-maintained alternative to Sudokupad’s mobile apps. Goal: become the recommended answer in CtC-orbit threads about mobile.
Phase 4 — multi-genre expansion and market growth (months 12–18). Slitherlink and Akari go live to public. Add one or two more genres opportunistically. The pitch shifts from “Sudoku marketplace” to “the home for Nikoli-style puzzles.” Begin paid acquisition and direct outreach to puzzle-curious audiences outside the CtC orbit. This is the phase where the marketing engine turns on.
Phase 5 — sustainability decision (months 18–24). Ramen profitability either exists or it doesn’t. Decide whether to raise, hire, stay solo, or pivot. Funding is on the table if mission acceleration justifies it; bootstrap remains plan A.
Operating model
Solo and bootstrapped to start. Funding is on the table if mission acceleration justifies it, but the primary plan is bootstrap. Costs are mostly hosting, payment-processing fees, mobile developer account fees, and time. The financial model isn’t complicated — what matters is the discipline.
Define ramen profitability concretely. Pick a monthly personal-income number that covers rent and basic costs, then work backward across all three revenue streams (pack take, creator-support take, Penmark Premium subscriptions) to a milestone for “Penmark works.” Without a number, it’s hard to know whether to keep going or pivot.
The embed introduces a hosting-cost watch item. Embedded plays consume bandwidth and compute even when they don’t generate revenue. At small scale this is negligible; at large scale (thousands of plays per day across many embeds) it can become a real line. Build modest cost monitoring early so that if an embed goes viral the costs don’t surprise. CDN-friendly embed architecture matters more for this reason than for performance alone.
The most important operational habit is constant conversation with setters and streamer-solvers. Ten substantive conversations with passionate users will shape the product more than a hundred hours of solo design. The temptation as a solo founder is to build in isolation; resist it. Talk to people more than you build.
Resist feature creep. The product surface area is already large (engine + editor + player + embed + marketplace + social + Premium + mobile). The bar for any new feature should be: does this make a setter more likely to embed or publish, a streamer more likely to recommend, or a player more likely to come back tomorrow (and convert to Premium)? Features that don’t pass that bar get cut.
Mission-success implies a small team eventually — hosting at scale, payment disputes at scale, support and moderation at scale. Cross that bridge when the volume forces it. The lean operating model is the v1 model, not the v3 model.
Risks and open questions
The risks worth losing sleep over.
Creator cold start, even with the embed. The embed wedge lowers the bar but doesn’t eliminate it. Without pre-existing community connections, breaking into the CtC orbit as a credible host is still the single biggest risk. The embed makes the first conversation easier but doesn’t solve getting to that first conversation. Hand-recruiting and patient community presence over months are the only answers.
Embed succeeds, marketplace doesn’t. If solvers play embedded puzzles but never click through to Penmark, you’ve built free infrastructure with hosting costs and no revenue. Mitigation: track conversion from week one, tighten CTAs aggressively if conversion is weak, be willing to make the embed slightly less convenient if the marketplace funnel needs more pressure.
Premium conversion is too low. The freemium math depends on a non-trivial fraction of free users upgrading. Most freemium apps see 2–5% conversion at steady state; below ~1% the model doesn’t pay for itself. Mitigation: calibrate hint cap and ad intensity as soft conversion levers, A/B test with real users, raise the friction (carefully) if conversion lags.
The market doesn’t grow. The 100x ambition assumes the CtC-orbit audience can be converted into paying fans of multiple creators at much higher rates than today, and that paid acquisition in Phase 4+ can bring in new puzzle fans from outside the orbit. If either fails — if willingness-to-pay caps lower than expected, or if non-orbit audiences turn out to be expensive to acquire — Penmark works as a niche tool but the mission fails. The mitigation is the cumulative effect of every other choice in this plan: lower friction to pay, give creators real homes, make supporting feel personal, make new creators emerge because the rails exist.
Cracking the Cryptic builds their own marketplace. They have the audience, trust, setter relationships, and have already shipped two sudoku apps; “CTC Marketplace” would be a one-quarter project for them and could win by default through audience alone. Operating instinct says they’re not the type — they’re solvers and YouTube creators, not platform builders — but include this as a hedge. Mitigation: stay CTC-friendly. Don’t compete with them on solving content; complement them.
Sudokupad responding. They could ship an embed feature in a few weeks if they wanted. Historically unlikely given the mobile-app neglect signal, but possible. The defensible asset isn’t the embed itself but the relationships with setters who adopted Penmark first. Speed in shipping a credible embed and recruiting launch setters matters partly for this reason.
Scope creep on a solo founder. Engine, editor, player, embed, mobile, marketplace, social, Premium, multi-genre. That’s a lot of surface for one person. The roadmap is sequenced to ship one wedge at a time, but the temptation to build adjacent things will be constant. Holding the line on phasing is the single biggest discipline question.
Solo burnout. Bootstrapped, niche, slow validation, many months of low signal. Build in personal sustainability — runway buffer, time off, social outlets outside the project. Most solo bootstrap failures aren’t strategic; they’re emotional.
The open questions worth resolving soon, but not blocking on now.
Freemium calibration. What’s the right hint cap per puzzle for the free tier — 3? 5? Per-genre? Per-difficulty? And what’s the right ad intensity — interstitial between sessions only, or also a banner outside the play view? Both need to be tuned through real user behavior, not chosen up front. Likely starting position: 3 hints per puzzle, light interstitial ads between sessions only, never inside the play view. Aim for >2% free-to-Premium conversion at steady state.
Premium and creator support bundling. If a solver pays $5/month to support a specific setter, do they also get Premium features (unlimited hints, no ads) bundled in? Generous interpretation: yes, support-tier pays for both; cleaner: separate. Tentative position: keep separate for clarity early on; consider bundling later if it drives more support-tier conversions.
v1 embed scope. Tentative position: the v1 embed includes a full play experience, mandatory Penmark branding, a setter-profile link, and a setter-configurable “buy this pack on Penmark” CTA when the embedded puzzle is part of a paid pack. Defer creator analytics (plays, completions, time-to-solve) to v1.5 — useful but not needed for the wedge to work, and adding scope. Worth validating with the first launch setters before locking in.
Take rate. Pick something defensible (10% or 15%) and commit; don’t agonize.
Pay-what-you-want. Likely yes from launch — very on-brand and lowers setter experimentation friction.
iOS-first vs. simultaneous mobile launch. Capacitor makes simultaneous more feasible than native, but iOS-first probably still wins for a solo founder (smaller surface, more willing-to-pay audience).
Bulk-upload gating mechanism. Gate at what threshold, verified how? Small uploads remain frictionless; bulk uploads need just enough friction to deter scraping and re-uploading a publisher’s catalog. Likely a quiet rate-limit plus manual review for first-time bulk uploaders. Designed properly closer to publisher onboarding.
Moderation and IP at scale. Open marketplaces eventually face spam, plagiarism, AI-generated puzzles flooding, copyright disputes between setters, and scrape-and-republish attacks. The plan defers a real moderation policy until volume forces it; this is acceptable for v1 but is genuine ongoing operational design that will need a position before publishers like GMPuzzles trust the platform with their catalogs.
Legal structure (LLC vs. sole prop). Light-touch policy for puzzle copyright disputes (only escalate when needed). Language scope (English-first; Japanese is a tempting later market but not a v1 fight).
The questions you do not need to answer yet: educational and B2B variants, exact pricing of every premium feature beyond hints/ads, whether to offer creator analytics beyond basic plays/completions, anything beyond Phase 2.
Summary of strategic positions
These are the commitments to preserve if the document ever compresses to a single page.
The mission is to grow the puzzle market by 100x, not to take a percentage of an existing one. The point is to make full-time puzzle setting a viable career for the people this community is already built on. That ambition is the multiplier behind every other choice.
This is a small-business mission — not a lifestyle business, not a venture rocket. Penmark aims to be a profitable, real, growing operation in a niche-then-broader market. Bootstrap with funding optionality; solo with small-team optionality; patient marketing in v1, paid acquisition and outreach post-seeding.
Penmark is infrastructure first, destination second. The engine and embeddable player deploy across the web; the marketplace and social layer live at the destination they funnel into. Penmark URLs work like Sudokupad’s for setters without sites; embeds work for setters with them.
The wedge is the embed (or direct Penmark publishing), not the marketplace. The first ask to a setter is “host your puzzles here, free, no commitment,” not “migrate to my platform.” It is the sharpest competitive lever available.
Streamer-solvers are the demand engine. Setters are supply; streamers drive fans toward setters. The pitch to streamers is: Penmark makes supporting your setters one click for your audience.
The differentiators — creator economics, polished mobile experience, multi-genre breadth — are reasons setters stay and grow, not reasons they show up. Acquisition runs on the wedge; retention runs on the differentiators.
Accounts are unified, free, and role-fluid. Penmark has people accounts; players, setters, streamer-solvers, and publishers all use the same primitives. Bulk uploads gated for IP protection; nothing else locked behind a tier.
Three revenue streams, compounding. Flat take on pack sales, flat take on monthly creator support, and Penmark Premium ($5/month or $50/year) for unlimited hints and no ads. Free remains genuinely playable — hint caps and light ads only, calibrated as soft conversion levers toward Premium. The embed stays free and Penmark-branded; no paid “remove branding” tiers.
Sudoku is the beachhead. Multi-genre is the long-term moat.
The biggest risk is creator cold start. The highest-leverage early action is hand-recruiting and personally onboarding the first ten setters. The marketplace conversation comes later, on a warmer base. CTC building their own marketplace is a hedge-risk; stay CTC-friendly.
Solo and bootstrapped means saying no to almost every feature, talking to setters and streamer-solvers constantly, and shipping one wedge at a time without skipping ahead.
Background
Why Penmark exists and what came before. Read these before the technical chapters if you want context; skip them if you already know the puzzle-engine landscape.
Motivation
Why Penmark exists, what gap it’s filling, and what it unlocks once it does. To be fleshed out.
Why one engine
The shape of the existing ecosystem — Tatham’s collection, PuzzleKit, per-puzzle solvers, SudokuPad, pzv.jp, F-Puzzles — and what none of them cover end-to-end.
Why now
What changed: Rust + WASM let one codebase serve CLI, desktop, and browser without rewrites; modern CSP techniques handle this catalog cleanly; large public puzzle datasets exist to mine.
What this enables
The tools, datasets, and feedback loops Penmark makes possible once the core engine is real — authoring, generation, grading, discovery, cross-genre infrastructure.
Non-goals
What’s intentionally out of scope, so scope creep has somewhere to be pushed back against.
Prior art
Penmark sits at an intersection of two ecosystems, and we want to be honest about both — the people doing the same engineering work, and the people serving the same users.
On the engine side (libraries that solve / grade / generate logic puzzles):
- Simon Tatham’s Portable Puzzle Collection — https://www.chiark.greenend.org.uk/~sgtatham/puzzles/ — the long-running C library + GUI for ~40 logic puzzles, with per-puzzle generators and solvers. The closest peer in scope.
- PuzzleKit — https://github.com/SmilingWayne/PuzzleKit — Python toolkit for Nikoli-style puzzles; we use the dataset at https://github.com/SmilingWayne/puzzlekit-dataset for benchmarking and corpus seeding.
- Per-puzzle solvers — every Sudoku solver since the 90s, Slitherlink solvers in the SAT-encoding tradition, MiniSat / Glucose / Z3 used directly. The “I wrote a solver for X” universe.
On the player / editor side (where Penmark’s WASM embed and ordo’s archive land):
- SudokuPad — https://sudokupad.app — best-in-class Sudoku player + variant authoring; the bar for in-browser puzzle UX.
- pzv.jp / Puzz.link — http://pzv.jp/ — canonical multi-puzzle player with the URL-encoded share format that everyone consumes.
- F-Puzzles — https://f-puzzles.com/ — Sudoku-variant editor + player popular with constructors.
Penmark’s ambition is to be one library covering both halves: the engine work that Tatham, PuzzleKit, and copris-puzzle do, plus the embeddable player + editor that SudokuPad / pzv.jp / F-Puzzles ship. One core crate, one rule vocabulary, three frontends. None of the above does all of it; whether we end up doing all of it is what implementation will decide.
Puzzle families
Different puzzle genres need fundamentally different algorithms,
not just different rules. Penmark today is a constraint-satisfaction
engine — built around Puzzle = (Substrate, Vec<Constraint>, &'static Genre)
and a Game<'p> whose state is per-coord candidate sets. That covers
one large family natively. Several other families share substantial
plumbing (substrate, geometry, parsing, dataset infra, frontends)
but need different solvers, graders, and state models entirely. A
few are different enough that hosting them in penmark would be a
different project.
This chapter maps the landscape so we can be specific about what penmark covers, what it could cover with focused work, and what’s out of scope.
How this chapter classifies
Genres are grouped by what algorithm solves them, since that’s the
axis that decides reuse. Rules and aesthetics matter for play; they
don’t determine whether penmark’s Propagator and FastSolver
apply.
For each family, this chapter lists:
- What — examples and the canonical solver shape.
- Reusable from penmark today — concrete subsystems that transfer without modification.
- Missing — subsystems that would need to be added.
The same vocabulary recurs across families. Reusable pieces are
almost always the static layer: Substrate + Coord + Region +
Mark + PuzzleMeta + tagged envelope + parsing dispatch +
dataset infra + bench infra + the eframe paint pipeline + the
Tier / GradeStatus taxonomy. What varies is the rule layer
and the solver track sitting on top.
1. Constraint satisfaction
Penmark’s home family.
- Examples: Sudoku, Akari, Slitherlink, Nurikabe, Heyawake, Star Battle, Kakuro, Masyu, Hitori, KenKen, Yajilin, Tents, Skyscrapers, Nonograms, Numberlink, Battleships, Einstein/zebra logic puzzles, cryptarithmetic, futoshiki.
- Solver shape: backtracking DFS + arc-consistency propagation, or SAT / CP-SAT for harder instances. Domains are small (2–9 marks per coord); the propagator narrows them via per-rule reasoning until every coord is singleton.
Reusable from penmark today: everything. Akari is wired
end-to-end; Sudoku and Slitherlink are partway. Adding a new genre
is a Genre impl plus the rule list its build_constraints emits.
See the Adding a genre chapter.
Missing: more Rule variants for genres that aren’t yet
expressible — for instance, Hashi’s “edge multiplicity 0/1/2”
needs a small extension to DegreeIn, and Yajilin’s
arrow-with-count clue needs a directional region shape. The genre
expansion plan tracks which rule kinds each upcoming genre
promotes from stub to real.
2. State-space planning
Sequential games. State changes with every move; a “solution” is a plan, not an assignment.
- Examples: Rush Hour, Sokoban, Klotski, the 15-puzzle and other n-puzzles, Tower of Hanoi, Lights Out, peg solitaire, Atomix, Lunar Lockout, Bloxorz, Rubik’s cube.
- Solver shape: BFS for shortest-plan (when state space is small), IDA* with a heuristic and pattern databases for larger ones, parent-pointer reconstruction for the move sequence. State is typically a packed bitboard or a small tuple, hashed into a visited-set.
Reusable from penmark today:
Substrate+Materialfor the static board (walls, exits, special cells).Coord+Region+ grid geometry helpers for legality checks (“which cells does this car occupy after sliding”).- Tagged-JSON envelope, content-hash,
PuzzleMeta. - The half of
Genrethat describes “what does the level look like”:NAME,LAYERS,MARK_RENDER,default_substrate,parse_text,to_janko_text, dataset / parsing dispatch. Tier/GradeStatusfor difficulty bucketing.- CLI scaffolding, dataset loaders, eframe grid paint pipeline.
Missing:
- A planning sister-trait (call it
PlanningRules) parallel to the CSP-rules half ofGenre, with associatedState,Move,legal_moves,apply,unapply,is_goal, optionalheuristic, and acanonicalizefor visited-set keys. - Generic
BfsPlanner<P>andIdaStarPlanner<P>returningPlan = Vec<P::Move>. - A planning-shaped grader (min-plan-length →
Tier). - A planning-shaped generator (sample layout, run BFS, accept if min-length is in target band).
- An eframe view for stepping through / animating a plan.
The apply/unapply discipline is conceptually the same as
FastGame’s trial-1 rollback, but the trait surface differs
enough that sharing one impl isn’t worth it — better to keep CSP
and planning as parallel solver families.
See the Roadmap for whether this is planned.
3. Exact cover and packing
Find a subset of placements that covers every cell exactly once.
- Examples: pentominoes, polyomino tilings, tangram, Soma cube, Eternity II, jigsaws, peg-grid placement puzzles, Sudoku itself when reformulated as exact cover.
- Solver shape: Knuth’s Algorithm X with Dancing Links (DLX). Expressible as CSP but DLX dominates by an order of magnitude on this shape, because the “choose one constraint to satisfy next, branch over the rows that satisfy it” loop is unusually well- matched to the data structure.
Reusable from penmark today:
- Substrate, region, coord, geometry — exactly the same vocabulary for cells / edges / vertices.
Markworks for piece-id labelling.- Parsing, dataset infra, eframe paint, CLI scaffolding.
- Generation patterns (
cancel: AtomicBool, time budgets, seeds).
Missing:
- A DLX solver. Penmark’s
Propagatorcan technically encode exact cover viaExactCount { mark, n: 1 }per cell + per-pieceExactCount, but it’s slow compared to native DLX. - A piece-placement vocabulary in
Region(arbitrary shape + rotation / reflection orbit). - A grader axis for packing puzzles — uniqueness of solution is the usual target, but “how forced is each placement” is the analogue of human technique grading.
This is a real gap. Pentominoes-shaped puzzles are popular and penmark’s current family doesn’t serve them well; a DLX track alongside the CSP track is plausible future work.
4. Word and dictionary puzzles
Constraint satisfaction with a lexical domain.
- Examples: crosswords (American, cryptic, themeless), word ladders, Scrabble (the legal-move enumerator), Boggle, fill-in puzzles, anagram puzzles.
- Solver shape: still CSP — slot variables, intersection constraints — but the domain is a 100k–500k-entry dictionary, not a small mark set. Propagation is bitset-AND over per-(length, position, letter) precomputed filter masks; branching is at the cell level rather than the slot level to deduplicate redundant propagation across candidate words sharing a letter at the crossing. See Orca’s writeup for the canonical modern implementation.
- For playable crosswords (find the right fill given clue scores), the problem becomes MAP inference rather than satisfaction — loopy belief propagation over the slot graph with per-slot answer distributions from a neural clue→answer scorer (Berkeley Crossword Solver, Dr. Fill). That’s a different solver again.
Reusable from penmark today:
- Substrate (grid with black squares), coord, region (a slot is a row/col slice of cells), parsing, eframe paint.
- The shape of
Propagator: arc-consistency loop + per-arc cache- bitset domain reasoning. The architecture transfers; only the domain backing changes.
- Tagged-JSON, dataset infra, CLI scaffolding, the
Tiertaxonomy. Constraint’s shape is right (“variables crossing at this coord must agree on letter”); just a newRulevariant (LetterAgreementor similar) is needed.
Missing:
- A large-domain domain store.
FastGamehardcodesu32per coord because mark domains are tiny; crosswords need heap-allocated bitsets sized to the wordlist (~52 KB per slot for a 411k-word dictionary). - The
(length, position, letter) → bitsetfilter index. This is the data structure that makes propagation fast. - A variable model where slots, not coords, are the primary variables. Today penmark’s variables are coords. Crosswords need region-as-variable, with cell-level branching on top.
- A clue→answer scoring layer for the MAP-inference flavor — that’s a separate concern (NLP), and grid-fillers like Orca skip it entirely.
A crossword genre is genuinely possible inside penmark, but the propagator-internal work is non-trivial.
5. Path and graph search
Find a path or sequence of moves through a graph.
- Examples: mazes, Hashiwokakero (Hashi — bridges between islands), river-crossing puzzles (missionaries + cannibals), the classic shortest-path puzzles. Hashi is partly CSP (constraint: bridge counts at each island) and partly graph (the bridges must form a connected planar embedding).
- Solver shape: graph search (Dijkstra, A*) for the pure- pathfinding cases; CSP with topological constraints for Hashi and friends. Mazes are a degenerate planning family with single- cell moves.
Reusable from penmark today:
- Substrate + coord + region: the graph is implicit in the geometry.
- For Hashi:
Constraintmostly works (bridge-count isExactCount);Connected { mark }already exists for the global “all islands reachable” check. - For mazes: a pure-pathfinding solver doesn’t really need penmark’s machinery, but the parsing / rendering / dataset infra still lands cleanly.
Missing:
- For Hashi specifically: an edge-multiplicity rule (0/1/2 bridges
per slot) and a planarity / non-crossing constraint. The latter
is awkward in pure CSP; today’s
Propagatorwould handle it via lazy cuts at the leaf, similar to thePathrule. - For mazes: not really anything — they’re so simple that a 50-line BFS is the whole solver. The question is whether penmark adds value over just embedding mazes as a degenerate planning genre.
6. Hidden information and probabilistic puzzles
CSP under uncertainty.
- Examples: Minesweeper, Battleship (solving), Mastermind, Hangman, Wordle (the strategic-guesser variant). The solver enumerates hidden states consistent with the observations made so far, then picks the maximum-information probe.
- Solver shape: CSP for consistency checking + expected- information-gain or expected-regret reasoning over the model space on top. Often Monte Carlo if exact enumeration is infeasible.
Reusable from penmark today:
- The CSP layer for consistency checking — very directly. Each
observation narrows the candidate-state set; that’s exactly what
Propagatordoes. - Substrate, region, mark vocabulary, parsing, frontends.
Missing:
- A probability layer.
Game::status()isInProgress / Solved / Contradicted— a binary verdict. Probabilistic puzzles need per-coord probability distributions and an expected-value computation over the consistent-models space. - A “next probe” selector that scores candidate moves by information gain, not just legality.
- A grader that distinguishes “always solvable with logic” from “requires guessing” — this is the famous Minesweeper grader question and is genuinely hard.
A Minesweeper genre that just shows the deduction is feasible today; a Minesweeper player with optimal-guess support would need real new machinery.
7. Adversarial two-player games
- Examples: Chess, Go, Checkers, Othello, Hex, Connect 4, Nim (when played, not when computed via Sprague–Grundy), TwixT.
- Solver shape: minimax + alpha-beta, MCTS, neural evaluation. State model is similar to planning but the objective is game- theoretic value (best move under optimal opponent), not goal- reaching.
Reusable from penmark today:
- Substrate / coord / region / parsing / eframe / dataset / CLI scaffolding. Same level-description machinery.
- The tagged-JSON envelope.
Missing:
- Essentially the entire algorithmic layer: search trees with two alternating players, evaluation functions, transposition tables, iterative deepening, opening books, endgame tablebases. Almost no overlap with CSP propagation.
- A player-identity layer in the game model.
- For neural-eval games: a totally separate ML stack.
This is realistically a sister project, not an extension. Penmark could host the board and the move format for an adversarial game, but the engine work is far enough afield that mashing it in would dilute the rest of the codebase.
8. Synthesis and programming-game puzzles
Build a program meeting a spec.
- Examples: SpaceChem, Opus Magnum, Shenzhen IO, TIS-100, Manufactoria, Human Resource Machine, Lightbot, the various Zachtronics catalog. Often optimized along multiple axes (cycles, area, ops).
- Solver shape: program synthesis — superoptimization, SMT- guided search, RL, neural code generation. The “puzzle” is a program with input/output specs; the “solution” is a program.
Reusable from penmark today: very little. Maybe the CLI
scaffolding and PuzzleMeta.
Missing: everything else. Different state model, different search techniques, different evaluation criteria, different authoring tools, different visualization needs.
Out of scope.
9. Combinatorial game theory
- Examples: Nim and its many variants, Wythoff, Hackenbush, Domineering, Toads-and-Frogs.
- Solver shape: Sprague–Grundy values for impartial games; surreal-number arithmetic for partizan games. Largely closed-form rather than search-based for the classical instances.
Reusable from penmark today: nothing meaningful.
Missing: the entire mathematical apparatus.
Niche, mathematically beautiful, and disjoint from everything else in this catalog. Out of scope.
Summary table
| Family | Penmark fit | What’s reusable | What’s missing |
|---|---|---|---|
| 1. Constraint satisfaction | Native | Everything | Per-genre rule variants |
| 2. State-space planning | Adjacent | Substrate / level half of Genre / Tier | Planning trait + BFS/IDA* + plan-length grader + plan generator |
| 3. Exact cover / packing | Adjacent | Substrate, region, parsing, frontends | DLX solver, piece-orbit region kind |
| 4. Word and dictionary | Adjacent | Substrate, propagator architecture, frontends | Large-domain store, filter index, slot-as-variable model, optional clue scorer |
| 5. Path / graph | Mixed | Substrate, Connected rule | Edge-multiplicity, planarity (Hashi); maze BFS overlap is small |
| 6. Hidden / probabilistic | Adjacent | CSP layer for consistency check | Probability layer, info-gain selector, guessing-aware grader |
| 7. Adversarial | Sister project | Level / paint / parsing | Minimax / MCTS / eval / player model |
| 8. Synthesis | Out of scope | CLI scaffolding | Everything else |
| 9. Combinatorial game theory | Out of scope | Nothing | Everything else |
“Native” means penmark hosts it today. “Adjacent” means a real but contained body of new work — a sibling solver track or a new domain store, not a rewrite. “Sister project” means the level layer transfers but the engine doesn’t. “Out of scope” means penmark is the wrong home.
The recurring lesson: penmark’s level-description machinery (substrate, coord, region, parsing, dataset, paint, CLI) is genre- agnostic and pays for itself across every family that has a board. The rule and solver layers are family-specific. The clean extension story is to keep the level layer as the shared core and add sister rule-traits and sister solver families as new puzzle classes land — not to force every problem through the CSP propagator.
Core concepts
The conceptual model Penmark works in — the substrate / domain / mark / region / constraint vocabulary, the genre / puzzle / game decomposition, and the four actions (solving, grading, generating, enumerating) defined as traits. The rest of the book assumes these terms.
Library structure
One core crate, two shipped frontends, one planned:
- CLI —
penmark <verb> <genre>for every operation: solve, generate, scaffold, grade, enumerate, profile-eval/init, import, list-impls. The actual verb set lives in the CLI chapter. - eframe desktop app — local GUI for the same operations: build a puzzle, watch it solve, browse the dataset, browse the db.
- WASM (not built yet) — eventual wasm-bindgen surface so the
website at
ordo/editcan call the same engine in-browser. Solving- grading + library introspection only; DB and enumeration stay CLI-side. The wire format on both sides is the tagged envelope described in the Tagged serialization chapter.
CLI and eframe call into the same core directly via Rust types — see the Dispatching chapter for the routing approach.
Grids and Coords
Almost every Nikoli + Sudoku-variant puzzle lives on a square grid; the only outlier worth treating specially is hex. So the standard layout is a square cell grid with three companion sub-grids over the same coordinate system:
| Layer | Dimensions | Used by |
|---|---|---|
| Cells | N × M | Sudoku, Akari, Nurikabe, Heyawake, Hitori, Masyu |
| Edges | (N+1) × M horizontals + N × (M+1) verticals | Slitherlink, fence rules |
| Corners | (N+1) × (M+1) | rare; Masyu pearls have corner-flavored constraints |
A single puzzle can host constraints across any mixture of sub-grids. Most genres use one; Slitherlink uses cells (for clues) and both edge grids (for the loop). The four sub-grids and a universal coordinate type are library-defined:
/// Reference enum for *talking about* layers — used by
/// `Region::AllOf(Layer)`, `Coord::layer()`, `Substrate::grid(layer)`.
pub enum Layer {
/// `N × M` cell grid. Default for almost every puzzle.
Cells,
/// `(N+1) × M` horizontal edges (between vertically-adjacent cells).
EdgesH,
/// `N × (M+1)` vertical edges (between horizontally-adjacent cells).
EdgesV,
/// `(N+1) × (M+1)` corners.
Corners,
}
/// A position in any sub-grid. The library's universal coordinate
/// type — every Nikoli/Sudoku-variant genre shares it. Hex puzzles
/// and other non-square layouts live behind a separate `HexPuzzle`
/// trait family with their own coord type.
pub enum Coord {
Cell { r: usize, c: usize },
EdgeH { r: usize, c: usize },
EdgeV { r: usize, c: usize },
Corner { r: usize, c: usize },
}
Layer is the discriminator that constraints and regions use to say
which sub-grid they’re talking about; Coord is the value-level
position they resolve to. A Slitherlink puzzle can carry cell clues
alongside edge state without ambiguity because every constraint
ultimately bottoms out at Coord values.
Coordinates. Internal: (row, col) zero-indexed from the top-
left, so (0, 0) is the upper-left cell — matches array indexing
and how every renderer naturally walks.
External (URLs, shared puzzle strings, REPL output) is layer-prefixed
and chess-up-numbered: <layer>:<col><row>, where the column is a
bijective base-26 letter sequence (a, b, …, z, aa, ab, …, az, ba,
…) and the row is a 1-indexed integer counted from the bottom up (so
cell:a1 is bottom-left, matching chess and Sudoku convention). The
bijective base-26 scheme handles arbitrarily wide boards — a
30-column Akari grid uses a..z then aa..ad; the chess 8-column
ceiling does not apply.
Each layer has its own grid dimensions and is addressed within that layer’s own coordinate space:
| Prefix | Layer | Grid | Example |
|---|---|---|---|
cell: | Cells | N × M | cell:a1 = bottom-left cell |
edgeh: | EdgesH | (N+1) × M | edgeh:a1 = bottommost horizontal edge of column a |
edgev: | EdgesV | N × (M+1) | edgev:a1 = leftmost vertical edge of row 1 |
corner: | Corners | (N+1) × (M+1) | corner:a1 = bottom-left corner |
The cell: prefix may be omitted (cells are the default layer), so
a1 and cell:a1 parse identically. The other prefixes are required.
A human-friendlier alternate form references neighboring cells —
edgeh(a1,a2) for the horizontal edge between cells a1 and a2,
edgev(a1,b1) for the vertical edge between a1 and b1,
corner(a1,b1,a2,b2) for the corner shared by those four cells.
Both forms parse to the same Coord; the canonical
<layer>:<col><row> is what the wire format uses, and the alternate
form is sugar for hand-typed input.
Grid storage
Layer contents — substrate Material markings, per-coord candidate
bitmasks, render glyph caches — all share one storage type: a flat
row-major Grid<T>, in core::grid.
pub struct Grid<T> {
rows: usize,
cols: usize,
data: Vec<T>,
}
impl<T: Default + Copy> Grid<T> {
pub fn new(rows: usize, cols: usize) -> Self; // filled with default
pub fn get(&self, r: usize, c: usize) -> T;
pub fn set(&mut self, r: usize, c: usize, v: T);
pub fn iter(&self) -> impl Iterator<Item = ((usize, usize), T)>;
pub fn data(&self) -> &[T];
}
Flat storage gives O(1) index, no hash overhead. The Default + Copy bound covers every layer payload the catalog reaches for —
Material and Mark are both small Copy enums that fit it
trivially.
Substrates
A substrate records what positions exist in a puzzle and
what’s at them. One struct per puzzle, four optional layer grids
inside, one Material per position.
// core::substrate
/// How a puzzle's grid is sized. Most genres use `Rect`. Sudoku
/// uses `Boxed` so the substrate carries the box partition that
/// constraint emission needs (per-box rectangles).
pub enum Dimensions {
/// `rows × cols` rectangle. Akari, slither, most genres.
Rect { rows: usize, cols: usize },
/// Sudoku-style: side = `box_w × box_h`, partitioned into
/// `box_h` rows × `box_w` cols of boxes.
Boxed { box_w: u8, box_h: u8 },
}
impl Dimensions {
pub fn rows(&self) -> usize { /* … */ }
pub fn cols(&self) -> usize { /* … */ }
}
/// The structural shape of a puzzle. Every `Puzzle` carries one
/// reachable via `substrate()`. Direct fields per layer; the
/// populated ones describe which positions exist and what's at each.
pub struct Substrate {
pub dims: Dimensions,
/// `None` = the genre doesn't use this layer. `Some(grid)` =
/// the layer is in use; grid contents give per-position Material.
pub cells: Option<Grid<Material>>, // rows × cols
pub edges_h: Option<Grid<Material>>, // (rows + 1) × cols
pub edges_v: Option<Grid<Material>>, // rows × (cols + 1)
pub corners: Option<Grid<Material>>, // (rows + 1) × (cols + 1)
}
/// What's *at* a position — the substance, structurally.
#[repr(u8)]
pub enum Material {
/// Decision-target position. The default; what `Grid::new` fills with.
#[default]
Floor,
/// Excluded; blocks LOS rays. Akari walls, Hitori shaded givens.
Wall,
/// Excluded; doesn't block LOS. Akari holes, Samurai padded gaps.
Hole,
/// Excluded; exists for rendering only. Slitherlink clue cells,
/// Killer-cage corner labels, Kropki edge dots, Picross hint
/// rows/cols outside the play grid.
LabelOnly,
}
Material is the structural classification of a position —
fixed when the substrate is built, immutable during play. Four
variants, each with a different effect on what the engine and
renderer do at that coord:
Flooris a decision target: the player or solver will commit a mark here. It’s the default value ofMaterial(and whatGrid::newfills with), so a fresh grid is fully playable. The vast majority of positions in most genres areFloor.Wallis excluded and blocks LOS rays: not a decision target (no mark ever lives here), and rays walked from neighboring cells stop at the wall position. Akari walls, Hitori shaded givens. The renderer paints them as solid blockers.Holeis excluded but transparent to LOS: also not a decision target, but rays pass straight through. This is the mechanism by which non-rectangular puzzles fit a flat R × C bounding box — Samurai gap cells, Iraka holes; see below.LabelOnlyis excluded, transparent to LOS, and exists for rendering: the position is structurally there so the renderer can paint a label / glyph / decoration anchored at it, but the engine never allocates decision state for it. Slitherlink clue cells, Killer-cage corner sums, Kropki edge dots, Picross hint rows/cols. The label value lives in puzzle-level data (e.g.cell_clue(), variant payloads);Material::LabelOnlyis just the structural marker.
Material answers “what’s structurally here?”, not “what value lives
here?”. A Floor cell can hold any mark in its domain; Wall,
Hole, and LabelOnly cells can’t hold any mark at all.
Edges H and V stay separate fields because their dimensions differ
and their geometric meaning is orientation-specific (Coord::EdgeH
vs Coord::EdgeV); renderer and constraint code dispatches on
orientation.
Substrate is uniformly typed across genres — every puzzle
holds one, accessed the same way. What varies per genre is which
fields are populated and what the contents are. Walls and holes
shape what coords exist before any rule fires; the substrate is
structurally prior.
Per-coord queries live as inherent methods:
impl Substrate {
pub fn rows(&self) -> usize; // forwards to dims.rows()
pub fn cols(&self) -> usize; // forwards to dims.cols()
/// Direct lookup by Layer reference. `None` if the genre
/// doesn't populate this layer.
pub fn grid(&self, layer: Layer) -> Option<&Grid<Material>>;
/// `Material` at `coord`; `Floor` (defensive default) for
/// unmapped layers.
pub fn material(&self, coord: Coord) -> Material;
pub fn set_material(&mut self, coord: Coord, m: Material);
/// Whether the cell at raw `(r, c)` blocks LOS rays
/// (`Material::Wall`).
pub fn blocks_los(&self, r: usize, c: usize) -> bool;
/// Every `Floor` position across every populated layer — the
/// decision-target set.
pub fn coords(&self) -> impl Iterator<Item = Coord> + '_;
/// Every populated-layer position regardless of material —
/// renderer-facing.
pub fn positions(&self) -> impl Iterator<Item = Coord> + '_;
/// Iterate `(Layer, &Grid)` for every populated layer.
pub fn layers(&self) -> impl Iterator<Item = (Layer, &Grid<Material>)> + '_;
// ── Material-keyed iterators ──
/// Iterate every cell whose Material equals `target`. Akari uses
/// this for "every floor cell" and "every hole cell" sweeps
/// during constraint emission.
pub fn cells_by_material(&self, target: Material)
-> impl Iterator<Item = Coord> + '_;
// ── Geometry helpers ──
pub fn edges_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
pub fn edges_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
pub fn vertices_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
pub fn vertices_of_edge(&self, edge: Coord) -> [Coord; 2];
pub fn cells_of_edge(&self, edge: Coord) -> Vec<Coord>;
pub fn cells_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
/// Iterate every vertex paired with its incident edges. Slither
/// uses this for the per-vertex `DegreeIn{0, 2}` constraint emit.
pub fn vertices_with_edges(&self)
-> impl Iterator<Item = (Coord, Vec<Coord>)> + '_;
/// Walk maximal `Floor` segments along `axis`. Walls flush the
/// current segment; non-floor materials other than `Wall` (Hole,
/// LabelOnly) are transparent. Akari uses this for the
/// per-row / per-col `AtMost(Bulb, 1)` segment constraints.
pub fn for_each_floor_segment(
&self,
axis: Axis,
on_segment: impl FnMut(Vec<Coord>),
);
}
pub enum Axis { Horizontal, Vertical }
Per-genre allocations
Genres build their substrate via Genre::default_substrate(dims),
returning whatever shape is canonical for that genre:
// Akari 7×7 — every cell starts as Floor; constructor calls
// set_wall / set_hole / add_clue to install the puzzle's walls,
// holes, and clue values.
Substrate {
dims: Dimensions::Rect { rows: 7, cols: 7 },
cells: Some(Grid::filled(7, 7, Material::Floor)),
edges_h: None, edges_v: None, corners: None,
}
// Sudoku 9×9 — Boxed dims so build_constraints can read box_w/box_h.
Substrate {
dims: Dimensions::Boxed { box_w: 3, box_h: 3 },
cells: Some(Grid::filled(9, 9, Material::Floor)),
edges_h: None, edges_v: None, corners: None,
}
// Slitherlink 5×5 — cells (clue containers, all LabelOnly) + edges
// (every edge a decision target).
Substrate {
dims: Dimensions::Rect { rows: 5, cols: 5 },
cells: Some(Grid::filled(5, 5, Material::LabelOnly)),
edges_h: Some(Grid::filled(6, 5, Material::Floor)),
edges_v: Some(Grid::filled(5, 6, Material::Floor)),
corners: None,
}
// Samurai 21×21 — cells with 72 Hole positions for the gap regions.
Substrate {
dims: Dimensions::Rect { rows: 21, cols: 21 },
cells: Some(grid_with_padding_holes()),
edges_h: None, edges_v: None, corners: None,
}
The “all Floor” grids cost a byte per position — 81 bytes for sudoku, trivial. Dense storage is correct because Floor is the dominant state; the previous HashMap-of-walls representation was a sparseness optimization for a non-sparse situation.
Non-rectangular shapes via Material::Hole
Hole is the mechanism by which non-rectangular puzzles fit the
flat R × C model. Pad the bounding rectangle, mark the gaps as
Material::Hole, and the substrate machinery treats them
correctly: not in coords(), not painted as decision cells, LOS
rays pass straight through.
This handles every “multiple sub-grids in a bigger rectangle” topology in the puzz.link / canonical roster:
- Samurai Sudoku — five 9×9s arranged in an X, fitted into a 21×21 bounding box with 72 padded gap cells.
- Gattai-8 — eight 9×9s edge-to-edge, larger bounding box, gaps where rows or cols don’t line up.
- Shogun — a 5×5 of 9×9s (45×45 bounding), regular grid of gaps.
- Sumo — four 9×9s overlapping at a center 9×9.
In every case, the cells-appearing-in-multiple-regions pattern that
already powers Sudoku’s 27 standard regions just keeps working: the
overlap zones are coords that appear in two sub-grids’ constraint
regions (one row from sub-grid A, one row from sub-grid B), and the
propagator handles the shared-cell case naturally — no special
multi-grid coordinate machinery, no new Layout type.
What this doesn’t cover: genuinely non-rectangular topologies —
hex grids, polar grids, Möbius strips, toruses. Those need new
Layer variants (a hypothetical Layer::HexCells) and a different
geometric vocabulary in RegionShape. Until then, flat R × C plus
Hole covers every multi-grid puzzle on the current catalog
without architectural concessions.
Future genres
Heyawake rooms, Hashi islands, Killer cages, Star Battle anchors —
all deferred to the constraint list. A Heyawake room is a Region
referenced by every constraint targeting it; the room layout ships
as the region’s coord set. Hashi islands are Pin-style constraints
anchoring bridge endpoints; the island position is just a Floor
cell that happens to have constraints rooted at it.
Material::LabelOnly covers the recurring “structurally exists,
holds a label, never decides” pattern across Sudoku variants and
Picross-family genres:
- Killer Sudoku — corner positions in the
Cornerslayer carry cage-sum labels.Material::LabelOnlyat the cage’s anchor corner; the sum value lives in the variant payload. - Kropki Sudoku — edge positions in
EdgesH/EdgesVhost white/black dots between adjacent cells. The edges layer is allLabelOnly(no edge marks); dot kind comes from the variant payload. - Sandwich / Little Killer / Quadruple clues — extra rows or
columns of
LabelOnlycorner/edge anchors outside the cell grid. - Picross — the bounding box pads the cell grid with hint rows
on top and hint columns on the left; those padding cells are
LabelOnly, holding the hint sequences. The actual play grid isFloorcells in the bottom-right rectangle.
Extending Material further is reserved for genuinely new
structural categories (a hypothetical “portal” cell that warps
LOS, a “tinted” non-blocker that affects scoring) that no current
or near-future genre needs.
Domains and Marks
Marks are what the player puts on the board — a Bulb here,
the Sudoku digit 7 there, this edge Drawn. Each move either
commits a mark at a coord or eliminates one from the set still
possible there. The puzzle is solved when every coord’s set
has narrowed to a single mark and every rule still holds.
The set of marks still possible at a coord at a given moment is
its candidate set; the domain is the candidate set a fresh
game starts with — every mark the genre permits at that position.
This chapter covers both — the type of the values themselves
(Mark) and how a candidate set narrows during play.
What’s at a coord
Every Floor coord starts a game with a full domain. For Sudoku,
that’s 1..=9 per cell; for Akari, {Bulb, Empty} per floor cell;
for Slitherlink, {Drawn, NotDrawn} per edge. Walls and holes
have no domain — they aren’t decision targets.
Play narrows the candidate set. Each move eliminates a mark from a
coord, or commits to a single one, and the propagator cascades the
elimination through constraints: a Sudoku row’s Distinct rule
removes the same digit from the other eight cells the moment one
is committed; an Akari bulb removes Bulb from every cell its
rays reach.
A coord ends up in one of three states:
- Open — candidate set has more than one mark left. Play continues.
- Solved — candidate set narrowed to a singleton. The remaining mark is the answer at this coord.
- Contradicted — candidate set is empty. No legal mark; the engine reports a violation and the puzzle (or the current branch of the search) is unsolvable.
A puzzle is solved when every Floor coord is Solved and
every Goal constraint is Satisfied (covered in the next
chapter).
Mark
Every decision in the catalog is either binary (Akari bulb / empty, Slitherlink drawn / not-drawn, Nurikabe shaded / unshaded) or N-state numeric (Sudoku digits, Skyscrapers, Hashiwokakero bridge counts).
pub enum Mark {
/// Binary mark — `false` / `true`. Per-puzzle prose labels
/// ("Bulb" / "Empty", "Drawn" / "NotDrawn") are render concerns,
/// not type concerns.
Binary(bool),
/// N-state numeric mark. The valid range is whatever the puzzle's
/// `full_domain(coord)` emits — there's no library-fixed
/// convention. Sudoku uses `1..=9` (matches the printed glyph);
/// Hashiwokakero bridge slots use `0..=2` (zero bridges is a
/// real value). The `u8` carries the value as displayed.
Numeric(u8),
}
Keeping Binary distinct from Numeric(u8) is deliberate.
Collapsing both into a single u8 would force binary genres to pick
a convention (is bulb 0 or 1?) and propagator code to
match-and-decode at every boundary. With the split, LOS sweeps and
count constraints over Binary marks dispatch on the type itself,
and Sudoku arithmetic stays unambiguous.
Numeric(u8) carries the displayed value — a Sudoku 5 is
Numeric(5), not Numeric(4) zero-indexed. The u8 is wide enough
for every digit any genre prints and narrow enough that a per-coord
candidate set still fits in a small bitmask, which the fast solver
relies on.
Constraints, Roles, Rules and Regions
Every constraint is a role + one or more regions + a rule:
- A role says how the engine reads the rule’s outcome:
Goal(must endSatisfiedfor the puzzle to be solved) orForbidden(must never beViolated;PendingandSatisfiedare both fine during play). - A list of regions says where the rule applies. Most rules
are single-region (
regions.len() == 1); a few —Clonefor Quoits-style ring duplications, future cross-region equality constraints — bind multiple regions and treat them collectively. The constraint owns its regions (no separate region registry elsewhere on the puzzle), so the renderer iteratespuzzle.constraints().iter().flat_map(|c| &c.regions)to find every paintable region — combined withRegion.presentation(above), this collapses what was per-genre paint code into a single uniform pass. - A rule says what holds over those regions — drawn from the small shared vocabulary defined below.
A puzzle is solved iff every Goal constraint is Satisfied and
no constraint is Violated.
There is no library-wide “every coord decided” rule baked into the solved check. Whether a genre needs that depends on what kind of puzzle it is, and the asymmetry is real:
- Value-filling puzzles (Sudoku, Hidato, KenKen, Numberlink) are
solved when every coord is decided to a specific value; they
include a
Goal + All + Decidedconstraint to enforce it. - Lighting / shading / loop puzzles (Akari, Slitherlink,
Nurikabe, Heyawake) are solved when their structural goal holds;
unmarked cells are fine, and
Decidedis omitted.
In Akari you can solve while leaving non-bulb cells unmarked because
the goal is “every cell lit” (ExactCount(Bulb, 1) per LosCross),
not “every cell decided.” In Sudoku the opposite is true: every cell
must commit, and that requirement is itself a constraint, not a
property of the engine.
A killer cage is Sum { target: 23, domain: 1..=9 } on a Set of
cells. A Sudoku row is Distinct on a Row. A 3×3 box is Distinct
on a Rect. A thermo is Increasing on a Path. Slitherlink’s loop
is Path { mark: Binary(true), closed: true } on
Union([AllOf(EdgesH), AllOf(EdgesV)]). A Quoits clone group is
Clone over multiple Set-shape regions, one per ring — the only
case in the current vocabulary where regions.len() > 1.
pub struct Constraint {
pub role: Role,
/// Owned regions the rule applies over. Length-1 for single-
/// region rules (the common case); length-N for multi-region
/// rules like `Clone`.
pub regions: Vec<Region>,
pub rule: Rule,
}
Constraints are identified by their slice index in
puzzle.constraints() — that’s the propagator’s cache key, the
violation report’s locator, the trace’s per-step constraint
reference. There’s no separate id field because the slice index
already does the job: stable within a puzzle (the list is built by
Genre::build_constraints and rebuilt — in the same canonical
order — on every mutation), O(1) to look up, no boilerplate at
construction.
Constraint exposes shorthands for the two common cases:
impl Constraint {
pub fn one(role: Role, region: Region, rule: Rule) -> Self;
pub fn many(role: Role,
regions: impl IntoIterator<Item = Region>,
rule: Rule) -> Self;
/// `Role::Goal` single-region — the common case.
pub fn goal(region: Region, rule: Rule) -> Self;
/// `Role::Forbidden` single-region.
pub fn forbidden(region: Region, rule: Rule) -> Self;
}
build_constraints implementations use goal / forbidden to keep
the emission sites compact.
pub enum Role {
/// Must be `Satisfied` for the puzzle to be solved. *Drives*
/// the propagator toward solution: forced moves come from
/// `Goal` constraints. Most rules are Goals — Sudoku
/// `Distinct`, Akari per-cell `ExactCount(Bulb, 1)`, Killer
/// `Sum`, Slitherlink loop `Path`.
Goal,
/// Must never be `Violated`, but doesn't drive solution. Used
/// for **negative invariants** — pure validity gates the
/// engine watches without trying to satisfy. The propagator
/// treats `Violated` `Forbidden` constraints as
/// `Status::Contradicted`, but doesn't emit forced moves from
/// them.
///
/// Examples that earn their `Forbidden` role:
///
/// - Nurikabe **no-2×2** of shaded cells: a `NoTwoByTwo` rule
/// the puzzle must never violate, but no shaded cell is
/// *forced* by it.
/// - Hashi **no-cross** between bridges: bridge edges may not
/// intersect, but the rule doesn't tell you which bridges to
/// draw.
/// - Akari **at-most-one-bulb-per-LosCross**: paired with the
/// `Goal: AtLeastOne` half — the lower bound drives, the
/// upper bound gates.
/// - Per-coord upper-bound caps in any genre.
///
/// The `Forbidden` role is what lets these be expressed in the
/// existing rule vocabulary without inventing new "negative"
/// rule kinds. `NoTwoByTwo` is just a rule; it's the role that
/// distinguishes "drive shading" from "watch for violations."
Forbidden,
}
Regions
A region names a set of coords a constraint applies to, plus an
optional render hint. Listing every coord by hand works but bloats
the constraint list — a Sudoku row is nine cells, a 9×9 box is nine
more, the entire grid has 27 such regions before any variants enter
the picture. So RegionShape carries a small vocabulary of named
shorthands; the engine expands each to its concrete coord list at
evaluation time.
pub struct Region {
/// What coords this region covers — the geometric vocabulary.
pub shape: RegionShape,
/// Optional render hint. `None` for purely structural regions
/// (Sudoku rows / columns are solver-only — the row constraint
/// fires `Distinct` but no row outline gets drawn). `Some(_)`
/// for regions the renderer should depict — see the next
/// chapter, *Decorations, Themes and Colors*, for what
/// `RegionPresentation` carries.
pub presentation: Option<RegionPresentation>,
}
pub enum RegionShape {
/// Every coord across every layer the puzzle's `Substrate`
/// populates. Reach for `AllOf(Layer)` instead unless you really
/// mean every layer.
All,
/// Every coord of one specific layer. Slitherlink's loop is
/// `Union([AllOf(EdgesH), AllOf(EdgesV)])`.
AllOf(Layer),
// Geometric, layer-parameterized: rows, columns, rects,
// diagonals.
Row { layer: Layer, r: usize },
Col { layer: Layer, c: usize },
Rect { layer: Layer, top: usize, left: usize,
height: usize, width: usize },
Diagonal { layer: Layer, from: Coord,
dir: DiagDir, length: Option<usize> },
// Cell-only LOS / adjacency. Take a `Coord` whose variant must
// be `Coord::Cell` (validated at constraint construction);
// `Los` walks until it hits a wall, `LosCross` is the union of
// all four rays from a cell *including* the source.
Los { from: Coord, dir: OrthoDir },
LosCross { from: Coord },
Neighbors4 { from: Coord },
// Universal fallbacks — explicit Vec<Coord>, work over any
// layer. The escape hatch for arbitrary cages, hand-authored
// configurations, direction-sensitive rules.
Set(Vec<Coord>),
Path(Vec<Coord>),
// Set algebra over shapes. "The row except this cell," etc.
Union(Vec<RegionShape>),
Intersect(Vec<RegionShape>),
Diff(Box<RegionShape>, Box<RegionShape>),
}
pub enum OrthoDir { Up, Down, Left, Right }
pub enum DiagDir { UpLeft, UpRight, DownLeft, DownRight }
Convenience builders sugar over the common cases:
impl Region {
/// Bare region with no presentation — the common case for
/// solver-only regions (Sudoku rows, Akari LosCross sets).
pub fn shape(shape: RegionShape) -> Self {
Self { shape, presentation: None }
}
pub fn all() -> Self;
pub fn all_of(layer: Layer) -> Self;
pub fn row(layer: Layer, r: usize) -> Self;
pub fn col(layer: Layer, c: usize) -> Self;
pub fn rect(layer: Layer, top: usize, left: usize, h: usize, w: usize) -> Self;
pub fn cells_at(coords: impl IntoIterator<Item = Coord>) -> Self;
/// Singleton-cell region — the common case for `Pin` constraints.
pub fn single_cell(r: usize, c: usize) -> Self;
pub fn los_cross(at: Coord) -> Self;
pub fn neighbors4(from: Coord) -> Self;
pub fn path(coords: Vec<Coord>) -> Self;
pub fn union(parts: Vec<RegionShape>) -> Self;
// ... one per RegionShape variant.
}
The with_presentation chain builder, plus the full set of
construction examples, lives in the next chapter alongside the
RegionPresentation definition.
Why Layer shows up twice. A RegionShape::Row { layer, r }
declares which layer the row lives on; once the engine resolves
that region to Vec<Coord>, every emitted Coord already carries
the same information as its variant tag (Coord::Cell,
Coord::EdgeV, …). The duplication is intentional: pre-resolution
RegionShape values are symbolic — Row(3) alone would be
ambiguous (cells? edges? corners?), so the Layer field
disambiguates the symbolic form. Post-resolution, layer moves from
explicit field to implicit variant tag. The cell-only shorthands
(Los, LosCross, Neighbors4) skip the field because their
semantics — wall-blocking, orthogonal adjacency — are only defined
over cells; they take a Coord whose variant must be Coord::Cell,
validated at constraint construction. Same currency (Coord)
everywhere; one type for callers to remember.
Rules and violations
Each Rule variant is paired 1:1 with a Violation variant carrying
the offending coords / counts. The mirroring is intentional: the
engine pattern-matches violations to format prose, highlight cells,
and grade without re-running the rule. Every rule has one
discriminant; the engine’s dispatch key is just Rule, no
sub-discriminants.
One file per rule. Each Rule variant owns its entire
implementation in core/src/rules/<rule>.rs: tightness,
mark-vocabulary requirement, structural validate, the slow-path
consistency check, propagator-side combined eval+force-deduction,
OR-tools CP-SAT encoding (gated behind #[cfg(feature = "ortools")]
— used as a test oracle and bench baseline, not a production
solver path), grader techniques, and (for rules with an
incremental cache) the per-constraint aggregate’s build and
apply/undo updates. Each file defines a marker ZST (e.g.
pub struct ExactCount;) that impls crate::rules::RuleImpl.
The trait surface itself is intentionally small. Two methods do the actual work:
check(args, coords, game) -> ConstraintOutcome— required. Universal consistency check; reads candidate masks viaGame, no aggregate access, no force-emission. Used bySimpleGame::status, generation validation, tests, and as the slow-path fallback inside the propagator. The CSP textbook’scheck.propagate(args, ci, prop, out) -> Option<ConstraintOutcome>— optional. Combined eval + force-emission for the AC-3 worklist. Reads the rule’s aggregate (if any), pushes implied moves intoout. ReturnsSome(status)when the rule computed status,Noneto defer to the slow-pathcheck. TheOption<_>makes the asymmetry explicit:checkis always definitive;propagatemay shrug.
Plus the supporting methods (extract, tightness,
mark_requirement, validate, techniques, build_aggregate),
all defaulted except extract. Central dispatch sites
(eval_rule in algorithm/eval.rs, the propagator’s
propagate_at, and ortools::encode_constraint) are exhaustive
matches that delegate to the rule’s trait method — adding a new
variant is a compile error at every dispatch site, not a panic at
runtime.
Propagator caches. Rules that maintain an incremental cache
(ExactCount/AtMost/AtLeastOne/Pin/DegreeIn/ParityEven share a
MarkAggregate; DistinctSequences and Path { closed: true } each
have their own) plug into a single per-constraint
Vec<AggregateState> on the propagator. AggregateState is a tagged
union with variants None / Mark / DistinctSeq / PathClosed; the
rule’s RuleImpl::build_aggregate produces the matching variant, and
per-apply / per-undo updates dispatch through the rule module. CellMin
is the exception: its storage is a shared SoA of counter state keyed by
counter name (not by constraint), so it lives directly on the
propagator with rule-side helpers in rules/cell_min.rs.
OR-tools is not on the prod path. The encode_ortools function
each rule provides is for the test oracle (OrToolsSolver, used in
core/src/tests/puzzles/* to cross-check FastSolver) and bench
baselining. FastSolver is the production solver. Some Rule
variants — Sum, Increasing, MinDifference, Polyomino,
BoundedSize, ValueGroupedSize — currently have no FastSolver
aggregate and return Pending from check; in production those
rules are caught by trial-1 and slow-walk verification, not by an
external CP-SAT call.
New puzzles get new rules by adding variants here, paired with the
matching Violation variant. There is no escape hatch — if a puzzle
needs a rule the vocabulary doesn’t cover, that’s a signal to extend
the vocabulary, not work around it.
pub enum Rule {
/// Marks in the region are pairwise distinct.
/// Sudoku rows/cols/boxes, Hitori, Latin squares.
Distinct,
/// Marks in the region sum to `target`, each value drawn from
/// `domain`. Killer cages, Arrow sums, Sandwich Sudoku.
Sum { target: i64, domain: RangeInclusive<u8> },
// ... Increasing, MinDifference, AtLeastOne, AtMost,
// ExactCount — the rest of the counting / ordering family.
/// The single coord in the region is committed to `mark`. The
/// canonical shape of every clue-shaped given — Sudoku givens,
/// Star Battle pre-placed stars, Hidato endpoints. Region must
/// contain exactly one coord. Distinguished from
/// `ExactCount { mark, n: 1 }` because the propagator can
/// shortcut to "set candidates to {mark}" without going through
/// the count machinery, and the grader treats `Pin` as "given"
/// (which the renderer wants for distinguishing givens from
/// deductions).
Pin { mark: Mark },
/// Count of `mark`-committed coords lies in `allowed`.
/// Generalises `ExactCount`; covers Slitherlink/Masyu/Yajilin
/// vertex degree `{0, 2}`. `allowed` must be sorted and deduped.
DegreeIn { mark: Mark, allowed: AllowedCounts },
// ─── Structural / shape rules ──────────────────────────────
/// The coords marked `mark` form a single path under
/// substrate-natural adjacency. `closed = true` for cycles;
/// Slitherlink loop is `Path { mark: Binary(true), closed: true }`
/// over edge coords with corner-sharing adjacency.
Path { mark: Mark, closed: bool },
// ... Connected, Polyomino, BoundedSize, NoTwoByTwo — the
// other structural / shape rules.
// ─── Completion + value-grouping ──────────────────────────
/// Every coord in the region has its candidate set narrowed to
/// a singleton — the *value* doesn't matter, only that the cell
/// is committed. With `Goal` role, this is "every cell decided."
/// Sudoku, Hidato, KenKen, Numberlink need it; Akari does not.
Decided,
// ... ValueGroupedSize — Fillomino's "components of value v have
// size v."
// ─── Per-cell + cross-rule shapes ──────────────────────────
/// For every coord in the region, the named per-cell counter at
/// that coord must be at least `min`. Counters are declared by
/// the puzzle via `Puzzle::cell_counters`. The propagator forces
/// the counter's target mark at the unique remaining contributor
/// when only one is left in the coord's reach.
CellMin { counter: String, min: u32 },
// ... LoopFreezesCounts — cross-rule freeze for Slitherlink-style
// "loop closed AND each clue's count satisfied."
}
/// Sorted, deduped list of allowed counts for `Rule::DegreeIn`.
/// Inline up to 4 entries — covers Slitherlink's `{0, 2}` and any
/// future genre that picks among a handful of values.
pub type AllowedCounts = SmallVec<[u8; 4]>;
pub enum Violation {
/// `Distinct` violated: two coords share a mark.
Distinct { a: Coord, b: Coord, mark: Mark },
/// `Sum` exceeded its target.
SumOver { target: i64, actual: i64 },
/// `Sum`'s max possible total is below target.
SumUnder { target: i64, max_possible: i64 },
// ... ExactCountOver, ExactCountUnder, AtLeastOneStarved,
// AtMostOver, OrderBroken, DegreeOutOfAllowed — the rest of
// the counting / ordering violations.
/// `Pin` violated: the pinned coord lost `mark` from its
/// candidate set (the player or solver eliminated the pinned
/// value, contradicting the given).
PinLost { coord: Coord, mark: Mark },
// ... Disconnected, PolyominoIsolated, BadPath, OverSize,
// HasTwoByTwo — the structural-rule violations.
/// `Decided` violated: a coord has no candidates left
/// (irrecoverable empty domain).
EmptyDomain { coord: Coord },
// ... ValueGroupedSizeMismatch — completion / value-grouping
// violations.
}
/// Discriminant-only mirror of `Rule`. Used by techniques to declare
/// which rule kinds they need present in `puzzle.constraints()` to be
/// applicable; the grader filters out techniques whose required
/// kinds don't appear, once per puzzle.
pub enum RuleKind {
Distinct, Sum, Increasing, MinDifference,
AtLeastOne, AtMost, ExactCount, Pin, DegreeIn,
Connected, Polyomino, Path, BoundedSize, NoTwoByTwo,
Decided, ValueGroupedSize,
CellMin, LoopFreezesCounts,
}
impl Rule {
pub fn kind(&self) -> RuleKind { /* trivial match */ }
}
Outcomes
What a constraint’s eval returns when the engine queries it, and the puzzle-wide rollup that follows from walking every constraint.
pub enum ConstraintOutcome {
/// Every requirement of this constraint is met by current marks.
Satisfied,
/// Constraint isn't decidable yet — needs more state.
Pending,
/// Constraint can never be satisfied from this state.
Violated(Violation),
}
pub enum Status {
InProgress,
Solved,
/// The first constraint to evaluate as `Violated` in
/// `puzzle.constraints()` order. The engine short-circuits on
/// the first violation it finds — when several constraints are
/// violated simultaneously, only one is reported, and "first"
/// is the slice index of the constraint that produced it.
/// Callers that want to enumerate every violation walk
/// `puzzle.constraints()` themselves.
Contradicted(Violation),
}
Move — the unit of change that narrows a candidate set, applied
between eval rounds — lives in the Propagation engine chapter,
where the moves-from-state derivation actually happens.
Givens are constraints, too
The constraint list is the puzzle’s every rule the solution must
satisfy. Region rules (Sudoku row Distinct, Akari LosCross
ExactCount(Bulb, 1)); invariants (Nurikabe NoTwoByTwo); shape
rules (Path { closed: true } for the slitherlink loop); and the
givens themselves. A sudoku given “cell (3, 4) = 7” is just
Pin { mark: Numeric(7) } over a single-coord region. An akari clued
wall translates to ExactCount over the wall’s visible floor cells
(the wall position itself isn’t a coord, but its visibility region
is). A killer cage is Sum over its cell set.
There is no separate “givens” surface on Puzzle — the constraint
list is the canonical form. Genres still keep raw author-input fields
(Akari.clues: HashMap<(r, c), u8>, Sudoku.clues: HashMap<(r, c), u8>) as editor scaffolding: the form a UI
mutates when a player clicks “place a clue here.” But every such
field is translated into constraints at construction (populate_ constraints); the constraint list is what every downstream
consumer reads.
The Pin rule (see the Rules section above) is the workhorse for
clue-shaped givens. Single-coord region, mark carries the pinned
value. The propagator narrows the coord’s candidate set to
{mark} immediately at game construction; a PinLost violation
fires if the player or solver later eliminates that mark. Sudoku
givens, Star Battle pre-placed stars, Hidato endpoints — all Pin.
Givens that aren’t Pin-shaped are still constraints, just under a
different rule. Akari clue values are ExactCount over visible
floors. Slitherlink cell clues are ExactCount over the cell’s four
edges. Killer cage sums are Sum. Hashi island bridge counts are
ExactCount over outgoing edges. Same story every time: the editor
mutates raw fields, the constructor emits constraints, the engine
reads constraints.
Mark / rule compatibility
Mark collapses to Binary | Numeric(u8), so compatibility is
runtime-checked rather than gated by trait bounds. Sum,
Increasing, and MinDifference are meaningful only over coords
carrying Numeric marks; constructing one over a Binary region
errors at constraint validation. Distinct, AtLeastOne,
ExactCount, Pin, DegreeIn, and the structural rules
(Connected, Polyomino, Path, BoundedSize, NoTwoByTwo)
work for both mark variants.
Growing the vocabulary
When a new puzzle wants a rule the existing vocabulary doesn’t
cover, the discipline is to extend the vocabulary, not work around
it: add a variant on Rule, the paired variant on Violation,
and a matcher in the engine. There is no escape hatch. Slitherlink’s
“edges form a single closed loop” earned a slot via
Path { mark: Binary(true), closed: true }; Masyu’s pearl rules
will earn theirs the day Masyu lands.
Decorations, Themes and Colors
How regions look on the screen. The previous chapter introduced
Region.presentation: Option<RegionPresentation> — None for
solver-only regions, Some(_) for anything the renderer should
depict (Sudoku 3×3 boxes, Killer cages, thermos, Quoits clones).
This chapter unpacks what RegionPresentation carries.
/// Optional render hint attached to a region. Renderers walk
/// `puzzle.constraints().iter().flat_map(|c| &c.regions)` and paint
/// every region whose `presentation` is `Some(_)`.
pub struct RegionPresentation {
/// Border on the perimeter (between this region and any coord
/// not in it). Cell-region only; edge / corner regions ignore.
pub border: Option<BorderStyle>,
/// Fill tone. `None` = transparent. `Some(color)` = solid tint.
/// Two regions with the same color render with the same shade.
pub fill: Option<Color>,
/// Optional label rendered in the region. Cage sums, room
/// counts, sandwich totals.
pub label: Option<RegionLabel>,
/// Whether the region's cells get a decorative shading tone —
/// the visual rhythm Sudoku 3×3 box parity uses. Independent of
/// `fill`; shaded cells get a frontend-default tint, fill is for
/// puzzle-author-chosen colors.
pub shaded: bool,
/// Decorative overlay drawn on top of the region's coords.
/// `None` = no overlay (the common case).
pub decoration: Option<Decoration>,
}
pub enum BorderStyle { Solid, Dashed, Dotted }
pub enum RegionLabel {
Number(u32),
DigitList(Vec<u8>),
Text(String),
}
Decorations
Decorative overlays — cross-cutting visual primitives that recur
across variant puzzles. Each variant carries an Option<Color>:
Some(_) for puzzle-author-chosen color, None lets the frontend
pick.
pub enum Decoration {
Ring { color: Option<Color> }, // Quoits clone groups
LineThrough { color: Option<Color> }, // palindromes, thermos, German whispers
Arrow { color: Option<Color> }, // sudoku arrow constraints
Dot { color: Option<Color> }, // Kropki edges
QuadCircle { color: Option<Color> }, // Quoits-style four-cell anchors
}
The five variants exhaust the cross-cutting visual primitives in the catalog so far — adding a new one is a deliberate extension, not a casual addition.
Colors
Three states, mapping to how puzzles actually want colors specified:
pub enum Color {
Hex(u32),
Named(NamedColor),
}
pub enum NamedColor {
Red, Orange, Yellow, Green, Blue, Purple, Pink, Brown, Grey,
// Closed set; add a variant only when a real puzzle needs more.
}
None(anOption<Color>that’sNone) — the puzzle didn’t care. Frontend picks however it likes; renderers may differ. Use for autogen Killer cages where the constructor doesn’t bother picking, or for decorations where any reasonable color works.Color::Hex(rgb)— exact RGB. Editor or constructor has a specific color in mind; frontends render that color (possibly with a theme transform). Two frontends in the same theme agree exactly.Color::Named(name)— semantic color from a small closed palette. Themes resolve to their palette’s appropriate shade. The puzzle’s “pink” is pink everywhere; the exact shade adapts per theme.
Theme adaptation
Frontends are free to transform Hex colors for accessibility or
theme constraints — a dark-mode renderer might lightness-flip every
Hex; a high-contrast theme might quantize to a distinguishable
subset. Named colors are resolved per theme to the theme’s
matching palette entry. None lets the frontend do whatever fits
its theme. This means two themes show the same puzzle with
different absolute colors — correct, not a bug.
Construction
The chained-builder pattern lets a constraint pin only the presentation fields it cares about; everything else falls back to the field’s default.
impl Region {
/// Chain a presentation onto a region. The `..Default::default()`
/// pattern lets a constraint pin only the fields it cares about
/// (Sudoku boxes set border + shaded; Killer cages set border +
/// label; nothing else needs to populate every field).
pub fn with_presentation(mut self, p: RegionPresentation) -> Self {
self.presentation = Some(p);
self
}
}
Construction reads naturally:
// Sudoku 3×3 box at (br, bc) with parity shading.
Region::rect(Layer::Cells, br * 3, bc * 3, 3, 3)
.with_presentation(RegionPresentation {
border: Some(BorderStyle::Solid),
shaded: (br + bc) % 2 == 1,
..Default::default()
})
// Sudoku row — solver-only, no presentation.
Region::row(Layer::Cells, r)
// Killer cage with dashed border + sum label + a chosen color.
Region::cells_at(cage.cells.clone())
.with_presentation(RegionPresentation {
border: Some(BorderStyle::Dashed),
label: Some(RegionLabel::Number(cage.sum)),
fill: Some(Color::Named(NamedColor::Yellow)),
..Default::default()
})
// Thermo: line drawn through the path, bulb implied by the path's
// first coord. Pairs with `Rule::Increasing` over the same Path.
Region::path(thermo.path.clone())
.with_presentation(RegionPresentation {
decoration: Some(Decoration::LineThrough {
color: Some(Color::Named(NamedColor::Grey)),
}),
..Default::default()
})
// Quoits clone group — two rings sharing a Pink color, paired
// across regions inside one `Rule::Clone` constraint.
Region::cells_at(ring_a_cells.clone())
.with_presentation(RegionPresentation {
decoration: Some(Decoration::Ring {
color: Some(Color::Named(NamedColor::Pink)),
}),
..Default::default()
})
// Kropki edge: a small dot between two adjacent cells.
Region::cells_at(vec![Coord::Cell{r,c}, Coord::Cell{r,c+1}])
.with_presentation(RegionPresentation {
decoration: Some(Decoration::Dot { color: None }),
..Default::default()
})
Genres, Puzzles, Games
Now that the vocabulary’s in place — Substrate, Layer, Coord,
Mark, Region, Constraint, Move — we can talk about how
Penmark actually represents puzzles.
A genre is a runtime value: a
&'static Genrefn-pointer table. A puzzle is aPuzzle— aSubstrateplus aVec<Constraint>, tagged with its genre handle. A game isMarkson a puzzle.
A genre is a value, not a type. The Genre struct
(core/src/genre.rs) is a table of fn pointers plus identity
constants (name, display_name, mark_render, layers,
build_constraints, cell_clue, …). One &'static Genre literal
ships per genre (AKARI, SLITHER, SUDOKU, TAKUZU) and they
all live in the GENRES slice. The per-genre logic itself is
authored as an impl GenreTrait for AkariGenre block on a
zero-sized marker — the macro genre_static! then coerces those
monomorphized items into the struct’s fn-pointer slots. Authors
think in trait methods; callers dispatch through the runtime handle.
A puzzle is a value of Puzzle — a single non-generic struct
that carries its genre as a field:
pub struct Puzzle {
pub genre: &'static Genre,
rows: usize,
cols: usize,
substrate: Substrate,
constraints: Vec<Constraint>,
meta: PuzzleMeta,
}
Per-genre type aliases still exist (pub type Sudoku = Puzzle;) as
shorthand, but they’re vestigial — Sudoku and Puzzle are the
same type. Constructors live as associated functions on the marker
struct: SudokuGenre::new(box_w, box_h, givens) returns a
Puzzle whose .genre is set to crate::genre::SUDOKU.
A puzzle carries a Substrate (which positions exist, where walls
and holes are) and a Vec<Constraint> (every rule the solution must
satisfy, with givens included as Pin-shaped constraints). There’s
no separate “clue map” or “wall set” field — those live as
Material::Wall / Material::Hole in the substrate and as
constraint entries in the rule list.
Editing a puzzle (add_clue, clear_clue, set_wall, set_hole,
set_floor, set_material) goes through methods on Puzzle that
mutate the substrate and rebuild the constraint list by dispatching
through puzzle.genre.build_constraints(substrate, clues). Rebuilds
are microseconds at human-interaction sizes, so we always rebuild
rather than apply surgical updates.
A game is someone’s in-progress play — which marks have been
committed, which candidates eliminated, where the propagator has
narrowed things. A game is a value of Game<'p>, holding Marks
(per-coord candidate sets) plus a borrow back to the static
Puzzle. The borrow lifetime makes “this game is bound to this
puzzle” a compile-time guarantee.
In prose, lowercase puzzle keeps its colloquial sense — “this
Sudoku puzzle is hard.” Capital Puzzle always refers to the
struct.
Solving, Grading, Generating, Enumerating
The four operations Penmark performs on a puzzle.
- Solving — finding solutions.
Solvertrait; pluggable per-solver impls. Takes&Puzzle. - Grading — assigning a difficulty by simulating a human solver
step by step.
Gradertrait; one shipped impl (StandardGrader). Takes&Puzzle. - Generating — producing a uniquely-solvable puzzle. Lives as
genre.generate(an fn pointer on the runtime handle) rather than a separate trait, since generation needs intimate access to per-genre structural state. - Enumerating — streaming uniquely-solvable puzzles of a given
size. Same shape:
genre.enumerate.
Solving and grading are pluggable because real users want to
compare solver / grader implementations side-by-side (the bench
matrix shows FastSolver vs SimpleSolver vs OrToolsSolver).
Generating and enumerating aren’t pluggable in the same way — each
genre has at most one generation pipeline that makes sense for it.
Folding them into the genre handle keeps the surface lean.
Pluggable solvers and graders
The algorithm traits (Solver, Grader) are non-generic and take
&Puzzle. Dispatch on a puzzle’s genre happens at runtime through
its &'static Genre handle, not by monomorphization.
The shipped set:
FastSolver— DFS + AC-3 propagation + depth-1 trial-1 sweep, driven through aPropagatorthat wraps aFastGame<'p>(bitmask domains, per-constraint aggregates, per-cell counters).SimpleSolver— brute-force DFS overSimpleGame’sHashMap<Coord, Vec<Mark>>candidate sets. The “obvious correct” reference every faster impl is cross-checked against.StandardGrader— fires named human-pattern techniques in priority order against the samePropagatorengine, with trial-1 as backward escalation. Techniques are rule-kind-driven and apply across genres.
impl Solver for FastSolver { /* … */ }
impl Grader for StandardGrader { /* … */ }
Adding a new solver impl is one trait impl plus a row in the bench matrix. No per-genre wiring needed.
Genre-handle generators and enumerators
Generation and enumeration are fn pointers on the Genre struct,
populated by the macro that builds each per-genre static from its
GenreTrait impl:
pub struct Genre {
pub generate: fn(&GenConfig, &AtomicBool) -> Option<Puzzle>,
pub enumerate: fn(&EnumConfig, &AtomicBool,
&mut dyn FnMut(&Puzzle),
&mut dyn FnMut(EnumStats)) -> String,
/* … */
}
// Dispatch at a call site:
let p = (puzzle.genre.generate)(&cfg, &cancel);
Both default to a universal recipe (scaffold_solve_derive /
walk_canonical) that uses genre.material_vocab to know what to
scatter and genre.derive_clue_at to derive a ClueValue per
clueable cell from a bare-solve completion. The default
derive_clue_at body dispatches through genre.clue_pattern
(the Pin / Count / ConnectedSize declarative shortcut) and
wraps the u8 result in ClueValue::Number; genres whose clues
don’t fit that closed set (Tapa, future Yajilin) override
derive_clue_at directly. Akari uses both defaults verbatim.
Sudoku and slither override generate because they need a
genre-specific structural seed (sudoku: random diagonal; slither:
connected inside-region) before the bare-solve can find a useful
completion.
See the Generator and Enumerator chapters for full configuration surfaces and extension points.
Implementations
The Rust trait surface that ties the conceptual model to running
code: Puzzle for the static puzzle data, Game for in-progress
play, tagged serialization for round-tripping puzzles and games
across the wire, dispatching for routing genres at runtime, the
shared propagation engine, and the Puzzlehound dataset that feeds
everything else.
Genre and Puzzle
Penmark splits puzzle representation into two pieces:
Genre— a runtime fn-pointer table (&'static Genre). Hosts every per-genre rule the dispatch path needs: domain shape, render style, how to build the constraint list from a substrate + clue grid, parsers, generators, enumerators. One static literal per genre (AKARI,SLITHER,SUDOKU,TAKUZU); they all live in theGENRESslice.Puzzle— a single non-generic struct holding the canonical state (the runtime genre handle, rows, cols, substrate, constraints, meta).
Per-genre logic is authored as an impl GenreTrait for <X>Genre
block on a zero-sized marker. The genre_static! macro coerces those
monomorphized methods and constants into the runtime Genre
struct’s fn-pointer fields. Authors think in trait methods; callers
dispatch through puzzle.genre.method(args).
The split keeps the value-carrying struct dimension-uniform across
genres while letting per-genre code stay co-located in
core/src/puzzles/<genre>/. Heterogeneous storage and iteration
(“every genre”) are now trivial — just walk &[&Genre].
The Genre struct
pub struct Genre {
// ── Identity ────────────────────────────────────────────────
pub name: &'static str, // "sudoku", "akari", …
pub display_name: &'static str, // "Sudoku" — shown in UIs
pub baseline_version: u32, // wire-format schema version
// ── Frontend defaults ──────────────────────────────────────
pub mark_render: MarkRenderStyle, // Lamp / Glyph / EdgeStroke
pub layers: &'static [Layer], // cells / edges / corners
// ── Slider ranges ──────────────────────────────────────────
pub gen_budget_range: (f32, f32),
pub gen_initial_budget_secs: f32,
pub enum_budget_range: (f32, f32),
pub enum_initial_budget_secs: f32,
pub enum_empty_message: &'static str,
// ── Generator default ──────────────────────────────────────
pub default_gen_density: f32,
// ── Puzzle-free fn pointers ─────────────────────────────────
pub default_substrate: fn(Dimensions) -> Substrate,
pub dims_for: fn(usize, usize) -> Dimensions,
pub build_constraints:
fn(&Substrate, &Grid<Option<ClueValue>>) -> Vec<Constraint>,
pub cell_counters: fn() -> &'static [CellCounter],
pub format_value: fn(u8) -> String,
pub emits_variant_directives_in_janko: fn() -> bool,
pub variant_aliases: fn() -> &'static [(&'static str, &'static str)],
pub material_vocab: fn() -> MaterialVocab,
pub clue_host_material: fn() -> Material,
pub clue_pattern: fn() -> CluePattern,
pub parse_external_line: fn(&str) -> Option<String>,
// ── Puzzle-aware fn pointers ────────────────────────────────
pub full_domain: fn(&Puzzle, Coord) -> Vec<Mark>,
pub format_value_for: fn(&Puzzle, u8) -> String,
pub cell_clue: fn(&Puzzle, usize, usize) -> Option<u8>,
pub extra_constraints: fn(&Puzzle) -> Vec<Constraint>,
pub empty_at: fn(Dimensions) -> Puzzle,
pub extract_clues: fn(&Puzzle) -> Grid<Option<ClueValue>>,
pub on_clue_added: fn(&mut Puzzle, usize, usize, &ClueValue),
pub derive_clue_at:
fn(&Puzzle, &Solution, Coord) -> Option<ClueValue>,
pub clue_domain: fn(&Puzzle, Coord) -> Vec<u8>,
pub sample: fn() -> Puzzle,
pub overlays:
fn(&Puzzle, &Marks, &Status) -> Vec<Overlay>,
pub generate:
fn(&GenConfig, &AtomicBool) -> Option<Puzzle>,
pub scaffold_passes: fn() -> &'static [ScaffoldPass],
pub clue_post_filters: fn() -> &'static [CluePostFilter],
pub apply_cli_density: fn(f32, &mut GenConfig),
pub enumerate: fn(
&EnumConfig,
&AtomicBool,
&mut dyn FnMut(&Puzzle),
&mut dyn FnMut(EnumStats),
) -> String,
pub parse_text: fn(&str) -> Result<Puzzle, ParseError>,
pub from_janko_text: fn(&str) -> Result<Puzzle, ParseError>,
pub to_janko_text: fn(&Puzzle) -> String,
pub from_puzzlink_url: fn(&str) -> Result<Puzzle, ParseError>,
pub to_puzzlink_url: fn(&Puzzle) -> Result<String, ParseError>,
}
pub static AKARI: &Genre = genre_static!(AkariGenre);
pub static SLITHER: &Genre = genre_static!(SlitherlinkGenre);
pub static SUDOKU: &Genre = genre_static!(SudokuGenre);
pub static TAKUZU: &Genre = genre_static!(TakuzuGenre);
pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU];
pub fn by_name(name: &str) -> Option<&'static Genre> {
GENRES.iter().copied().find(|g| g.name == name)
}
Call sites dispatch through the field:
let p = (puzzle.genre.parse_text)(s)?;
let dom = (puzzle.genre.full_domain)(&puzzle, coord);
if puzzle.genre.name == "sudoku" { /* … */ }
GenreTrait — the authoring interface
GenreTrait (in crate::puzzle::genre) is the trait per-genre
markers implement. Its const items and method signatures mirror
the Genre struct’s field types. The genre_static! macro fills
in each &'static Genre literal by copying the monomorphized items
out of the marker’s impl:
pub trait GenreTrait: 'static + Sized {
const NAME: &'static str;
const DISPLAY_NAME: &'static str;
const LAYERS: &'static [Layer];
const MARK_RENDER: MarkRenderStyle;
const BASELINE_VERSION: u32 = 1;
const DEFAULT_GEN_DENSITY: f32 = 0.22;
/* …slider const ranges… */
/// Runtime handle — must match the literal in `crate::genre`.
/// Default methods that construct a `Puzzle` use it to set the
/// `genre` field.
const GENRE: &'static crate::genre::Genre;
fn full_domain(p: &Puzzle, coord: Coord) -> Vec<Mark>;
fn default_substrate(dims: Dimensions) -> Substrate;
fn build_constraints(
substrate: &Substrate,
clues: &Grid<Option<ClueValue>>,
) -> Vec<Constraint>;
fn clue_pattern() -> CluePattern;
fn sample() -> Puzzle;
/* …~25 more methods, most with sensible defaults… */
}
Required methods (no default): NAME, DISPLAY_NAME, LAYERS,
MARK_RENDER, GENRE, full_domain, default_substrate,
build_constraints, clue_pattern, sample. Everything else has
a default — see core/src/puzzle.rs for the full trait.
Clue payload — ClueValue
Per-cell clues thread through build_constraints, extract_clues,
add_clue, and derive_clue_at as a closed enum:
pub enum ClueValue {
Number(u8), // sudoku digit, akari/slither count, takuzu cell value
Runs(Vec<u8>), // Tapa run-length sequences
// future Yajilin arrows, Masyu pearls go here as new variants
}
The four shipped genres only ever emit ClueValue::Number(_);
Puzzle::add_clue accepts impl Into<ClueValue> so existing
u8 literal call sites continue to work via
impl From<u8> for ClueValue. Multi-byte payloads (Tapa runs,
Yajilin arrows when it lands) carry their own variants.
Genre::cell_clue: Option<u8> is unchanged — it’s the display
shorthand renderers (eframe painter, CLI Janko writer) use to
fold a clue down to a single glyph. Multi-byte payloads return
their first byte. The typed ClueValue channel is for
construction / canonical reconstruction / round-trip; rendering
stays on the cheap u8 path.
Generation extension points (default-implemented hooks)
material_vocab()— non-Floor materials available on the cells layer. Drives the universal generator’s density sliders.clue_host_material()— substrate material that holds clue cells (sudoku/takuzu:Floor; akari:Wall; slither:LabelOnly). DrivesPuzzle::is_clue_targetand the editor’s clue-cycle gate.clue_domain(p, coord)— valid clue values; empty means no clue allowed at this coord.clue_pattern()— declarative shortcut for the “given a solution, what’s the clue here?” question. Three canonical shapes:Pin(sudoku),Count{target, region}(akari/slither),ConnectedSize{target}(nurikabe). The defaultderive_clue_atbody dispatches on this enum and wraps the result inClueValue::Number. Genres whose clues don’t fit any of these shapes (Tapa runs) overridederive_clue_atdirectly and ignoreclue_pattern.derive_clue_at(p, sol, coord) -> Option<ClueValue>— inverse ofcell_clue. The walker calls it once per clueable cell to derive the canonical clue set from a bare-solve completion. Default body forwards throughclue_pattern(); the escape valve for novel clue shapes.scaffold_passes()— named structural pre-passes (akari ships six wall styles + hole layouts).clue_post_filters()— value-filters applied after clue derivation (akari shipsonly_evensetc).apply_cli_density(d, cfg)— map the CLI’s--densityflag into the rightGenConfigslot for this genre.dims_for(rows, cols)— sudoku rounds rows to a square factor pair and returnsBoxed{box_w, box_h}.
See the Generator chapter for the full configuration surface.
Implementations live in core/src/puzzles/<genre>/mod.rs. The
marker struct is pub struct <Genre>Genre; — ZST, all logic on the
trait impl block.
The Puzzle struct
pub struct Puzzle {
pub genre: &'static crate::genre::Genre,
rows: usize,
cols: usize,
substrate: Substrate,
constraints: Vec<Constraint>,
meta: PuzzleMeta,
}
Same six fields across every genre. Serialization delegates to
WireDelta so the on-disk JSON elides any constraint / substrate
cell that the genre’s build_constraints + default_substrate
would produce on its own; custom constraints land in
constraints_added. Deserialize is not implemented directly —
rehydration needs the runtime &'static Genre handle, which serde
can’t thread through. Callers go through
Puzzle::from_tagged_json (envelope dispatch on the genre tag)
or crate::puzzle_wire::from_wire_delta(genre, wire).
Puzzle exposes a uniform API that covers most engine and frontend
needs:
impl Puzzle {
// ── Construction ──
pub fn from_state(
genre: &'static Genre,
rows: usize, cols: usize,
substrate: Substrate,
constraints: Vec<Constraint>,
) -> Self;
// ── Reads ──
pub fn rows(&self) -> usize;
pub fn cols(&self) -> usize;
pub fn substrate(&self) -> &Substrate;
pub fn constraints(&self) -> &[Constraint];
pub fn meta(&self) -> &PuzzleMeta;
pub fn coords(&self) -> impl Iterator<Item = Coord> + '_;
pub fn material(&self, r: usize, c: usize) -> Material;
pub fn cell_clue(&self, r: usize, c: usize) -> Option<u8>;
pub fn cell_clues(&self) -> impl Iterator<Item = ((usize, usize), u8)> + '_;
pub fn is_play_target(&self, coord: Coord) -> bool;
pub fn is_clue_target(&self, coord: Coord) -> bool;
pub fn is_floor(&self, r: usize, c: usize) -> bool;
pub fn is_wall(&self, r: usize, c: usize) -> bool;
pub fn is_hole(&self, r: usize, c: usize) -> bool;
pub fn walls(&self) -> impl Iterator<Item = (usize, usize)> + '_;
pub fn holes(&self) -> impl Iterator<Item = (usize, usize)> + '_;
pub fn box_size(&self) -> u8; // sqrt(rows), sudoku-shaped
// Geometry helpers (forward to substrate)
pub fn edges_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
pub fn edges_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
pub fn vertices_of_cell(&self, r: usize, c: usize) -> [Coord; 4];
pub fn vertices_of_edge(&self, edge: Coord) -> [Coord; 2];
pub fn cells_of_edge(&self, edge: Coord) -> Vec<Coord>;
pub fn cells_of_vertex(&self, r: usize, c: usize) -> Vec<Coord>;
// ── Mutation (always rebuilds constraints via puzzle.genre) ──
pub fn add_clue(&mut self, r: usize, c: usize, v: impl Into<ClueValue>);
pub fn clear_clue(&mut self, r: usize, c: usize);
pub fn set_material(&mut self, coord: Coord, m: Material);
pub fn set_floor(&mut self, r: usize, c: usize);
pub fn set_wall(&mut self, r: usize, c: usize);
pub fn set_hole(&mut self, r: usize, c: usize);
pub fn substrate_mut(&mut self) -> &mut Substrate;
pub fn constraints_mut(&mut self) -> &mut Vec<Constraint>;
// ── Genre-derived (dispatch through self.genre) ──
pub fn full_domain(&self, coord: Coord) -> std::vec::IntoIter<Mark>;
pub fn cell_counters(&self) -> &'static [CellCounter];
pub fn format_value(&self, v: u8) -> String;
// ── Wire format ──
pub fn to_tagged_json(&self) -> String;
pub fn from_tagged_json(s: &str) -> Result<Self, TagError>;
pub fn to_tagged_json_with_marks(&self, marks: &Marks) -> String;
pub fn from_tagged_json_with_marks(s: &str)
-> Result<(Self, Option<Marks>), TagError>;
pub fn to_janko(&self) -> String;
pub fn content_hash(&self) -> String;
}
Editing model. Constraints are canonical state but never edited
directly during normal use. Mutating the puzzle (add_clue,
set_wall, …) updates the substrate and re-derives the constraint
list from (self.genre.build_constraints)(substrate, clues). Builds
run in single-digit microseconds across genres at human-interaction
sizes, so we don’t bother with surgical updates.
constraints_mut() is exposed for use cases that need it (variant
constraint authoring, debugging) but isn’t the normal path.
Per-genre aliases and constructors
Each genre exposes a friendly type alias (vestigial since the
collapse — they’re just shorthand for Puzzle) and inherent
constructors on its marker:
pub type Sudoku = Puzzle;
impl SudokuGenre {
pub fn new(box_w: u8, box_h: u8,
givens: HashMap<(usize, usize), u8>) -> Puzzle { /* … */ }
pub fn empty(box_size: u8) -> Puzzle { /* … */ }
}
So SudokuGenre::new(3, 3, givens) returns a Puzzle whose
.genre is set to crate::genre::SUDOKU. The constructor builds
an empty puzzle, calls set_wall / set_hole / add_clue as
needed, and the substrate-then-rebuild path keeps constraints in
sync.
Akari constructors take (rows, cols, walls, clues, holes);
Slitherlink takes (rows, cols, clues).
Why a struct of fn pointers instead of a trait?
The pre-collapse design had trait Genre and Puzzle<G: Genre>.
Heterogeneous iteration over every genre meant a for_each_genre!
macro expanded at 11+ call sites across the workspace; runtime
genre lookup needed a parallel GenreEntry trait object slice;
loading a puzzle without knowing its genre at compile time meant
an AnyPuzzle enum.
Collapsing the trait into a value-typed struct paid off three ways:
- One iteration mechanism. Every “for every genre” call site
becomes
for g in GENRES { … }. Bothfor_each_genre!and theGenreEntryobject slice deleted. - One puzzle type.
Puzzlecarries its genre as a field; the generic parameter is gone everywhere.AnyPuzzledeleted. - Editor-friendly. Threading a runtime
&'static Genrethrough gesture-codec, variant, and apply paths is mechanical; threading<G>through them is friction. The collapse unblocked the universal-editor work.
The cost is a small indirection (an fn-pointer call) on the
per-genre dispatch path, which doesn’t show up in benchmarks
(propagation work dominates), and a one-time macro that copies
monomorphized items into the runtime literal. The full design lives
in penmark/notes/genre-collapse.md.
Presentation declarations
Every frontend (eframe today, WASM/web/SVG/terminal later) needs to
know two things to paint a puzzle: what’s at each position
structurally (Floor? Wall? Hole?) and what player marks mean
visually (lamp? digit? edge stroke?). The first answer reads from
Substrate (per-position Material); the second is a small closed
enum (MarkRenderStyle) declared per-genre as a const on the
GenreTrait impl and surfaced on the runtime handle as
genre.mark_render. A frontend implements one paint pass per
Material variant and one per MarkRenderStyle variant; new
genres land without touching the renderer.
The substrate side deliberately doesn’t carry given information
(clue values, pinned digits) — that comes from walking the
constraint list at paint time, not from the per-coord material.
Cell clues are read uniformly via Puzzle::cell_clue(r, c) which
forwards to (self.genre.cell_clue)(self, r, c).
pub enum MarkRenderStyle {
/// Cell-substrate, binary marks. `Binary(true)` is a token
/// (lamp / bulb / star), `Binary(false)` is a small X.
/// Akari, Star Battle.
Lamp,
/// Cell-substrate, numeric marks rendered via
/// `genre.format_value`. Sudoku, KenKen.
Glyph,
/// Cell-substrate, binary marks, tri-state visual: wall fill,
/// explicit empty fill, and a third "unknown" tone for cells
/// with no committed mark. Nurikabe, LITS, Hitori, Tapa,
/// Yin-Yang.
Fill,
/// Cell-substrate, numeric marks where `Numeric(tag)` is a
/// region tag; the renderer derives a partition. Fillomino,
/// Suguru.
Partition,
/// Edge-substrate, binary marks. `Binary(true)` is a thick
/// line, `false` is an X cross. Slitherlink, Masyu, Yajilin
/// loop edges.
EdgeStroke,
/// Edge-substrate, numeric marks: `Numeric(1|2)` is one or
/// two parallel lines. Hashi.
BridgeCount,
/// Corner-substrate, binary marks. Shingoki and other
/// vertex-mark genres.
CornerCircle,
/// Standard pass paints nothing; the genre's overlay handles
/// all mark visualisation.
Custom,
}
MarkRenderStyle is a semantic commitment (“Akari marks ARE
lamps”) — what a lamp looks like (yellow disc with rays in eframe,
a <div class="lamp"> in web, a * in a terminal) is the
frontend’s call. The genre declares what; frontends decide how.
The four small read accessors a frontend uses, in paint order:
puzzle.material(r, c) -> Material— paint backgrounds.puzzle.cell_clue(r, c) -> Option<u8>— paint clue text.puzzle.is_play_target(coord) -> bool— gate input dispatch.puzzle.format_value(v) -> String— render a small int as the genre’s glyph. Forwards to(self.genre.format_value)(v).
Game<'p>
Holds the candidate sets per coord. Borrows its puzzle for its
entire lifetime — the borrow checker enforces “this game is bound
to this puzzle.” Cloning a Game copies only the candidate state
plus a reference; the static puzzle is never duplicated.
pub trait Game<'p>: Sized + Clone {
/// Fresh game bound to a puzzle — every coord carries its full
/// domain.
fn new(puzzle: &'p Puzzle) -> Self;
/// The puzzle this game is playing.
fn puzzle(&self) -> &'p Puzzle;
/// Current candidate set at a coordinate.
fn candidates(&self, coord: Coord) -> impl Iterator<Item = Mark> + '_;
/// Apply a move. No validity checks — the caller is responsible.
/// `Move` is `Copy` and small (≤ 32 bytes), so we pass by value:
/// no indirection on the DFS hot path, and the data lives in
/// registers across the call.
fn apply(&mut self, mv: Move);
/// Undo the most recently applied move. Implementations track the
/// history themselves (per-game undo stack), so the caller doesn't
/// pass the move back — pop-the-stack semantics. Cheaper at the
/// call site (one fewer parameter to thread through DFS frames)
/// and lets fast impls undo via diff-records they already keep
/// for incremental propagator state.
fn undo(&mut self);
// ─── Derived ─────────────────────────────────────────────────
/// `Contradicted(v)` if any constraint is `Violated` — `v` is
/// the violation reported by the *first* such constraint in
/// `puzzle.constraints()` order. Else `Solved` if every `Goal`
/// constraint is `Satisfied`. Else `InProgress`.
fn status(&self) -> Status;
// ─── Player-mark accessors ───────────────────────────────────
/// Iterate every coord currently committed to a single mark,
/// paired with that mark. Skips undecided and contradicted
/// coords. Default impl walks `puzzle().coords()` and yields
/// singletons; overrides can stream from a denser store.
fn marks(&self) -> impl Iterator<Item = (Coord, Mark)> + '_ { /* default */ }
/// Read the committed mark at `coord`. `None` for undecided or
/// empty-domain coords. Default impl runs in O(domain size)
/// via `candidates()`.
fn get(&self, coord: Coord) -> Option<Mark> { /* default */ }
/// Force `coord`'s candidate set to `{mark}` — the player UI
/// path. Doesn't go through the apply/undo log; player edits
/// aren't solver moves. No default: the trait's
/// `apply(Move::Commit)` only works for multi-candidate
/// coords, so it can't express "overwrite a previously-
/// committed value" — a common player gesture.
fn set(&mut self, coord: Coord, mark: Mark);
/// Restore `coord` to its full domain — the player's "clear
/// this cell" gesture. Like `set`, not undo-stack-tracked.
fn clear(&mut self, coord: Coord);
// ─── Serialization (game-with-givens) ────────────────────────
/// Snapshot the game's current state: every puzzle coord paired
/// with its remaining candidate list. Forgiving by design — a
/// caller can save partial state and round-trip it. The
/// matching wire shape lives in `core::tagged::Marks` (see the
/// *Tagged serialization* chapter).
fn to_marks(&self) -> Marks { /* default impl walks coords */ }
/// Build a fresh game and replay each saved candidate
/// restriction into it. Coords absent from `marks` are left at
/// their full domain — partial saves stay loadable. Saved
/// marks outside a coord's full domain are silently dropped;
/// the engine's domain is the source of truth.
fn from_marks(puzzle: &'p Puzzle, marks: &Marks) -> Self { /* default impl */ }
}
Saved-game serialization round-trips through Puzzle’s tagged-JSON
methods (to_tagged_json_with_marks / from_tagged_json_with_marks,
see the Tagged serialization chapter); the loader then rebuilds
a Game in two steps because the borrow checker won’t let one
function return both the owned puzzle and a game that borrows it:
let (puzzle, marks) = Puzzle::from_tagged_json_with_marks(s)?;
let game = match marks {
Some(m) => FastGame::from_marks(&puzzle, &m),
None => FastGame::new(&puzzle),
};
Tagged serialization and dispatch
Every serialized puzzle (and every serialized game) carries its own
genre tag. The wire format is the same in every direction (db row,
data/puzzles/<genre>.jsonl.gz line, penmark import input, future
WASM IPC) — a self-describing JSON envelope:
{
"genre": "akari",
"data": { "rows": 7, "cols": 7, "walls": [...], ... },
"marks": { // optional — present iff this is a saved game
"cell:00": ["bulb", "empty"],
"cell:01": ["empty"],
...
}
}
The marks field is the saved-game extension: present means “this
is the puzzle plus where the player is now”; absent means “bare
puzzle.” A serialized game always carries the givens — data
is the full puzzle definition, every time. There is no marks-only
wire form, by design: a save-game blob is portable on its own and
needs no separate puzzle reference to render or play.
After the trait→struct collapse there is exactly one path into the format:
Puzzle::to_tagged_json/from_tagged_json— for every caller, whether or not it knows the genre yet.from_tagged_jsonreads the"genre"field from the JSON envelope, looks the name up viacore::genre::by_name, and constructs aPuzzlewhose.genrefield is that runtime handle. The pre-collapseAnyPuzzleenum (one arm per genre) is gone — the runtime handle carries everything it used to encode in its discriminant.Puzzle::to_tagged_json_with_marks/from_tagged_json_with_marks— the saved-game pair. The latter returns(Puzzle, Option<Marks>).
impl Puzzle {
pub fn to_tagged_json(&self) -> String;
pub fn from_tagged_json(s: &str) -> Result<Self, TagError>;
pub fn to_tagged_json_with_marks(&self, marks: &Marks) -> String;
pub fn from_tagged_json_with_marks(s: &str)
-> Result<(Self, Option<Marks>), TagError>;
}
/// Errors from tagged-JSON parsing.
pub enum TagError {
Json(serde_json::Error),
UnknownGenre(String),
GenreMismatch { expected: &'static str, found: String },
}
Round-trip property tests pin the agreement:
- Puzzle: for every
Puzzlevaluep,p.to_tagged_json()parses back to aPuzzlewhose.genreand storage equal the original. - Game: for every
(puzzle, sequence of valid moves), building the game, applying the moves, callingto_tagged_json_with_marks, parsing viafrom_tagged_json_with_marks, and rebuilding viaGame::from_marksproduces a game whosestatusandcandidates(c)match the original at every coord.
Adding a new genre = one &'static Genre literal + one entry in
the GENRES slice (see Adding a genre). The serialization paths
require no changes — they look up by name.
How variants ride inside
Variants of an existing genre — Killer Sudoku, Thermo, Sandwich,
Sudoku-X, German Whispers — are not new genres. They’re additional
constraints stacked onto the same puzzle’s constraints vector, so
the genre tag on the wire stays the same regardless of how exotic
the puzzle gets.
canonical/sudoku/variants.rs is the worked example:
- A
VariantTagenum names every recognised variant (Killer,Thermometer,Sandwich,AntiKnight,NonConsecutive, …). - A
Constraintsstruct carries oneVec<…>field per variant with payload (killer: Vec<KillerCage>,thermometer: Vec<Thermometer>, …), each#[serde(default, skip_serializing_if = "Vec::is_empty")]so absent variants vanish from the JSON. - A
TOGGLE_VARIANTSconst lists the payload-less ones (Diagonals, AntiKnight, NonConsecutive, …); they ride in avariants: Vec<VariantTag>list on the puzzle.
A Killer-plus-Thermo-plus-Sudoku-X puzzle round-trips as:
{
"genre": "sudoku",
"data": {
"box_size": 3,
"givens": { ... },
"variants": ["diagonals"],
"constraints": {
"killer": [{ "cells": [...], "total": 23 }, ...],
"thermometer": [{ "bulb": [3,3], "path": [...] }, ...]
}
}
}
Two clean properties fall out:
- The dispatch tag never grows. Every Sudoku — vanilla, Killer,
Thermo, all of them — serializes as
"genre": "sudoku". The variant universe expands inside Sudoku’s data shape. New variant = new field onConstraints+ newVariantTagarm; no change to the genre registry, no change to the db schema, no change to the import pipeline. - Querying for variants stays cheap without normalising. The
db row’s
dataJSON is searchable by tag (data->'variants' @> '["diagonals"]'in Postgres,json_extract(data, '$.variants')in SQLite) — no separatepuzzle_variantsjoin table required unless query patterns demand it. If “show me every Killer” earns a denormalised indexed column, it’s the same write-side helper pattern asgenre: extractvariants[]into a sibling text array.
What this rules out: “Killer Sudoku is its own genre with its own dispatch arm.” That’d contradict the constraint-composition thesis the rest of the design is built on, and force every cross-genre caller to know that Killer is just-Sudoku-but-different. The genre is the rule vocabulary the puzzle uses; variants are constraints stacked on top.
What this unifies
- Database —
puzzles(id, genre TEXT, data JSON, rows, cols, difficulty, hard_score, ...)wheredatais the tagged blob andgenreis denormalised from the tag for indexing. Loader isPuzzle::from_tagged_json(row.data)?. ACHECKconstraint or a write-side helper enforces that the column and the blob’s tag agree. data/puzzles/<genre>.jsonl.gz— one tagged blob per line. The per-genre file split stays (keeps file sizes / query patterns reasonable), but each line is self-describing — a stray Sudoku blob in the Akari file is rejected at read time via the tag / expected-genre check.penmark import— reads puzzlehound’s collated output, parses each line viaPuzzle::from_tagged_json, validates the resulting.genre.nameagainst the target file’s expected genre, appends. The “what genre is this?” question is asked once (during the tag lookup) and the rest of the pipeline operates on the resultingPuzzle.- Saved games —
Puzzle::to_tagged_json_with_marks(&marks)produces the same envelope with amarksfield appended. A save / share / URL-hash carries everything needed to render and play in one blob; the givens travel with the marks, so a recipient can play with no prior context. - WASM (when it lands) — the JS side hands a tagged blob to
Rust, which parses through
Puzzle::from_tagged_json[_with_marks]and dispatches perpuzzle.genre. No parallel envelope format, no out-of-band genre header, no separate “puzzle channel” and “game-state channel”.
The thing this doesn’t do is force every caller to think about
dispatch. Puzzle::from_tagged_json(blob) returns a Puzzle that
already knows its genre; method calls go through the fn-pointer
field, not through a match.
Dispatching
Every “for every genre” call site iterates the GENRES slice
declared in core/src/genre.rs:
pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU];
CLI, eframe, and the bench harness all walk this slice; there is no
parallel hand-maintained list to keep in sync. Adding a genre means
one &'static Genre literal + one append to the slice (see Adding
a genre).
The runtime Genre struct answers every per-genre question the
harnesses used to out-source: parser, default density, dimension
shape, generation pipeline, painter overlays, etc. Call sites
dispatch through the fn-pointer fields:
// CLI command dispatch:
let Some(genre) = core::genre::by_name(&args.genre) else {
bail!("unknown genre: {}", args.genre);
};
let p = (genre.from_janko_text)(&input)?;
let out = (genre.to_puzzlink_url)(&p)?;
// Iterating every genre:
for genre in core::genre::GENRES {
println!("{} — {}", genre.name, genre.display_name);
}
penmark list-impls walks GENRES and prints genre.name per
entry; the technique list is queried through
core::algorithm::grader::technique_names(), which is itself genre-
neutral now that techniques are object-safe.
The pre-collapse design had a for_each_genre! macro expanded at
11+ call sites across the workspace plus a parallel GenreEntry
trait-object slice for runtime iteration. Both deleted. The full
rationale lives in penmark/notes/genre-collapse.md.
Propagation engine
Underneath Solver, Generator, and Grader is a single propagation
engine that runs the same way for every genre. It has one currency
— Move — and two roles for the rule machinery, kept clean.
pub enum Move {
/// Eliminate `mark` from `coord`'s candidate set.
Eliminate { coord: Coord, mark: Mark },
/// Narrow `coord`'s candidates to exactly `mark`.
Commit { coord: Coord, mark: Mark },
}
A Move is the unit of change. Eliminate is fine-grained (drop
one mark from a coord’s candidate set); Commit is coarse (collapse
to a singleton). Commit is conceptually equivalent to eliminating
every other mark, but it’s its own variant because (a) it’s the
player’s natural gesture — “place a bulb here” — and (b) the engine
can shortcut it as a one-bit-set bitmask write instead of N
eliminations. Game::apply and Game::undo both consume Moves.
The two roles, then:
- Constraints are predicates. They report
Satisfied,Pending, orViolated(reason)against the current state. They don’t produce moves; the engine never asks them to. - Propagator-tagged techniques produce moves. Techniques
flagged
Phase::Propagationare the cheap, rule-shape-specific propagators that emitDeduction { observed, moves }. Naked singles, bulb-clears-LOS, clue-saturation, etc. live here.
This split means a constraint’s eval is a yes/no/maybe check; the
work of deriving what to do about it lives in propagators. If a
propagator’s pass produces moves, those moves change the state, and
the next pass evaluates constraints again. Fixed-point iteration
handles correctness without Forces plumbing on the constraint
side.
pub enum Phase {
/// Cheap forward inference — runs in the inner propagation
/// loop after every move, to fixpoint. Akari LOS clearing,
/// Sudoku naked singles, clue-saturation,
/// single-lighter-must-be-bulb.
Propagation,
/// Higher-cost pattern matching — runs in the outer grader
/// loop only when propagation has stalled. X-Wing, Hidden
/// Pairs, Akari clue-cover-by-elimination, paired-clue
/// interactions.
Pattern,
}
/// Object-safe: the grader holds a `Vec<Box<dyn Technique>>`.
pub trait Technique: Send + Sync {
fn name(&self) -> &'static str;
fn rule_kinds(&self) -> &'static [RuleKind];
fn fire_once(&self, ci: usize, c: &Constraint, prop: &Propagator)
-> Vec<TechniqueMatch> { /* default: empty */ }
fn fire_all(&self, prop: &Propagator) -> Vec<TechniqueMatch>;
}
The loop
propagate(puzzle, game) -> Result<Vec<SolveStep>, Violation>:
loop:
before = game.move_count()
for each Technique t with PHASE = Propagation, applicable to puzzle:
run t.fire on builder; apply resulting Deduction; record steps
for each Constraint c in puzzle.constraints():
if c.eval(game) == Violated(v):
return Err(v)
if game.move_count() == before:
return Ok(steps) # fixpoint reached
Each pass: run all applicable propagators, applying any moves they produce; then check every constraint for a violation; loop until no move is made. Constraints are pure predicates; propagators are the move-producing engine.
How callers use this
Three callers weave propagate into their own outer loops, each
adapting the engine to a different goal.
Solver fires the generic Propagator — the same set of
forced-move derivations the engine knows about, batched and
unnamed for speed. There’s no per-move trace; the solver just
wants the answer. When propagation stalls without a solution or
contradiction, the solver branches (MRV-ordered open coord,
trial-1 with intra-sweep cascade) and recurses.
Grader fires the same forced-move logic, but as named
Phase::Propagation techniques, one at a time, so each move lands
in the trace tagged with the technique that derived it. The
profile-driven scorer reads those tags to assign difficulty. When
propagation stalls, the grader walks Phase::Pattern techniques
in priority order, fires the first one that hits, and re-enters
propagation. When no Pattern technique fires either, it escalates
through backward fallbacks (clue-cover, trial-1, trial-2/N),
each itself a Pattern-phase technique recorded in the trace. See
the Grader chapter for the full layering.
Discovery loop runs propagation to the stall point, captures the state as a specimen (board layout, prior moves, the trial-search-forced move that no technique predicted), applies that move, and continues. The harness mines specimens that expose gaps in the technique library.
Three solvers, one contract
FastSolver’s body is: propagate to fixpoint → trial-1 sweep
(MRV order, intra-sweep cascade) → if still in progress, branch on
the first MRV-ordered open coord. The propagation step uses the
generic Propagator, which derives forced moves from each rule
kind’s cached aggregate state — no genre-specific code. This is the
production solve path for every genre.
SimpleSolver is the deliberate brute-force baseline:
HashMap<Coord, Vec<Mark>> candidates, no propagation, full DFS.
Used for cross-checks on small puzzles and as the bench reference
every faster impl beats badly.
The OR-Tools wrapper in algorithm/ortools/ translates each
constraint into CP-SAT primitives and asks Google’s solver for
solutions. It’s slower than FastSolver and not on the production
path; its job is to be a totally independent oracle. When the three
disagree on solve(p, k).len() or on the solution sets, the bug is
on our side.
Puzzle dataset gathered by Puzzlehound
The engine reads puzzles from per-genre canonical files —
penmark/data/puzzles/<genre>.jsonl.gz, one tagged JSON blob per
line in the format described in Tagged serialization and dispatch.
The per-line shape is whatever Puzzle::to_tagged_json() produces;
FastSolver, StandardGrader, the eframe dataset browser, the
benchmarks, and the test fixtures all consume this format and only
this format.
The per-genre file split is for query patterns and reasonable file
sizes, not for type discrimination — every line is self-describing,
so a stray Sudoku blob in the Akari file is rejected at read time
with a TagError::GenreMismatch, never silently decoded into the
wrong type.
Strict mode at the boundary: every field in a canonical record maps to a typed variant struct, or the record is rejected at write time. No opaque escape hatch — if it can’t be encoded, it isn’t stored. Downstream consumers never branch on “this field might not be there”.
Where the puzzles come from: puzzlehound
Penmark doesn’t fetch puzzles. puzzlehound does — a sibling
project at threeemojis/puzzlehound/ (Python, has its own README)
that discovers, fetches, and collates puzzles from registered
sources (logic-masters.de, Cracking the Cryptic spreadsheets,
gmpuzzles RSS, swaroopg92 Atom, tdoku benchmarks, …) into a form
penmark can consume directly.
penmark import reads puzzlehound’s collated output, parses each
record with Puzzle::from_tagged_json, validates puzzle.genre.name
against the target file’s genre, and appends as a tagged blob. From
there the dataset-reading verbs (solve, grade-canon,
profile-eval) take over. Adding a new upstream source is
puzzlehound’s problem; adding a new genre to canonical-ize into is
penmark’s — one new entry in the GENRES slice + one new
data/puzzles/<genre>.jsonl.gz file.
Solver
Solver finds solutions to a puzzle. Three live impls today, one
contract.
/// One valid solution: the candidate set per coord narrowed to a
/// singleton, expressed as a flat coord → mark map. Genre-specific
/// readers (e.g., "extract bulb positions") project from this.
pub type Solution = HashMap<Coord, Mark>;
/// Up to `max_solutions` solutions, in unspecified order. Length
/// ≤ `max_solutions`; equal to the true solution count if fewer.
/// The generator's uniqueness gate calls `solve(p, 2)` and checks
/// `len == 1`.
pub type SolutionSet = Vec<Solution>;
pub trait Solver {
type Config: Default + Serialize + DeserializeOwned;
const NAME: &'static str;
fn new(config: Self::Config) -> Self;
fn solve(&self, puzzle: &Puzzle, max_solutions: usize) -> SolutionSet;
}
/// Cross-puzzle solver options. Per-impl Configs typically embed
/// this with `#[serde(flatten)]`.
pub struct SolverParams {
pub time_budget: Duration,
}
One CSP engine for every genre
The generic FastSolver, driven through Propagator over FastGame,
is the single solve path for every genre. The rule vocabulary —
DegreeIn, CellMin, LoopFreezesCounts, plus the per-cell counter
machinery the engine maintains automatically — is expressive enough
that no per-genre solver override is needed. Per-genre fast solvers
(e.g. a future AkariFastSolver that registers fast genre-specific
propagators while keeping the same outer DFS shape) are post-MVP;
the generic impl earns its keep on every benchmark we run today.
OR-Tools as the sanity-check oracle
A separate algorithm/ortools/ impl wraps Google OR-Tools’ CP-SAT
behind the same Solver trait. It’s not on the hot path —
FastSolver outperforms it for everything we ship. What it provides
is a totally different code path with the same contract: every
generated puzzle and every imported dataset can be cross-checked
against an industry-grade CSP solver to confirm the answer set
matches. When FastSolver and OR-Tools disagree, the bug is in our
engine, full stop. Fallout from going hard on the CSP-engine path —
once the rule vocabulary maps cleanly onto CP-SAT primitives, getting
a second opinion is mostly plumbing.
Cross-checks
FastSolver, SimpleSolver, and the OR-Tools wrapper all implement
the same Solver trait. The benchmark harness runs them in the same
Criterion invocation on a fixed puzzle set; the differential tests
run them over a fuzzed corpus and assert solve(p, k).len() agrees
and the returned sets are equal as sets when fewer than
max_solutions exist. SimpleSolver is the brute-force baseline,
OR-Tools is the independent oracle, FastSolver is the production
path. Three impls, one contract, three angles on every disagreement.
Fast solver
A report on the architecture of the FastSolver (penmark-core),
the optimizations that landed, the ones we tried and reverted, and
ideas we discussed but haven’t pursued.
The engine lives under core/src/algorithm/ in two families:
algorithm/fast/— the production family.game.rs—FastGame, the bitmask-backed CSP state (per-coord candidate bitmasks + apply/undo + slowstatus()).propagator.rs—Propagator, the speed: every incremental cache (per-constraint aggregates, per-cell counters, dirty queue, region indices, trial order) plus theforced_moves_at/eval_constraint_atmachinery.solver.rs—FastSolver, the search loop. Drives aPropagatorwrapping aFastGame.
algorithm/simple/— the brute-force reference family.game.rs—SimpleGame, aHashMap<Coord, Vec<Mark>>-backed minimalGameimpl. No bitmask, no caches.solver.rs—SimpleSolver, plain DFS overSimpleGame. The correctness baseline — every other solver should beat it badly.
algorithm/geometry.rs— region/reach helpers shared between the two families.algorithm/eval.rs— generic constraint evaluator with both a fasteval_constraint(assumes pre-resolved regions) and a sloweval_constraint_full(handlesLosCross/Neighbors4/CellMinby walking the puzzle on the fly).algorithm/grader.rs—StandardGrader, the human-techniques grader. Drives aPropagator(same engine asFastSolver) and fires named techniques in priority order. Techniques are generic, keyed offRulekind + cached aggregate state:exact-count-zero,exact-count-saturated,exact-count-forced,at-most-saturated,at-least-one-witness,cell-min-witness, plustrial-1as backward escalation. Any genre using these rule kinds gets the techniques for free; genres needing bespoke patterns can append their ownTechniqueimpls.
1. What the solver does
At every DFS node:
- Forward propagation to fixpoint. AC-3-style worklist over
constraints, plus a parallel dirty-cell drain for
CellMin-style counter rules. Continues until both queues are empty. - Trial-1 sweep with intra-sweep cascade. Walk open coords in
degree-descending order (
game.trial_order()). At each coord, hypothesize every candidate; any commit that propagates to a contradiction is recorded as a forced elimination. Eliminations are applied immediately and followed by an inline forward- propagation drain, before moving to the next coord. - Branch. If still
InProgress, pick the firsttrial_order-ranked open coord and branch over its candidates. For binary domains (Akari) “smallest domain” MRV is degenerate (every open cell has 2 candidates) — degree is the heuristic that actually shrinks the search tree.
The undo log inside FastGame records every applied move; on exit
from each DFS frame we pop exactly the count of moves applied at
this level.
The same three-phase shape lives in the grader, except:
- the grader runs in an outer loop instead of recursing;
- the grader keeps a step trace tagged by
RuleKind; - the grader never branches (trial-1 is its terminal technique).
The shared propagation routines are intentionally close to literal copies — keeping them divergent for tiny gains is more cost than it’s worth.
2. The FastGame / Propagator split
FastGame is intentionally minimal: per-coord candidate bitmasks,
the coord_index / mark_to_bit lookups, the undo_stack, and an
empty_count counter that lets status() short-circuit O(1) when
no domain is empty. Apply and undo do the bit-flip + push/pop the
undo log + maintain empty_count. That’s it. status() walks all
constraints with a slow per-region resolver — fine for tests and
the brute-force Simple solver, slow on the hot path.
Propagator<'g, 'p> borrows &'g mut FastGame<'p> and owns
every incremental cache. Its constructor builds coord_constraints,
region_indices, trial_order, the per-counter inverse-reach
maps, the lifted CellMin rule table, the per-constraint
aggregates, and the per-cell counters — all from the game’s current
candidate masks. After that, the propagator is the sole driver of
apply / undo: each call flips the game’s bits, then walks the
incremental updates against the (pre, post) mask delta.
The propagator keeps a parallel transition_log so undo can
reverse cache deltas without inspecting the game’s undo stack —
two stacks, kept in sync by the rule “only the propagator drives
moves during a solve.” Anyone holding a &mut FastGame (player
UI, parser tests, future slow grader) can still apply moves; they
just don’t get the cache.
3. State that makes propagation cheap
The hot loops touch these structures:
- Bitmask candidates.
Vec<u32>per-coord, one bit per mark in the puzzle’s universe (cap 32 marks, seeMAX_MARKS). Apply / undo becomemask & !bit/mask | bit. Iteration is aBitIteron the mask. Replaces the originalVec<Vec<Mark>>. - Static region indices. Each constraint’s region is resolved
once at construction to a
Vec<usize>of dense coord indices. Removes the per-evalcoord → idxHashMap lookup. - First-class LOS regions.
Puzzle::los_blockslets the generic region walker handleLos,LosCross, andNeighbors4natively — the genre doesn’t reimplement ray walking. - Per-constraint aggregates. Each constraint caches
committed_with_markandpossible_with_markfor its primary mark.apply/undokeep these in sync;eval_constraint_atreads them in O(1) instead of rescanning the region. Rules without a primary mark (Decided,Distinct,NoTwoByTwo) carry a zero target mask — the update is a no-op and the generic scan handles them. - Per-cell counters with a dirty-cell queue.
CellMin(the Akari “lit at least once” rule) is lifted out of the AC-3 worklist and onto a dedicated coord-indexed counter with adirty_cellsVec. A 100×100 Akari has thousands ofCellMinrules; routing them through AC-3 would put thousands of constraints in the worklist. Routing through the counter+dirty- cell queue costs a single push per affected coord. trial_order. Coords sorted by descending degree (regular constraints + CellMin rules touching them). Drives the trial-1 sweep order and DFS branch ordering — the branch coord is the first one intrial_orderwith two or more candidates.FxHashMapforcoord_indexandmark_to_bit. Cheap hashing replacing the std SipHash default; not visible in any one profile but compounds across construction.- Undo log of bit-level deltas.
AddBack { idx, bit }andRestore { idx, mask }entries.applypushes;undopops. - Reusable
Vec<Move>buffer.forced_moves_atwrites into the caller’s buffer instead of returning a fresh allocation per call. Same trick forcell_min_forced_moves_at.
Every one of those is the result of a profiling round (mostly samply call-tree inspection, plus a couple of pprof FlameGraph runs) showing one specific hot frame and chasing it.
4. What worked, in order
Headline numbers below are wall-clock medians from the solve_one
example, plus cargo bench summaries. Where two numbers appear,
the left is “before the change” and the right is “after.”
Compile profile.
lto = "fat"andcodegen-units = 1in the workspace release profile. Modest but free win.
Profiling instrumentation.
- Optional
--pprofflag on the CLI for FlameGraph output. - Switched to samply for higher-fidelity call-tree timing on macOS.
- Throttled
Instant::now()sampling inside the propagation loop to 1-in-1000 calls; the original per-iter timestamping was measurably visible in the profile.
Data-structure swaps.
HashSet<usize>→Vec<bool>for the AC-3in_worklistand the dirty-cellin_dirtyflag. Hashing was a profile hot spot.std::collections::HashMap→FxHashMap(rustc-hash) for small-key tables.- Region indices materialized once at construction (Tier 1).
- LOS / LosCross / Neighbors4 handled by the generic walker via
los_blocks(Tier 2). - Per-cell counters with a dirty-cell queue (Tier 3) — replaces a per-coord linear scan that the original AtLeastOne path did.
Bitmask candidates. Per-coord u32 instead of Vec<Mark>.
- This was the biggest single mid-stage win on dense puzzles: the inner candidate loops collapse to a popcount + bit test, and apply/undo become two ops.
status() aggregate-driven evaluation. The “is the puzzle
solved / contradicted / in progress?” check was rescanning every
constraint. Reading the cached aggregates first short-circuits the
common cases. (The variant that also maintained per-constraint
“unsatisfied” / “starved” flags was tried and reverted — see §4.)
MRV trial ordering. Sorting trial-1’s coord walk by descending
degree (trial_order) was the largest single win on the grader.
On 100×100: ~31 s → ~2 s once trial-1 was MRV-ordered. The
intuition is that high-degree cells are the ones whose hypothesis
collapses the search space fastest; they surface forced
eliminations sooner, so trial-1 returns sooner with progress.
Intra-sweep propagation in trial-1. When trial-1 finds an elimination, drain forward propagation in place before moving to the next coord, instead of returning to the outer loop. This avoids the round-trip “trial-1 returns → outer re-runs forward → trial-1 restarts at coord 0.” On 100×100: ~2 s → ~280 ms after this single change. The cost of the inline drain is dwarfed by the work it removes.
MRV branching + trial-1 inside the solver. Lifting the
trial-1 sweep and trial_order-driven branch pick out of the
grader and into the solver was the last big win:
| size | solver before | solver now | grader |
|---|---|---|---|
| 50×50 | ~2 s | 3.7 ms | 4.5 ms |
| 100×100 | (didn’t terminate cleanly) | 43 ms | 64 ms |
| 17×17 corpus (criterion) | — | ~30% faster on naive | — |
The solver now beats the grader on big puzzles, which makes sense: solver stops at the first solution, grader records every step.
5. What we tried and reverted
status() with cached cell_min_unsatisfied / starved
counters. Idea: maintain bookkeeping so status() is O(1) by
counting how many CellMin rules are currently unsatisfied or
starved. Result: helped 100×100 by ~12% but regressed 50×50 by
~10% and the dataset by ~11%. The per-apply maintenance cost
swallowed the per-status() win on workloads where status() isn’t
actually the bottleneck. Reverted. The cached aggregate
fast-path remained; only the extra counters were dropped.
Batch trial-1 eliminations across coords without intervening propagation. Idea: skip the inline propagation drain in §3 and instead apply every elimination found on the full sweep at once. Result: 5–50× slowdown. The reason is subtle: forward propagation between trials was doing a lot of cheap deduction. If you batch, those cheap deductions surface as additional trial-1 hypotheses on the next sweep — paying for them at trial-1 prices instead of forward-propagation prices. Reverted.
Per-floor AtLeastOne constraints. Original encoding for
Akari’s “every floor must be lit” rule. Each floor became its own
AtLeastOne constraint, which means a 100×100 puzzle pushed
thousands of constraint visits through AC-3 per cycle. Replaced by
a single CellMin rule routed through the dirty-cell queue.
Several construct.rs constraint-count tests broke and were updated
to the new counts.
Janko parser without _ for holes. Bug discovered during a
round-trip test: to_janko writes _ but parse only accepted
- and .. Fixed by adding _ to the hole branch.
6. Things we discussed but haven’t done
Generic clue-cover and lit-witness propagators for Akari.
Open question: is there a clean way to express these as
genre-supplied propagation rules instead of baking them into the
generic constraint walker? Today the only Akari-specific
propagation is the CellMin rule; clue-cover (a clue with n
unlit neighbors that all need bulbs is fully determined) and
lit-witness (the only remaining floor that can light a forced cell
must be a bulb) are subsumed by trial-1 today, but a direct
propagator would cut the trial-1 cost.
MAC / GAC / AC-2001. Discussed as standard upgrades to the AC-3 worklist. The user leaned toward “make the naive solver as fast as possible” instead — the trial-1 + MRV + intra-sweep work that followed delivered most of the wins these would have. AC-2001 in particular is the natural next step if propagation throughput becomes the bottleneck again.
Trial-2 / trial-N. Higher-order trial search (commit two hypotheses, look for contradictions). Cost goes up combinatorially and most useful on harder puzzles where trial-1 stalls. Not yet needed — trial-1 + branching solves every dataset puzzle in the budget.
Smarter branch heuristics. Today the branch coord is the first
trial_order entry with ≥ 2 candidates. Possible refinements:
domain-size tiebreak (degenerate for binary, useful for richer
domains), least-constraining-value ordering on the candidate side,
random restarts. None implemented.
Refactor the grader to be pure human techniques. The
optimization machinery now lives in Propagator, separated from
FastGame — that work is done. The remaining piece is teaching
the grader to drive a different engine: instead of running AC-3
to fixpoint, it should fire one human technique at a time,
recording each step under its proper name. That refactor is the
next planned change.
Parallelism. rayon is a feature flag and is wired into a
couple of construction paths but not the solve loop itself. Per-
puzzle parallelism (across the corpus) is the obvious win; per-
solve parallelism (multiple branches concurrently) is a much
bigger lift and not on the table.
WASM and the editor surface. The editor (ordo/edit) is
supposed to consume the same core through WASM. The solver compiles
to WASM today but no measurements have been taken — we don’t know
where the slow path is in the browser yet.
External solvers
penmark ships its own FastSolver (see the Fast solver
chapter for internals) and treats
external solvers as feature-gated oracles — useful for cross-validation,
useful as fallbacks for genres where writing a domain-specific propagator is
hard, but not on the production solve path.
This doc captures what we tried and what we learned, since most of it isn’t obvious from the source.
Headline numbers
100×100 Akari (release build, single thread, scaffold-derived clued puzzles
at 0.42 wall density, cached at /tmp/penmark_compare_100x100.json):
| solver | wall-clock | gap to FastSolver |
|---|---|---|
FastSolver (production) | ~10 ms | 1× |
OrToolsSolver (CP-SAT, tuned) | ~28 ms | 2.8× |
OrToolsSolver (CP-SAT, defaults) | ~52 ms | 5.2× |
OrToolsSolver (initial port) | ~115 ms | 11.4× |
compare_ortools (in core/examples/) reproduces this table; profile_ortools
breaks the OR-tools call into model-build vs CP-SAT-solve and runs a 30+
parameter sweep so the next round of tuning has a measurement harness ready.
The takeaway is structural: FastSolver is a hand-rolled Akari propagator
(per-cell lit_by counter, LOS-cross-aware AC-3, trial-1 with MRV, bitmask
state). A general CP solver can’t replicate that without knowing about the
puzzle’s lighting structure; it just sees clauses and cardinality constraints.
The “10-100×” pitch from CP-SAT marketing assumes a naive backtracker as the
baseline, not a tuned propagator.
OR-tools journey
Initial port: 115 ms at 100×100
The first cut used a one-hot encoding: one BoolVar per (coord, mark) pair
with add_exactly_one over each coord. For Akari (binary domain) that’s 2×
the variables it needs. CP-SAT’s default parameters were also fighting us —
the portfolio of subsolvers, presolve, probing, and linearization all charge
fixed setup costs that don’t pay off on a 7,000-constraint Boolean satisfaction
model.
Win 1: single-bool encoding for binary domains (115 ms → 45 ms, 2.5×)
When every coord shares a 2-mark domain, allocate one BoolVar per coord and
use !v as the indicator for the negative mark. BoolVar implements Not so
the negation is free. No exactly_one constraints, half the variables.
OneHot stays as a fallback for wider domains (Sudoku-style numeric marks).
Win 2: SatParameters tuning (45 ms → 33 ms)
Methodically benchmarked each top-level parameter against the 100×100 corpus. The CP-SAT primer is emphatic about not modifying top-level params — the portfolio of subsolvers is what makes CP-SAT fast — but that advice is tuned for optimization workloads. For pure satisfaction at our scale:
| param | savings |
|---|---|
cp_model_presolve = false | ~7 ms |
cp_model_probing_level = 0 | ~3 ms |
linearization_level = 0 | ~2 ms |
symmetry_level = 0 | ~3 ms |
use_sat_inprocessing = false | ~1 ms |
use_phase_saving = false | small |
stop_after_first_solution = true | small |
num_search_workers = 1 | (avoids 8-worker overhead) |
These are deliberately against the primer’s guidance because at our scale the portfolio’s fixed setup cost outweighs its wins.
Win 3: specialised constraint encodings (33 ms → 28 ms)
The CP-SAT modelling page points out boolean constraints (add_and,
add_or, add_at_most_one, add_exactly_one) are recognised as native
clause forms by the SAT presolve, while add_eq on a linear sum routes
through the LP relaxation — costlier and less helpful for boolean inputs.
Specialising ExactCount { n } over k literals:
n = 0→add_and(negations)n = 1→add_exactly_onen = k - 1→add_exactly_one(negations)n = k→add_and(literals)- otherwise → linear
add_eq(the prior path)
For Akari this hits the 0/1/2/3/4 clue cases that dominate the constraint count.
Things that didn’t work
subsolvers = ["fixed"] — went down to 21 ms but broke 1×1 (returned
empty solution sets). Single-strategy fixed-order DFS skips edge cases the
default subsolver handles.
Warm-start via penmark’s Propagator — left as opt-in
OrToolsSolverConfig::warm_start (default false). The propagator covers
~99.8% of cells in 5 ms at 100×100. Hinting those to CP-SAT helps at small
sizes (5×5–17×17, ~10-20% faster) but tanks at 50×50+ (5× regression):
the missing 0.2% (10 cells out of 5832) trips up CP-SAT’s hint-driven
search badly. Verified the propagator’s deductions are correct (0/5822
disagreements with FastSolver), so it’s a CP-SAT behaviour issue with
high-coverage-but-incomplete hints. Tried add_and pin,
fix_variables_to_their_hinted_value, presolve toggling — none recovered
the regression. Useful at small sizes; opt-in.
Pinning warm-start cells via add_and instead of hint — 200 ms at
100×100, 6× regression. Without presolve to simplify them, the unit clauses
from a single add_and([v1, v2, ..., v5822]) slow down CP-SAT’s search
loop. With presolve on, still regressed because the model becomes
disconnected fragments that the LP doesn’t help with.
Disabling cp_model_use_sat_presolve only — wash. The other presolve
phases dominate.
disable_constraint_expansion — crashed with
SolutionIsFeasible assertion failures. Constraint expansion is required
for our encoding.
Changing search branching — LP_SEARCH, PSEUDO_COST_SEARCH,
PORTFOLIO_WITH_QUICK_RESTART_SEARCH were all within 0.5 ms of each other.
AUTOMATIC_SEARCH (default) wins by being marginally faster on average and
not having edge cases.
linearization_level = 0 + lighting=add_ge — 33 ms vs 28 ms with
add_or. add_or is recognised as a clause; add_ge(sum, 1) goes through
the linear path even with linearization off.
What about pure SAT?
Briefly explored: a SatSolver that translated the same constraint list to
CNF + cardinality constraints via the rustsat
library and fed it to BatSat / CaDiCaL / Kissat / MiniSat. Numbers at 100×100:
| backend | wall-clock | gap to FastSolver |
|---|---|---|
| Kissat | ~16 ms | 1.6× |
| BatSat | ~18 ms | 1.8× |
| MiniSat | ~18 ms | 1.8× |
| CaDiCaL | ~25 ms | 2.4× |
Pure SAT genuinely beat OR-tools’ general-purpose CP solver on this problem class — Akari is essentially Boolean satisfaction with cardinality, and a modern CDCL solver (Kissat especially) plus rustsat’s totalizer cardinality encoding landed within 60% of the hand-rolled engine. CaDiCaL underperformed despite its reputation — the problem is too small for its strengths (incremental clause learning across many calls, lazy clause minimisation).
The SatSolver lived in the codebase for a few commits before being removed
to keep the maintenance surface small; the experiment validated the
“SAT-as-IR” approach for genres where we’d otherwise have to write a
domain-specific propagator (Slitherlink’s “single closed loop”, Nurikabe’s
connectivity, etc.). If we ever bring it back, the encoding is documented
in git history (search for algorithm::rustsat).
When external solvers would actually win
At our scale they don’t. Scenarios where they would:
- Optimization variants. CP-SAT’s branch-and-bound shines with
objectives.
min unlit cells,max bulbs— the LP relaxation actually earns its keep.FastSolverhas no objective machinery. - Genres with hard-to-propagate constraints. Slitherlink’s “single
closed loop”, Nurikabe’s “all white cells connected”, Yajilin’s path
rules. Writing a CP propagator for these is months of work; CP-SAT
handles them via lazy clauses with reasonable performance even on
first try. Pure SAT (Kissat etc.) was particularly strong here since
these genres reduce naturally to CDCL with cardinality constraints —
if the SAT path becomes load-bearing, that’s the point to bring
rustsatback. - Really hard hand-crafted puzzles. Where
FastSolver’s trial-1 stalls and DFS branching explodes. CDCL clause learning starts paying off. We haven’t actually hit one in the corpus; if we do, escalation to the external solver becomes useful. - Multi-genre infrastructure that doesn’t justify per-genre tuning. If the goal is “solve any reasonable CSP in ≤ 1 s”, CP-SAT or rustsat are the right call. penmark’s bet is the opposite: domain knowledge per genre, with the external solvers as backstops.
Maintenance posture
OrToolsSolverlives behind--features ortools. Default builds don’t pull incp_sat(which links a system OR-tools install plus libprotobuf).encode_constraintis exhaustive onRule— adding a new variant is a compile error here, not a panic at run time. Every variant in the enum has either an eager encoding (boolean-friendly rules) or a lazy cut (global rules).- The criterion
akari/solverandslitherlink/solvergroups under the consolidatedpenmarkbench include OR-tools as a sibling series when theortoolsfeature is on, so head-to-head ratios show up in the report alongsideFastSolver. - Per-puzzle cross-check tests (
ortools_agrees_with_fast_*in each genre’stests.rs) make divergence a unit-test failure, not a silent corpus drift.
Build setup
cargo … --features ortools Just Works on Apple Silicon Homebrew:
brew install or-tools
cargo bench -p penmark-core --features ortools --bench penmark -- akari/solver
cargo bench -p penmark-core --features ortools --bench penmark -- slitherlink/solver
The wiring:
penmark/.cargo/config.tomlsetsORTOOLS_PREFIX(read by cp_sat’s build.rs) andCXXFLAGS(read by cc-rs when cp_sat compiles its C++ wrapper, so abseil + protobuf headers are reachable).core/build.rsadds the protobuf link line that cp_sat doesn’t emit itself, plus rpaths for both dylibs so binaries don’t needDYLD_LIBRARY_PATHat run time.
Both auto-detect via brew --prefix <pkg> if the env var isn’t set, so
the hardcoded /opt/homebrew/... defaults are just a fallback. To use
a non-Homebrew install (Linux, Intel Mac, source build), export
ORTOOLS_PREFIX / PROTOBUF_PREFIX / CXXFLAGS in your shell — those
override the [env] defaults.
If cargo build --features ortools fails with 'absl/...' file not found, your CXXFLAGS doesn’t include the abseil headers. If it
fails with Undefined symbols ... google::protobuf::..., the build
script didn’t find protobuf — set PROTOBUF_PREFIX.
Rule encoding strategy
CP-SAT 0.4’s Rust bindings expose boolean primitives (add_or,
add_at_most_one, add_exactly_one), linear constraints over
multi-interval domains (add_linear_constraint), and reified
constraints via only_enforce_if. They do not expose
add_circuit, add_table, or add_inverse — which leaves the
“single closed loop” / “all marked cells connected” family of
rules without a clean direct encoding.
The compromise: eager + lazy cuts.
Eager encodings (CP-SAT enforces directly):
Rule variant | Encoding |
|---|---|
Distinct | per-mark add_at_most_one over region |
Sum | add_eq over Σ value · indicator |
Increasing | pairwise add_lt on per-coord value expressions |
MinDifference | reified add_or([pos, neg]) over pos → a-b≥n, neg → b-a≥n |
AtLeastOne | add_or |
AtMost { n } | add_at_most_one (n=1) or add_le |
ExactCount { n } | specialised: add_and (n∈{0,k}), add_exactly_one (n∈{1,k-1}), add_eq otherwise |
DegreeIn { allowed } | add_linear_constraint(sum, allowed-as-singleton-intervals) |
NoTwoByTwo | per 2×2 block, add_or of 4 negated indicators |
Polyomino | per coord with mark, add_or([!self, neighbor_indicators…]) |
Decided | no-op (Binary singleton / OneHot exactly_one already enforces) |
CellMin | per region cell, add_ge over reach-set indicators (or add_or for min=1) |
LoopFreezesCounts | per (cells, target), eager ExactCount over cells |
Lazy cuts (CP-SAT solves, post-check, add add_or no-good
clause if violated, re-solve until satisfied or infeasible):
Connected { mark }— find connected components of marked coords; if more than 1, cut the smallest component as “this exact set must not all be marked together.”Path { mark, closed }— partition drawn-edge subgraph into connected components; for closed, each must be a degree-2 cycle and there must be exactly one; for open, exactly one component with two degree-1 endpoints. Cut violating shapes.BoundedSize { mark, max }— cut any component larger thanmax.ValueGroupedSize— for each numerically-marked coord, BFS over same-value neighbors; cut if size ≠ value.
The lazy-cut loop terminates because each cut shrinks the feasible boolean assignment space by at least one. Worst case is exponential, but in practice for well-constrained puzzles convergence is a handful of iterations.
The slitherlink bench shows the lazy path is workable: 4×4 puzzles solve in ~600µs–2ms, 6×6 in ~3-7ms (vs FastSolver’s ~100µs–1ms on 4×4, ~1-13ms on 6×6 — comparable, and OR-tools even beats FastSolver on a hard 6×6 case).
When a future genre needs a rule whose lazy-check encoding is prohibitively slow at the target sizes, that’s the signal to reach for a different external solver (rustsat with Kissat, say) rather than to push the lazy-cut harder.
Generator
Penmark generates puzzles through genre.generate (an fn pointer on
the runtime Genre handle). There’s no separate Generator trait —
generation is a per-genre function, with a universal default recipe
that genres override only when their structural space requires
bespoke setup.
impl GenreTrait for SudokuGenre {
fn generate(
cfg: &GenConfig,
cancel: &AtomicBool,
) -> Option<Puzzle> {
// Genre-specific impl — sudoku needs diagonal seeding for
// solution diversity since the universal recipe's bare-solve
// returns the same canonical solution every time.
sudoku_generate(cfg, cancel)
}
// …
}
The default behind genre.generate is scaffold_solve_derive:
- Scaffold — pin a structural pre-pass (
genre.scaffold_passes, selected by name viacfg.scaffold) if any, then sprinkle per-(layer, material)density across remainingFloorcells. - Bare-solve —
FastSolverfinds one valid completion of the scaffolded substrate. Skip + retry if no solution exists. - Derive clues — for each cell whose
genre.clue_domainis non-empty, callgenre.derive_clue_at(p, sol, coord)and place the returnedClueValueon the puzzle. The defaultderive_clue_atbody dispatches throughgenre.clue_pattern(Pin,Count{target, region}, orConnectedSize) and wraps theu8result inClueValue::Number; genres whose clue shape doesn’t fit (Tapa runs, future Yajilin arrows) overridederive_clue_atdirectly. - Reduce — drop clues by symmetry orbit while uniqueness holds,
stopping at the
cfg.clue_density[Layer::Cells]floor. (Or applygenre.clue_post_filtersinstead, whencfg.clue_filteris set.) - Verify —
FastSolver.solve(p, 2).len() == 1gates the return. Loop on failure untilcfg.time_budgetelapses orcancelflips.
Akari uses this default verbatim. Sudoku and slither override
generate because each invents its own structural seed
(sudoku: random diagonal; slither: connected inside-region) before
the bare-solve can find a useful completion. Their helpers live in
core/src/puzzles/<genre>/generate.rs as crate-private
<genre>_generate functions; the GenreTrait::generate impl is a
one-line forwarder.
GenConfig
Universal config for every genre’s generate:
pub struct GenConfig {
pub dims: Dimensions, // Rect or Boxed
pub material_density: HashMap<(Layer, Material), f32>,
pub clue_density: HashMap<Layer, f32>, // Reduce floor; default 1.0
pub symmetry: Symmetry,
pub seed: Option<u64>,
pub time_budget: Duration,
pub scaffold: Option<String>, // Genre::scaffold_passes name
pub clue_filter: Option<String>, // Genre::clue_post_filters name
pub extra: HashMap<String, f32>, // genre-specific knobs
}
extra is the escape hatch for knobs that don’t map onto material
or clue density — slitherlink uses extra["inside_density"] to drive
loop topology, for example. Genres document the keys they read.
Extension points
Two hooks let a genre tune the universal recipe without overriding
generate wholesale:
genre.scaffold_passes()
A static slice of named structural pre-passes. Each runs before
sprinkle and pins materials in a deterministic or topologically-
constrained pattern. Akari ships six (walled, grid, maze,
layered, holes_donut, holes_islands); sudoku and slither ship
none.
pub struct ScaffoldPass {
pub name: &'static str,
pub apply: fn(&mut Puzzle, Dimensions, &mut dyn rand::RngCore),
pub respects_symmetry: bool,
}
The user picks one by name via cfg.scaffold. Sprinkle is additive
over the pre-pass — it only writes Floor cells, leaving the
pre-placed walls / holes alone.
genre.clue_post_filters()
Value-filters that run after clue derivation. When
cfg.clue_filter is set, the named filter replaces the
density-driven reduce pass. Akari ships seven (only_0..only_4,
only_evens, only_odds); sudoku and slither ship none.
pub struct CluePostFilter {
pub name: &'static str,
pub allows: fn(value: u8) -> bool,
}
CLI mapping
Each genre exposes two CLI-friendly knobs through GenreTrait:
trait GenreTrait: 'static + Sized {
const DEFAULT_GEN_DENSITY: f32 = 0.22;
fn apply_cli_density(density: f32, cfg: &mut GenConfig) { /* default */ }
}
DEFAULT_GEN_DENSITY is the value the CLI’s --density flag falls
back to when omitted (akari 0.22, sudoku 0.36, slither 0.45). The
runtime Genre struct surfaces both as genre.default_gen_density
and genre.apply_cli_density for call-site dispatch.
apply_cli_density writes the user’s value into the right GenConfig
slot — material density for akari, clue_density[Cells] for sudoku,
extra["inside_density"] for slither. Adding a 4th genre with a
fundamentally new knob means overriding both.
CLI
penmark generate <genre> --rows R --cols C --count N --density D --seed S --budget-ms B [--no-db] looks up the genre handle by name,
calls (genre.generate)(cfg, &cancel) in a loop with per-iteration
seed offsets, prints each puzzle in Janko format, and (unless
--no-db) inserts into the local SQLite library.
penmark scaffold <genre> --rows R --cols C prints
(genre.empty_at)(dims) as a sanity check.
Grader
Grader assigns a difficulty label to a puzzle by simulating a
human solver step by step.
pub trait Grader {
type Config: Default + Serialize + DeserializeOwned;
const NAME: &'static str;
fn new(config: Self::Config) -> Self;
fn grade(&self, puzzle: &Puzzle) -> GradeResult;
}
/// Cross-puzzle grader options. Per-impl Configs typically embed
/// this with `#[serde(flatten)]`.
pub struct GraderParams {
pub time_budget: Duration,
pub profile: String, // grading profile name
pub max_trial_depth: Option<u32>,
}
StandardGrader works for every genre by firing named human-pattern
techniques in priority order; the techniques are generic over rule
kind, so genres that share rule shapes share deductions. Genres
with bespoke patterns can append their own Technique impls to the
list.
Forward logic: techniques and families
A technique is a small named function over (puzzle, game) that
observes a DeductionBuilder — recording which cells the pattern
matched and what moves they force — and returns nothing. The grader
owns the builder’s lifecycle; the technique just describes what it
saw. Techniques are grouped into families by their underlying
rule shape — Sudoku families like Naked Singles, Hidden Pairs,
X-Wing; Akari families like LosCross-singles, clue-saturation,
mutual-visibility-elim. Each family is one module; each technique
is one function in it.
The Technique trait is defined above (see “Propagation engine”).
A technique declares PHASE: Propagation for cheap per-move
fixpoint inference, Pattern for the higher-cost matchers that
only run when propagation stalls. The Grading section here focuses
on Pattern-phase techniques and how the grader sequences them;
Propagation-phase techniques participate in the inner
propagation loop instead.
TARGETS examples (across both phases):
| Technique | PHASE | TARGETS | Fires on… |
|---|---|---|---|
| Naked Singles (Sudoku) | Propagation | &[RuleKind::Distinct] | Any Sudoku-shape puzzle |
| Bulb-clears-LOS (Akari) | Propagation | &[RuleKind::ExactCount] | Akari |
| Hidden Pairs / X-Wing (Sudoku) | Pattern | &[RuleKind::Distinct] | Sudoku, when propagation stalls |
| Hidden Cage Sum (Killer) | Pattern | &[RuleKind::Sum] | Killer / Sandwich / Frame — skipped on vanilla |
| Bulb-Lit Single (Akari) | Pattern | &[RuleKind::ExactCount] | Akari (clue or floor cells) |
| Trial-1 / Trial-2 / clue-cover | Pattern | &[] | Universal backward fallbacks |
The grader sets up the builder, calls fire, then converts the
builder into a Deduction (or skips the step if nothing was
recorded). Every technique skips the bookkeeping — no
DeductionBuilder::new() at the top, no into_deduction() at the
bottom, no Option<…> plumbing. Just observe.
Deduction is a library type carrying both halves of the result:
pub struct Deduction {
/// Cells the pattern matched on — used by the renderer for
/// highlighting and by traces for explanation. Doesn't affect
/// solve correctness.
pub observed: Vec<Coord>,
/// The moves the technique wants applied.
pub moves: Vec<Move>,
}
DeductionBuilder is the shared accumulator, library-defined and
used identically across every technique in every genre. It handles
three pieces of bookkeeping — deduplicating moves (multiple
observers may target the same coord), recording an observer only
when its moves were novel, and producing an Option<Deduction>
based on whether anything was new:
pub struct DeductionBuilder { /* observers, moves, seen-coord set */ }
impl DeductionBuilder {
pub fn new() -> Self;
/// Record that `observer` produced these `moves`. Moves whose
/// target coord was already added by an earlier observer are
/// dropped. If at least one move actually landed, `observer` is
/// appended to the observed list; otherwise this is a no-op.
pub fn observe(
&mut self,
observer: Coord,
moves: impl IntoIterator<Item = Move>,
);
/// Pair-observer variant — for techniques whose deduction comes
/// from two interacting cells (e.g., paired clues, X-Wing's two
/// rows). Both observers land in the list iff the pair produced
/// novel moves.
pub fn observe_pair(
&mut self,
a: Coord, b: Coord,
moves: impl IntoIterator<Item = Move>,
);
/// `Some(Deduction)` if any move was recorded, else `None`. Called
/// by the grader after `fire`, never by the technique.
pub fn into_deduction(self) -> Option<Deduction>;
}
A canonical technique impl: just the pattern, nothing else.
fn fire<'g, G: Game<'g>>(
puzzle: &Puzzle, game: &G, b: &mut DeductionBuilder,
) {
for bulb in bulbs(game) {
b.observe(bulb, los_cross_from(puzzle, bulb)
.filter(|&c| game.candidates(c).any(|m| m == Mark::Binary(true)))
.map(|c| Move::Eliminate { coord: c, mark: Mark::Binary(true) }));
}
}
The grader calls each registered technique’s fire against a fresh
DeductionBuilder, then drains it with into_deduction():
// Inside the grader's forward-logic loop, once per technique:
let mut b = DeductionBuilder::new();
TechniqueX::fire(puzzle, game, &mut b);
if let Some(d) = b.into_deduction() {
/* apply d.moves, record one SolveStep per move */
}
When grading begins, the grader walks puzzle.constraints() once
to compute the set of present RuleKinds, then filters the
genre’s registered technique list down to those whose TARGETS
are either empty or share at least one kind with that set. The
filtered list is iterated in registration order until none fire.
Each fire produces one or more SolveSteps in the trace, tagged
with the technique name.
This is the “concatenating constraints” pitch from earlier paying
its dues at runtime. A vanilla Sudoku grade never invokes any
killer-cage technique — they’re filtered out before the inner
loop. Adding Killer cages to a puzzle expands its constraint set,
which expands the filter’s positive set, which lets the
killer-cage techniques back in. The cost of an inapplicable
technique is one HashSet::contains check at puzzle-load time,
not one fire-and-bail per inner-loop iteration.
Backward logic: when forward stalls
When no technique fires, the grader escalates through fallbacks in order. Each fallback is itself a “technique” (named, recorded in the trace, scorable in the profile) — the difference between forward and backward is just which bucket the technique lives in.
-
Clue-cover analysis (
clue-cover). Walks unsatisfied clues and looks for minimum-cover deductions structural to the constraint shape: a clue needing N more bulbs whose remaining LOS has exactly N candidates forces all of them; two clues sharing candidates forcing each other; anExactCountwhose region’s remaining capacity equals its target. Polynomial-time backward reasoning — cheap, no search. -
Trial search at depth 1 (
trial-1). Pick a candidate move, commit it, run forward + clue-cover. If it reaches contradiction, the opposite move is forced. -
Trial search at iterating depths (
trial-2,trial-3, …). Iterative deepening up to a configurable wall-clock budget (default 10s). At depthd, the grader runs forward + clue-cover- recursive trial-search at depths
< d. No artificial depth cap beyond the time budget.
- recursive trial-search at depths
When the budget expires, GradeResult::status = TimedOut and the
trace is whatever was reached up to that point. The puzzle may still
be solvable by deduction at deeper trial — these get bucketed for
study but filtered out of curated sets.
Tiers and scores live in profiles, not on techniques
Techniques don’t carry their own tier or score. Both live in a grading profile — a runtime-loadable table mapping technique name → tier → score. The profile is the unit of tuning.
pub struct GradingProfile {
pub name: String,
pub entries: Vec<GradingEntry>,
}
pub struct GradingEntry {
pub technique: String, // kebab-case technique name
pub tier: Tier,
pub score: u64,
}
pub enum Tier { Trivial, Easy, Medium, Hard, Extreme }
GradingProfile is Serialize + Deserialize and loads from TOML.
The thesis is “difficulty is relative; recalibration must be cheap”
— compiling Rust to retune a curve would be the opposite of cheap.
Profiles live in two places, with different cadences:
- Shipped defaults, embedded via
include_str!at build time frompenmark/core/src/puzzles/<genre>/profiles/<name>.toml. Editing one of these does require a rebuild — they’re a maintainer task, the curve we ship by default. Source-of-truth in the repo so they version with the techniques they reference. - User overrides, runtime-loaded from
~/.config/penmark/profiles/<genre>/<name>.toml(or--profile-file=<path>on the CLI). These take precedence over the embedded default of the same name. Editing one is a TOML save, no compile, and the nextpenmark gradepicks it up.
The grader walks the user dir first, falling back to the embedded default. So the everyday calibration loop — tweak weights, re-grade the db, eyeball the distribution — runs entirely against user TOMLs. The embedded copies only change when we ship a new release with a different default curve.
penmark profile-init <genre> writes the embedded default to the
user dir, giving you a starting TOML to edit instead of building
one from scratch.
A profile TOML reads like:
name = "nikoli-strict"
[[entries]]
technique = "naked-singles"
tier = "trivial"
score = 1
[[entries]]
technique = "hidden-pairs"
tier = "easy"
score = 4
[[entries]]
technique = "x-wing"
tier = "hard"
score = 25
Each genre ships one or more default profiles in its source tree
(under puzzles/<genre>/profiles/) so out-of-the-box runs work
without any user setup. Editing a default means editing a TOML file
in the repo, not Rust source.
Why outside the technique impl: difficulty is relative. Calling
naked-singles “Trivial” only means something next to where x-wing
lands. Pulling tier and score onto each technique would force every
recalibration to ripple through impls; keeping them in a TOML file
lets you re-cut the whole curve, save the file, re-grade, repeat —
no compile in the loop.
penmark profile-eval <genre> --profile=<name> runs the active
profile against the db and prints the difficulty distribution, so
calibration has a tight feedback loop without any code changes.
Profile loading fails loudly on mismatch. TOML technique names
and Rust Technique::NAME constants are decoupled, so a rename in
code would silently invalidate stale TOML entries. The grader
validates the profile against the genre’s registered techniques at
load time and errors immediately with the offending name(s) and the
list of valid names. Same check on a registered technique that has
no entry. No silent fallbacks — fix the profile or the rename.
Adding a new technique means: write the Rust impl, register it, add an entry to the default profile TOML. Three edits.
GradeResult
pub struct GradeResult {
/// How grading concluded. `Tier` is purely difficulty; operational
/// state lives here.
pub status: GradeStatus,
/// Highest tier any step reached. Meaningful in every status —
/// for `Solved` it's the puzzle's final difficulty; for `TimedOut`
/// or `Unsolvable` it's the deepest tier reached during the
/// attempt.
pub difficulty: Tier,
/// Sum of per-step scores under the active grading profile.
pub hard_score: u64,
/// Ordered trace of the steps the grader took.
pub steps: Vec<SolveStep>,
}
pub enum GradeStatus {
/// Forward + backward logic completed; puzzle solved.
Solved,
/// The grader's time budget expired before reaching a solution.
/// The trace is whatever was reached.
TimedOut,
/// The grader proved no solution exists (backward logic forced a
/// contradiction at the root).
Unsolvable,
}
pub struct SolveStep {
pub technique: &'static str, // technique name (forward or backward)
pub observed: Vec<Coord>, // cells the pattern matched on
pub mv: Move, // the deduction's move
pub depth: u32, // 0 forward, ≥1 trial search
}
Tier is purely a difficulty bucket; “we ran out of budget” is
GradeStatus::TimedOut, “the puzzle is broken” is
GradeStatus::Unsolvable. Both persist alongside puzzles in the
db, indexed separately for curation queries — “show me solved
Hard puzzles” and “show me puzzles that timed out at any tier” are
two different filters.
Enumerator
Enumeration is genre.enumerate — an fn pointer on the runtime
&'static Genre handle that streams uniquely-solvable puzzles for
a given size, calling on_puzzle for each hit and on_stats
periodically for progress UIs.
pub enumerate: fn(
&EnumConfig,
&AtomicBool,
&mut dyn FnMut(&Puzzle),
&mut dyn FnMut(EnumStats),
) -> String,
The default behind genre.enumerate is walk_canonical — a true
canonical-form walker shared across every genre, driven entirely by
the runtime Genre handle (genre.material_vocab,
genre.derive_clue_at, genre.clue_domain, genre.empty_at,
genre.build_constraints). No genre overrides it today.
Clue derivation per cell goes through genre.derive_clue_at(p, sol, coord) -> Option<ClueValue>. The default body dispatches through
genre.clue_pattern (the canonical Pin / Count /
ConnectedSize shortcut for u8 genres); future multi-byte clue
genres (Tapa, Yajilin) override derive_clue_at directly. See the
ClueValue discussion in Traits → Puzzle for the full picture.
The walker has three nested phases:
- Layouts. Walk the cell-layer material space at the given
size, in canonical-form order under D4 (square) / D2 (non-square)
lex-min. Genres with empty
material_vocab(sudoku, slitherlink) collapse to a single all-floor layout. - Completions. Bare-solve each layout to enumerate the legal full assignments under the genre’s interior rules (no clues active). Skip completions whose maximally-clued puzzle isn’t uniquely solvable — no clue subset of an ambiguous full-clue set can be unique.
- Clue subsets. For each accepted completion, walk lex-ascending k-combinations of clueable cells for k = 1, 2, … Smallest-puzzle-first ordering. Stabilizer-based dedup filters out clue arrangements that are D4-images of each other within a single layout’s orbit, before invoking the solver.
A monotonicity short-circuit runs alongside the subset walk: if
a candidate subset M contains some previously-confirmed unique
subset R as a strict superset, M must also be unique (extra clues
are consistent with the same bare-solve completion, so adding them
can’t introduce a second solution). The walker tracks recorded
unique masks in popcount-bucketed Vec<u64> per completion and
skips the solver call when M is subsumed. Solving was ~90% of
per-subset cost, so this typically yields an order-of-magnitude
speedup once minimal-unique anchors fill in at low k.
See notes/canonical-enumeration.md for the full design rationale,
soundness arguments for the completions skip, and follow-on
directions (value-relabeling-aware dedup, beautification).
EnumConfig
pub struct EnumConfig {
pub dims: Dimensions,
pub limit: usize, // 0 = unlimited
pub time_budget: Duration,
pub holes: bool, // include Material::Hole in the cell alphabet
pub max_bare_solutions: usize, // cap on completions per layout (default 10_000)
}
pub struct EnumStats {
pub layouts_examined: u64,
pub total_layouts: u64,
pub completions_examined: u64,
pub subsets_examined: u64,
pub unique: u64,
}
holes widens akari from 2-state (floor/wall) to 3-state
(floor/wall/hole). Other genres don’t declare Hole on the cell
layer, so the flag is a no-op for them. Default is true; set to
false for larger akari grids where the 3-state space blows up.
max_bare_solutions caps the bare-solve completion enumeration
per layout. Sudoku 4×4 has exactly 288 completions; slither 4×4
has thousands. The cap mostly exists so a pathological scaffold
with millions of legal completions can’t hang the walker.
The return type is String — a free-form “stop reason” the caller
echoes back ("exhausted", "limit", "deadline", "cancelled",
or "bad-config:<why>").
Why callbacks, not iterators
Every real caller wants two things an iterator handles awkwardly:
- Progress reporting, since canonical-form scans can examine millions of layouts even at small sizes; silent UIs are broken.
- Early exit on multiple conditions — output limit, wall-clock budget, user cancel. Returning an iterator forces every consumer to re-thread those exit conditions through the pull loop; baking them into the contract keeps callers trivial.
cancel is a separate &AtomicBool rather than a config field so
EnumConfig stays Serialize + DeserializeOwned — runtime
cancellation has no business in a serialized config.
Discovery and coverage (planned)
The discovery + coverage feedback loop is described elsewhere — pair the enumerator with the technique library and the trial-search oracle to mine specimens (small puzzles annotated with what was hard about them), then run every technique against every persisted specimen for per-technique hit counts and coverage residue.
The CLI verbs discover and coverage are placeholders for that
loop; the mining loop runs ad-hoc against Genre::enumerate until
those land.
Frontends
The two shipped surfaces: the penmark CLI for batch operations and
the eframe desktop app for interactive use. WASM is planned but not
yet built.
CLI
penmark <verb> <genre> [options]. The genre is a required
positional, never a flag — every command knows up front which
genre it’s running, and a missing genre is an immediate error rather
than a subtle dispatch into the wrong defaults.
The verb set is still in flux; what’s wired today:
| Verb | Purpose |
|---|---|
solve | Solve every puzzle in a canonical dataset; report counts and timings. |
grade | Grade one puzzle against a profile and print the trace. |
grade-canon | Grade every puzzle in a canonical dataset against a profile; print difficulty distribution. |
generate | Full pipeline — scaffold + solve + derive + reduce. Wired for akari, sudoku, slitherlink. |
scaffold | Phase 1 only — wall/hole structural pass, no solver, no clues. |
enumerate | Walk uniquely-solvable puzzles of a given size for a genre. |
import | Read puzzlehound’s collated output and write canonical records to data/puzzles/<genre>.jsonl.gz. |
profile-init | Copy the embedded default profile to ~/.config/penmark/profiles/<genre>/ for editing. |
profile-eval | Run a profile against a canonical dataset; print the difficulty distribution. |
list-impls | Print the hand-maintained list of genres + algorithms. |
The verbs discover (capture specimens at every trial-search step)
and coverage (per-technique hit-count report against persisted
specimens) are described in the Enumerator chapter but
haven’t shipped as CLI verbs yet. The mining loop runs ad-hoc against
the enumerator until those land.
Config knobs are clap flags on the universal Cmd enum. There are
no per-genre CLI files — GenericCli is a single non-generic impl
that handles every command for every genre by dispatching through
the runtime &'static Genre handle (genre.default_gen_density,
genre.apply_cli_density, genre.parse_text, …). Adding a new
option means one clap field on Cmd::Generate (or wherever) plus
reading it from GenericCli::generate.
Parallelism is op-by-op: generate and the enumerator fan out
via rayon::par_iter across puzzles in the work queue; one puzzle’s
solve/grade is sequential. --threads=N is consistent where it
applies and ignored on single-puzzle ops.
Directory layout
Penmark lives at threeemojis/penmark/ — its own top-level project,
not nested under ordo. The repo is a Cargo workspace:
threeemojis/penmark/
├── Cargo.toml — workspace manifest
├── core/ — engine: traits, library types, all algorithms
├── cli/ — `penmark` binary
├── eframe/ — local desktop app
├── data/ — canonical per-genre datasets + import inputs
│ ├── puzzles/<genre>.jsonl.gz — what the engine reads
│ └── dailyakari/ — one source's raw download script
└── scripts/ — bench-all, fetch-datasets, …
Inside core/src/:
core/src/
├── lib.rs — crate root, re-exports
├── coord.rs — Coord, OrthoDir, DiagDir, wire encoding
├── substrate.rs — Substrate, Material, MaterialVocab, Dimensions
├── mark.rs — Mark
├── region.rs — Region + resolution against a puzzle
├── constraint.rs — Constraint, Role, Rule, Violation, RuleKind
├── outcome.rs — Move, ConstraintOutcome, Status
├── puzzle.rs — Puzzle struct + GenreTrait (implementation interface)
├── genre.rs — runtime Genre struct + per-genre &'static Genre handles
├── overlay.rs — Overlay vocabulary (Highlight / Conflict / LoopFill / …)
├── render.rs — MarkRenderStyle (Lamp / Glyph / EdgeStroke)
├── tagged.rs — tagged-JSON wire format + Marks
├── game.rs — Game trait
├── cell_counter.rs — CellCounter declarations
├── symmetry.rs — orbit-based canonical-form helpers
├── db.rs — SQLite-backed persistence
├── algorithm/
│ ├── mod.rs — Solver, Grader traits, shared *Params
│ ├── eval.rs — generic constraint evaluator
│ ├── geometry.rs — region / reach geometry helpers
│ ├── generate.rs — GenConfig + scaffold_solve_derive
│ ├── enumerate.rs — EnumConfig + walk_canonical
│ ├── profile.rs — GradingProfile loading + validation
│ ├── fast/ — FastGame + Propagator + FastSolver
│ ├── simple/ — SimpleGame + SimpleSolver (brute-force ref)
│ ├── grader/ — StandardGrader + Technique trait + standard techniques
│ └── ortools/ — OR-Tools CP-SAT wrapper (cross-check oracle)
└── puzzles/ — one subdirectory per genre
├── pzv.rs — shared pzv-encoding helpers
├── store.rs — shared per-genre JSONL store
├── akari/ — mod + parse + pzv + scaffold + variants + tests
├── sudoku/ — mod + parse + pzv + generate + tests
└── slitherlink/ — mod + parse + pzv + generate + tests
Per-genre directories vary in shape — some genres ship a generate.rs
(when the universal recipe needs an override), some ship a
scaffold.rs (akari’s wall-style + clue-filter slices). The required
files are just mod.rs (with GenreTrait impl on the marker struct)
parse.rs. Enumeration runs through the universalalgorithm::enumerate::walk_canonicalfor every genre — there’s no per-genreenumerate.rsto write.
Adding a new genre is a new directory under core/src/puzzles/ plus
a one-line entry each in cli/src/main.rs::genres() and
eframe/src/main.rs’s genre list — see the Adding a genre chapter.
Inside the frontends:
cli/src/
├── main.rs — clap setup, dispatches to GenericCli
├── common.rs — shared CLI plumbing (load_dataset, summaries)
├── generic.rs — GenericCli — one impl handles every genre via &'static Genre
├── genre.rs — GenreCli trait (dyn dispatch surface)
├── grade_dataset.rs — grade against canonical datasets
├── import.rs — puzzlehound output → canonical pipeline
└── profile.rs — profile-init / profile-eval
eframe/src/
├── main.rs — egui app shell + genre routing
├── common.rs — shared widget code
├── genre.rs — GenreApp trait
├── games.rs — universal app behavior dispatching on puzzle.genre
├── painter.rs — universal puzzle painter (dispatches on genre.mark_render)
├── harness/ — solve/generate/grade harness panels
├── dataset_browser.rs — browse data/puzzles/<genre>.jsonl.gz
└── db_browser.rs — browse the SQLite db
Both frontends pulled per-genre files pre-collapse; they’re now
single implementations that dispatch on the &'static Genre handle
carried by each Puzzle. The fn-pointer table in Genre answers
every question they used to out-source per-genre.
Inside the bench harness:
core/benches/
├── penmark.rs — single bench file, dispatches all genres
├── common/mod.rs — solver_bench<G>, grader_bench<G>,
│ generator_bench<G>, pipeline_bench<G>,
│ load_fixtures<G>, load_dataset_corpus<G>
└── fixtures/<R>x<C>.json — checked-in akari fixture corpora
cargo bench --bench penmark -- akari/solver filters via criterion’s
name match.
Adding a new puzzle genre
End-to-end checklist for landing a new genre. Akari, sudoku, slitherlink, and takuzu are the worked examples — open them side by side when following along. Takuzu is the smallest of them (single mod.rs + parse.rs + tests.rs, no pzv codec, no scaffold passes), so start there if you’re after the minimum viable genre.
The post-collapse architecture means a genre is now mostly one
file: core/src/puzzles/<genre>/mod.rs with a GenreTrait impl on
a zero-sized marker. Per-genre CLI files, eframe files, and bench
files no longer exist — the universal GenericCli, eframe app, and
bench harness all dispatch through the runtime &'static Genre
handle, so adding a genre to the live binaries is one entry in the
GENRES slice plus a few re-exports.
TL;DR
Two scenarios:
-
The genre’s rules already have
Rulevariants — common case onceConnected,NoTwoByTwo,ValueGroupedSizeetc. land. You’re writing the genre marker +GenreTraitimpl + parsers + aGenreTrait::generateoverride (when the universal recipe isn’t enough) + tests + UI wiring + bench wiring. No engine changes. -
You need a new
Rulevariant — same as (1), plus you add the variant toRule, encode it ineval.rs, optionally land a propagator fast path, and land an OR-tools encoding (the match inalgorithm::ortools::solveris exhaustive — your build won’t compile without it).
The order below assumes (1); the “New Rule variant” sidebar
folds in (2).
File map
A complete genre touches:
penmark/
├── core/
│ ├── src/
│ │ ├── lib.rs ← add `pub mod <genre>;` under puzzles
│ │ ├── genre.rs ← add `pub static MYGENRE: &Genre = genre_static!(MyGenre);`
│ │ │ + append to `GENRES` slice
│ │ └── puzzles/
│ │ └── <genre>/
│ │ ├── mod.rs ← marker struct + impl GenreTrait + type alias + constructors
│ │ ├── parse.rs ← Janko text format parser + writer
│ │ ├── pzv.rs ← puzz.link / pzv URL parser
│ │ ├── generate.rs ← (optional) Genre::generate override
│ │ ├── scaffold.rs ← (optional) ScaffoldPass + CluePostFilter slices
│ │ ├── tests.rs ← unit tests + ortools cross-checks
│ │ └── inspect.rs ← (optional) solution accessors
│ └── benches/penmark.rs ← add `register_<genre>(c)` call
├── cli/
│ └── src/main.rs ← genres come from `core::genre::GENRES` — no manual list
├── eframe/
│ └── src/main.rs ← same — iterates `GENRES`
└── data/
└── puzzles/<genre>.jsonl.gz ← canonical per-genre store, populated by
`penmark import`. Nothing to commit by hand.
There’s no per-genre enumerate.rs — the universal walk_canonical
covers enumeration for every genre using only the existing trait
surface (material_vocab, clue_pattern, clue_domain, empty_at,
build_constraints). See the Enumerator chapter for the walker’s
algorithm.
The optional files only exist when the genre needs them:
generate.rs— only if the universalscaffold_solve_deriverecipe doesn’t produce diverse output. Sudoku and slither override because they invent topology before the bare-solve can run; akari uses the default.scaffold.rs— only if you shipScaffoldPass/CluePostFilterslices. Akari ships six wall styles and seven clue filters; sudoku and slither ship none.inspect.rs— only when your solution shape (e.g. akari bulb positions) needs accessors that don’t fit the genericSolution.
Step-by-step
Substitute <Genre> for the title-case name (e.g. Masyu) and
<genre> for the kebab-case slug.
1. Pick your substrates
Look at core/src/coord.rs::Substrate — Cells, EdgesH,
EdgesV, Corners. Akari uses just Cells; slitherlink uses
EdgesH + EdgesV + Corners. Most loop genres need edges + corners;
most cell-marking genres need just cells.
2. Pick your rules
Walk core/src/constraint.rs::Rule and pick the variants whose
semantics match your puzzle.
| Genre | Rules |
|---|---|
| akari | ExactCount, AtMost, AtLeastOne, CellMin, Decided |
| slitherlink | ExactCount, DegreeIn, Path { closed: true } |
| sudoku | Pin, Distinct |
| masyu (planned) | Path { closed: true }, DegreeIn, plus MasyuBlack / MasyuWhite |
| nurikabe (planned) | ValueGroupedSize, Connected, NoTwoByTwo |
If every rule already exists and has an eval.rs arm that’s not
stubbed Pending, you’re in scenario (1). Otherwise see the New
Rule variant sidebar.
3. Genre module: core/src/puzzles/<genre>/mod.rs
Define the genre marker, the (vestigial) type alias, impl GenreTrait,
and constructors. The trait surface is large but most methods have
sensible defaults. Required:
pub type MyPuzzle = Puzzle; // vestigial; just shorthand for Puzzle
#[derive(Clone, Copy, Debug, Default)]
pub struct MyGenre;
impl GenreTrait for MyGenre {
const NAME: &'static str = "mygenre";
const DISPLAY_NAME: &'static str = "MyGenre";
const LAYERS: &'static [Layer] = &[Layer::Cells];
const MARK_RENDER: MarkRenderStyle = MarkRenderStyle::Glyph;
/// Runtime handle — must match the literal added in genre.rs.
const GENRE: &'static crate::genre::Genre = crate::genre::MYGENRE;
fn full_domain(p: &Puzzle, _coord: Coord) -> Vec<Mark> {
// For binary genres: vec![Mark::Binary(false), Mark::Binary(true)]
// For numeric: (1..=p.rows() as u8).map(Mark::Numeric).collect()
}
fn default_substrate(dims: Dimensions) -> Substrate { /* ... */ }
fn build_constraints(
substrate: &Substrate,
clues: &Grid<Option<ClueValue>>,
) -> Vec<Constraint> {
// Walk substrate + clue grid. For u8 genres, use the
// convenience helper `for_each_clue_byte` which
// pattern-matches `ClueValue::Number(_)` for you.
}
fn clue_pattern() -> CluePattern {
// Pin / Count{target,region} / ConnectedSize{target}.
// The convenience layer over `derive_clue_at` — see below.
}
fn sample() -> Puzzle { /* small demo puzzle */ }
}
The macro in core/src/genre.rs coerces these monomorphized items
into the runtime Genre struct’s fn-pointer fields:
// core/src/genre.rs
pub static MYGENRE: &Genre = genre_static!(MyGenre);
pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU, MYGENRE];
Optional overrides used by various genres:
material_vocab()— non-Floor materials available on the cells layer (akari:[Wall, Hole]).clue_host_material()— substrate material that holds clue cells (defaultFloor; akari overrides toWall; slither toLabelOnly). DrivesPuzzle::is_clue_target.clue_domain(p, coord)—Vec<u8>of valid clue values at the coord; empty means no clue allowed there.cell_clue(p, r, c)— read non-Pin clues from constraints (akari’sExactCountoverNeighbors4, slither’s over edges). Returns the display shorthand: a singleu8. Multi-byte payloads fold to their first byte.extract_clues(p)— substrate-filtered clue grid (akari only walls hold clues). ReturnsGrid<Option<ClueValue>>; u8 genres emitClueValue::Number(_).on_clue_added(p, r, c, v)— promote material on clue add (akari promotes Floor → Wall).v: &ClueValue.derive_clue_at(p, sol, coord) -> Option<ClueValue>— inverse ofcell_clue. The walker calls this once per clueable cell to derive the canonical clue from a bare-solve completion. Default body dispatches throughclue_pattern()and wraps inClueValue::Number. Override directly when your clue shape doesn’t fitPin/Count/ConnectedSize(Tapa run sequences, future Yajilin arrows).dims_for(rows, cols)— sudoku rounds to nearest factor pair and returnsBoxed{box_w, box_h}.format_value(v)— sudoku overrides for hex / base-36 on big grids.overlays(p, marks, status)— paint overlays (akari lit cells, sudoku candidates + conflicts, slither loop fill at solve).from_janko_text/from_puzzlink_url/to_puzzlink_url— text-format parsers and writers.
Multi-byte clue payloads. The four shipped genres carry single-byte clues, so
ClueValue::Number(_)is the only variant they emit. Genres whose clues are richer (Tapa’s run-length sequences, future Yajilin arrows) useClueValue::Runs(_)or a new variant, and overridederive_clue_atdirectly instead of relying onclue_pattern. TheClueValueenum is closed — adding a variant is one enum entry plus exhaustive-matchwarnings pointing to every site that needs an arm.
4. Generator hook
The universal GenreTrait::generate default is scaffold_solve_derive:
scatter materials per material_vocab, bare-solve, derive clues per
clue_pattern, reduce by orbit. Akari uses this verbatim.
Override GenreTrait::generate when the recipe doesn’t produce diverse
output. The override forwards to a crate-private helper:
fn generate(cfg: &GenConfig, cancel: &AtomicBool) -> Option<Puzzle> {
generate::mygenre_generate(cfg, cancel)
}
The helper in core/src/puzzles/<genre>/generate.rs reads cfg.dims,
cfg.seed, cfg.time_budget, plus whatever cfg.extra /
cfg.clue_density keys make sense for the genre. Sudoku reads
cfg.clue_density[Cells] as target_givens; slither reads
cfg.extra["inside_density"].
Two CLI knobs you’ll usually also override:
const DEFAULT_GEN_DENSITY: f32 = 0.36; // CLI's --density default
fn apply_cli_density(d: f32, cfg: &mut GenConfig) {
// Map --density into the right cfg slot for your genre.
}
If you ship structural pre-passes (border ring, maze carving,
hole-pin patterns) or clue value filters (only-evens etc), declare
them in core/src/puzzles/<genre>/scaffold.rs and override
scaffold_passes() / clue_post_filters(). See
core/src/puzzles/akari/scaffold.rs for the template.
5. Parser: core/src/puzzles/<genre>/parse.rs
Two formats per genre:
- Janko — small-grid text format from janko.at. Useful for hand-built fixtures and dataset smoke tests.
- puzz.link / pzv — URL format used by puzz.link. Lives in
pzv.rs; puzzlekit datasets ship in this format.
Round-trip in tests: parse(s).to_janko() == s for at least a 3×3
plus a corner case.
6. Tests: core/src/puzzles/<genre>/tests.rs
Minimum:
coord_count_formula_holds_for_various_sizes— coord enumeration matches grid math.empty_NxN_constraint_count— right number ofConstraints for an empty puzzle.parse_janko_round_tripover a fixed string.parse_puzzlink_round_tripfrom a real-corpus URL.simple_and_fast_agree_on_*over a hand-built puzzle whose solution count is known.ortools_agrees_with_fast_*(#[cfg(feature = "ortools")]) — at least one cross-check. Catches engine drift.
7. Module registration: core/src/lib.rs
One line: add pub mod <genre>; in the pub mod puzzles { ... }
block.
8. Genre registration: core/src/genre.rs
Two edits:
pub static MYGENRE: &Genre = genre_static!(MyGenre);
pub static GENRES: &[&Genre] = &[AKARI, SLITHER, SUDOKU, TAKUZU, MYGENRE];
That’s it for CLI and eframe — both iterate GENRES at startup and
dispatch on the runtime handle. There is no separate genres list to
maintain in cli/src/main.rs or eframe/src/main.rs.
9. Frontend rendering
The eframe app’s painter dispatches on puzzle.genre.mark_render
(Lamp / Glyph / EdgeStroke / …); clicks dispatch on
(mode, mark_render); overlays come from
(puzzle.genre.overlays)(puzzle, marks, status).
If your genre needs a render style none of the existing variants
cover, add a MarkRenderStyle variant in core/src/render.rs and
a paint pass in eframe/src/painter.rs::paint_marks. Most new
genres won’t.
10. Bench registration: core/benches/penmark.rs
The bench harness iterates GENRES and dispatches solver / grader /
generator / pipeline benchmarks for each. Most genres don’t need
manual wiring beyond appearing in GENRES; genres that ship a
hand-curated corpus instead of generated fixtures register that
corpus in core/benches/common/mod.rs. Filter the bench matrix via
criterion’s name match: cargo bench --bench penmark -- mygenre.
11. Dataset coverage — automatic
The pr_report harness in penmark-eval walks
data/puzzles/<genre>.jsonl.gz end-to-end (FastSolver + grader,
bucketed by rows × cols) for every PR run. Once your genre has
records under data/puzzles/<genre>.jsonl.gz, the PR comment
picks it up automatically — no per-genre test file to maintain.
Earlier iterations of penmark kept hand-rolled
<genre>_dataset_smoke.rs tests for this; they were retired in
favor of the per-genre dataset phase in the eval pipeline.
12. Sanity check
cargo build --workspace --features ortools
cargo test -p penmark-core --features ortools
cargo bench --bench penmark -- mygenre # smokes the bench matrix
cargo run -p penmark-cli -- list-impls # shows the new genre
cargo run -p penmark-cli -- generate mygenre --rows R --cols C
New Rule variant sidebar
If your genre needs a constraint shape no existing Rule covers,
add one before doing the steps above. The change crosses several
files because adding a variant breaks every exhaustive match.
- Variant: add to
Ruleenum incore/src/constraint.rs. Add the matchingRuleKindvariant. Add aViolationvariant — used by eval to surface which constraint failed. - Eval: add an arm in
core/src/algorithm/eval.rs::eval_rule. At minimum returnConstraintOutcome::Pendinguntil decided, plus the obvious-violation cases. Slow-path-only is fine for a first cut. - Tightness: add a tightness arm in
Rule::tightness. High tightness means “prunes a lot per commit”; the solver tries it first. - Propagator (optional):
core/src/algorithm/fast/propagator.rshas per-rule fast paths forforced_moves_at. Eval-only is fine; trial-1 picks up the slack. Add a fast path later if benches demand. - OR-tools:
core/src/algorithm/ortools/solver.rs::encode_constraintis exhaustive onRule— your build won’t compile until you add an arm. Two patterns:- Eager: pick the closest CP-SAT primitive (
add_or/add_at_most_one/add_linear_constraint). Use for anything expressible as linear / boolean over per-coord indicators. - Lazy cut: push a
LazyCheckvariant. CP-SAT 0.4 doesn’t have direct primitives for graph-shape rules (Connected,Path,BoundedSize).solve_innercalls each lazy check after every CP-SAT response and adds anadd_orno-good cut on violation.
- Eager: pick the closest CP-SAT primitive (
- Grader (optional): if your rule yields a recognisable human
technique, add a
Techniqueimpl incore/src/algorithm/grader/. Most rules don’t add new techniques — the existing ones cover most patterns generically. - Docs: update the encoding table in External solvers.
Definition of done
-
cargo test -p penmark-core --features ortoolspasses. -
cargo bench --bench penmark -- <genre> -- --testsmokes. -
cargo run -p penmark-cli -- list-implsshows the new genre. -
cargo run -p penmark-cli -- generate <genre> --rows R --cols Cproduces a valid puzzle. - At least one
ortools_agrees_with_fast_*test incore/src/puzzles/<genre>/tests.rs::solver. - eframe app loads and a sample puzzle renders + solves interactively.
Cross-references
- Generator chapter —
GenConfig,genre.generate, scaffold passes, clue filters. - Fast solver — internals before writing a propagator fast path.
- External solvers — OR-Tools encoding strategy table.
Roadmap
What the MVP covers, where performance currently sits, and what’s open.
MVP
Three genres on one engine, two frontends. Mostly landed.
Genres in scope
| Genre | Mark | Substrates | Variants | Constraint shapes used |
|---|---|---|---|---|
| Akari | Binary | Cells | classic + holes (Iraka) | ExactCount, CellMin |
| Slitherlink | Binary | Cells, EdgesH, EdgesV | classic | ExactCount, DegreeIn, Path { closed }, LoopFreezesCounts |
| Sudoku | Numeric | Cells | classic 9×9 + several variants via canonical-import path | Distinct, Decided |
For each genre:
- Concrete
Puzzleimpl with constructor that populates theconstraintsVec at build time. ✅ - Propagators per
RuleKindthe genre uses, riding the sharedPropagatorengine. ✅ FastSolver+StandardGraderend-to-end. ✅- OR-Tools cross-check via
algorithm/ortools/. ✅ Enumerator<G>impl walking the genre’s canonical structural space, with sharedEnumStopsemantics. ✅- Starter grading profile shipped as TOML, validated against the registered technique list at load time. ✅
- ASCII renderer for terminal output. ✅
- Generator: shipped for Akari; not for Slitherlink/Sudoku yet (those load from the canonical dataset).
Frontends in scope
- CLI — see the CLI chapter for the live verb set. Genre routing is hand-wired per verb (no registry).
- eframe desktop app — one window, per-genre interactive views
in
eframe/src/games/, plusharness/(solve / generate / grade panels),dataset_browser.rs, anddb_browser.rs. Per-genre rendering lives in each genre’s module directly; noRendertrait abstraction (it’ll earn one if/when the catalog grows enough to make the overlap clear).
Explicitly out of scope, for now
- More genres: Nurikabe, Heyawake, Hitori, Tapa, Yajilin, Masyu, Hashiwokakero, Numberlink, Fillomino, etc.
- Sudoku variants beyond what the canonical importer already understands (Killer, Sandwich, Thermo, Arrow, Sudoku-X, German Whispers, Renban, Kropki, XV are partially modelled but not uniformly graded).
- Pattern-phase techniques beyond a starter handful.
- Hex / non-square substrates.
- Backward-logic refinements (clue-cover beyond the basic case).
- A
Rendertrait abstracted across genres. - The
wasmcrate,wasm-bindgensurface, and theordo/editpage that consumes them. The wire-format work (tagged puzzle and game serialization) is in scope and described above; the WASM bindings themselves and whichever registry shape they want are what’s still to land.
Genre expansion
Which genres ship next and why, revisited against what’s actually
landed. The original sequencing (masyu → nurikabe → nurimisaki →
heyawake → yajilin) was written when adding a genre meant ~7 files
across core/, cli/, and eframe/, and when the OR-Tools
encoder didn’t yet cover the rules that nurikabe-family genres
need. Both of those have changed. This chapter rewrites the plan
on top of the current infra.
What shipped vs. the plan
| Genre | Original plan | Actual |
|---|---|---|
| akari | done at plan time | done |
| slitherlink | done at plan time | done |
| sudoku | not on plan | shipped (variant-CtC market fit; jumped the queue) |
| takuzu | not on plan | shipped (worked example for the “smallest possible genre” path) |
| masyu | Phase 1 | not started |
| nurikabe | Phase 2 | not started |
| nurimisaki | Phase 3 | not started |
| heyawake | Phase 4 | not started |
| yajilin | Phase 5 | not started |
| hashiwokakero | deferred | now in scope (named in the business plan) |
Sudoku displaced masyu because the business plan
makes variant-Sudoku in the CtC orbit the beachhead audience.
Takuzu displaced nurikabe because shipping a third tiny genre
(single mod.rs + parse.rs + tests.rs, no pzv codec, no
scaffold passes) validated the universal-genre architecture and
gave the Adding a genre chapter a minimum-viable example to
point at.
What changed in the infra
Three concrete shifts since the original plan was written.
Frontend wiring is one line
Adding a genre to CLI + eframe + bench is now one entry in the
core::genre::GENRES slice, not three full files. The post-collapse
architecture in Adding a genre spells this out: all three
harnesses iterate the slice and dispatch through the runtime
&'static Genre handle; per-genre cli/src/<genre>.rs and
eframe/src/<genre>.rs no longer exist. The wiring cost per genre
dropped from “a day or two” to “an hour.”
OR-Tools already encodes the stubbed rules
core/src/algorithm/ortools/solver.rs::encode_constraint is
exhaustive on Rule and has arms for every variant currently
stubbed in eval.rs:
| Rule | eval.rs (FastSolver) | OR-Tools encoder |
|---|---|---|
Connected | Pending | implemented |
NoTwoByTwo | Pending | implemented |
ValueGroupedSize | Pending | implemented |
Polyomino | Pending | implemented |
BoundedSize | Pending | implemented |
Path { closed: false } | Pending | implemented |
Sum / Increasing / MinDifference | Pending | implemented |
This is the part that re-orders the roadmap. A genre whose rules
are stubbed in eval.rs is not blocked from shipping — OR-Tools
solves it correctly, the universal walker enumerates it, and
trial-1 carries forces in the absence of a fast eval arm. The
oracle works before the fast path does. Fast-path arms can land
later when bench numbers force them.
Universal walker, generator, and tests
walk_canonical removed per-genre enumerators (~200 LOC each).
scaffold_solve_derive covers most generation. Per-genre
dataset-smoke tests are copy-paste. The dataset infra
(puzzlekit-dataset repo, /private/tmp/penmark-data/) is in
place. The cost of “adding a genre that uses already-shipped
rules” is now mostly: write build_constraints, write the pzv
parser, write tests.
Recommended order
The new ordering optimizes for two things: cheapest first (stay in flow, ship visible progress), and then heaviest infra-leverage (the genre that promotes the most stubs to real fast-path eval). This is a different ordering than the original plan, which optimized for engine-reuse compounding alone.
Phase 1 — masyu (cheapest delta)
Still the right first ship. Slitherlink-minus-clues with two new
local rules (MasyuBlack, MasyuWhite). Reuses every existing
slitherlink primitive: Path { closed: true }, DegreeIn,
union-find, parse_url_frame, the pzv decode4Cell family.
Eval-only first cut for the two new rules; trial-1 carries
forces. OR-Tools needs new exhaustive arms for the two
variants, but they’re local and small.
Why this ordering still holds: it’s the genre with the smallest possible delta against shipped code, so it’s the ideal “warm up the new architecture” ship and proves the once-per-genre cost is actually as low as the Adding a genre chapter claims.
Ship target: a few days. Definition of done unchanged from the original plan — 50-puzzle masyu dataset smoke passes, ortools cross-check, eframe pane renders.
Phase 2 — hashiwokakero
New addition from the business plan. Hashi is named explicitly as “within reach on the same architecture.” Engine-wise it sits between slither (edges) and nurikabe (Connected). It needs:
- An edge-multiplicity rule — the existing edge substrate carries
binary
drawn/not drawn; Hashi needs{0, 1, 2}per edge slot. Either a newMark::Numericon edges (cleanest) or a newRule::EdgeMultiplicity { allowed: [0,1,2] }over edges (smaller diff but encodes a marker as a constraint). ExactCountper island over its incident edges (already shipped).Connected { mark: Bridge }over the bridge edge set (stubbed ineval.rs; needs a real arm — see below).- A planarity / non-crossing cut: bridges can’t cross. Lazy cut pattern, same shape as the slither loop closure.
The Hashi solver work doubles as the forcing function for
promoting Connected from stub to real. Once Connected has a
fast eval arm, nurikabe / nurimisaki / heyawake / yajilin all
benefit.
Why earlier than Phase 3+: the business plan explicitly calls Hashi out as part of the “Nikoli family within reach” pitch. CTC solves Hashi occasionally on YouTube. It’s a different shape of puzzle from the cell-coloring family, which broadens the engine demo without demanding the whole nurikabe-family stub-to-real push at once.
Ship target: ~2 weeks. Most of the time goes to the planarity cut and to making Connected fast enough for trial-1 to be useful.
Phase 3 — nurikabe (the high-leverage stub-to-real push)
This was Phase 2 in the original plan, and the underlying
argument hasn’t changed: implementing real fast-path eval for
Connected, NoTwoByTwo, and ValueGroupedSize unlocks four
later genres simultaneously. What’s changed is that Hashi
(Phase 2 above) already paid for Connected’s real arm, which
makes nurikabe’s incremental cost smaller than originally
estimated.
Three rules need real eval.rs arms after Hashi:
NoTwoByTwo { mark }— slow-path is trivial (walk every 2×2); fast-path is per-2×2 aggregate with force-when-three-and-one-undecided.ValueGroupedSize— designed for nurikabe; per-value-seed BFS over committed cells of that mark, asserting size == value.Connected(already done in Phase 2 if Hashi shipped first; otherwise here).
The pzv work is the standard decode4Cell family.
Ship target: ~3 weeks. Most of the time is the propagator work for the three rules.
Phase 4 — nurimisaki, heyawake (proves Phase 3 transports)
Both reuse Connected + NoTwoByTwo from Phase 3. Both ship
fast on top of that base.
- Nurimisaki adds one new local rule:
Cape { at, len }. LOS ray of lengthlen-1white, terminated by black/wall, other three neighbors black/wall. Eval-only first cut. - Heyawake adds room geometry + adjacency. No new
Rulevariants — encodes viaExactCountper numbered room +AtMost(1)per adjacent black pair + per-straight-runForbidden(ExactCount(black, k))for the room-boundary rule. Lots of constraints, no new engine work.
Land them in the same phase because nurimisaki proves Phase 3’s stub promotions transport, and heyawake stress-tests the constraint count on existing rules.
Ship target: ~2 weeks combined.
Phase 5 — yajilin (the prize)
Original Phase 5; still the hardest. Cell-only Model A
({Black, WhiteLoop, Numbered} per cell, loop recovered from
WhiteLoop set via Connected + DegreeIn(2)) reuses everything in
place by then. Model B (edges + cells coupled biconditional)
deferred until benches force it.
If Phase 3’s Connected arm proves not strong enough as a
“single closed loop” check on cells, add a cell-substrate
Path { closed: true } variant. The slither edge-based version
demonstrates the algorithm is tractable.
Ship target: ~3 weeks.
Phases 1-5 in roughly two months
Estimate is honest if focused: masyu (3-5 days) + hashi (2 weeks)
- nurikabe (3 weeks) + nurimisaki+heyawake (2 weeks) + yajilin
(3 weeks). About 10 weeks of focused work, less if some genres
ship faster than estimated, more if
Connected’s fast eval arm turns into a multi-week project on its own.
This is materially faster than the original plan’s pace because
of the architecture collapse: each genre is mostly one file in
core/ plus three Vec entries elsewhere, and OR-Tools solves
correctly today for every rule shape we care about. The gating
work is per-rule fast-path eval, not per-genre wiring.
Deferred (still out of scope)
- shakashaka — needs
Mark::Triangle(Empty, NW, NE, SW, SE)(5-cardinal mark; engine assumes Binary/Numeric) and aRegionIsRectanglegeometric rule with no analog. Revisit when there’s a concrete demand signal. - lits — tetromino placement is a different paradigm than
per-coord candidate sets. Would need
Rule::OneOfShapesand same-shape adjacency across rooms. Tractable but architecturally distinct. - dbchoco — region-pairing congruence; needs shape-comparison primitives.
- Crossword, mazes, hidden-information genres — covered in the Puzzle families chapter; sister-track work, not in-scope for the CSP roadmap.
Locked decisions (carry over)
- Trial-1 + OR-Tools oracle carries any genre as a “slow but correct” first cut. Fast-path eval lands when benches demand.
- New genres are one-file-in-
core/puzzles/<genre>/+ three Vec entries (CLI / eframe / bench). No per-genre frontend files. - pzv URL framing goes through
core/src/puzzles/pzv.rs::parse_url_frame. - Each genre lands with a 50-puzzle dataset smoke test against the puzzlekit / puzzlink corpus.
Why this matters for the business plan
The business plan makes engine breadth a long-term moat: “the home for all Nikoli-style puzzles” is the embed-pitch differentiator no competitor can offer. The Phase 1 beachhead is variant-Sudoku — done — but Phase 4 of the business roadmap (“multi-genre expansion and market growth”) is the phase that turns the engine into a real pitch. Shipping six more genres in the next ~10 weeks moves Phase 4 from “long-term moat” to “ready to demo when the marketplace ships in the business plan’s Phase 2.”
The cheap-then-leveraged ordering (masyu → hashi → nurikabe → nurimisaki+heyawake → yajilin) means the engine demo gets visibly broader every couple of weeks, which is the showing-up-in-CtC-orbit cadence the business plan’s go-to-market section depends on.
Performance roadmap
The originally-planned tiers all landed inside FastGame +
Propagator. FastSolver solves a 100×100 Akari in ~42 ms;
StandardGrader grades the same in ~210 ms. See the Fast solver
chapter for the full optimization log — what worked, what was
reverted, and what’s still open.
The ideas that landed:
- Region indices cached once at construction. Each constraint
resolves its
Regionto aVec<usize>of dense coord indices inPropagator::region_indices; the hot path walks the slice directly withcandidates[i]access. - First-class LOS / ray substrates.
Region::Los,Region::LosCross, andRegion::Neighbors4resolve natively viapuzzle.substrate().blocks_los(r, c)— a direct read on the cells grid (Material::Wallblocks,FloorandHolepass through), no trait-method dispatch. - Per-constraint aggregates. Each constraint caches
committed_with_mark/possible_with_mark;apply/undomaintain them;forced_moves_atreads them in O(1). - Per-cell counters (Tier 3, the big one). Puzzles declare
named counters via
Puzzle::cell_counters(); the propagator maintains them throughapply/undovia precomputed inverse- reach.Rule::CellMin { counter, min }propagates cell-by-cell through a dirty-cells queue instead of riding the AC-3 worklist. - Bitmask candidates. Per-coord
u32, one bit per mark in the puzzle’s universe. Apply / undo aremask & !bit/mask | bit. - MRV trial ordering + intra-sweep propagation in trial-1.
- LTO=fat + codegen-units=1 in release.
Still open: fused substrate-aware updates (a move that walks a ray AND emits forced moves on those same cells, in one pass), MAC / GAC / AC-2001, trial-2/N, smarter branch heuristics, parallelism in the solve loop, WASM measurement.
Virtuous cycle
How the four operations Penmark performs on a puzzle — generate, solve, grade, enumerate — and the discovery loop that connects them, fit together as one feedback system rather than four standalone tools.
The loop
flowchart LR
GEN["Generate<br/><sub>Genre::generate</sub><br/><sub>scaffold → bare-solve →<br/>derive clues → reduce</sub>"]
PUZ[(Puzzle corpus)]
SLOW["Slow solve<br/><sub>SimpleSolver,<br/>OR-Tools CP-SAT</sub><br/><sub>reference truth</sub>"]
FAST["Fast solve<br/><sub>FastSolver</sub><br/><sub>AC-3 + trial-1<br/>+ Propagator</sub>"]
GRADE["Grade<br/><sub>StandardGrader</sub><br/><sub>fire named Techniques<br/>in priority order</sub>"]
ENUM["Enumerate<br/><sub>Genre::enumerate</sub><br/><sub>stream unique puzzles<br/>at a given size</sub>"]
DIFF["Difficulty<br/>label"]
DISCOVER["Discover<br/><sub>mining pass over corpus:<br/>capture contradictions,<br/>cluster proof-shapes</sub>"]
HUMAN["Human curation<br/><sub>inspect candidates,<br/>encode as Technique impl</sub>"]
CATALOG[("Grader<br/>technique<br/>catalog")]
GEN -->|new puzzle| PUZ
ENUM -->|stream| PUZ
PUZ --> SLOW
PUZ --> FAST
SLOW -->|cross-check| FAST
FAST --> GRADE
GRADE --> DIFF
DIFF -->|filter / re-roll| GEN
PUZ --> DISCOVER
FAST -->|trial-k probes,<br/>contradiction proofs| DISCOVER
DISCOVER --> HUMAN
HUMAN -->|new Technique impl| CATALOG
CATALOG --> GRADE
CATALOG -->|finer-grained<br/>difficulty control| GEN
classDef gen fill:#1f3b73,stroke:#3b5fa8,color:#fff
classDef solve fill:#2d5a3d,stroke:#4a8a5a,color:#fff
classDef grade fill:#5a4a2d,stroke:#8a7a4a,color:#fff
classDef discover fill:#5a2d4a,stroke:#8a4a7a,color:#fff
classDef store fill:#3a3a3a,stroke:#6a6a6a,color:#fff
class GEN,ENUM gen
class SLOW,FAST solve
class GRADE,DIFF grade
class DISCOVER,HUMAN discover
class PUZ,CATALOG store
The four “boxes a user runs” are colored separately:
- blue — sources of puzzles (generate, enumerate).
- green — solvers (slow reference + fast production).
- gold — grading (the technique catalog plus the resulting difficulty label).
- purple — the discovery loop that grows the catalog.
- grey — passive stores (the puzzle corpus, the technique catalog).
How each edge actually flows
Generate → corpus
genre.generate runs the universal scaffold_solve_derive
recipe (or a genre-specific override): pick structural materials,
bare-solve a completion with FastSolver, derive clues via
genre.derive_clue_at (the typed ClueValue-producing hook;
defaults through genre.clue_pattern for u8 genres), then reduce
by symmetry orbit until clue density hits the floor. The output is one uniquely-solvable
Puzzle (with its genre field set to the runtime handle).
Repeated calls under varied seeds populate a corpus file (the
per-genre data/puzzles/<genre>.jsonl[.gz]).
genre.enumerate is the same idea wholesale — instead of
“give me one good puzzle for this config” it streams every
uniquely-solvable puzzle at a given size. Either path lands in
the same on-disk corpus.
Corpus → solve
Two solvers run in parallel against the same corpus, for different reasons:
- Slow solve is the reference.
SimpleSolveris brute-force DFS overHashMap<Coord, Vec<Mark>>; the obvious-correct impl every faster solver gets cross-checked against.OrToolsSolver(CP-SAT, behind theortoolsfeature) is the “real CP solver” point of comparison — answers the question “how close is our bespoke propagator to a production CP engine on this corpus?”. - Fast solve is production.
FastSolverruns DFS plus AC-3 propagation through aPropagatorover aFastGame<'p>(bitmask domains, per-constraint aggregates, per-cell counters), with a depth-1 trial-1 sweep at every node. This is what the grader and generator call internally.
The cross-check edge between them is what keeps fastsolve
honest. Every simple_and_fast_agree_* test asserts the two
return the same solution set on a fixed puzzle; the corpus-wide
bench harness asserts the same across thousands of puzzles.
Fast solve → grade
StandardGrader fires named human-pattern Techniques in
priority order against the same Propagator engine FastSolver
uses. Techniques are rule-kind-driven (Propagation-phase ones
participate in the inner fixpoint loop; Pattern-phase ones run
when propagation stalls), so genres sharing rule shapes share
deductions. The output is a difficulty label assigned by
counting which technique tiers were needed to close the puzzle.
Difficulty → generate
The generator’s cfg.time_budget and cfg.clue_density knobs
control raw structural shape, but the grader’s difficulty
label is what makes the generator useful. A “produce a hard
6×6 sudoku” workflow is the generator producing candidates, the
grader rating each, and the harness re-rolling anything that
doesn’t land in the requested tier. Without the grader the
generator is just a uniqueness oracle.
Corpus + fast solve → discover
The discovery loop is the part of this diagram that doesn’t
yet exist as production code but is the point of the whole
exercise. The shape: run trial-k preprocessing across the corpus
at depth k > 1, capture each contradiction’s minimal-variable-
set proof, cluster the proofs by structural shape (modulo
symmetry), and surface the top-N candidate patterns.
The patterns that cluster cleanly are candidate techniques. Each one is a deduction the existing grader catalog doesn’t have a name for, surfaced from real puzzles where it would have mattered.
Discover → human → catalog
A human inspects the candidates. Patterns that read like
something a sudoku setter would call a technique (“locked
candidate,” “sashimi X-wing,” etc.) get encoded as Technique
impls and added to the grader’s catalog. Patterns that turn out
to be 12-variable monsters with no human-recognisable shape get
discarded — they don’t pollute the catalog.
The grader stays a catalog of hand-coded named techniques. The discovery loop is a helper for the human writing techniques, not a generator of them. That matters for three reasons:
- The grader API is unchanged; difficulty calibration stays anchored to human-recognisable techniques.
- Solver internals (aux-var state, proof shapes, search bookkeeping) never have to be human-readable.
- Auto-discovered candidates that aren’t interpretable get filtered at human review.
Catalog → generate (the long edge)
This is the edge that closes the cycle. A grader catalog with finer-grained tiers gives the generator finer-grained difficulty control: “produce a puzzle that needs technique X but not technique Y” becomes expressible. That produces a richer corpus that, on the next mining pass, surfaces a different set of candidate patterns — including some that the previous catalog was masking by always firing first.
Why this framing matters
Each operation in isolation is unremarkable: a CP solver, a DFS generator, a rule-fires grader, a difficulty tagger. The interesting part is that the four operations share the same propagation engine and so each improvement compounds:
- A faster
FastSolvermakes the generator produce more candidates per second, makes trial-k tractable for larger k, and makes the grader reach harder puzzles within the same time budget. - A bigger grader catalog makes the generator’s difficulty output sharper and makes the discovery loop’s clustering pass see fewer false-positive “new” patterns.
- A bigger corpus makes both the bench harness more meaningful and the discovery loop more likely to surface low-frequency patterns.
The cycle’s bottleneck moves around. When the existing catalog covers most of the easy patterns, the discovery loop becomes the rate-limiting step. When the discovery loop’s canonicalisation surfaces clean clusters, the human-curation step becomes the rate-limiting step. When the catalog is rich enough that the generator has fine-grained control, the next unlock is harder corpora that exercise the long tail of techniques.
Where we are in this picture
The blue / green / gold parts (generate, both solvers, grader) are all in tree and exercised by the bench matrix and the canonical corpus. The purple discovery loop is not yet built — every “Technique” in the grader catalog is hand- written from scratch, with the discovery side filled in by human intuition rather than corpus mining.
The aux-vars / sudoku-triads experiment on
claude/penmark-algorithmic-speedups-9tMvr was a pre-discovery
attempt to enrich the propagator’s primitive set so the
discovery loop, when built, would have a richer vocabulary to
cluster proofs in. The integration didn’t pay off on the
current sudoku corpus (~2× slowdown, identical DFS node count
across difficulty tiers — trial-1 already subsumes the patterns
triads catch). The full writeup is in
penmark/notes/aux-vars-postmortem.md;
the takeaway is that trial-1 is the floor on the current
corpus, so future discovery work should target either
genuinely-hard puzzles (where trial-1 doesn’t close the gap and
the search tree grows) or other genres entirely.
Status
Four genres (Akari, Sudoku, Slitherlink, Takuzu) run end-to-end on a
single CSP engine, on CLI + eframe. OR-Tools wired in as a
cross-check oracle. Canonical/wire layer ingests external puzzle
formats via puzzlehound. Tagged-serialization (puzzle blobs) and
the saved-game wire format (puzzle + marks envelope) both ship,
with round-trip tests across every genre via the single
Puzzle::from_tagged_json[_with_marks] path. The Tier 1
presentation surface (MarkRenderStyle, Material, material,
cell_clue, is_play_target, format_value on Puzzle;
marks / get / set / clear on Game) lives on the trait,
and eframe consumes it (per-genre HashMap mark stores collapsed
into the generic tagged::Marks).
Open frontiers — designs documented in this book that haven’t yet landed in the code:
-
Substrate reification +
Pinconstraint. Documented in the Substrates and Constraints chapters; not yet implemented. The big move: collapseSUBSTRATES/coords()/coord_bg()/los_blocks()/rows()/cols()into onesubstrate()accessor returning aSubstratestruct with four direct layer fields.BgbecomesMaterial; the discriminator enum renamesSubstrate→Layer. Givens collapse into the constraint list asPin(single-coord) orExactCount/Sum(count-shaped); per-genreclues: HashMapfields stay as editor scaffolding only. -
Constraint owns its regions +
Region.presentation. Documented in the Constraints chapter (Regions section) and Decorations, Themes and Colors; not yet implemented.region: Regionbecomesregions: Vec<Region>onConstraint— owned by the constraint, no separate region registry.Regionwraps aRegionShapeplus an optionalRegionPresentation(border / fill / label / shaded / decoration). The renderer’s per-genre paint code collapses to a single walk:puzzle.constraints().iter().flat_map(|c| &c.regions), painting whateverpresentationis set. -
Generic grid widget. Once
Region.presentationand substrate land, eframe’s per-genredraw_grid/paint_edgecollapse onto a single widget that dispatches onMARK_RENDERfor marks and walksconstraintsfor region presentations. -
WASM bindings, discovery / coverage CLI verbs, expanding the genre catalog, and deciding the registry shape the website will want — all open work.
Shape, traits, and module layout will keep iterating; this book is the current snapshot, not a contract.