Shipping a real product is the fastest path to honesty. Everything that seemed like a solid assumption on a whiteboard gets tested against actual customers, actual orders, and actual factory runs. Here's what the first 100 WeMkr orders actually taught us — what worked, what surprised us, what broke, and what we're still figuring out.
What Worked Immediately
The AI prompt system performed better than we expected on first tries. We anticipated needing to iterate heavily on generation quality before it was reliably useful. Instead, customers who described their ideas in plain language got usable, DFM-compatible designs on the first generation more often than not. The specific language customers used mattered less than we thought — the system was more robust to variation in phrasing than our internal testing suggested.
The DFM validation layer caught real problems — not theoretical ones. In the first 100 orders, it flagged and corrected 14 designs that would have resulted in factory rejections or bad proofs. Most of those corrections were line weight or color wall issues that wouldn't have been obvious to anyone without factory experience.
The payment model worked. We use authorize-then-capture: payment is authorized at checkout and only captured after the customer approves the final production proof. We expected this to feel confusing or create friction. Instead, it reduced abandonment at the proof stage — customers felt less financially committed before seeing the real proof photo, which made them more willing to start the process.
What Surprised Us
Logo uploads were harder than free-form descriptions. Logos carry years of brand intent and fine graphic detail — exactly the kind of detail that doesn't survive enamel translation. Our first logo pipeline made too many liberties in the simplification step. We've since rebuilt it with a dedicated approval step between simplification and generation, so customers can confirm what's being preserved before the design is finalized.
Customers want to refine, not restart. We built the refinement flow expecting it to be an edge case. It ran on 67% of all orders. Customers wanted incremental changes — "make the border thicker," "switch to silver hardware," "move the text down" — not full regenerations. That loop is now a core part of how we think about the product, not an afterthought.
Gold was the default. We hadn't realized how strongly the training data skewed toward gold hardware until nearly every generated design came out with gold borders unless the prompt explicitly specified otherwise. We built a neutralization step that surfaces hardware color as a deliberate choice rather than a default. Orders since then have had a more balanced distribution.
What Broke
- The proof email had a broken image link in certain email clients. The proof photo URL was being generated correctly but rendered incorrectly downstream. 11 orders were affected in weeks one and two before we caught it during a routine QA pass and patched it the same day.
- Rate limiting was exploitable via IP spoofing. A pen test we ran after week 3 surfaced this. Fixed with a combination of user-agent fingerprinting and account-level rate limits instead of IP-only controls.
- One $0.00 order made it into the production database. The order had a legitimate customer ID, a real design, and a Stripe session — but the captured amount was zero. We've since added charge capture validation that blocks fulfillment for any order below the expected threshold.
What We're Still Working On
Generation quality consistency is the hardest ongoing problem. The 90th percentile of outputs is excellent — clean, factory-ready, matching what the customer described. The 10th percentile is not. The failure modes are known (overly detailed fills, occasional symmetry artifacts, rare misinterpretation of the concept brief), but improving the floor without degrading the ceiling is slow, careful work.
Sticker add-ons are the next product. The manufacturing pipeline for stickers is simpler in some ways and more complex in others. DFM constraints are different. Factory qualification is underway.
Photo upload case 4 — a photo combined with a style instruction ("make this look like a retro travel badge") — is partially handled but not well. Cases 1 through 3 are solid: logo, reference image, plain photo. Case 4 has an interpretation problem we haven't fully resolved yet.
Alpha is about finding out what's true. We found out a lot. The things that broke were fixable. The surprises were mostly good ones. The gaps that remain are real work, not wishful backlog. Building continues.