Why factories beat static fixtures
Hardcoded fixtures create brutal coupling: change one User field and 47 tests break. Factories generate data on demand with sensible defaults and surgical overrides where it matters. A login test only needs valid email/password; the rest (createdAt, address, preferences) can be faker noise. Three key wins: 1) Isolated tests without order dependencies, 2) Explicit setup of what's relevant (user.withRole('admin')), 3) Centralized maintenance when schema evolves.
Python's Factory Boy and Ruby's FactoryBot popularized the pattern. In JS, libraries like Fishery (TypeScript-first) or Rosie give you fluent DSLs. The secret is traits: instead of copy-pasting userFactory() 20 times with tweaks, define .pending(), .verified(), .suspended() and compose. PHPUnit has no native factory but DataProviders + custom builders work the same.
Common mistakes: stuffing business logic into factories (that belongs in the entity), generating colliding IDs (use sequences or UUIDs), and not cleaning up side-effects (if factory persists, your tearDown must delete).
Anatomy of a well-designed factory
Typical structure: sensible defaults (faker for names, valid enums, null-safe foreign keys), composable traits (withOrders, expired, premium), lazy evaluation (createdAt resolves at build time, not at factory definition), and hooks (afterBuild for derived calculations, afterCreate for external API calls in integration tests).
TypeScript example with Fishery:
const userFactory = Factory.define<User>(({ sequence, afterBuild }) => ({
id: sequence,
email: faker.internet.email(),
role: 'user',
createdAt: new Date()
})).trait('admin', { role: 'admin' });Now userFactory.admin().build() gives you an admin, and .build({ email: 'test@foo.com' }) does surgical override. The sequence guarantees unique IDs per test run.
For relationships: lazy attributes. If Order has customerId, the Order factory can receive optional customer or generate a default with customerFactory.build(). Avoids combinatorial explosion of fixtures.
Factories for APIs and network mocks
Testing frontends or API integrations means mocking HTTP responses. Factories shine here: define responseFactory with status 200, default headers, body matching OpenAPI/GraphQL schema. Traits for errors: .notFound(), .serverError(), .rateLimited().
Libraries like MSW (Mock Service Workers) integrate perfectly: intercept fetch(), return responseFactory.success({ data: userFactory.buildList(5) }). Your tests stay semantic: 'when API returns empty list' uses .build({ data: [] }); 'when user is banned' uses userFactory.banned().
Real case: GraphQL with paginated nodes. Your factory generates edges with cursor, pageInfo with calculated hasNextPage, and nodes matching schema. Jest snapshots validate shape; factories give you variety (empty page, one item, full page, partial error).
Pro trick: export factories from __tests__/factories for cross-suite reuse. One userFactory used in 30 files beats 30 slightly different copies.
Deterministic seeders and non-flaky tests
Random faker causes flakiness: test passes with 'John Doe' but fails with 'X Æ A-12' because your regex validator breaks. Solution: fixed seed per test. In beforeEach: faker.seed(12345). Now each run generates the same pseudorandom sequence. CI and local give identical results.
Alternative: hardcoded values for sensitive fields (email always 'test@example.com'), faker only for cosmetic filler (bio, avatar URL). Some teams use snapshot factories: first run generates JSON fixtures, following commits reuse them (package jest-serializer-factory).
For integration tests hitting real DB, factories must do cleanup. In Jest: afterEach(async () => { await cleanupUsers(createdIds); }). Track generated IDs in a Set, delete everything at the end. TestContainers + factories = paradise: each test gets ephemeral Postgres, factories populate, test runs, container dies.
Edge cases: date factories. If your logic depends on '3 days ago', use subDays(new Date(), 3) in factory, not a hardcoded timestamp that'll become stale. Same with timezone: factories should generate UTC or your app's TZ, never mix.