When I started building A11Y CAT, I was not trying to create another tool that simply throws accessibility results onto a screen and pretends that the job is finished. I wanted something more useful than that. I wanted a tool that could help me understand what was actually happening on a page. I wanted to inspect accessibility issues, compare results, spot patterns, and learn where automation helps and where it starts to fall apart. But as soon as you start building an accessibility scanner seriously, one question comes up very quickly: Should I rely on an external accessibility API, or should I build the scanner around local engines and browser based testing? At first, the API route looks attractive. You send a URL, receive a report, and avoid some of the pain of building the logic yourself. But once I looked closer, I realised that using a paid external API as the core engine of A11Y CAT would not be the right choice for me.

The attraction of using WAVE API
WAVE is a strong accessibility tool. I already see value in WAVE because it gives clear visual feedback and makes some accessibility issues easier to understand. For colour contrast, the WAVE API can return useful contrast information. Its documentation explains that detailed report types can include contrast data and CSS selector information for identifying where an issue appears on the page. WAVE API report details That sounds exactly like the kind of evidence a scanner needs:
- foreground colour
- background colour
- contrast ratio
- selector
- issue category
- supporting evidence
So yes, technically, I could use it. The problem is not whether WAVE is useful. The problem is whether it is the right foundation for a tool that I want to run frequently, debug repeatedly, and potentially make available to others without creating a cost problem.
The cost problem
WAVE API is not free at scale. New WAVE API accounts currently include free credits, but after that the API uses a credit model. The public WAVE API documentation explains that requests consume credits and that more detailed reports can cost more credits per request. WAVE API pricing and credits That means every scan becomes a cost decision. If I am only testing a few pages occasionally, that is manageable. But A11Y CAT is not just a one page checker in my mind. The whole point is to inspect pages, rerun scans, compare results, debug failures, and improve the scanner over time. That means scans are not rare. They are part of the development workflow. If the tool was ever used by other people, the risk would become bigger. Every user scan could become my bill. That is not a good foundation for the kind of tool I want to build.
The API key problem
There is also a basic security problem. I cannot safely put a paid API key inside a browser extension. If I put a WAVE API key directly into A11Y CAT, someone could inspect the extension, find the key, and use it. That could burn credits without my control.
The correct architecture would be:
Browser extension ↓ My backend ↓ WAVE API That means I would need to build and maintain a backend, protect the API key, add rate limiting, probably add authentication, and monitor usage. That is possible. But it changes the project. Suddenly I am not just building a browser based accessibility tool. I am building an API proxy, quota system, and cost control layer. For what I need now, that is not the right burden to take on.
The API limitation
Even if I used WAVE API, it would still not remove the need for careful review. Accessibility tools can assist evaluation, but they cannot determine accessibility by themselves. W3C explains that evaluation tools can support testing, but human judgement is still required because tools cannot check every accessibility requirement automatically. W3C guidance on selecting accessibility evaluation tools That matters because I do not want A11Y CAT to give fake confidence. I do not want the scanner to say, “this page passed”, when in reality it only checked what automation can see. A scanner can find some real issues. It can support manual review. It can surface patterns. It can provide evidence. But it cannot replace keyboard testing, screen reader testing, visual inspection, accessibility tree inspection, or understanding the user journey.
Colour contrast is a good example
Colour contrast looks simple until you try to build it properly. A flat button with white text on a dark blue background is straightforward. But real websites are rarely that simple.
- text over images
- text over gradients
- semi transparent overlays
- inherited backgrounds
- text shadows
- sticky headers
- hover states
- focus states
- disabled states
- dark mode
- animation states
Even axe-core has to do significant work for colour contrast. Deque’s API documentation notes that rules such as color contrast can be expensive because they inspect many elements, computed styles, overlapping elements, hierarchy, selectors, and duplicate results. axe-core API documentation So the correct answer is not simply “use an API”. The correct answer is to build a scanner that understands what it can prove, what it cannot prove, and what should be sent to manual review.
Why axe-core still makes the most sense
For my use case, axe-core should remain the main accessibility engine. Axe-core is widely used, designed for automated accessibility testing, and already includes rules for common issues such as missing accessible names and colour contrast. Deque documents the button name rule as checking whether buttons have a discernible accessible name. Deque button name rule Deque also documents the colour contrast rule as checking whether foreground and background colours meet WCAG 2 AA contrast thresholds. Deque colour contrast rule That gives A11Y CAT a solid base layer. But the important lesson is this: using axe-core is not enough. It has to be implemented correctly. If A11Y CAT takes axe results and then adds weak custom logic on top, it can create false positives. For example, a button like this should not be flagged as missing a name: <button> <span>Search Jobs</span> </button> The text inside the span is still valid button text. If the scanner flags that as a missing button name, the problem is probably not axe-core. The problem is likely the custom implementation around axe-core. That is why A11Y CAT needs to keep these layers separate:
- raw axe-core findings
- A11Y CAT interpretation
- additional scanner heuristics
- manual review guidance
Those are not the same thing, and they should not be reported as if they are.
The best technical route: axe-core plus Playwright
The strongest route for A11Y CAT is:
axe-core inside a real browser validated with Playwright supported by an evidence and classification layer Playwright’s accessibility testing documentation explains how Playwright can run accessibility checks with axe-core and analyse pages in a browser context. Playwright accessibility testing documentation That matters because accessibility scanning needs the rendered page. It needs to see the DOM after JavaScript has run, after styles are applied, and after components are visible.
For A11Y CAT, this route gives the best balance:
- no per scan API cost
- no exposed paid API key
- real rendered browser testing
- reliable axe-core baseline
- control over classification and reporting
- better debugging through Playwright tests
- clearer proof that the scanner is not inventing issues
I do not just want a report. I want to debug why the report says what it says. I want to inspect selectors, compare DOM states, understand the accessibility tree, and prove that A11Y CAT is not creating false positives. Playwright helps with that because I can test the actual scanner behaviour, not just the final output.
Where Pa11y and IBM Equal Access fit
I would not ignore other tools completely. Pa11y is useful because it runs accessibility tests from the command line or Node.js, which makes it helpful for automated testing and regression workflows. Pa11y GitHub project IBM Equal Access is also worth considering as a comparison engine. IBM describes its Equal Access Accessibility Checker as a set of open source tools for automated accessibility checking in browser and build environments. IBM Equal Access Accessibility Checker But I would use these as supporting layers, not as the core product.
A sensible setup would be:
- main scanner: axe-core
- browser and rendered testing: Playwright
- extra regression checks: Pa11y
- optional comparison engine: IBM Equal Access
- manual validation: keyboard, screen reader, visual checks, and accessibility tree inspection
That is stronger than blindly trusting one tool.
What A11Y CAT should do with contrast
For colour contrast, A11Y CAT should not pretend that every result has the same certainty.
The scanner should classify contrast output clearly:
- confirmed contrast failure
- potential contrast issue
- manual visual review required
- not fully testable because of a complex background
That classification is more useful than dumping everything into a failed bucket. It also protects the credibility of the tool. If the scanner cannot prove something, it should say so. That is not weakness. That is accuracy.
Why I am not choosing the API first route
The API first route is tempting because it feels faster. But for A11Y CAT, it creates the wrong dependency. I would be depending on a paid external service for something the tool should be able to do locally most of the time. I would have to manage credits. I would have to protect the key. I would have to build a backend. I would have to limit usage. I would have to explain why users cannot scan freely. And I would still need axe-core, Playwright, and manual review anyway. So the better choice is not to build A11Y CAT around WAVE API. The better choice is to build a local scanner that is technically honest, then optionally use WAVE API later as an internal comparison layer when I really need it.
The best choice for A11Y CAT
For where I am now, the best route is:
- axe-core as the main accessibility engine
- Playwright to test and prove scanner behaviour
- dom-accessibility-api for accessible name checks where needed
- a local colour contrast evidence layer
- manual review categories for things automation cannot prove
- optional WAVE API only for internal comparison
This gives me a scanner I can run repeatedly without per scan API costs. It also gives me more control over accuracy. That matters because the biggest danger with accessibility tooling is not just missing one issue. The bigger danger is building a tool that looks confident but is technically wrong. If the scanner flags valid markup, developers lose trust. If the scanner hides uncertainty, auditors get false confidence. If the scanner mixes confirmed failures with guesses, the report becomes noisy. So my goal is not just to build a scanner that finds issues. My goal is to build a scanner that explains what it found, how it found it, and where its limits are. That is the real lesson. Automation is useful, but only when it is honest. For A11Y CAT, the most honest path is not a paid API first architecture. It is a browser based, axe-core driven, Playwright tested scanner with clear evidence, clear limitations, and no fake certainty.
Useful resources
- WAVE API
- WAVE API report details
- axe-core API documentation
- Deque button name rule
- Deque colour contrast rule
- Playwright accessibility testing
- Pa11y
- IBM Equal Access Accessibility Checker

