WRITTEN BY
Irakli B.

Why Hiring a CRO Agency Is a Procurement Decision, Not a Vibe Check

Figuring out how to choose a CRO agency is less about who has the slickest deck and more about who has a method that holds up under scrutiny. Every agency shows you the winning case studies. Very few will tell you how many tests lost, how they sized the sample, or what they did when the "redesign" tanked revenue. This guide is for founders and growth leads evaluating three to five agencies who need to make the hire defensible internally — not just exciting.

Why Most CRO Agency Pitches Look Identical

If you've taken three discovery calls this week, you've already noticed the pattern. Case studies with big percentages. A testing roadmap slide. A retainer quote somewhere between $6,000 and $15,000 a month. The pitches blur together because they're selling the same story.

The differences show up about 90 days in. One agency is running statistically sound tests tied to revenue on your P&L. The other is redesigning your hero section every two weeks and calling it optimisation. By the time you notice, you've spent $30,000 and your conversion rate is exactly where it started.

Think of it like hiring a contractor to renovate your kitchen. Every contractor shows you photos of finished kitchens. The question that matters is how they handle the plumbing when the wall comes down and there's a leak nobody planned for. CRO is the same. The work happens when tests lose, research surprises you, or a "winning" variant doesn't replicate on the P&L.

The seven questions below are designed to surface the method, not the marketing. Ask them in the order given. You'll learn more in 20 minutes than most founders learn in three months of working with the wrong agency.
Pro Tip:
Run the same questions across every agency you interview. Consistency is how you compare answers. If you ask Agency A about win rate and Agency B about pricing, you can't actually compare them. Use one scorecard, seven questions, and rate each response 1-5.

Question 1 - What's Your Win Rate Over the Last 12 Months?

The first question to ask a CRO agency is their overall win rate across every test they ran in the last 12 months - not just the ones that made it into the deck. A healthy win rate sits somewhere between 20% and 35%. Anything above 50% is either a fresh agency with five tests under its belt or someone cherry-picking.

Here's why the range matters. CRO is a science, and real science includes losing hypotheses. An agency with a 70% win rate is probably testing safe, low-impact changes (button colour, copy tweaks) or calling flat tests "wins" because the variant didn't lose. Neither moves revenue.

A good answer sounds like this: "Over the last 12 months we ran 47 tests across 11 clients. 14 were winners, 9 were losers, and 24 were flat or inconclusive. Our winners averaged a 6.8% uplift in revenue per visitor."

That answer includes volume, breakdown, and the metric. A weak answer sounds like: "We have a 92% success rate." Ask what they mean by success and watch the room go quiet.
Quick Note:
"Flat" is not a dirty word. Flat tests are still learning. They tell you the lever you pulled doesn't move the needle, which saves you from building a whole strategy around a dead end. Agencies that hide flats are hiding the map of what didn't work.

Question 2 - How Do You Size Samples and Call Significance?

This is the question that separates testing programmes from theatre. Ask how they calculate sample size before a test starts and when they decide a test is "done."

A proper answer includes three numbers: baseline conversion rate, minimum detectable effect (MDE), and statistical power (usually 80%). The agency should plug these into a sample size calculator and commit to a test duration before the test goes live - typically two to four full business cycles, or at least two weeks minimum.

If the answer is "we let it run until we see a winner," run. That's called peeking, and it inflates false positives. It's the CRO equivalent of flipping a coin 100 times and stopping the count the moment heads is ahead. You'll always find a "winner" that doesn't hold up.

Imagine testing whether a new menu at your café sells more espressos. You don't declare victory on Tuesday lunch because three extra people ordered one. You run the menu for a full month across weekday mornings, weekend rushes, and rainy Sundays. Sample size is just the ecommerce version of giving the menu enough shifts to prove itself.

Question 3 - What Happens When a Test Loses?

Ask the agency to walk you through the last test they ran that lost. What was the hypothesis, what happened, and what did they do next. The answer tells you everything about their process.

A mature agency treats losers as data. They'll tell you exactly which assumption in the hypothesis was wrong, what the post-test analysis revealed, and how it changed the next test. A less mature agency will say "we'll run another variant" or change the subject to a winner.

Losing tests are more valuable than winning ones over a 12-month programme. They eliminate bad ideas, refine customer understanding, and prevent you from scaling a change that would have quietly cost you money. Any agency that treats losers as failures rather than inputs is running an optimisation theatre, not a programme.

Think of it like an emergency room. A good doctor doesn't just celebrate the patients who walked out fine - they do a post-mortem on the complications, because that's where the lessons live. Same with tests. The wins feel good. The losses teach you how your customers actually behave.
Important Update:
Losing tests should have a written post-mortem. Ask to see the template. If every test ends with "we'll iterate," there's no learning loop. A proper loser analysis documents the original hypothesis, what the data showed, what the agency thinks went wrong, and what it means for future tests on similar pages.

Question 4 - How Does Research Feed the Testing Roadmap?

A testing roadmap without research is a guessing roadmap. Ask what research methods the agency uses and how those inputs become hypotheses on the testing calendar.

The answer should cover a mix: quantitative data (GA4 funnel analysis, Shopify reports, heatmaps, session recordings) and qualitative data (customer surveys, review mining, user testing, support ticket analysis). The best agencies combine at least three methods before writing a single hypothesis.

Here's what a weak answer looks like: "We follow best practices and look at your analytics." Best practices are the average of what worked for other stores. Your store is not average, and your customers aren't average, so best practices are a starting bias, not a strategy.

A strong answer sounds like: "We start with a two-week research sprint - review mining on your top 200 reviews, five user tests, a funnel audit in GA4, and a heatmap on your top three landing pages. That produces roughly 25 to 40 friction points, which we turn into prioritised hypotheses." Concrete, time-bound, and evidence-led.

Question 5 - Who Owns the Testing Tool Contract?

Small question, big implications. CRO testing tools like VWO, Convert, or AB Tasty aren't cheap - paid plans run $500 to $3,000+ per month depending on traffic. Ask whether the contract sits in your name or the agency's.

If the agency owns the contract, two things happen when you leave. You lose access to every test result, every audience segment, every heatmap and recording you've built up. And the agency keeps the leverage - they can quietly bundle the tool cost into the retainer and mark it up 30-50%.

The right setup is simple. The tool contract is in your name, on your billing. The agency has admin access, they install it, they run it, they build everything inside it. If the relationship ends, you keep 18 months of test history, segments, and recordings. The only thing that leaves is the agency.

This applies to analytics tools too. GA4 property, Hotjar, Microsoft Clarity, survey tools - all in your name. A good agency will tell you this before you ask. It's a signal they're used to working with sophisticated clients who know how the industry works.

Question 6 - How Do You Score and Prioritise Hypotheses?

Every CRO agency should be using a scoring framework - ICE, PIE, or RICE are the common ones. Ask which they use and how it works in practice.

ICE scores each hypothesis on Impact, Confidence, and Ease (1-10 each). PIE scores on Potential, Importance, and Ease. RICE adds Reach - how many users the test will affect. The framework itself matters less than whether they use one consistently and whether you'll see the scores.

Ask for a sample prioritised backlog. A good agency will show you 15-30 scored hypotheses with notes on the research input behind each one. A weaker agency will send you a three-item roadmap based on "what we usually test first."

The scoring isn't gospel - it's a conversation starter. A hypothesis scoring 24 on ICE should be tested before one scoring 16, but the founder's gut check matters too. If the top-scoring test conflicts with brand guidelines or a product launch, it gets deprioritised.
What you want is a system that forces the conversation to be explicit instead of "trust us."

If you want to pressure-test your own hypothesis backlog or compare how agencies score ideas, the Weblics CRO framework breakdown walks through ICE, PIE, and RICE side by side.
Reminder:
Prioritisation scores should be visible to you. If the agency holds the scoring spreadsheet and only sends you the top three tests each quarter, you can't challenge the logic. Ask for read access to the full backlog. Real agencies share it. Performance theatres don't.

Question 6 - How Do You Score and Prioritise Hypotheses?

This is the question that catches most agencies off guard. Ask how they define a successful test: uplift on a metric, or revenue on the profit and loss statement.

The honest answer is that uplift and revenue can disagree. A variant might increase conversion rate by 8% but reduce average order value by 12%, netting out to flat revenue. Another variant might lift add-to-cart by 20% but tank checkout completion. If the agency only measures the metric closest to their intervention, they'll "win" while your P&L stays flat or goes backwards.

A strong answer ties test results to revenue per visitor (RPV) - the metric that captures both conversion rate and AOV in one number. Even better if they look at it alongside new customer revenue versus returning, because a lift driven by existing buyers doesn't grow the business the same way.

Here's the GPS analogy. A turn-by-turn app that tells you you're making great time while you're headed to the wrong city is useless. Uplift on a single metric without revenue context is the same thing - technically accurate, strategically meaningless.

CRO Agency Red Flags and How to Spot Them Early

A few answers should end the conversation before the second call. These are the CRO agency red flags worth memorising.

Guaranteed results. Anyone promising a specific uplift percentage before seeing your data is either lying or doesn't understand how testing works. CRO outcomes depend on traffic volume, baseline, seasonality, and a dozen variables the agency can't see on a sales call. The only legitimate guarantee is a process guarantee - "we'll run X tests in Y days" or "we'll refund if we don't hit a process milestone."

"We'll redesign your site." Redesigns are not CRO. They're creative projects dressed up as optimisation. A full redesign removes the baseline you'd need to measure whether anything actually worked. Good agencies test inside the existing site and only recommend a redesign after the data says the current structure is fundamentally broken.
No research phase. If the agency wants to start testing in week one, they're guessing. A proper CRO programme has a two to four week discovery phase before the first test goes live. Anyone skipping it is selling activity, not outcomes.

Flat retainers with no deliverable cadence. A $10,000 monthly retainer should come with a specific cadence: X tests per quarter, Y research outputs, Z reporting rhythms. If the contract just says "ongoing CRO services," you'll spend six months wondering what you're paying for.

Case studies without context. Percentages are useless without traffic, duration, and test design. A "23% uplift" on a site doing 500 visitors a month is noise. Ask for the denominators. Any agency unwilling to share them is hiding something.
Pro Tip:
The best agencies will disqualify themselves. If an agency tells you your traffic is too low for rigorous testing, or that you need a research phase before a test plan, or that they don't do redesigns - that's a trust signal. Agencies saying yes to everything are telling you what you want to hear, which is rarely what you need.
FAQ

Do you have any questions left?

Here are the answers for you

How much does a CRO agency cost?

Most reputable CRO agencies charge between $5,000 and $20,000 per month on retainer, with enterprise engagements running higher. Pricing usually correlates with test volume, research depth, and traffic tier. Fixed-fee project work for audits runs $3,000 to $10,000. If a retainer is under $3,000/month, the agency likely can't afford the hours a real testing programme requires - someone is cutting corners, usually on research.

CRO agency vs in-house team: which is better?

An agency makes sense when you're doing under $10M annually and can't justify a full-time CRO hire (which costs $120K-$180K all-in). In-house starts to make sense above $10M, or when testing velocity needs to be faster than an agency retainer supports. Many teams run hybrid - agency for research and strategy, in-house for execution.

How long until a CRO agency shows results?

Realistically, 90 to 120 days before the first revenue-moving winner. The first 30 days go to research and baseline setup. Tests 2-5 run over the following 60-90 days. Anyone promising results in 30 days is either testing tiny changes that don't affect revenue or calling flat tests wins. Compounding impact shows up in months 6-12.

Does a CRO agency need access to my Shopify admin?

Yes, and no. They need analytics access (GA4, Shopify reports), theme access to deploy test code, and read access to orders and customers. They should not need full admin access - a staff account with theme editing and reports permissions is enough. Good agencies will tell you the minimum access required and explain why.

What's the best CRO agency for Shopify specifically?

Look for agencies that specialise in Shopify rather than generalists serving WordPress, Magento, and Shopify interchangeably. Shopify-specific agencies understand the theme structure, the checkout constraints (especially Shop Pay and Shopify Plus checkout extensibility), and app ecosystem trade-offs. Ask how many Shopify stores they've worked on in the last 12 months.

Can I run a CRO programme without an agency?

You can, but the ceiling is lower. Solo operators can run 2-4 tests per quarter with careful discipline. Agencies typically run 8-15. The value of an agency isn't execution speed - it's the accumulated pattern recognition across 50+ stores, which compresses the learning curve. If you're under $500K/year in revenue, DIY is fine. Above that, the agency ROI usually clears.

What's included in each plan?

Every plan includes complete care-driven CRO - what varies is testing capacity and analysis depth.

All Plans Include:

Onboarding (First 5 days):

  • Founder interviews & business deep-dive
  • Comprehensive technical website audit
  • Customer psychology analysis (ICP, 5 WHYs, SWOT)
  • AI-trained buyer personas creation
  • Ad creatives audit
  • Marketing ecosystem review

Ongoing (Continuous):

  • Psychology-first hypothesis generation
  • Conversion-focused UX/UI design
  • Strategic copywriting
  • Shopify development & implementation
  • A/B testing & QA
  • Transparent reporting & documentation
  • Strategy meetings (weekly or bi-weekly)

What Changes by Tier:

  • Tests per month: 2, 4, 6, or 8 A/B tests
  • Meeting frequency: Bi-weekly (Starter) or Weekly (Growth+)
  • Analysis depth: Post-purchase surveys, support analysis, inventory strategy, KPI planning, quarterly planning (varies by tier)

Bonus (Growth+): Comprehensive email marketing audit from specialist partners

What's the difference between Flexible and Scale plans?

Flexible plans give you complete control over costs. You pay for the essential CRO work - strategy, hypothesis generation, analysis, A/B test and project management - whilst design, development, and QA are billed separately at $70/hourly only when you need them.

This is perfect if you have an in-house design or development team, or if you want to manage exactly what gets built and when. You're not locked into paying for services you don't need.

Scale plans include everything - strategy, analysis, design, development, QA, and implementation - in one predictable monthly retainer. No surprises, no separate invoices, just complete care-driven CRO delivered autonomously.

Choose Flexible if: You have internal resources or want precise cost control
Choose Scale if: You want fully autonomous, hands-off CRO with everything included

How do your pricing tiers work?

Transparent pricing based on your monthly traffic.

We charge based on traffic volume because testing capacity and statistical significance directly correlate with session count. The more traffic you have, the faster we can run tests and deliver results.

Pricing:

  • Starter (50K-75K sessions): $1,650/mo - 2 tests
  • Growth (75K-150K sessions): $3,500/mo - 4 tests
  • Scale (150K-350K sessions): $6,600/mo - 6 tests
  • Enterprise (350K+ sessions): $10,700/mo - 8 tests

No long-term contracts. Cancel anytime.
Every plan includes our 30-day profitability guarantee.

Not sure which plan fits?
Book a discovery call - I'll help you find the perfect match for your business.

What's your CRO process?

Our battle-tested frameworks and systems validate every hypothesis before we build.

Phase 1: Onboarding (First 5 days)

  • Deep-dive into your business, customers, and psychology
  • Comprehensive technical audit
  • 25+ care-driven optimisation hypotheses
  • Custom roadmap delivered

Phase 2: Operational (Continuous)

  • Validate hypotheses through AI-trained buyer personas
  • Ask: "Does this genuinely serve customer needs - not manipulate?"
  • Design, develop, and implement winning tests
  • Rigorous QA across all devices
  • Launch and monitor

Phase 3: Ongoing Analysis (Monthly)

  • Behavioural segmentation & data analysis
  • Post-purchase survey analysis (Growth+ plans)
  • Support ticket insights analysis (Growth+ plans)
  • Inventory strategy (Growth+ plans)
  • Monthly KPI planning (Growth+ plans)
  • Quarterly strategic planning (Scale+ plans)

Do you use AI?

Yes - but as an addition to our battle-tested frameworks, not the foundation.

We've built a proprietary AI system that validates every hypothesis against your actual buyer personas before we build anything. This ensures we only create optimisations your customers will genuinely respond to.

How it works:

  1. Our frameworks identify conversion opportunities
  2. We generate psychology-first hypotheses
  3. AI-trained buyer personas validate each hypothesis
  4. We ask: "Does this genuinely serve customer needs—not manipulate?"
  5. Only validated hypotheses get built and tested

This approach achieves 84% test success rate vs 45% industry average - because we validate with your actual customers before building, not after.

AI enhances our care-driven methodology. It doesn't replace genuine customer understanding.

What if I need more than my plan includes?

Simply upgrade to the next tier for more included tests and enhanced ongoing analysis.

We're completely flexible - scale up or down based on your business needs. No penalties, no long-term lock-ins.

Want to discuss expanding your plan? Your dedicated CRO manager can adjust your package anytime.

Can I cancel anytime?

Yes. No long-term contracts. Cancel anytime.

We earn your business every single month through results - not by trapping you in contracts.

If we don't make you profitable within 30 days, you pay nothing more until we deliver. That's our guarantee.

Most clients stay because care-driven CRO compounds month after month - each winning test keeps generating revenue whilst new tests add even more. But you're never locked in.

We're confident our results will speak for themselves.

How involved do I need to be?

Zero micromanagement required. We operate completely autonomously.

We're an extension of your business - making decisions with your profit margins AND mission in mind, not billable hours.

Your involvement:

  • Initial onboarding: 2-3 hours (interviews, strategy alignment)
  • Weekly/bi-weekly meetings: 30-60 minutes (strategy updates, results review)
  • Ad-hoc questions: Slack chat for quick questions

We handle everything else:

  • Hypothesis generation
  • Design and copywriting
  • Development and implementation
  • QA across all devices
  • A/B test management
  • Data analysis and reporting

You focus on running your business. We focus on adding $50K+ monthly to your revenue.

That's the partnership.

What tools/platforms do you use?

We integrate with your existing tools—no forced changes.

Analytics: Shopify Analytics, Microsoft Clarity, GA4
Testing: Intelligems
Management: ClickUp, Figma, Slack

Your data stays in your systems. We integrate seamlessly.

How do you ensure my data is secure?

We sign NDAs before any work begins. Your data is protected - always.

Security measures:

  • Non-Disclosure Agreement (NDA) signed upfront
  • Limited access permissions (only what's necessary)
  • Data stored in your systems (we don't migrate your data)
  • Team access restricted to assigned personnel only
  • Regular security audits

We treat your business like our own - that includes protecting your data like it's our own.

You maintain full control over all access permissions and can revoke them anytime.

What results can I expect?

Guaranteed profitability in 30 days. $50K+ monthly revenue boost within 60 days.

Tangible outcomes:

But more than numbers - you'll understand your customers deeply, remove friction authentically, and build genuine relationships that compound revenue month after month.

  • Increased conversion rates (50-100%+ improvements common)
  • Higher average order values
  • Improved ROAS (return on ad spend)
  • Enhanced customer lifetime value
  • Sustainable, compounding revenue growth

Our 84% hypothesis success rate means tests consistently work.

Real client results:

  • ForKeeps Merch: $2.3M added revenue (+70% conversion rate)
  • Organic Muscle: 128% conversion rate increase
  • CKitchen: $1.1M added revenue over 22 months
  • Mayven Studios: 50% conversion increase in 2 months
How long should I work with you?

For as long as care-driven CRO continues delivering massive ROI - which typically compounds over 6+ months.

Why long-term partnerships work:

  • Each winning test keeps generating revenue permanently
  • New tests stack on top of previous wins
  • Deeper customer understanding leads to better hypotheses
  • Compounding effects multiply over time

Typical timeline:

  • Months 1-3: Foundation + initial wins ($50K+ monthly added)
  • Months 4-6: Compounding effects visible (wins multiply)
  • Months 7-12: Sustainable growth system established
  • 12+ months: Category-leading conversion rates achieved

Most clients stay 12-24+ months because results compound. But there's no lock-in - cancel anytime.

We earn your business every month through genuine results, not contracts.

How do I get started?

Three simple steps:

Step 1: Book a Discovery Call 30-minute conversation to discuss your traffic, goals, and biggest challenges. We'll explore if we're a good fit and map out your path to $50K+ monthly revenue growth.

Step 2: Get Your Free Audit We'll conduct a comprehensive CRO audit of your website, deliver 25+ psychology-first hypotheses, and show you exactly where your biggest revenue opportunities are.

Step 3: Choose Your Plan & Launch Select the plan that fits your traffic and business needs. We'll onboard you within 5 days and have your first A/B test live within 10 days.

Ready to grow with care-driven CRO?

Or have more questions? Email us: garyk@weblics.agency