May 12, 2026

We Sent 47,000 Cold Emails in Q1 2026. Here's What Actually Worked.

Most cold email content tells you what should work. This isn't that.

This is a breakdown of 47,000 outbound emails sent across 14 B2B campaigns over a 90-day period — the actual reply rates, the deliverability disasters, the sequences that booked meetings, and the approaches that flatlined. If you're running cold outreach at any volume, the patterns here will save you months of trial and error.

The short version: deliverability decides everything before the copy even matters, automated prospecting dramatically outperforms manual list-building on quality (not just speed), and the SDR-shaped hole most companies think they need can be filled with the right system and a fraction of the budget.

---

The Setup: How We Ran Automated Prospecting at Scale

Before a single email went out, we spent three weeks on infrastructure and targeting. This is where most teams get it backwards — they build sequences first and fix the plumbing later. We did the opposite.

Prospect sourcing used automated prospecting across LinkedIn company data, job postings, and founder profiles. The filters weren't just industry + headcount. We targeted signals: companies that had posted a VP of Sales role in the last 60 days (growth mode, budget unlocked), SaaS companies crossing Series A (team scaling, outreach tooling decisions pending), and agencies with 5–25 employees that showed signs of doing outreach manually (Calendly links in email signatures, LinkedIn activity from founders doing their own DMs).

Out of every 1,000 raw leads pulled, roughly 340 made it through our ICP scoring process with a score of 70+. The bottom 660 never got emailed. That 34% pass rate sounds strict — it was the right call. Campaigns targeting broad, low-quality lists underperformed by 2.4x on reply rate compared to tighter, higher-scored pools.

Domain setup used 18 sending domains across 6 root domains, with 3 mailboxes per domain. All domains were aged a minimum of 45 days before any volume. SPF, DKIM, and DMARC were enforced on every single one. This isn't optional in 2026 — it's the floor.

---

Deliverability: The Unglamorous Thing That Determined 80% of Results

Here's a number most cold email guides won't give you: of the 47,000 emails sent, we estimate roughly 41,200 reached the inbox. The remaining ~12% landed in spam or bounced. That gap — 5,800 emails that no one ever saw — is the difference between a campaign that funds itself and one that doesn't.

What moved inbox placement most:

1. Warmup wasn't a checkbox — it was ongoing. We kept warmup running on all domains throughout the campaign at a 30% ratio of warmup to outbound sends. Most teams stop warming once a domain "passes." Deliverability degrades under sustained volume. Continuous warmup offset that degradation and kept spam rates below 0.08% across the quarter.

2. Send patterns mimicked human behavior. No campaign sent more than 40 emails per mailbox per day. Sends were distributed across a 9am–5pm local-time window with random delay intervals between 90 and 480 seconds. Batch sends — where 200 emails go out in a 15-minute window — are one of the fastest ways to trip provider filters. For a deeper dive on this, see our email warmup and domain rotation playbook.

3. Bounce rates were managed in real time. Any domain crossing 3% bounce rate was paused immediately, its remaining sequence contacts were re-validated, and sending resumed only after the list was cleaned. Two domains never recovered — both had been used for a previous campaign with sloppy list hygiene. They were retired.

4. Subject line diversity mattered more than A/B testing. Running the same subject line across 2,000 sends is a signal. We rotated across 6 subject line variants per campaign, randomized at send time, with no single variant exceeding 20% of volume. This reduced pattern-matching by spam filters and gave us honest performance data per variant.

---

The Sequences That Actually Booked Meetings

Across 14 campaigns, three sequence structures consistently outperformed the others.

The 3-touch problem-first sequence was the top performer, generating 71% of all booked meetings:

- Email 1 (Day 0): One specific pain point, one social proof data point, one clear ask. Under 90 words. No features. No "I came across your profile." - Email 2 (Day 4): A case study reference or specific outcome — one sentence. "Worth a quick look?" Not a full re-pitch. - Email 3 (Day 9): A permission-to-close. "Should I close this out, or is the timing just off?" Low pressure, high response rate.

This structure outperformed longer sequences (5+ touches) by 38% on reply rate and 52% on positive reply rate. More touches generated more replies, but they were increasingly negative — "stop emailing me" — which tanks deliverability and poisons the domain.

The contrast opener worked particularly well for competitive markets. First line: "Most [role] tools promise [generic benefit]. Ours does [specific differentiator] instead." No pleasantries, no company backstory, no "I hope this finds you well." The reply rate on contrast openers was 4.1% vs. 2.3% for generic benefit-led openers.

The hyper-specific personalization tier — reserved for top 10% ICP scores — used custom first lines referencing a recent hire, a funding round, or a specific blog post by the prospect. These generated a 7.2% reply rate. The personalization was generated automatically at scale through our platform's enrichment layer, not written by hand.

---

What Killed Reply Rates (The Honest Version)

Long first emails. Any email over 120 words in the first touch saw a 40% drop in reply rate. Prospects aren't reading long cold emails. They're skimming for a reason to respond or a reason to delete. Give them one thing.

Feature-forward copy. "Our platform uses advanced AI to..." is not a hook. Pain-forward copy — "Your team is probably spending 12+ hours a week on manual prospecting that delivers inconsistent results" — converts at 3x the rate of feature copy. Sell the outcome, not the mechanism.

Mismatched ICP. The two worst-performing campaigns targeted companies that scored 50–65 on our ICP model. We ran them as a test. The results confirmed what the model predicted: low fit = low interest. The reply rates (0.9% and 1.1%) would have been embarrassing enough without the deliverability damage those campaigns caused. Low-engagement sends hurt sender reputation. A bad list doesn't just waste budget — it actively harms your good campaigns.

Sending to role-based addresses. Any email going to info@, hello@, or sales@ was excluded from sequences. Role-based addresses generate bounces, out-of-office loops, and low engagement signals. Always target named contacts.

---

The SDR Replacement Math

The honest question underneath every cold email automation discussion is: what does this cost compared to a human SDR?

A mid-market SDR in 2026 costs $80K–$110K fully loaded (salary, benefits, tools, management overhead). Ramp time is 3–4 months. Productive output is roughly 40–60 personalized outreach touchpoints per day before quality drops. At that rate, a single SDR might touch 800–1,200 unique prospects per month.

The 47,000-email system we ran touched approximately 9,400 unique prospects per month. Personalization quality — measured by positive reply rate on top-tier accounts — was comparable because the automated enrichment and sequence generation matched or exceeded what a junior SDR produces. Cost to run the system: under $1,200/month including tooling.

We covered this in more depth in the SDR replacement cost breakdown, but the summary is: the math isn't close. The bottleneck isn't volume or personalization anymore. It's ICP accuracy and deliverability execution.

---

What to Take From This

If you're running cold outreach right now, the highest-leverage changes you can make aren't in your copy — they're in your infrastructure and targeting:

1. Run ICP scoring before you build a single sequence. Bad lists don't just waste effort; they damage deliverability for your good campaigns. 2. Treat warmup as ongoing, not one-time. Sustained volume needs sustained warmup to offset degradation. 3. Keep first-touch emails under 100 words. Length is not persuasion. 4. Use 3-touch sequences by default. More touches generate more noise, not more meetings. 5. Rotate subject lines at volume. Pattern diversity reduces spam filter triggers.

The teams booking the most meetings with cold email in 2026 aren't writing better copy than everyone else. They're running cleaner infrastructure, targeting tighter ICPs, and letting automated prospecting handle the volume so their human judgment goes where it actually matters.

If you want to see how OnyxSend handles the prospecting, scoring, sequencing, and deliverability monitoring in a single system — take a look at what's included.

← Back to blog