Skip to main content

The Art of API Migration: Switching Providers Without Downtime

·APIScout Team
migrationapi integrationbest practicesvendor switchingreliability

The Art of API Migration: Switching Providers Without Downtime

Switching API providers is the project nobody wants. It's risky, time-consuming, and usually triggered by something painful — a price hike, an outage, a deprecation notice. But done right, a migration can be smooth, zero-downtime, and even improve your system. Here's the playbook.

Why Companies Migrate

TriggerFrequencyUrgency
Price increaseCommonMedium — negotiate first
Better alternative existsCommonLow — plan carefully
Reliability issuesOccasionalHigh — after major incident
Feature gapOccasionalMedium — evaluate alternatives
Acquisition/deprecationRareHigh — forced migration
Compliance requirementRareHigh — regulatory deadline
Vendor lock-in escapeOccasionalLow — strategic decision

The Migration Playbook

Phase 1: Assessment (1-2 weeks)

Before writing any code, answer these questions:

## Migration Assessment Checklist

### Current State
- [ ] Document all endpoints you use (not all available — just what you call)
- [ ] List all data stored with the current provider
- [ ] Map all webhook handlers and event types
- [ ] Identify SDK usage across your codebase
- [ ] Check contractual obligations (notice period, data export rights)
- [ ] Measure current performance baselines (latency, uptime, error rate)

### Target State
- [ ] Verify feature parity for YOUR use cases
- [ ] Test target provider's API with your actual data shapes
- [ ] Compare pricing at your usage level (not just list price)
- [ ] Check SDK quality (types, error handling, documentation)
- [ ] Verify compliance requirements (SOC 2, GDPR, etc.)

### Migration Scope
- [ ] Estimate code changes (endpoints, models, error handling)
- [ ] Identify data migration needs (users, subscriptions, history)
- [ ] List integration points (webhooks, SDKs, admin dashboards)
- [ ] Assess team training needs
- [ ] Set rollback criteria

Phase 2: Abstraction Layer (1 week)

If you don't already have one, add an abstraction layer:

// Create an interface that abstracts the provider
interface EmailProvider {
  sendEmail(params: {
    to: string;
    subject: string;
    html: string;
    from?: string;
  }): Promise<{ id: string }>;

  getEmailStatus(id: string): Promise<'delivered' | 'bounced' | 'pending'>;
}

// Current provider implementation
class SendGridProvider implements EmailProvider {
  async sendEmail(params) {
    const response = await sgMail.send({
      to: params.to,
      from: params.from || 'hello@company.com',
      subject: params.subject,
      html: params.html,
    });
    return { id: response[0].headers['x-message-id'] };
  }

  async getEmailStatus(id: string) { /* ... */ }
}

// New provider implementation (write alongside, don't replace yet)
class ResendProvider implements EmailProvider {
  async sendEmail(params) {
    const result = await resend.emails.send({
      to: params.to,
      from: params.from || 'hello@company.com',
      subject: params.subject,
      html: params.html,
    });
    return { id: result.data!.id };
  }

  async getEmailStatus(id: string) { /* ... */ }
}

Key principle: Make the switch a configuration change, not a code change.

Phase 3: Parallel Running (1-2 weeks)

Run both providers simultaneously to verify behavior:

class DualEmailProvider implements EmailProvider {
  constructor(
    private primary: EmailProvider,   // Current (SendGrid)
    private secondary: EmailProvider, // New (Resend)
    private shadowPercent: number = 10, // % of traffic to shadow
  ) {}

  async sendEmail(params) {
    // Always send through primary
    const result = await this.primary.sendEmail(params);

    // Shadow send through secondary (don't fail if it errors)
    if (Math.random() * 100 < this.shadowPercent) {
      try {
        const shadowResult = await this.secondary.sendEmail({
          ...params,
          to: `shadow-test+${Date.now()}@company.com`, // Don't email real users!
        });
        this.logComparison(result, shadowResult);
      } catch (error) {
        this.logShadowError(error);
      }
    }

    return result;
  }

  private logComparison(primary: any, secondary: any) {
    // Compare response times, formats, behavior
    console.log('Shadow comparison:', { primary, secondary });
  }
}

Shadow testing rules:

  • Never send shadow traffic to real users
  • Use test/sandbox endpoints or internal addresses
  • Compare response formats, latency, error handling
  • Run for at least 1 week before switching

Phase 4: Data Migration

// Data migration depends on category:

// PAYMENT MIGRATION (Stripe → other)
// Most complex — must migrate:
// - Customer records
// - Payment methods (usually NOT portable — re-collect)
// - Subscription data
// - Transaction history (for your records, not the new provider)

// EMAIL MIGRATION (SendGrid → Resend)
// Moderate — migrate:
// - DNS records (SPF, DKIM, DMARC)
// - Sender verification
// - Template mappings
// - Suppression lists (bounced emails)

// AUTH MIGRATION (Auth0 → Clerk)
// Complex — migrate:
// - User accounts (password hashes may not be portable)
// - Social connections
// - MFA settings
// - Session management
// - RBAC policies

// SEARCH MIGRATION (Algolia → Typesense)
// Moderate — migrate:
// - Index data (re-index from your database)
// - Search configuration (relevance, synonyms, filters)
// - API query format changes

Phase 5: Traffic Cutover

// Gradual traffic shift using feature flags
class MigratingEmailProvider implements EmailProvider {
  constructor(
    private old: EmailProvider,
    private new_: EmailProvider,
  ) {}

  async sendEmail(params) {
    // Feature flag controls rollout
    const useNew = await featureFlag.isEnabled('use-resend', {
      percent: getPhase(), // 0% → 10% → 50% → 100%
    });

    if (useNew) {
      try {
        return await this.new_.sendEmail(params);
      } catch (error) {
        // Fallback to old provider on error during migration
        console.error('New provider failed, falling back:', error);
        return await this.old.sendEmail(params);
      }
    }

    return await this.old.sendEmail(params);
  }
}

// Rollout schedule:
// Day 1: 0% (shadow testing only)
// Day 3: 10% (early adopters, monitor closely)
// Day 5: 25% (broader testing)
// Day 7: 50% (half traffic)
// Day 10: 100% (full migration)
// Day 17: Remove old provider code

Phase 6: Cleanup

## Post-Migration Checklist

- [ ] Old provider SDK removed from dependencies
- [ ] Old provider env vars removed
- [ ] Feature flags cleaned up
- [ ] Old webhook endpoints decommissioned
- [ ] DNS records updated (email: SPF/DKIM)
- [ ] Monitoring updated for new provider
- [ ] Team documentation updated
- [ ] Old provider account downgraded or closed
- [ ] Data export from old provider (for records)
- [ ] Runbook updated with new provider procedures

Category-Specific Migration Guides

Payment Provider Migration

Difficulty: Very High

Key challenges:
- Payment methods can't be transferred (cards must be re-collected)
- Active subscriptions need careful handling
- PCI compliance during transition
- Financial reconciliation

Approach:
1. New users → new provider immediately
2. Existing users → dual-write during transition
3. Subscription renewal → migrate at next billing cycle
4. Communicate to customers about re-entering payment info

Auth Provider Migration

Difficulty: High

Key challenges:
- Password hashes may use different algorithms
- Social connection tokens need re-authorization
- Active sessions during cutover
- MFA device re-enrollment

Approach:
1. Bulk import users (most auth providers support this)
2. Force password reset for users with non-portable hashes
3. Social logins: re-link on next login
4. Cut over login page, not sessions (existing sessions stay valid)

Email Provider Migration

Difficulty: Medium

Key challenges:
- DNS propagation for SPF/DKIM
- IP reputation with new provider
- Suppression list transfer
- Template format differences

Approach:
1. Set up DNS records for new provider alongside old
2. Warm up new provider's sending reputation
3. Import suppression lists
4. Migrate templates
5. Switch traffic gradually

Search Provider Migration

Difficulty: Medium

Key challenges:
- Query syntax differences
- Relevance tuning needs re-work
- Re-indexing all data
- Search analytics continuity

Approach:
1. Re-index from your source of truth (database, not old index)
2. A/B test search quality before full switch
3. Map old query syntax to new
4. Monitor search metrics after switch

Rollback Plan

Every migration needs a rollback plan:

// Rollback criteria (define BEFORE starting)
const ROLLBACK_CRITERIA = {
  errorRate: 0.05,      // >5% error rate
  latencyP99: 2000,     // >2s P99 latency
  downtime: 60,         // >60 seconds downtime
  dataLoss: 0,          // Any data loss = immediate rollback
};

// Rollback procedure
async function rollback() {
  // 1. Switch feature flag to 0% (all traffic to old provider)
  await featureFlag.disable('use-new-provider');

  // 2. Verify old provider is handling traffic
  await healthCheck.verify('old-provider');

  // 3. Alert team
  await alert('API migration rolled back — investigating');

  // 4. Do NOT delete new provider setup (may resume later)
}

Common Mistakes

MistakeImpactFix
Big-bang cutoverAll-or-nothing, no rollbackGradual traffic shift
No abstraction layerMigration requires changing every fileBuild abstraction first
Skipping parallel runningBugs found in productionShadow test for 1+ week
Forgetting webhook migrationMissing events after switchMigrate webhooks BEFORE cutover
Migrating data, not re-syncingStale data in new providerRe-sync from source of truth
No rollback planCan't recover if migration failsDefine rollback criteria upfront
Rushing to delete old providerNo fallback if issues emergeKeep old provider active for 30 days

Compare API providers for easy migration on APIScout — feature parity checks, migration guides, and vendor comparison tools.

Comments