Apify Production Checklist
Overview
Complete checklist for deploying Actors to the Apify platform and integrating them into production applications. Covers Actor configuration, scheduling, monitoring, alerting, and rollback.
Prerequisites
- Actor tested locally with
apify run apify loginconfigured with production token- Familiarity with
apify-core-workflow-aandapify-deploy-integration
Pre-Deployment Checklist
Actor Configuration
- [ ]
.actor/actor.jsonhas correctname,title,description - [ ]
INPUT_SCHEMA.jsonvalidates all required inputs - [ ]
Dockerfileuses pinned base image version (apify/actor-node:20, notlatest) - [ ]
package-lock.jsoncommitted (deterministic installs) - [ ] Memory set appropriately (start at 1024MB, tune after profiling)
- [ ] Timeout set with buffer (2x expected runtime)
Code Quality
- [ ]
Actor.main()wraps entry point (handles init/exit/errors) - [ ]
failedRequestHandlerlogs failures without crashing Actor - [ ] Input validation at Actor start (
if (!input?.startUrls) throw ...) - [ ] No hardcoded URLs, credentials, or magic numbers
- [ ] Proxy configured for target sites that block datacenter IPs
- [ ]
maxRequestsPerCrawlset to prevent runaway costs
Data Output
- [ ] Dataset schema documented (consistent field names)
- [ ]
SUMMARYkey-value store record saved with run stats - [ ] Large payloads chunked (9MB dataset push limit)
- [ ] PII sanitized before storage
Instructions
Step 1: Deploy Actor
# Build and push to Apify platform
apify push
# Verify the build succeeded
apify builds ls
# Test on platform with production-like input
apify actors call username/my-actor \
--input='{"startUrls":[{"url":"https://target.com"}],"maxItems":10}'
Step 2: Configure Scheduling
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
// Create a scheduled task (cron)
const schedule = await client.schedules().create({
name: 'daily-product-scrape',
cronExpression: '0 6 * * *', // Daily at 6 AM UTC
isEnabled: true,
actions: [{
type: 'RUN_ACTOR',
actorId: 'username/my-actor',
runInput: {
body: JSON.stringify({
startUrls: [{ url: 'https://target.com/products' }],
maxItems: 5000,
}),
contentType: 'application/json',
},
runOptions: {
memory: 2048,
timeout: 3600,
build: 'latest',
},
}],
});
console.log(`Schedule created: ${schedule.id}`);
Or configure in Apify Console: Actors > Your Actor > Schedules.
Step 3: Set Up Webhooks for Monitoring
// Create webhook for run completion alerts
const webhook = await client.webhooks().create({
eventTypes: ['ACTOR.RUN.SUCCEEDED', 'ACTOR.RUN.FAILED', 'ACTOR.RUN.TIMED_OUT'],
condition: { actorId: 'ACTOR_ID' },
requestUrl: 'https://your-server.com/api/apify-webhook',
payloadTemplate: JSON.stringify({
eventType: '{{eventType}}',
actorId: '{{actorId}}',
runId: '{{actorRunId}}',
status: '{{resource.status}}',
datasetId: '{{resource.defaultDatasetId}}',
startedAt: '{{resource.startedAt}}',
finishedAt: '{{resource.finishedAt}}',
}),
});
Step 4: Monitor Runs
// Check recent runs for failures
async function checkActorHealth(actorId: string, lookbackHours = 24) {
const { items: runs } = await client.actor(actorId).runs().list({
limit: 50,
desc: true,
});
const cutoff = new Date(Date.now() - lookbackHours * 3600_000);
const recentRuns = runs.filter(r => new Date(r.startedAt) > cutoff);
const stats = {
total: recentRuns.length,
succeeded: recentRuns.filter(r => r.status === 'SUCCEEDED').length,
failed: recentRuns.filter(r => r.status === 'FAILED').length,
timedOut: recentRuns.filter(r => r.status === 'TIMED-OUT').length,
totalCostUsd: recentRuns.reduce((sum, r) => sum + (r.usageTotalUsd ?? 0), 0),
};
const successRate = stats.total > 0
? ((stats.succeeded / stats.total) * 100).toFixed(1)
: 'N/A';
console.log(`Actor: ${actorId}`);
console.log(`Last ${lookbackHours}h: ${stats.total} runs, ${successRate}% success`);
console.log(`Failed: ${stats.failed}, Timed out: ${stats.timedOut}`);
console.log(`Total cost: $${stats.totalCostUsd.toFixed(4)}`);
if (stats.failed > 0) {
console.warn('ALERT: Failed runs detected!');
}
return stats;
}
Step 5: Implement Rollback
# List available builds
apify builds ls
# Roll back to a previous build
curl -X POST \
-H "Authorization: Bearer $APIFY_TOKEN" \
"https://api.apify.com/v2/acts/ACTOR_ID?build=BUILD_NUMBER"
# Or redeploy from a git tag
git checkout v1.2.3
apify push
Step 6: Cost Guard
// Set up a cost guard that aborts runs exceeding budget
async function runWithCostGuard(
actorId: string,
input: Record<string, unknown>,
maxCostUsd: number,
) {
const run = await client.actor(actorId).start(input);
// Poll every 30 seconds
const pollInterval = setInterval(async () => {
const status = await client.run(run.id).get();
const cost = status.usageTotalUsd ?? 0;
if (cost > maxCostUsd) {
console.error(`Cost guard: $${cost.toFixed(4)} exceeds $${maxCostUsd}. Aborting.`);
await client.run(run.id).abort();
clearInterval(pollInterval);
}
}, 30_000);
const finished = await client.run(run.id).waitForFinish();
clearInterval(pollInterval);
return finished;
}
Production Alert Conditions
| Alert | Condition | Severity |
|-------|-----------|----------|
| Run failed | status === 'FAILED' | P1 |
| Run timed out | status === 'TIMED-OUT' | P2 |
| Low yield | Dataset items < expected threshold | P2 |
| High cost | usageTotalUsd > budget | P2 |
| Consecutive failures | 3+ failures in a row | P1 |
| No runs in window | Schedule didn't trigger | P1 |
Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Build fails on platform | Local deps differ | Commit package-lock.json |
| Schedule not firing | Cron syntax error | Validate at crontab.guru |
| Webhook not received | URL not reachable | Use ngrok for testing; check HTTPS |
| Memory exceeded | Workload too large | Increase memory or reduce concurrency |
| Unexpected cost spike | No maxRequestsPerCrawl | Always set an upper bound |
Resources
Next Steps
For version upgrades, see apify-upgrade-migration.