Explain the COPY command in Amazon Redshift.
💡 Model Answer
The COPY command is a bulk data loading utility in Amazon Redshift that imports data from external sources such as Amazon S3, DynamoDB, or EMR into a Redshift table. It is designed for high throughput and parallelism, automatically distributing data across slices. Key features include:
- Parallel loading: Redshift reads multiple files or partitions simultaneously.
- Format support: CSV, JSON, Avro, Parquet, ORC, etc.
- Compression handling: Supports gzip, bzip2, lzop.
- Security: Uses IAM roles or temporary credentials.
- Error handling: Options like ACCEPTINVCHARS, MAXERROR, and TRUNCATECOLUMNS.
Typical syntax:
COPY table_name
FROM 's3://bucket/path'
IAM_ROLE 'arn:aws:iam::account-id:role/RedshiftCopyRole'
FORMAT AS CSV
DELIMITER ','
IGNOREHEADER 1;
The command is preferred over INSERT for large datasets because it reduces network traffic, leverages Redshift’s parallel architecture, and provides robust error handling.
This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.
🎤 Get questions like this answered in real-time
Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.
Get Assisting AI — Starts at ₹500