Technical Entry Level

You receive a dataset with missing values, duplicates, and inconsistent formatting. Walk me through your data cleaning process.

Quick Tip

Always document your cleaning decisions. Future you (or your colleague) needs to know why you dropped those 200 rows.

What good answers include

Strong answers follow a systematic approach: profile the data first (counts, distributions, null rates), assess data quality issues by column, decide on handling strategies (imputation, deletion, flagging) based on the analysis goal, document all cleaning decisions, and validate the cleaned data against known benchmarks. Best candidates mention reproducibility.

What interviewers are looking for

Tests practical data skills. Real analysts spend 80% of their time on data quality. Red flag: candidates who assume data is always clean or who delete missing values without thought.

← All Data Analyst questions