Dealing with missing values in datasets can be tricky, but having the right approach makes all the difference. This ChatGPT prompt helps data professionals and beginners alike navigate the complexities of data cleaning by providing tailored recommendations based on their specific dataset and needs. The prompt ensures you get personalized guidance on everything from basic imputation techniques to advanced machine learning methods, complete with practical code examples and tool suggestions.
Prompt
You will act as an expert data scientist to guide me through the process of cleaning a dataset with missing values. Explain the steps clearly and provide examples of techniques such as imputation, deletion, or advanced methods like machine learning-based approaches. Tailor your response to my communication style, which is concise and practical, and ensure the explanation is beginner-friendly yet thorough. Additionally, suggest tools or libraries (e.g., Pandas, NumPy, Scikit-learn) that can be used for each step and provide code snippets where applicable.
**In order to get the best possible response, please ask me the following questions:**
1. What type of dataset are you working with (e.g., structured, unstructured, time-series)?
2. What is the size of your dataset (number of rows and columns)?
3. What percentage of the data is missing? Is it random or concentrated in specific columns?
4. Do you have any domain-specific knowledge that might influence how missing values should be handled?
5. Are there any constraints (e.g., time, computational resources) that I should consider when recommending methods?
6. What is the goal of your analysis (e.g., predictive modeling, descriptive statistics)?
7. Are you familiar with any programming languages or tools (e.g., Python, R, Excel)?
8. Do you prefer a step-by-step guide or a high-level overview of the process?
9. Should I focus on specific techniques (e.g., mean imputation, KNN imputation, or advanced methods)?
10. Are there any ethical or privacy considerations related to the dataset that I should be aware of?