11. How do you merge or join two DataFrames?
Use pd.merge(df1, df2, on='key_column', how='inner') with options:⦁ how='inner' (default) for intersection,
⦁ left, right, or outer for other joins.
12. What is the difference between .loc[] and .iloc[] in Pandas?
⦁ .loc[] selects data by label (index names).
⦁ .iloc[] selects data by integer position (0-based).
13. How do you handle duplicates in a DataFrame?
Use df.duplicated() to find duplicates and df.drop_duplicates() to remove them.
14. Explain how to deal with outliers in data.
Detect outliers using statistical methods like IQR or Z-score, then either remove, cap, or transform them depending on context.
15. What is data normalization and how can it be done in Python?
Scaling data to a standard range (e.g., 0 to 1). Can be done using sklearn’s MinMaxScaler or manually using (x - min) / (max - min).
16. Describe different data types in Python.
Common types: int, float, str, bool, list, tuple, dict, set, NoneType.
17. How do you convert data types in Pandas?
Use df['col'].astype(new_type) to convert columns, e.g., astype('int') or astype('category').
18. What are Python dictionaries and how are they useful?
Unordered collections of key-value pairs useful for fast lookups, mapping, and structured data storage.
19. How do you write efficient loops in Python?
Use list comprehensions, generator expressions, and built-in functions instead of traditional loops, or leverage libraries like NumPy for vectorization.
20. Explain error handling in Python with try-except.
Wrap code that might cause errors in try: block and handle exceptions in except: blocks to prevent crashes and manage errors gracefully.
No comments:
Post a Comment