1. what is data skew 2. what happen when worker nodes died 3. how you configure spark for 1tb of data 4. what is best for ETL EMR or GLUE 5. which cluster manager you use and why 6. on which business problem you have worked 7. what type of request you get from bi and data scientist team 8. in your database what type of records or column you have 9. how agile methodology work in your organization 10. where you used stored procedure views triggers 11. how much time is generally required for a job of having 1 gb of data 12. how many jobs you are currently maintaining 13. cluster mode client mode 14. what optimization techniques you are using 15. what errors you faced in glue job 16. what is DAG 17. have you faced out of memory issue and how you maintainer 18. on daily bases how many jobs you create or handle labrary 19. in month how much of data you Handel ( counter question on i stated we extract data 2 times a week from RDS MYSqL SERVER) 20. what libraries you use in AWS glue and how you install external labraris if we needed 21. in how your team works and what is key role 22. what transformation on data you had done 23. for every job did you personally create scripts and how you test it 24. on single dataset how much time you spend on general and what is the average size of dataset 25. how you monitor job execution ,worker nodes in glue 26. IAM policy.