Interviewed with Lead engineer on spark although recruiter said it was on python. Not a big deal. Asked to for a group by aggregation. Did a groupby and sum and joined results. Asked if I could do it without a join. Did it with a sum over a partition. Asked if there was another way. Added a withColumn statement and put groupby statement as 2nd argument. Interviewer told me it was starting to look good although that's not possible in pyspark. recruiter let me know I failed tech screen.
Preguntas de entrevista [1]
Pregunta 1
What is inner vs. left join
how would you dedup table
Me postulé a través de un reclutador. El proceso tomó 3 semanas. Acudí a una entrevista en Walt Disney Company (San Francisco, CA) en mar 2026
Entrevista
The interview process started with a recruiter screening discussing my background, Spark/PySpark experience, SQL skills, and large-scale ETL pipeline development. The next round was a technical interview focused on Python, SQL, data modeling, and distributed data processing concepts. I was asked about Spark optimizations, partitioning, joins, and handling large datasets in production environments. There was also discussion around real-world data pipeline troubleshooting and system design. T
This whole process was a nightmare. Every single person who I would be working with greeted me without any video and I could barely get any kind of introduction from them. It started with the HM who has nice enough, then technical with disembodied voices, then with senior leadership.
A building has 100 floors. One of the floors is the highest floor an egg can be dropped from without breaking.
If an egg is dropped from above that floor, it will break. If it is dropped from that floor or below, it will be completely undamaged and you can drop the egg again.
Given two eggs, find the highest floor an egg can be dropped from without breaking, with as few drops as possible.