Manages and optimizes large datasets to facilitate machine learning and AI applications. Proficiency in data storage technologies and distributed computing is key.