– A list is a data structure in Python that holds an ordered collection of items. It allows you to store multiple values in a single variable, and the items within a list are indexed, meaning each item can be accessed by its position (index). Lists are mutable, so their content can be changed after they are created.
– The map() function in Python applies a given function to each item in an iterable (such as a list or tuple) and returns an iterator that yields the results. It is useful when you want to perform an operation on each item of a sequence, such as transforming or processing data.
– A data lake is a centralized repository that allows you to store structured, semi-structured, and unstructured data at scale. Unlike traditional data warehouses, data lakes store raw, unprocessed data, which can later be transformed and analyzed as needed.
– A data pipeline is a series of steps or stages used to collect, process, and transform data before loading it into a storage system, database, or data warehouse. It automates data workflows from extraction to transformation to loading (ETL).