
Â
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Answer:
Â
Answer:
A business intelligence (BI) analyst collects, processes, and analyzes business data to provide actionable insights. They use tools and techniques like dashboards, reports, and visualizations to inform decision-making and improve business strategies.
Â
Answer:
A pivot table is used to summarize, analyze, and explore datasets, allowing users to reorganize and aggregate data in different ways. It’s especially useful in tools like Excel to quickly extract meaningful insights.
Â
Answer:
A Markov Chain is a mathematical model used to represent systems that transition from one state to another based on certain probabilities. It’s used in predictive modeling when future states depend only on the current state, not the history.
Â
Answer:
Data warehousing refers to the process of collecting, storing, and managing large amounts of data from different sources in a central repository, typically for reporting and analysis. It supports decision-making by organizing and making the data accessible.
Â
Answer:
A dashboard is a visual representation of key metrics and data points, providing an at-a-glance view of business performance. Dashboards are interactive and allow stakeholders to make data-driven decisions quickly.
Â
Answer:
Data governance refers to the management of data availability, usability, integrity, and security within an organization. It includes policies, standards, and procedures to ensure data is accurate, consistent, and properly utilized across the enterprise.
Â
Answer:
Answer:
Clustering is a type of unsupervised learning where the data is grouped into clusters based on similarity. The goal is to find natural groupings within the data, such as customer segmentation. Algorithms like K-means and DBSCAN are commonly used.
Â
Answer:
Feature scaling ensures that all features contribute equally to the model by normalizing or standardizing them. Without scaling, features with larger ranges can dominate, leading to biased model results, especially in distance-based algorithms like K-means or KNN.
Â
Answer:
A Receiver Operating Characteristic (ROC) curve is a graphical representation of the diagnostic ability of a binary classifier. It plots the true positive rate (sensitivity) against the false positive rate (1-specificity). The area under the ROC curve (AUC) measures the model’s overall performance.
Â
Answer:
Data wrangling is the process of cleaning, restructuring, and enriching raw data into a usable format for analysis. It includes handling missing values, removing outliers, merging datasets, and transforming data.
Â
Answer:
Smoothing techniques in time-series analysis (like moving averages or exponential smoothing) are used to remove noise from data and identify underlying trends, making forecasts more accurate.
Â
Answer:
Data bias refers to systematic errors that occur during data collection, which lead to incorrect conclusions. Bias can occur due to sampling methods, data collection processes, or inherent biases in the data itself.
Â
Answer:
An ensemble method combines predictions from multiple models to improve accuracy and robustness. Common ensemble techniques include bagging (e.g., Random Forest), boosting (e.g., Gradient Boosting), and stacking.
Â
Answer:
Advantages:
Disadvantages:
Â