National Institute of Technology Rourkela

राष्ट्रीय प्रौद्योगिकी संस्थान राउरकेला

ଜାତୀୟ ପ୍ରଯୁକ୍ତି ପ୍ରତିଷ୍ଠାନ ରାଉରକେଲା

An Institute of National Importance

Syllabus

Course Details

Subject {L-T-P / C} : CS6315 : Data Science { 3-0-0 / 3}

Subject Nature : Theory

Coordinator : Prof. Bibhudatta Sahoo

Syllabus

Mathematical Foundations: Introduction to data, summarization, Probability and Statistics: distributions, Inference and Markov Chains, Linear Algebra and Vector Calculus
Data Warehousing: Data Preprocessing, Warehouse Architecture, ETL, OLAP, Data Lakes, Big Data Pipeline, Job Scheduling,
Descriptive Modelling: k-means, hierarchical, DBScan, Outliers, Association Rule Mining: Apriori and FP-Tree, Objective Measures of Interestingness
Predictive Modeling: Regression, Decision Tree, SVM, Ensemble of Classifiers, CNN, RCNN, RNN, LSTM, GRU, Advanced predictive models
Time Series Data Analysis: Introduction to Time Series, Correlation, Forecasting (Univariate): Autoregressive Moving Average (ARMA) models, Autoregressive Integrated Moving Average(ARIMA) models, SARIMA, Profit Model, Forecasting (Multivariate): MARS, Deep Learning Architectures, Forecasting (spatial-temporal)
Advanced Topics: Recommender Systems: Content based methods, Collaborative approaches, Web & Social Media Analytics: Information Retrieval, Link analysis, Text Mining, Security and Privacy, Data Governance
Data Visualization and Condensation: Introduction to Data Visualization, Basic charts and dashboard, Descriptive Statistics, Dimensions and Measures, Visual analytics, Dashboard design & principles, Advanced design components/ principles: Enhancing the power of dashboards, Special chart types

Course Objectives

  • To extract valuable information for use in strategic decision making, product development, trend analysis, and forecasting.
  • Apply quantitative modeling and data analysis techniques to the solution of real world business problems, communicate findings, and effectively present results using data visualization techniques.
  • Employ cutting edge tools and technologies to analyze Big Data.

Course Outcomes

1. Develop in depth understanding of the key technologies in data science and business analytics: data mining, machine learning, visualization techniques, predictive modeling, and statistics. <br />2. Practice problem analysis and decision-making. <br />3. Gain practical, hands-on experience with statistics programming languages and big data tools through coursework and applied research experiences. <br />4. Students will apply data science concepts and methods to solve problems in real-world contexts and will communicate these solutions effectively

Essential Reading

  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson
  • Laura Igual and Santi Seguí, Introduction to Data Science, Springer

Supplementary Reading

  • Davy Cielin, Arno Meysman, Mohamed Ali, Introducing Data Science, Manning
  • Andreas, Practical Data Science, Apress