Equip yourself with the essentials of informed decision-making with this practical guide to mastering data-driven modeling and extracting actionable, meaningful patterns from the vast sea of modern data.
Table of ContentsPreface
1. Fundamentals of Data Analysis and PreprocessingSudipta Hazra and Arindam Mondal
1.1 Introduction
1.2 Data Preprocessing
1.2.1 Issues with Data
1.2.1.1 Excessive Data
1.2.1.2 Too Little Data
1.2.1.3 Splintered Data
1.2.2 Setting Up for DA
1.2.2.1 Recognizing the Types of Data
1.2.2.2 Preparing Data for Detailed DA
1.3 Strategies for Preparing Data
1.3.1 Transforming Data
1.3.1.1 Filtering Data
1.3.1.2 Data Arranging
1.3.1.3 Editing Data
1.3.1.4 Modeling Noise
1.3.2 Information Compilation
1.3.2.1 Data Visualization
1.3.2.2 Data Elimination
1.3.2.3 Data Selection
1.3.2.4 Analysis of Principal Components
1.3.2.5 Data Sampling
1.3.3 Production of Novel Information
1.3.3.1 Including Extra Features
1.3.3.2 Data Fusion
1.3.3.3 Time–Series Analysis
1.3.3.4 Information Modeling
1.3.3.5 Dimensional Analysis
1.4 Real-World Applications
1.4.1 Machine Learning and Predictive Analytics
1.4.2 Healthcare and Biomedical Research
1.4.3 Financial Analysis and Risk Management
1.4.4 Marketing and Customer Analytics
1.4.5 Supply Chain Management and Logistics
1.4.6 Environmental Monitoring and Sustainability
1.5 Conclusion
References
2. Advanced Data Control Methods for Data-Driven Modeling: Techniques, Challenges, and Future DirectionsAarushi Chatterjee and Souvik Ganguli
2.1 Introduction
2.2 Related Works
2.2.1 Data Quality and Preprocessing
2.2.2 Data Governance and Control in Distributed Systems
2.2.3 Data Privacy and Security
2.2.4 Model Predictive Control and Data-Driven Approaches
2.2.5 Data Drift and Adaptive Control
2.3 Data Control Architecture in Modeling
2.3.1 Centralized versus Decentralized Data Control
2.3.1.1 Centralized Data Control
2.3.1.2 Decentralized Data Control
2.3.2 Automated Data Governance
2.3.2.1 Metadata Management
2.3.2.2 Data Provenance and Lineage
2.3.2.3 Policy Enforcement Engines
2.3.3 Real-Time Data Control in Streaming and Dynamic Systems
2.3.3.1 Windowing and Stream Processing
2.3.3.2 Adaptive Sampling and Real-Time Data Filtering
2.3.3.3 Real-Time Model Retraining
2.3.4 Emerging Trends in Data Control Architecture
2.3.4.1 Federated Learning for Data Control
2.3.4.2 Blockchain for Data Integrity and Control
2.4 Advanced Techniques for Data Control
2.4.1 Data-Driven Control Strategies
2.4.1.1 Model Predictive Control
2.4.1.2 RL for Data-Driven Control
2.4.1.3 Adaptive Control Systems
2.4.2 Control of Streaming Data
2.4.2.1 Sliding Windows and Stream Processing Frameworks
2.4.2.2 Approximate Query Processing
2.4.2.3 Online Learning for Streaming Data
2.4.3 Handling Dynamic and Evolving Data Environments
2.4.3.1 Adaptive Learning Models
2.4.3.2 Handling Data Drift and Concept Drift
2.4.4 Advanced Real-Time Data Governance
2.4.4.1 Automated Policy Enforcement
2.4.4.2 Dynamic Access Control
2.5 Challenges in Data Control for Modeling
2.5.1 Scalability Issues
2.5.1.1 Data Volume and Velocity
2.5.1.2 Horizontal versus Vertical Scaling
2.5.2 Data Drift and Concept Drift
2.5.2.1 Types of Drift
2.5.2.2 Challenges in Detecting Drift
2.5.2.3 Model Adaptation
2.5.3 Real-Time Data Control
2.5.3.1 Latency Issues
2.5.3.2 Synchronization and Consistency
2.5.4 Data Privacy and Security
2.5.4.1 Data Anonymization and Differential Privacy
2.5.4.2 Data Encryption and Secure Computation
2.5.5 Collaborative Data Control
2.5.5.1 Data Sharing Across Organizations
2.5.5.2 Version Control and Auditing
2.6 Best Practices for Data Control in Data-Driven Modeling
2.6.1 Data Versioning and Auditing
2.6.1.1 Data Versioning
2.6.1.2 Auditing
2.6.2 Collaborative Data Control
2.6.2.1 Role-Based Access Control
2.6.2.2 Data Sharing and Federation
2.6.3 Metadata Management for Governance and Provenance
2.6.3.1 Automated Metadata Generation
2.6.3.2 Data Provenance and Lineage Tracking
2.6.4 Automation in Data Governance
2.6.4.1 Automated Policy Enforcement
2.6.4.2 Automated Compliance Monitoring
2.7 Case Studies in Data Control Methods
2.7.1 Real-Time Data Control in AVs
2.7.2 Data Governance and Privacy in Healthcare
2.7.3 Collaborative Data Sharing in Financial Services
2.7.4 Data Control in Smart Energy Grids
2.7.5 Big Data Control in E-Commerce
2.8 Future Directions in Data Control
2.8.1 Decentralized and Distributed Data Control
2.8.1.1 Edge Computing and Data Control at the Edge
2.8.1.2 Blockchain for Decentralized Data Control
2.8.2 Privacy-Preserving Data Control
2.8.2.1 Differential Privacy
2.8.2.2 Homomorphic Encryption and Secure Computation
2.8.3 Real-Time Adaptive Data Control
2.8.3.1 AI-Driven Data Control
2.8.3.2 Context-Aware Data Control
2.8.4 Federated Learning and Collaborative Data Control
2.8.4.1 Federated Learning at Scale
2.8.4.2 Federated Governance and Data Control
2.8.5 Quantum Computing and Its Impact on Data Control
2.8.5.1 Quantum Cryptography for Data Security
2.8.5.2 Quantum Machine Learning for Data Control
2.9 Concluding Remarks
References
3. Machine Learning Algorithms for Data-Driven ModelingSouryadip Ghosh, Indrani Mukherjee and Suparna Biswas
3.1 Introduction
3.2 What is Machine Learning?
3.3 Classification of Machine Learning Methods
3.3.1 Supervised Learning
3.3.2 Unsupervised Learning
3.3.3 Reinforcement Learning
3.4 Supervised Machine Learning
3.4.1 Decision Tree for Classification
3.4.2 C4.5
3.4.3 CART
3.4.4 CHAID
3.4.5 Iterative Dichotomizer 3
3.5 Support Vector Machine
3.5.1 SVM for Linear Classification
3.5.2 SVM for Nonlinear Classification
3.5.3 Kernel
3.5.4 Unsupervised Machine Learning
3.5.5 Clustering
3.5.6 K-Means
3.6 Hierarchical Clustering
3.6.1 Methodologies for Determining the Optimal Number of Clusters
3.6.2 Dimensionality Reduction
3.6.3 t-Distributed Stochastic Neighbor Embedding
3.6.4 Multidimensional Scaling
3.7 Principal Component Analysis
3.8 Conclusion
Bibliography
4. Neural Networks and Deep Learning in Data-Driven ModelingTanishka Chakraborty, Indrani Mukherjee and Suparna Biswas
4.1 Introduction
4.2 Basic Concept of Neural Network and Deep Learning
4.2.1 Characteristics of Neural Network
4.2.2 Characteristics of Deep Learning
4.3 Applications of Neural Networks and Deep Learning in Data-Driven Modeling
4.3.1 Image Recognition
4.3.2 Natural Language Processing
4.3.3 Time–Series Prediction
4.3.4 Recommender Systems
4.3.5 Anomaly Detection
4.3.6 Generative Adversarial Networks
4.3.7 Autonomous Driving
4.3.8 Health Monitoring Using Wearable Devices
4.3.9 Attention Mechanisms in NLP
4.3.10 Brain–Computer Interface
4.3.11 Fault Diagnosis in Industrial Systems
4.3.12 Speech Recognition
4.3.13 Cybersecurity Applications
4.3.14 Energy Consumption Forecasting
4.3.15 Human Activity Recognition
4.4 Techniques of Neural Networks and Deep Learning in Data-Driven Modeling
4.4.1 Convolutional Neural Networks
4.4.2 Recurrent Neural Networks
4.4.3 Long Short-Term Memory Networks
4.4.4 Autoencoders
4.4.5 Generative Adversarial Networks
4.4.6 Deep Reinforcement Learning
4.4.7 Transfer Learning
4.4.8 Data Augmentation
4.5 Methods of Neural Networks and Deep Learning in Data-Driven Modeling
4.5.1 Backpropagation
4.5.2 Data Augmentation
4.5.3 Hyperparameter Optimization
4.5.4 Ensemble Learning
4.5.5 Attention Mechanisms
4.5.6 Capsule Networks
4.5.7 Neuroevolution
4.6 Conclusion
Bibliography
5. Advances in Time-Series Analysis: Techniques and Applications for Predictive ForecastingA. UmaDevi, Jagendra Singh, Shrinwantu Raha, Nazeer Shaik, Anil V. Turukmane and Ishaan Singh
5.1 Introduction
5.1.1 Definition and Conceptual Framework
5.1.2 Importance and Applications
5.2 Foundational Techniques in TSA
5.2.1 AR Models
5.2.2 MA Models
5.2.3 ARIMA Models
5.2.4 Exponential Smoothing Methods
5.2.5 Seasonal Decomposition of Time Series
5.2.6 State Space Models and Kalman Filtering
5.2.7 Spectral Analysis and Fourier Transform
5.2.8 ML Techniques
5.3 Applications of TSA
5.3.1 Economic and Financial Forecasting
5.3.2 Healthcare and Epidemiology
5.4 Future Directions and Emerging Trends
5.4.1 Deep Learning and Neural Networks
5.4.2 Probabilistic Forecasting
5.4.3 Anomaly Detection and Outlier Analysis
5.4.4 Interpretable and Explainable Models
5.4.5 Multivariate and High-Dimensional TSA
5.4.6 Integration with Domain-Specific Knowledge
5.4.7 Ethical and Fair TSA
5.4.8 Automated ML for Time Series
5.4.9 Continuous Learning and Model Adaptation
5.5 Conclusion
References
6. Ensemble Methods for Data-Driven Modeling in Agriculture and ApplicationsKhalil Ahmed, Mithilesh Kumar Dubey, Kajal and Devendra Kumar Pandey
6.1 Introduction
6.1.1 Data Analysis Solutions for Data Modeling in Agriculture
6.2 Data-Driven Agriculture Cycle
6.3 Cloud-Based Event and Data Management in Data-Driven Modeling
6.4 Ensemble Methods for Data-Driven Modeling in Agriculture
6.4.1 Random Forest
6.4.2 Gradient-Boosting Machines
6.4.2.1 Loss Function
6.4.2.2 Weak Learners
6.4.2.3 Additive Model
6.4.3 AdaBoost
6.4.3.1 XGBoost
6.4.4 Bagging
6.4.5 Boosting
6.5 Applications of Data Modeling in Agriculture
6.5.1 Field and Resource Management
6.5.2 Environmental Sustainability and Food Safety
6.5.3 Crop Yield Prediction
6.5.4 Agriculture Market and Associated Risk Management
6.6 Conclusion and Future Directions
References
7. Artificial Intelligence–Enabled Ensemble Machine Learning Approaches for Solanaceae CropsKajal, Mithilesh Kumar Dubey, Khalil Ahmed and Devendra Kumar Pandey
7.1 Introduction
7.2 Overview of Solanaceae Crops
7.3 Data Modeling in Agriculture
7.3.1 Life Cycle of Data Modeling
7.3.1.1 Conceptual Data Model
7.3.1.2 Logical Data Model
7.4 Ensemble Machine Learning Methods in Sustainable Farming
7.4.1 Basic Ensemble Learning Techniques
7.4.1.1 Max Voting
7.4.1.2 Averaging
7.4.1.3 Weighted Average
7.4.2 Advanced Ensemble Learning Techniques
7.4.2.1 Stacking
7.4.2.2 Blending
7.4.2.3 Boosting
7.5 Application of Data Modeling and Ensemble Learning in Solanaceae Crops
7.5.1 Disease Detection and Diagnosis
7.5.2 Yield Prediction and Optimization
7.5.3 Supply Chain Optimization
7.6 Conclusion and Future Directions
References
8. Dynamic Multitask Transfer Learning with Adaptive Feature Sharing for Heterogeneous Data and Continual LearningToufique Ahammad Gazi
Introduction
Methodology
Conclusion
References
9. Forecasting Solar Power Generation in the Future by ARIMA Approach and Stationary TransformationSudeep Samanta
Introduction
Conclusion
References
10. Prognosticating Plays: ANN-Enabled Score Projection with the Help of FISSusmit Chakraborty and Sourish Harh
10.1 Introduction
10.2 System Model
10.3 ANFIS Controller
10.3.1 Layer 1
10.3.2 Layer 2
10.3.3 Layer 3
10.3.4 Layer 4
10.3.5 Layer 5
10.4 Results and Analysis
10.4.1 Data Preprocessing in Jupyter Notebook
10.4.2 ANFIS Model Building in MATLAB 2020A
10.4.3 Score Predictor Model Evaluation
10.5 Conclusion
References
11. Designing a PID Controller for the Two-Area LFC Problem Using Gradient Descent–Based Linear RegressionSusmit Chakraborty and Arindam Mondal
11.1 Introduction
11.2 Plant Model
11.3 PID Controller
11.4 LR Model
11.5 Result Analysis
11.5.1 ML Phase in Jupyter Notebook
11.5.2 Simulation Phase in MATLAB
11.6 Conclusion
Appendix
References
12. Implementing PID Controllers for Data‑Driven Recognizing for a Nonlinear SystemSusmit Chakraborty and Sagnik Agasti
12.1 Introduction
12.2 System Model
12.3 Nonlinear System
12.4 ML Engine
12.5 Result Analysis
12.6 Conclusion
References
13. Temporal Resilience Redux: BiLSTM for Short-Term Load Forecasting in Deep Learning DomainRitu K. R.
13.1 Introduction
13.2 Literature Review
13.3 Recurrent Neural Networks and LSTM
13.3.1 Architecture and Functioning of LSTM
13.3.2 LTSM versus RNN
13.4 Bidirectional LSTM
13.4.1 Bidirectional, Multilayer Stacked LSTM NN
13.4.2 Multilayer Stacked LSTM Bidirectional NN for Short-Term Load Forecasting
13.4.3 Multilayer BiLSTM Stacked NN
13.4.4 Load Forecasting of Multilayer Stacked BiLSTM
13.5 Experimental Settings
13.6 Conclusion
References
Index Back to Top