Browse Subject Areas

For Authors

Submit a Proposal

Applied Computer Vision Through Artificial Intelligence

Edited by Jasminder Kaur Sandhu, Abhishek Kumar, Rakesh Sahu, and Sachin Ahuja
Copyright: 2025 | Status: Published
ISBN: 9781394272594 | Hardcover |
494 pages
Price: $225 USD

One Line Description
Master the cutting-edge field of computer vision and artificial intelligence with this accessible guide to the applications of machine learning and deep learning for real-world solutions in robotics, healthcare, and autonomous systems.

Audience
Researchers, academics, engineers, developers, and industry professionals working in computer science, artificial intelligence, and data science.

Description
Applied Computer Vision through Artificial Intelligence provides a thorough and accessible exploration of how machine learning and deep learning are driving breakthroughs in computer vision. This book brings together contributions from leading experts to present state-of-the-art techniques, tools, and frameworks, while demonstrating this technology’s applications in healthcare, autonomous systems, surveillance, robotics, and other real-world domains. By blending theory with hands-on insights, this volume equips readers with the knowledge needed to understand, design, and implement AI-powered vision solutions.

Structured to serve both academic and professional audiences, the book not only covers cutting-edge algorithms and methodologies but also addresses pressing challenges, ethical considerations, and future research directions. It serves as a comprehensive reference for researchers, engineers, practitioners, and graduate students, making it an indispensable resource for anyone looking to apply artificial intelligence to solve complex computer vision problems in today’s data-driven world.

Back to Top

Author / Editor Details
Jasminder Kaur Sandhu, PhD is a professor and the Head of the Department of Machine Learning and Data Science at IILM University. With over 13 years of academic and research experience, she has published more than 70 research papers in reputed international journals. Her research interests include machine learning, ensemble modelling, artificial intelligence, wireless sensor networks, and soft computing.

Abhishek Kumar, PhD is a professor and the Assistant Director of the Computer Science and Engineering Department at Chandigarh University, Punjab with over 13 years of teaching experience. He is an award-winning researcher that has published more than 170 peer-reviewed papers in international journals of repute. His research interests span artificial intelligence, renewable energy systems, image processing, and data mining.

Rakesh Sahu, PhD is a dedicated academician and researcher with over a decade of experience. He has made significant contributions as a post-doctoral scholar at IIT Bombay and as a faculty member at esteemed institutions, where his work focuses on Himalayan glacier dynamics. His research interests include glacier mapping, modelling, and climate change.

Sachin Ahuja, PhD has an illustrious academic and research career, marked by numerous impactful contributions. An accomplished editor, he has contributed to numerous books and served as a guest editor for special issues in reputed international journals. His research focuses on artificial intelligence, machine learning, and data mining.

Table of Contents
Preface
1. An Overview of Medical Diagnostics through Artificial Intelligence-Powered Histopathological Imaging and Video Analysis
Atul Rathore, Praveen Lalwani, Pooja Lalwani and Rabia Musheer
1.1 Introduction
1.1.1 A Focus on Digital Image and Video Analysis
1.1.2 Overview of Research Article
1.1.2.1 Comparison Between Different Techniques/Comparative Analysis Among the Techniques Available
1.1.2.2 Overview of Data Preprocessing and Meta-Heuristic Algorithms
1.1.3 The Organizational of the Research Article
1.2 Background
1.2.1 Difficulties with Feature Selection
1.3 Preliminaries
1.3.1 Selection of Features (FS)
1.3.2 Classification
1.3.2.1 Support Vector Machine
1.3.2.2 Naïve Bayes
1.3.2.3 ANN
1.3.3 Meta-Heuristic Algorithms in FS
1.3.3.1 Genetic Algorithm
1.3.3.2 Cuckoo Search Optimization
1.3.3.3 BAT Algorithm
1.3.3.4 Grey Wolf Optimizer
1.3.3.5 Harris Hawk Optimization
1.3.3.6 Transition from Exploration to Exploitation
1.4 Experimental Results
1.4.1 Challenges in the Application of a Metaheuristic Algorithm for Classification and Prediction of Medical Disease
1.4.2 Summary of the Review
1.5 Conclusion
References
2. Generative Adversarial Networks: Theory and Application in Synthesis
Manoj Kumar Pandey, Priyanka Gupta, Triveni Lal Pal and Ayush Kumar Agrawal
2.1 Introduction
2.2 Ideologies of GAN
2.3 Architecture of GAN
2.4 Applications of GAN
2.4.1 Image Processing and Computer Vision
2.4.2 Healthcare and Medical Imaging
2.4.3 Natural Language Processing (NLP)
2.4.4 Video and Animation
2.4.5 Gaming and Entertainment
2.4.6 Cybersecurity and Anomaly Detection
2.4.7 Fashion and Retail
2.4.8 Art and Creativity
2.5 Conclusion
References
3. From Pixels to Predictions: Deep Learning for Glaucoma Detection
Tushar Verma, Sachin Ahuja and Jasminder Kaur Sandhu
3.1 Introduction
3.1.1 Glaucoma
3.1.2 Detection of Glaucoma
3.1.3 Deep Learning
3.1.4 Glaucoma Detection Using Deep Learning
3.2 Literature Review
3.2.1 Glaucoma Classification
3.2.2 Glaucoma Detecting
3.3 Problem Statement
3.4 Hybrid Approach for Glaucoma Detection
3.5 Result and Discussion
3.5.1 Confusion Matrix has been Obtained During Testing that is Shown Below for 4 Models
3.6 Conclusion
3.7 Future Scope
References
4. Advancements in Computer Vision for Object Detection and Recognition using DenseNet Deep Learning Model
N. Deepa, Padmapriya L., Priyadarshini V. and Shree Harini S.
4.1 Introduction
4.2 Literature Survey
4.2.1 Application of Principles
4.3 Proposed System
4.4 Results and Discussion
4.5 Conclusion
References
5. Deep Learning-Based Detection of Cyber Extortion
Mohana Preya R., Ramya M. and A. Abdhur Rahman
5.1 Introduction
5.2 Related Works
5.3 Existing System
5.4 Proposed System
5.5 System Architecture
5.6 Methodology
5.6.1 Data Collection and Preprocessing
5.6.2 Feature Extraction
5.6.3 Voice Processing
5.6.4 Model Architecture
5.6.4.1 Text Vectorization Layer
5.6.4.2 Embedding Layer
5.6.4.3 Bidirectional LSTM Layer
5.6.4.4 Dense Layers
5.6.4.5 Dropout Regularization
5.6.5 Evaluation
5.6.5.1 Precision
5.6.5.2 Recall
5.6.5.3 F1 Score
5.6.5.4 Accuracy
5.7 Results and Discussion
5.8 Conclusion
5.9 Future Work
References
6. GANs Unleashed: From Theory to Synthetic Realities
Rakhi Chauhan, Priya Batta and Km Meenakshi
6.1 Introduction
6.2 Related Works
6.2.1 Accurate Representation of the Density
6.2.2 Classification/Regression
6.2.3 Computer Algorithms for Image Synthesis
6.2.4 Computer Algorithms Synthesize Pictures
6.3 Limitations that are Enforced by GAN
6.4 Conclusion
References
7. RFID and Computer Vision-Enhanced Automotive Authentication Verification System
V. Vidya Lakshmi, Sowmya M. B., Archanaa R., Shreenidhi G. and Naveena R.
7.1 Introduction
7.2 Literature Survey
7.3 Proposed System
7.4 Working
7.5 Block Diagram
7.6 Hardware Components
7.7 Result
7.8 Conclusion
Bibliography
8. Synergizing Ensemble Learning Techniques for Robust Emotion Detection using EEG Signals
Pulkit Dwivedi, Jasminder Kaur Sandhu and Rakesh Sahu
8.1 Introduction
8.1.1 Overview of EEG-Based Emotion Detection
8.1.2 Motivation for Using Ensemble Learning
8.2 Ensemble Learning Techniques
8.2.1 Random Forest Classifier
8.2.2 AdaBoost Classifier
8.2.3 Gradient Boosting Classifier
8.2.4 CatBoost Classifier
8.2.5 XGBoost Classifier
8.2.6 Extra Trees Classifier
8.3 Methodology
8.3.1 Data Collection and Preprocessing
8.3.2 Implementation Details
8.4 Experimental Results
8.4.1 Impact of Different Ensemble Techniques on Emotion Detection Accuracy
8.4.2 Robustness and Reliability
8.5 Discussion
8.5.1 Advantages of Ensemble Methods in EEG Emotion Detection
8.5.2 Future Directions
8.6 Conclusion
9. Understanding the Unseen: Explainability in Deep Learning for Computer Vision
Apoorva Jain, Jasminder Kaur Sandhu and Pulkit Dwivedi
9.1 Introduction
9.1.1 An Overview of the Success of Deep Learning in Computer Vision
9.1.2 The Importance of Interpretability and Explainability
9.2 The Need for Interpretation in Computer Vision
9.3 Understanding Interpretability in Deep Learning
9.4 Visualization Techniques
9.5 Maps of the Headland
9.6 Model Simplification
9.7 Meaning of Function
9.8 Feature Importance
9.9 Methods Based on Prototypes
9.10 Challenges and Future Directions
9.11 Conclusion
9.12 Future Vision
References
10. Prefatory Study on Landslide Susceptibility Modeling Based on Binary Random Forest Classifier
Arpitha G. A. and Choodarathnakara A. L.
10.1 Introduction
10.2 Materials and Methodology
10.2.1 Region of Study
10.2.2 Preparation of Dataset
10.2.3 Random Forest
10.2.4 Evaluation of Landslide Susceptibility Model
10.3 Result Analysis
10.3.1 10-Fold Cross-Validation
10.3.2 Feature Selection
10.3.3 LSM by Binary RF Model
10.4 Conclusion
References
11. Improving Digital Interactions using Augmented Reality and Computer Vision
Priya Batta and Rakhi Chauhan
11.1 Introduction
11.2 Literature Survey
11.3 Methodology
11.4 Results
11.5 Conclusion and Future Scope
References
12. The Evolutionary Dynamics of Machine Learning and Deep Learning Architectures in Computer Vision
Palvadi Srinivas Kumar
12.1 Introduction to Computer Vision and Its Evolution
12.2 Foundations of Machine Learning in Computer Vision
12.3 Rise of Deep Learning in Computer Vision
12.4 Key Architectures and Techniques in Deep Learning for Computer Vision
12.5 CNN Architectures
12.5.1 Inception
12.5.2 ResNet (Residual Network)
12.5.3 DenseNet
12.6 Transfer Learning and Fine-Tuning
12.7 Object Detection, Image Segmentation, and Image Classification
12.7.1 Visual Geometry Group (VGG)
12.7.2 MobileNet
12.7.3 Transfer Learning and Fine-Tuning
12.7.4 Mask R-CNN
12.7.5 DeepLab
12.7.6 EfficientNet
12.8 Evolution of Image Processing Models
12.8.1 Progression of Deep Learning (DL) Architectures
12.8.2 Recent Advancements in Computer Vision Research
12.8.3 Integration of Multimodal Learning
12.8.4 Continual Learning and Lifelong Adaptation
12.8.5 Ethical Considerations and Responsible AI
12.8.6 Robustness and Adversarial Defense
12.8.7 Interpretability and Explainability
12.8.8 Domain-Specific Adaptation and Transfer Learning
12.8.9 Human-Centric Vision Systems
12.9 Challenges and Future Directions
12.9.1 Challenges
12.9.1.1 Interpretability
12.9.1.2 Robustness
12.9.1.3 Scalability
12.9.1.4 Interpretability
12.9.1.5 Robustness
12.9.1.6 Scalability
12.9.1.7 Interpretability
12.9.1.8 Robustness
12.9.1.9 Scalability
12.9.2 Future Directions
12.9.2.1 Multimodal Learning
12.9.2.2 Self-Supervised Learning
12.9.2.3 Incorporating Domain Knowledge
12.9.2.4 Multimodal Learning
12.9.2.5 Self-Supervised Learning
12.9.2.6 Incorporating Domain Knowledge
12.9.2.7 Multimodal Learning
12.9.2.8 Self-Supervised Learning
12.9.2.9 Incorporating Domain Knowledge
12.10 Applications and Impacts
12.10.1 Autonomous Driving
12.10.2 Medical Imaging
12.10.3 Surveillance and Security
12.10.4 Societal Impacts
12.10.5 Retail and E-Commerce
12.10.6 Agriculture
12.10.7 Art and Creative Industries
12.10.8 Accessibility
12.10.9 Environmental Monitoring
12.10.10 Industrial Quality Control
12.10.11 Augmented Reality (AR) and Virtual Reality (VR)
12.10.12 Smart Cities
12.10.13 Education
12.10.14 Humanitarian Aid and Disaster Response
12.11 Conclusion
References
13. Real-World Applications: Transforming Industries with Computer Vision
Seema B. Rathod, Pallavi H. Dhole and Sivaram Ponnusamy
13.1 Introduction
13.1.1 Definition and Brief History of Computer Vision
13.1.2 Importance of Computer Vision in Modern Industries
13.1.3 Purpose and Structure of the Paper
13.2 Healthcare
13.2.1 Medical Imaging Analysis
13.2.1.1 Use in Early Disease Detection (e.g., Cancer, Diabetic Retinopathy)
13.2.1.2 Case Studies and Statistics on Improved Diagnostic Accuracy
13.2.2 Robotic Surgery
13.2.2.1 Enhancements in Precision and Patient Outcomes
13.2.3 Patient Monitoring
13.2.3.1 Continuous Monitoring Systems and their Benefits
13.3 Manufacturing
13.3.1 Quality Control and Defect Detection
13.3.1.1 Automated Visual Inspection Systems
13.3.1.2 Case Studies on Efficiency and Waste Reduction
13.3.2 Predictive Maintenance
13.3.2.1 Early Detection of Machinery Issues
13.3.2.2 Impact on Reducing Downtime and Extending Machinery Lifespan
13.4 Retail
13.4.1 Personalized Shopping Experiences
13.4.1.1 Visual Search and Recommendation Systems
13.4.2 Inventory Management and Loss Prevention
13.4.2.1 Real-Time Stock Monitoring
13.4.2.2 Theft Detection Systems
13.4.3 Cashier-Less Checkout Systems
13.4.3.1 Technology Behind and Benefits of Seamless Shopping Experiences
13.5 Automotive
13.5.1 Autonomous Vehicles
13.5.1.1 Role of Computer Vision in Navigation and Obstacle Detection
13.5.1.2 Impact on Road Safety and Traffic Management
13.5.2 Case Studies
13.5.2.1 Examples of Companies and Technologies Leading the Way
13.6 Agriculture
13.6.1 Precision Farming
13.6.1.1 Monitoring Plant Health and Soil Conditions
13.6.1.2 Pest Infestation Detection
13.6.1.3 Benefits in Crop Yield and Resource Management
13.6.2 Sustainable Farming Practices
13.6.2.1 Examples of Successful Implementations
13.7 Security and Surveillance
13.7.1 Public Safety Enhancements
13.7.1.1 Facial Recognition and Behavioral Analysis
13.7.1.2 Real-Time Crime Prevention
13.7.2 Law Enforcement Support
13.7.2.1 Case Studies and Statistics on Crime Reduction
13.8 Challenges and Future Directions
13.8.1 Technical Challenges
13.8.1.1 Limitations in Current Computer Vision Technologies
13.8.2 Ethical Considerations
13.8.2.1 Privacy Concerns and Data Security
13.9 Future Trends
13.9.1 Emerging Technologies and Potential Future Applications
13.10 Conclusion
13.10.1 Overall Impact of Computer Vision on Various
Industries 296
13.10.2 Final Thoughts on the Future of Computer Vision in Industry Transformation
References
14. Revolutionizing Vision Perception with Multimodal Fusion Technologies
Priya Batta, Rakhi Chauhan and Gagandeep Kaur
14.1 Introduction
14.2 Literature Survey
14.3 Methodology
14.4 Results and Discussions
14.5 Conclusion and Future Scope
References
15. Object Detection and Localization: Identifying and Pinpointing With Precision
Seema B. Rathod, Pallavi H. Dhole and Sivaram Ponnusamy
15.1 Introduction
15.1.1 Importance of Object Detection and Localization in Computer Vision
15.1.2 Applications Across Various Domains
15.1.3 Overview of Challenges and Goals of the Paper
15.2 Background and Literature Review
15.2.1 Historical Perspective and Evolution of Object Detection Techniques
15.2.2 Overview of Traditional Methods vs. Deep Learning Approaches
15.2.3 Review of Key Advancements in Convolutional Neural Networks (CNNs) and their Impact on Object Detection
15.3 Methodologies and Techniques
15.3.1 Overview of State-of-the-Art Object Detection Algorithms
15.3.2 Detailed Explanation of Each Methodology: Architecture and Training Process
15.3.3 Discussion on Handling Challenges: Occlusions, Scale Variations, and Complex Backgrounds
15.4 Evaluation Metrics and Benchmarks
15.4.1 Introduction to Evaluation Metrics for Object Detection
15.4.2 Benchmark Datasets Commonly Used for Evaluating Object Detection Models
15.4.3 Comparative Analysis of Performance Across Different Algorithms and Datasets
15.5 Applications and Case Studies
15.5.1 Real-World Applications of Object Detection and Localization
15.5.2 Case Studies Illustrating Successful Implementations and Their Impact
15.6 Challenges and Future Directions
15.6.1 Current Challenges in Object Detection and Localization
15.6.2 Emerging Trends and Future Research Directions
15.7 Conclusion
References
16. Uncertainty Estimation in Deep Learning Based Computer Vision
Palvadi Srinivas Kumar
16.1 Introduction
16.2 Basics of Uncertainty
16.2.1 Sorts of Uncertainty
16.2.2 Hypothetical Establishments and Suggestions for PC Vision Tasks
16.3 Uncertainty Estimation Techniques
16.3.1 Bayesian Deep Learning Approaches
16.3.2 Variational Inference and Monte Carlo Dropout
16.3.3 Ensemble Methods and Their Application in Uncertainty Estimation
16.3.4 Gaussian Processes for Uncertainty Estimation
16.3.5 Calibration Methods
16.3.6 Active Learning Strategies
16.3.7 Integration with Domain Knowledge
16.3.8 Dropout as Bayesian Approximation
16.3.9 Uncertainty-Aware Loss Functions
16.3.10 Meta-Learning for Uncertainty Estimation
16.3.11 Interpretability and Visualization of Uncertainty
16.3.12 Ethical Considerations and Deployment Challenges
16.4 Uncertainty in Object Detection
16.5 Challenges and Considerations in Detecting Objects with Uncertain Predictions
16.5.1 Ambiguity in Object Boundaries
16.5.2 Scale and Perspective Variability
16.5.3 Limited Training Data
16.6 Case Studies and Practical Examples
16.6.1 Autonomous Driving Systems
16.6.2 Medical Imaging
16.6.3 Surveillance and Security
16.7 Uncertainty in Semantic Segmentation
16.8 Pixel-Wise Uncertainty Estimation Techniques
16.9 Incorporating Uncertainty Into Segmentation Models for Improved Performance
16.10 Practical Implications and Case Studies
16.11 Uncertainty in Image Classification
16.12 Applications and Case Studies
16.13 Evaluating Uncertainty Estimates
16.14 Future Directions and Challenges
16.14.1 Advanced Bayesian Techniques
16.14.2 Multimodal Fusion
16.14.3 Uncertainty in Reinforcement Learning
16.14.4 Ethical and Fair Uncertainty Estimation
16.14.5 Large-Scale Deployment and Efficiency
16.14.6 Interpretable Uncertainty Quantification
16.14.7 Transferability and Generalization
16.14.8 Human-Centric Design
16.14.9 Meta-Learning and Active Learning
16.14.10 Benchmarking and Standardization
16.14.11 Adversarial Robustness
16.14.12 Real-Time Applications
16.14.13 Continual Learning and Concept Drift
16.14.14 Integration with Decision-Making Systems
16.14.15 Multimodal Uncertainty Fusion
16.15 Conclusion
Bibliography
17. Overcoming Occlusions in Visual Data using Long Short-Term Memory Networks (LSTMs)
Sivaram Ponnusamy, K. Swaminathan, Nandha Gopal S. M., Ambika Jaiswal and Suhashini Chaurasia
17.1 Introduction
17.1.1 Impacts of LSTMs on Proposed Framework
17.2 Literature Survey
17.3 Proposed System
17.4 Results and Discussion
17.5 Conclusion
References
18. Transformative Role of Machine Learning and Deep Learning Architecture in Computer Vision
Neetu Amlani, Swapnil Deshpande, Suhashini Chaurasia, Ambika Jaiswal and Sivaram Ponnusamy
18.1 Introduction
18.2 Literature Review
18.3 Methodology
18.3.1 Significant Lexicons
18.3.2 Briefly Describing the Applications of CV
18.3.3 Types of Machine Learning
18.4 Conclusion
References
19. A Comprehensive Analysis of Deep Learning and Machine Learning for Semantic Segmentation, and Object Detection in Machine and Robotic Vision
Pragati V. Thawani, Prafulla E. Ajmre, Suhashini Chaurasia and Sivaram Ponnusamy
19.1 Introduction
19.2 Machine Learning/Deep Learning Algorithms
19.3 Object Detection, Semantic Segmentation, and Human Action Recognition Methods
19.4 Human and Computer Vision Systems
19.5 Case Studies
19.6 Challenges
19.7 Conclusion
References
20. From Theoretical Foundations to Data Synthesis: Advanced Applications of Generative Adversarial Networks (GANs)
Pulkit Dwivedi, Jasminder Kaur Sandhu and Apoorva Jain
20.1 Introduction
20.2 Theoretical Foundations of Gans
20.2.1 Basics of Generative Adversarial Networks (GANs)
20.2.2 Mathematical Formulation
20.2.3 Training Dynamics
20.3 Applications of GANs in Synthesis
20.3.1 Image Synthesis
20.3.2 Text-to-Image Synthesis
20.3.3 Video Synthesis
20.3.4 Data Augmentation
20.3.5 3D Object Synthesis
20.4 Case Studies and Practical Implementations
20.4.1 Case Study 1: Medical Imaging for Tumor Detection
20.4.2 Case Study 2: Autonomous Driving Simulation
20.4.3 Case Study 3: Artistic Style Transfer
20.5 Implementation of GANs for Synthetic Image Generation
20.5.1 Explanation of Key Steps in the Implementation
20.5.2 Performance Evaluation
20.5.3 Comparison with Traditional Methods
20.5.4 Result Analysis
20.6 Transfer Learning in GANs
20.6.1 Benefits of Transfer Learning in GANs
20.6.2 How Transfer Learning Works in GANs
20.6.3 Applications of Transfer Learning in GANs
20.6.4 Challenges and Future Directions
20.7 Advanced Training Techniques for GANs
20.7.1 Wasserstein GANs (WGANs) and Gradient Penalty
20.7.2 Spectral Normalization
20.7.3 Label Smoothing and Noisy Labels
20.7.4 Feature Matching
20.7.5 Multi-Scale Discriminators
20.7.6 Self-Attention Mechanisms
20.7.7 Progressive Growing of GANs
20.8 Security Implications of GANs
20.8.1 Adversarial Attacks Powered by GANs
20.8.2 Deepfakes and the Proliferation of Digital Manipulation
20.8.3 Data Privacy and the Risk of Model Inversion Attacks
20.8.4 Forgery, Fraud, and Identity Theft
20.8.5 Cybersecurity Risks and the Dual Use of GANs
20.8.6 Ethical and Legal Implications
20.9 GANs for Sustainable AI Development
20.9.1 Resource Efficiency of GANs
20.9.2 Sustainability in Data-Driven Applications
20.9.3 Minimizing the Environmental Impact of AI
20.9.4 GANs in Green AI Initiatives
20.10 Challenges and Future Directions
20.10.1 Current Challenges in GAN Research and Applications
20.10.2 Ethical Considerations
20.10.3 Potential Future Directions and Research Areas
20.11 Conclusion
References
21. Optimization Techniques in Training Deep Neural Networks for Vision
Shantanu Bindewari, Sumit Singh Dhanda and Anand Singh
21.1 Introduction to Deep Neural Networks for Vision
21.1.1 Overview of Neural Networks in Computer Vision
21.1.2 Training Challenges in Deep Neural Networks
21.1.3 Importance of Optimization Techniques
21.2 Fundamentals of Optimization in Neural Networks
21.2.1 Gradient Descent
21.2.2 Learning Rate and Its Impact on Training
21.2.3 Loss Functions in Vision Tasks
21.3 Advanced Gradient-Based Optimization Techniques
21.3.1 Momentum-Based Optimizers
21.3.2 Optimizers of Adaptive Learning Rates
21.3.3 Methods of Second-Order Optimization
21.3.4 Benefits and Drawbacks of Optimizers Based on Gradients for Vision Tasks
21.4 Regularization Techniques for Vision Models
21.4.1 Regularization, L1 and L2
21.4.2 Dropout: Its Significance in Avoiding Overfitting
21.4.3 Normalization of Batches
21.4.4 Data Enrichment for Visual Tasks
21.5 Learning Rate Schedules and Optimizers for Efficient Training
21.5.1 Fixed and Step Decay Schedules
21.5.2 Cyclical Learning Rates
21.5.3 Warm Restarts and Cosine Annealing
21.5.4 Combining Schedules with Adaptive Optimizers
21.6 Techniques for Handling Vanishing and Exploding Gradients
21.6.1 Weight Initialization Techniques
21.6.2 Gradient Clipping
21.6.3 Use of Residual Networks
21.7 Model Compression and Optimization for Inference
21.7.1 Pruning Techniques
21.7.2 Quantization Techniques
21.7.3 Knowledge Distillation
21.8 Transfer Learning and Fine-Tuning Techniques
21.8.1 Pretrained Models in Vision Tasks
21.8.2 Fine-Tuning on Custom Datasets
21.8.3 Domain Adaptation and Transfer Learning Optimization
21.9 Hyperparameter Tuning and Optimization Techniques
21.9.1 Grid Search and Random Search
21.9.2 Bayesian Optimization
21.9.3 Population-Based Training
21.10 Case Studies and Applications
21.10.1 Case Study: Image Classification with CNNs
21.10.2 Case Study: Object Detection and Segmentation
21.10.3 Case Study: Vision Transformers and Advanced Architectures
References
About the Editors
Index

Back to Top

Description
Author/Editor Details
Table of Contents
Bookmark this page

Search

Browse Book Series

Browse Subject Areas

For Authors

Applied Computer Vision Through Artificial Intelligence