EXPLORE SCALABLE AND COST-EFFECTIVE AI DEPLOYMENTS, INCLUDING DISTRIBUTED TRAINING, MODEL SERVING, AND REAL-TIME INFERENCE ON HUMAN TASKS

International Journal of Advances in Engineering Research 24 (1):7-27 (2022)
  Copy   BIBTEX

Abstract

The rapid growth of Artificial Intelligence (AI) has sparked the demand for scalable, efficient, and cost-effective deployment solutions. In particular, these methods are crucial for handling the increasing computing demand and complexity of AI models in human-centric tasks like real-time picture classification, speech recognition, and natural language processing. The three main topics of this paper's exploration of scalable AI deployment methodologies are real-time inference, model serving, and distributed training. Optimized deployment pipelines, parallel processing, and cloud infrastructure are essential for striking a balance between performance and cost. This study offers a thorough analysis of various technologies, looking at their cost-effectiveness, suitability for use in real-world settings, and capacity to handle huge datasets. Along with evaluations, the article provides a comparative study of various approaches based on cost, efficiency, and scalability parameters. Tables are used to highlight the differences between the approaches. A survey of pertinent literature covering the years 2003 to 2022 gives context for the advancement of AI deployment technology

Analytics

Added to PP
2025-03-09

Downloads
157 (#98,249)

6 months
157 (#32,367)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?