Ayuda
Ir al contenido

Dialnet


Multimodal Generative Artificial Intelligence Tackles Visual Problems in Chemistry

    1. [1] University of Texas at Austin

      University of Texas at Austin

      Estados Unidos

  • Localización: Journal of chemical education, ISSN 0021-9584, Vol. 101, Nº 7, 2024, págs. 2716-2729
  • Idioma: inglés
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • The introduction of multimodal capabilities in large language models (LLMs) marks a significant advancement in the field of artificial intelligence (AI). In particular, the ability to process and interpret visual data, including complex graphs and plots frequently encountered in chemistry, expands the potential of these models. This integration of text and image processing allows multimodal AI to tackle a broader range of problems, especially in areas where visual information is central to understanding and solving problems. This study provides an examination of GPT-4’s image input capabilities, specifically targeting its efficacy in interpreting and solving chemistry problems that require graphical information. This study evaluates GPT-4’s image input feature, focusing on its accuracy in interpreting chemical diagrams, structures, and tabular data, and its utility as an interactive, conversational tutor in chemistry education. The research assesses the consistency of the AI’s responses to visual data of varying quality and its ability to parse handwritten problems and answers. Further, the study examines GPT-4’s capacity for molecular structure analysis and spectral data interpretation, vital for advanced problem-solving in chemistry. Through analysis, we demonstrate how the image processing capabilities of GPT-4 could be leveraged for pedagogical purposes, particularly in undergraduate chemistry courses. In addition, we provide advice for prompt development to improve response quality.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno