GenAI (Generative Artificial Intelligence) is a type of AI (Artificial Intelligence) that falls under Deep Learning. GenAI creates text, code, image, or video-based output for the provided prompt. The output is based on the training data it analyzed and various patterns it learned. Though GenAI can create new information, it is not always accurate. Incorrect information when confidently presented as correct information is termed AI/GenAI/Model Hallucination. This is a phenomenon where AI makes a factual error without realizing it.
A common occurrence of hallucination is when the GenAI model produces a summary based on limited information. It might include fabricated details within the summary. For example, if the GenAI model is prompted to summarize how Informatica handles a particular process in MDM (Master Data Management) but the model is only trained on general information and has no access to Informatica’s portal, the model will produce a generalized summary which might not be accurate when referenced against Informatica’s approach. This is a common case of GenAI Hallucination.
GenAI models leverage categorically analyzing massive datasets to learn underlying patterns to create new information. Lapses in the process can lead to the model hallucinating. Following are a few potential causes at various stages for GenAI hallucinations:
- Data Ingestion: Feeding datasets that contain biased, faulty, ambiguous, or misleading information.
- Preprocessing: Improper data cleansing, inconsistent formatting, and oversimplification can remove necessary detail and nuance from the dataset.
- Model Architecture: Choosing a simple model with tunnel vision for nuanced relationships between words cannot capture deeper patterns.
- Training Process: Constrained training on a limited dataset can lead to gaps in understanding.
When not checked, GenAI hallucinations will have negative impacts. A website content written by a GenAI tool where the tool lacked information on the services offered by the website can exaggerate the performance and downplay the challenges for the offerings. This will have real-world implications like misinformation, loss of trust, and damage to reputation.
GenAI hallucinations can be realistic and convincing, making it a challenge to identify them. One should be cautious while consuming content generated by GenAI tools. Knowing the model architecture and scope of datasets will allow for fair expectations from the model. Attentive lookout for factual errors, contradictions within the generated text, lacking coherence, or if the content does not appear logically true. We can mitigate hallucinations to a certain extent by:
- Removing ambiguity by providing clear instructions
- Verifying the accuracy of output through reliable sources
- Using GenAI models where data quality is prioritized, and hallucinations reduction measures are implemented.
GenAI hallucinations are considered a current limitation. Through critical thinking, the impact of factual missteps can be minimized and the capabilities of GenAI models can be greatly leveraged. With AI technology evolving, better accuracy and improved hallucination detection can be expected.