By Alok Mehta – CIO Business Systems at Kemper and Mila Beryozkin – Principal Data Management Consultant at Discover Financial Services
The ever-evolving technology of Generative AI (Gen AI) has been making waves in the modern world. Its integration into the Software Development Lifecycle (SDLC) has opened doors to endless opportunities. From generating test cases to algorithms, and even synthetic data, many tech experts have already embraced the use of Gen AI.
But there are pitfalls. As with every powerful tool comes a great deal of responsibility. Gen AI is no different. In this article, we delve into specific examples and challenges associated with AI-generated assets and the crucial role of software engineers in promoting responsible AI adoption.
Imagine an algorithm that is generated by a large language model (LLM) that is designed to automate loan approval processes. The algorithm is trained on historical loan data, which it will use to approve or reject loan applications. However, if the historical data contains biases – for instance, if it reflects past discriminatory practices against certain demographic groups – the AI’s model will inadvertently learn and perpetuate these biases. This could result in unfairly rejecting loan applications from individuals belonging to these groups.
Let’s explore another example. In the health insurance space, potential biases in AI-generated code could be an AI system developed to assess insurance claims or determine insurance premiums based on individual health data. If the AI is trained on data that predominantly represents a certain demographic (e.g., a specific age group, gender, or ethnicity), it may be less accurate or fair when assessing individuals outside of that demographic. For instance, the AI could unfairly assign higher premiums or deny claims more frequently for certain groups if it’s not exposed to a diverse range of health profiles and conditions during its training. This scenario illustrates the ethical issue of data bias in AI, where the AI’s outputs may inadvertently reflect and perpetuate existing inequalities or biases present in the training data.
Another example of potential bias in AI-generated code could arise in systems designed for autonomous vehicles. Imagine an AI system trained predominantly on driving data from urban environments in a specific region. This system might not perform effectively in rural areas or regions with different traffic patterns and road conditions. Consequently, autonomous vehicles might exhibit less safe or efficient driving behaviors in these underrepresented environments. This scenario underscores the importance of diverse and comprehensive training data in AI systems, especially when safety-critical decisions are involved.
These scenarios highlighted the issue of bias, which is a critical concern of Generative AI. On the surface, the generated algorithm and automated systems may seem benign, but all pose ethical problems of which IT professionals must be aware, recognize and address. The question is how to practice responsible AI from SDLC standpoint, which can be addressed if several key steps are emphasized:
- Data Profiling: This involves understanding the characteristics of the data used to train the AI model. Collaborating with subject matter experts for data insights is crucial, as well as utilizing tools for data profiling to ensure a comprehensive understanding of the data.
- Data Diversification: Ensuring the training data for AI models covers a wide spectrum of scenarios and is representative of real-life diversity is vital to prevent biases in decision making.
- Data Governance: One of the most important aspects of knowing your data is to have robust data governance in place, especially as it relates to managing and maintaining data lineage, meta data and quality.
- Transparency: Maintaining openness about the functioning and decision-making process of the AI system helps in building trust and allows for better oversight.
- Explanation: Documenting and clearly explaining the workings of AI systems is necessary for accountability and understanding.
- Workflow: Implementing human checks and balances alongside AI decisions ensures a balance between automated efficiency and human judgment.
- Auditing: Regular auditing of AI systems and the data they use, preferably automated, is important to identify and address any issues proactively.
- Testing: Conducting comprehensive testing, including unit, system, functional, negative, and regression testing ensures the reliability and safety of AI systems.
- AI Governance Framework: Establishing an AI governance framework in collaboration with stakeholders including participation from legal, compliance, data privacy and technology helps in setting standards and guidelines for responsible AI usage.
These steps are integral to mitigating risks associated with AI, ensuring ethical usage, and maximizing the technology’s benefits responsibly.
In conclusion, the integration of Generative AI in tSDLC presents transformative opportunities but also significant ethical challenges. It’s crucial for IT professionals to practice responsible AI by ensuring data profiling, diversification, and transparency. Thorough documentation and explanation of AI systems, coupled with regular auditing and diverse testing, are essential. Implementing a robust AI governance framework and incorporating human oversight can further safeguard against biases and ensure ethical, effective AI deployment in various sectors like finance, healthcare, and automotive. This approach will help in harnessing the benefits of AI while mitigating its potential risks.
Lyudmila (Mila) Beryozkin, serves as the Principal Data Management Consultant at Discover Financial Services. She has extensive expertise and interest in Enterprise Data Management, SDLC and AI/ML technologies.