Efficiency improvement in Software Development using Generative AI
To reduce the effort required for data integration, we developed a language model for our client that can better search existing programming code and generate suggestions.

Challenge
In the research and development of new drugs, countless data are generated throughout the process. Particularly in clinical studies, our client faced the challenge of dealing with data that were poorly standardized and documented. Integrating non-standardized data requires significant effort, and the necessary programming code is manually created anew each time. An innovative approach with high automation potential was sought to accelerate this process and reduce codebase fragmentation.
Approach
Large language models now offer not only an excellent understanding of natural language but are also capable of understanding and even generating a variety of programming languages. Moreover, these models can be used to "translate" natural language into programming language. Since the specific programming language used by our client was underrepresented, we further trained a pre-trained language model on their data (programming code and documentation). This enabled the model to learn a semantic understanding of the client-specific codebase. After integration into a search engine, the model can now search for similar code components or create them based on existing examples.
Result
By combining a tailored language model with semantic search, the data integration process was significantly accelerated and qualitatively improved. This is possible because new programming code no longer has to be generated from the first line, allowing for high reusability of existing code. Developers can quickly find, reuse, and even automatically adapt already implemented functionalities. The reusability of code also reduces increasing fragmentation due to new functionalities and duplicate structures.