Bridging Data Management and Knowledge Discovery in the Life Sciences
Karl Kugler*, 1, Maria Mercedes Tejada2, Christian Baumgartner2, Bernhard Tilg2, Armin Graber2, Bernhard Pfeifer2, *
Identifiers and Pagination:Year: 2008
First Page: 28
Last Page: 36
Publisher ID: TOBIOIJ-2-28
Article History:Received Date: 17/04/2008
Revision Received Date: 03/06/2008
Acceptance Date: 04/06/2008
Electronic publication date: 25/07/2008
Collection year: 2008
open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
In this work we present an application for integrating and analyzing life science data using a biomedical data warehouse system and tools developed in-house enabling knowledge discovery tasks. Knowledge discovery is known as a process where different steps have to be coupled in order to solve a specified question. In order to create such a combination of steps, a data miner using our in-house developed knowledge discovery tool KD3 is able to assemble functional objects to a data mining workflow. The generated workflows can easily be used for ulterior purposes by only adding new data and parameterizing the functional objects in the process. Workflows guide the performance of data integration and aggregation tasks, which were defined and implemented using a public available open source tool. To prove the concept of our application, intelligent query models were designed and tested for the identification of genotype-phenotype correlations in Marfan Syndrome. It could be shown that by using our application, a data miner can easily develop new knowledge discovery algorithms that may later be used to retrieve medical relevant information by clinical researchers.