API proposal for applicability domain estimation
An API proposal, attempting to unify different approaches of applicability domain estimation.
Applicability domain in OpenTox framework:
- An applicability domain procedure is an OpenTox Algorithm.
- An applicability domain "model" is created posting a dataset URI to an applicability domain algorithm URI. This creates ot:Model with type ota:ApplicabilityDomain and returns a "AD-model" uri.
- Alternatively, for AD, embedded in a predictive model, just declare additional rdf:type of the model to be ota:ApplicabilityDomain
- An applicability domain estimation is done by POSTing a dataset to the "AD-model" uri. This generates another dataset with an extra feature telling whether the corresponding compound belongs to the applicability domain (or in fuzzy terms, how much does it belong to that set).
- For models with embedded AD, on POST of a dataset to the model , both prediction results and AD estimates are generated.
- All models provides the estimation results as specified below.
Applicability domain RDF representation:
A predictive model can be assigned external or embedded applicability domain
- In case of AD external to the model:
@prefix ot: <http://www.opentox.org/api/1.1#> . @prefix ota: <http://www.opentox.org/algorithmTypes.owl#> . </model/mlr-model> ot:hasDomain </model/leverage-ad-model>. </model/mlr-model> rdf:type ot:Model. </model/mlr-model> ot:algorithm </algorithm/mlr>. </algorithm/mlr> rdf:type ot:Algorithm. </algorithm/mlr> rdf:type ota:Regression. </model/leverage-ad-model> rdf:type ot:Model. </model/leverage-ad-model> ot:algorithm </algorithm/leverage>. </algorithm/leverage> rdf:type ot:Algorithm. </algorithm/leverage> rdf:type ota:ApplicabilityDomain.
- In case of AD embedded with the model
@prefix ot: <http://www.opentox.org/api/1.1#> . @prefix ota: <http://www.opentox.org/algorithmTypes.owl#> . <lazar-model> ot:hasDomain <lazar-model>. <lazar-model> rdf:type ot:Model. <lazar-model> ot:algorithm </algorithm/lazar>. </algorithm/lazar> rdf:type ot:Algorithm. </algorithm/lazar> rdf:type ota:ApplicabilityDomain. </algorithm/lazar> rdf:type ota:LazyLearning.
Results form applicability domain estimation
- by analogy of ot:predictedVariables, used to specify features,
where prediction results are stored, one can specify which features
hold the result of AD estimation (suggestion for better property names instead of ot:adMembership and ot:adMetric are welcome !)
@prefix ot: <http://www.opentox.org/api/1.1#> . //the estimated value, e.g. leverage ot:Model ot:adMetric ot:Feature. //the desision for AD membership, based on the estimated value - e.g. "in-domain" if leverage > threshold //have to agree on the value type - boolean, numeric, string, nominal ? ot:Model ot:adMembership ot:Feature.
and subsequently use the same ot:dataEntry and ot:FeatureValue RDF constructions , used elsewhere to specify property values, to specify AD results as well:
@prefix ot: <http://www.opentox.org/api/1.1#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix : <http://ambit.uni-plovdiv.bg:8080/ambit2/> .
@prefix ota: <http://www.opentox.org/algorithmTypes.owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ac: <http://ambit.uni-plovdiv.bg:8080/ambit2/compound/> .
@prefix ad: <http://ambit.uni-plovdiv.bg:8080/ambit2/dataset/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix af: <http://ambit.uni-plovdiv.bg:8080/ambit2/feature/> .
ad:1 a ot:Dataset ;
ot:dataEntry
[ a ot:DataEntry ;
ot:compound ac:1 ;
ot:values
[ a ot:FeatureValue ;
ot:feature af:1 ;
ot:value "3.14"^^xsd:double
]
ot:values
[ a ot:FeatureValue ;
ot:feature af:9999 ;
ot:value "0.0"^^xsd:double
]
] .
af:1
a ot:Feature , ot:NumericFeature ;
dc:title "MLR-prediction" ;
ot:hasSource <http://opentox.ntua.gr/model/mlr> ;
ot:units "" .
af:9999
a ot:Feature , ot:NumericFeature ;
dc:title "AD-leverage" ;
ot:hasSource <http://opentox.ntua.gr/model/leverage-ad> ;
ot:units "" .
ac:1
a ot:Compound ;
ot:NumericFeature
a owl:Class ;
rdfs:subClassOf ot:Feature .
ot:DataEntry
a owl:Class .
ot:hasSource
a owl:ObjectProperty .
ot:units
a owl:DatatypeProperty .
ot:values
a owl:ObjectProperty .
ot:compound
a owl:ObjectProperty .
dc:title
a owl:AnnotationProperty .
ot:feature
a owl:ObjectProperty .
ot:Dataset
a owl:Class .
dc:description
a owl:AnnotationProperty .
ot:dataEntry
a owl:ObjectProperty .
ot:Compound
a owl:Class .
dc:identifier
a owl:AnnotationProperty .
ot:FeatureValue
a owl:Class .
ot:Feature
a owl:Class .
dc:type
a owl:AnnotationProperty .
ot:value
a owl:DatatypeProperty .
There is no difference in representation of AD results, if AD is embedded in the model itself, besides that ot:hasSource for features , representing predicted values and AD estimation, point to the same ot:Model object
ad:1 a ot:Dataset ;
ot:dataEntry
[ a ot:DataEntry ;
ot:compound ac:1 ;
ot:values
[ a ot:FeatureValue ;
ot:feature af:lazar_prediction ;
ot:value "1.0"^^xsd:double
]
ot:values
[ a ot:FeatureValue ;
ot:feature af:10000 ;
ot:value "0.666"^^xsd:double
]
] .
af:10000
a ot:Feature , ot:NumericFeature ;
dc:title "AD-lazar" ;
ot:hasSource <http://in-silico.ch/model/lazar> ;
ot:units "" .
af:lazar_prediction
a ot:Feature , ot:NumericFeature ;
dc:title "prediction-lazar" ;
ot:hasSource <http://in-silico.ch/model/lazar> ;
ot:units "".
ac:1
a ot:Compound ;

