[OTDev] LogP modeling challenge
Nina Jeliazkova jeliazkova.nina at gmail.comWed Feb 23 11:32:30 CET 2011
- Previous message: [OTDev] LogP modeling challenge
- Next message: [OTDev] LogP modeling challenge
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 23 February 2011 12:21, Egon Willighagen <egon.willighagen at gmail.com>wrote: > Hej Nina, > > On Wed, Feb 23, 2011 at 10:53 AM, Nina Jeliazkova > <jeliazkova.nina at gmail.com> wrote: > > In an exercise to reproduce an ECOSAR model, I've found the current > > implementation of XLogP (CDK ) performs a bit different compared to > KOWWIN > > [1]. > > > > http://tinyurl.com/xlogp-kowwin > > > > This makes hard to reproduce the ECOSAR model, since it depends on LogP. > > Also, LogP is an important parameter in many toxicity prediction models. > > Yeah, this nicely reflect how 'reproducible' QSAR modeling is. Most > models are so numerically unstable, that exchanging a variable with a > highly correlated one (KOWWIN and CDK LogP's) ruins the prediction... > says more about the QSAR model thatn the LogP descriptors... > > > > This is why, I think it is an opportunity for everybody within OpenTox > (and > > outside) to create a better LogP prediction model from a dataset, which > has > > been recently made available to OpenTox . Models would be built > preferably > > via OpenTox API , but not necessary (in that case we could consider > wrapping > > the models into OpenTox API later) . > > > > Models can be then validated by OpenTox validation service at > ALU-Freiburg > > and best one(s) selected. > > Does OpenTox also provide the training data? > > > The dataset is available via OpenTox dataset service (several formats via > > HTTP Accept:mime-type header ) > > > > http://apps.ideaconsult.net:8080/ambit2/dataset/181563 > > http://apps.ideaconsult.net:8080/ambit2/dataset/181563 > > This is data to be used as training data? Do you have information on > how it was curated? How tautomers were selected? Etc... > > It's about 2300 compounds with experimental LogP values... what's the > license? > > wget --accept application/rdf+xml > http://apps.ideaconsult.net:8080/ambit2/dataset/181563/metadata > > did not reveal license/copyright or modify/redistribution rights... > > We need to further clarify the exact license, the data was sent recently to us to be used within OpenTox project, for a modeling exercise similar to this (claimed data being mostly public). The structures were originally in and SDfile, I don't have information how exactly they have been selected. I guess this could be another interesting experience to find how/if the models change if different tautomers are used. As the original exercise is to reproduce the ECOSAR model, we can hardly do that, without comparable (to KOWWIN) LogP model. (In)validating KOWWIN model could be of course part of the exercise :) Nina > Egon > > -- > Dr E.L. Willighagen > Postdoctoral Researcher > Institutet för miljömedicin > Karolinska Institutet > Homepage: http://egonw.github.com/ > LinkedIn: http://se.linkedin.com/in/egonw > Blog: http://chem-bla-ics.blogspot.com/ > PubList: http://www.citeulike.org/user/egonw/tag/papers >
- Previous message: [OTDev] LogP modeling challenge
- Next message: [OTDev] LogP modeling challenge
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list