[OTDev] Luca Settimo
Nina Jeliazkova jeliazkova.nina at gmail.comWed Sep 7 16:07:20 CEST 2011
- Previous message: [OTDev] Luca Settimo
- Next message: [OTDev] Q-edit
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear Luca, The access to the structures (and data) is assumed to be via the OpenTox REST web services API, not through database calls. As it was explained at http://ambit.uni-plovdiv.bg/downloads/ambit2/ the ambit2.war file have to be deployed to a servlet container and data accessed via API calls. The content of the download page is also accessible at http://ambit.sourceforge.net/download_ambitrest.html . For the API documentation and publications http://opentox.org/dev/apis/api-1.2 http://www.jcheminf.com/content/2/1/7 http://www.jcheminf.com/content/3/1/18 http://ambit.sourceforge.net/api.html http://ambit.sourceforge.net/ambit_services.html On 7 September 2011 16:36, <luca_settimo at vrtx.com> wrote: > Dear Opentox support and dear Vedrin > I thank you for the answer that you gave me last month. > We had some difficulty to get the structures from the database dump > http://ambit.uni-plovdiv.bg/downloads/ambit2/db/ambit2-2011051401.7z > that you sent me some time ago > My colleague (Pat Walters) tried to load the database into MySQL, he said > that there are about 50 tables and we don't see any documentation. Do > you know if there is a description or an entity relationship diagram > available? > > There's a table called "structures", that has 478,009 structures in SDF > format and 70,646 in INC (INChI?) format. > There's another table called "chemicals" with 118,726 SMILES > > The other tables contain descriptors and data, but we are not sure how it > all fits together. > > If you would like to study the database schema, a version of it could be found in the Prototype Database deliverable (p.12), along with other information. http://www.opentox.org/data/documents/development/opentoxreports/opentoxreportd32 The updated document for the final database is under review by partners and hopefully will be published soon. Information about the schema can be found a at https://ambit.svn.sourceforge.net/svnroot/ambit/trunk/ambit2-all/ambit2-db/src/main/resources/ambit2/db/sql/ambit2.mwb (MySQL workbench file) and in the book chapter [1] . If you only would like to access structures (and.or data), and not necessary install OpenTox web services, it might be more appropriate to use the pre-installed service at https://ambit.uni-plovdiv.bg:8443/ambit2/dataset<https://ambit.uni-plovdiv.bg:8443/ambit2/dataset?max=100> and just download the structures in the preferred format. Best regards, Nina Jeliazkova [1] Jeliazkova N., Jaworska J., Worth A. (*2010*) Chapter 17. Open Source Tools for Read-Across and Category Formation, In M. Cronin, & Madden J. (Eds.), In Silico Toxicology : Principles and Application<http://www.rsc.org/publishing/ebooks/2010/9781849730044.asp>(pp. 408-445). Cambridge, UK: RSC Publishing > Please include Pat Walters in the reply if you could help thanks > > Thanks > Luca > > > > From: Vedrin Jeliazkov <vedrin.jeliazkov at gmail.com> > To: luca_settimo at vrtx.com > Cc: opentox development mailing list <development at opentox.org> > Date: 04/08/2011 13:29 > Subject: Re: Luca Settimo > > > > Hi Luca, > > > could you give me some more info on the databases that you collected for > AMBIT? > > The database dump that is available at > http://ambit.uni-plovdiv.bg/downloads/ambit2/db/ambit2-2011051401.7z > contains the following datasets: > > ECHA list of pre-registered substances (143835 entries) > ChemIDplus (structures for 80468 chemicals from the ECHA list of > pre-registered substances) > Chemical Identifier Resolver (structures for 72985 chemicals from the > ECHA list of pre-registered substances) > ChemDraw (structures for 22519 chemicals from the ECHA list of > pre-registered substances) > CPDBAS (1547 entries) > DBPCAN (209 entries) > EPAFHM (617 entries) > FDAMDD (1216 entries) > HPVCSI (3548 entries) > HPVISD (1006 entries) > IRISTR (544 entries) > KIERBL (278 entries) > NCTRER (232 entries) > NTPBSI (2330 entries) > NTPHTS (1408 entries) > ISSCAN (1150 entries) > ISSMIC (151 entries) > ISSSTY (232 entries) > TOXCST (320 entries) > TXCST2 (960 entries) > ECETOC Technical Report No. 66 Skin irritation and corrosion Reference > Chemicals data base (1995) (176 entries) > Local Lymph Node Data for the Evaluation of Skin Sensitization - > Compilation of historical data (Dermatitis Vol 16 No 4 2005) (209 > entries) > Local Lymph Node Data for the Evaluation of Skin Sensitization - > Second compilation (Dermatitis Vol 21 No 1 2010) (108 entries) > Bioconcentration factor (BCF) Gold Standard Database (1130 entries) > Benchmark Data Set for pKa Prediction of Monoprotic Small Molecules > the SMARTS Way (185 entries) > Benchmark Data Set for In Silico Prediction of Ames Mutagenicity (6512 > entries) > Bursi AMES Toxicity Dataset (4337 entries) > EPI_AOP (818 entries) > EPI_BCF (685 entries) > EPI_BioHC (175 entries) > EPI_Biowin (1263 entries) > EPI_Boil_Pt (5890 entries) > EPI_Henry (1829 entries) > EPI_KM (631 entries) > EPI_KOA (308 entries) > EPI_Kowwin (15809 entries) > EPI_Melt_Pt (10051 entries) > EPI_PCKOC (788 entries) > EPI_VP (3037 entries) > EPI_WaterFrag (5764 entries) > EPI_Wskowwin (2348 entries) > TOXCST_ACEA (320 entries) > TOXCST_Attagene (320 entries) > TOXCST_BioSeek (320 entries) > TOXCST_Cellumen (320 entries) > TOXCST_CellzDirect (320 entries) > TOXCST_Gentronix (320 entries) > TOXCST_NCGC (320 entries) > TOXCST_Novascreen (320 entries) > TOXCST_Solidus (320 entries) > TOXCST_ToxRefDB (320 entries) > ECBPRS (structures and data for 80410 chemicals from the ECHA list of > pre-registered substances) > OPSIN (structures for 78458 chemicals from the ECHA list of > pre-registered substances) > > You can also access all of the above mentioned datasets at > https://ambit.uni-plovdiv.bg:8443/ambit2/dataset after you login with > your OpenTox username and password at > https://ambit.uni-plovdiv.bg:8443/ambit2/opentoxuser (You can register > as an OpenTox user at http://www.opentox.org/join_form if you haven't > already). > > In addition to these datasets, you could access at the same location > the PubChem Structures + Assays dataset (473965 entries), which is not > included in the MySQL dump that is available for download in order to > keep it more compact. > > Please note that some additional datasets (not listed above, but > available in the DB) are accessible only by OpenTox partners, due to > specific licensing requirements and agreements. > > > Are you aware of this paper? > > [http://dx.doi.org/10.1016/j.taap.2009.08.022] > > > Perhaps you will find very useful Table 1 because it shows all databases > for tox that are available in the literature. Which of these > > do you have? > > As you can see from the list above, there's some degree of overlap > between the references in Table 1 of this paper and the datasets > included in the OpenTox DB, but both have entries that are absent in > the other list. One major obstacle for including some of the sources > that you mention is the lack of computer-readable bulk download for > them. In addition, the AMBIT database is evolving continuously (even > as I write these lines) and it can be somehow hard to tell what's > included and what's not -- all registered users with sufficient > privileges can add datasets at any time. In general, the OpenTox > framework (and AMBIT as one particular implementation of the OpenTox > API) provides the infrastructure to store and process relevant data in > a more or less similar way as the Apache HTTP server acts for making > available web site content. It's up to the users to upload whatever > datasets, algorithms, models, etc..., they like to use or make > available to others. So, in essence, the OpenTox DB is a kind of > starting reference point, with particular emphasis on datasets that > are relevant to the European REACH legislation, mainly due to the > specific context of the OpenTox project. However, the OpenTox > framework was designed in a generic way, to enable its use in other > domains as well. It's up to the users to install, populate, run, > maintain their own instances of OpenTox services. Furthermore, due to > the common API, these services could be linked together and rely on > each other for executing specific tasks (e.g. an algorithm provided by > service A can be used to build a model by service B, using training > dataset available at service C; the model at service B could be > validated by service D and used to predict properties for a dataset > hosted at service E, etc). You can have all of these running on a > single box, or on a private cluster, or as (distributed) services that > you offer to the public to use. > > > So Barry told me that you have a linux version of > tox-create/tox-predict? Is that true? > > See my previous and Micha's mail for a detailed answers to these > questions. The apps are platform independent and can run on any OS. > ToxPredict and its dependencies are Java-based, ToxCreate and its > dependencies are Ruby-based. > > As a somehow easier first step you might want to try the OpenTox > virtual appliance, which has all of these apps pre-installed for you > on a recent version of Linux: > > > http://ambit.uni-plovdiv.bg/downloads/opentox/Opentox%20Virtual%20Appliance%20DC.ova > > > Please note that this is a large file (2730474496 bytes). Its md5 > checksum which you could check to ensure that no errors have occurred > while downloading it is: 1530bb83e88c3c646bcbac3183745bab > > You could import and run the appliance in VirtualBox > (http://www.virtualbox.org/). > > Let us know if we can be of further assistance. > > Kind regards, > Vedrin > > > > > > > > "Registered in England and Wales No: 2907620 > Registered Office: 88 Milton Park, Abingdon, Oxfordshire OX14 4RY, UK" > > > > > > > "Disclaimer: The information contained in this transmission may contain > privileged and confidential information. It is intended only for use of > the person(s) named above. If you are not the intended recipient, you are > hereby notified that any review, dissemination, distribution or > duplication of this communication is strictly prohibited. If you are not > the intended recipient, please contact the sender by reply e-mail and > destroy all copies of the original message." > > > > > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development >
- Previous message: [OTDev] Luca Settimo
- Next message: [OTDev] Q-edit
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list