Download descriptor extraction tools, protein-ligand complexes, and pre-extracted descriptors

NameTypeDescriptionAddedEdit
PDBbind15PlcSampleMoleculesThe aim of the PDBbind database is to provide a comprehensive collection of the experimentally measured binding affinity data for all types of biomolecular complexes deposited in the Protein Data Bank (PDB). It thus provides an essential linkage between energetic and structural information of these complexes, which is helpful for various computational and statistical studies on molecular recognition occurred in biological systems.2016-05-19 20:25:00Delete
ScorePDBbind14RefTrTsDescriptorsThis data is the 2014 refined set of PDBBind protein-ligand complexes characterized by sixteen different descriptor sets. The data also includes experimental binding affinities collected from the literature by the curators of PDBbind. This data set is ideal for training and evaluating machine-learning scoring functions for the tasks of binding affinity prediction and ligand ranking. It is composed of 3446 protein-ligand complexes characterized by more than 2700 descriptors from 16 different scoring functions and interaction modeling tools.2016-05-20 08:18:05Delete
ScorePDBbind14PrTrDescriptors This data is the 2014 primary training set of PDBBind protein-ligand complexes characterized by sixteen different descriptor sets. The data also includes experimental binding affinities collected from the literature by the curators of PDBbind. This data set is ideal for training machine-learning scoring functions for the tasks of binding affinity prediction and ligand ranking. It is composed of 3251 protein-ligand complexes characterized by more than 2700 descriptors from 16 different scoring functions and interaction modeling tools. The primary training set is a subset of the refined set of PDBbind 2014. (The refined set can be found in DDB under the name ScorePDBbind14RefTrTs.) 2016-05-20 08:25:32Delete
ScorePDBbind14CrTsDescriptorsThis data is the 2014 core test set of PDBBind protein-ligand complexes characterized by sixteen different descriptor sets. The data also includes experimental binding affinities collected from the literature by the curators of PDBbind. This data set is ideal for testing machine-learning scoring functions for the tasks of binding affinity prediction and ligand ranking. It is composed of 195 protein-ligand complexes grouped into 65 clusters of protein families. The size of each cluster is three complexes in which the three ligands bind to the same protein but with very different binding affinity values. The complexes are characterized by more than 2700 descriptors from 16 different scoring functions and interaction modeling tools. The testing data ScorePDBbind14CrTs is a subset of the refined complexes of the PDBbind 2014 and it can be used for testing ML SFs that were trained on the non-overlapping primary training set ScorePDBbind14PrTr (its DDB name).2016-05-20 08:40:13Delete
XANLDRSADGC_xgboostSFThis is an XGBOOST based SF fitted to the Primary set of 2014 PDBbind and tested on the core set of the same PLC database.2016-05-20 08:44:55Delete
XANLDRSADGC_filterFilterThis is a filter that was built to detect noisy descriptors derived using 11 different tools.2016-05-20 08:53:28Delete
ScorePDBbind15CrTsMolecules This data is the 2015 core test set of PDBBind protein-ligand complexes characterized by sixteen different descriptor sets. The data also includes experimental binding affinities collected from the literature by the curators of PDBbind. This data set is ideal for testing machine-learning scoring functions for the tasks of binding affinity prediction and ligand ranking. It is composed of 195 protein-ligand complexes grouped into 65 clusters of protein families. The size of each cluster is three complexes in which the three ligands bind to the same protein but with very different binding affinity values. The complexes are characterized by more than 2700 descriptors from 16 different scoring functions and interaction modeling tools. The testing data ScorePDBbind14CrTs is a subset of the refined complexes of the PDBbind 2015 and it can be used for testing ML SFs that were trained on the non-overlapping primary training set ScorePDBbind14PrTr (its DDB name). The 2014 and 2015 PDBbind core test sets are identical.2016-05-20 09:07:47Delete
nnscoreSminaDpocketDescsDescriptorsThe descriptor types that might include descriptors for protein targets only2016-05-26 06:29:04Delete
allDescriptors_Xgboost_Pr14Cr14SFXgboost fitted to Pr14 characterized by all descriptors and tested on Cr142016-05-26 07:09:24Delete
allDesc_BRTSFBRT on all descriptors Pr14/Cr142016-05-26 07:46:25Delete
XGBOOST_with_shuffleSF2016-05-26 22:06:27Delete
dnn_xag1030SF2016-11-08 03:31:32Delete
many1036SF2016-11-08 03:40:40Delete
svm1102SF2016-11-08 04:02:51Delete
test_fileDescriptorstest_descript2017-06-16 17:41:51Delete
-dDescriptors2017-10-31 03:06:05Delete
-dDescriptors2018-01-04 03:50:32Delete
ID1Descriptors2018-01-04 15:29:48Delete
-uFilter2018-01-16 18:55:41Delete
-uFilter2018-01-16 19:57:27Delete
-dDescriptors2018-01-17 14:34:00Delete
-dDescriptors2018-01-18 02:27:57Delete
-uFilter2018-01-27 04:22:32Delete
-dDescriptors2018-02-06 09:57:06Delete
tt.mol2Software tt.mol22018-02-06 13:05:15Delete
-dDescriptors2018-02-16 01:28:57Delete
-dDescriptors2018-03-13 05:30:08Delete
-dDescriptors2018-03-16 19:00:40Delete
-dDescriptors2018-03-27 03:56:16Delete
-dDescriptors2018-03-27 04:10:16Delete
-uFilter2018-03-27 04:26:05Delete
-dDescriptors2018-04-06 13:50:44Delete
-dDescriptors2018-04-18 14:18:50Delete
-dDescriptors2018-04-26 12:03:59Delete
-dDescriptors2018-06-13 14:59:34Delete
PDBBind 2017 Rule of 3Molecules This is a subset of PDBBind 2017 containing only protein-fragment complexes. I filtered the PDBBind using the cutoff MW <= 300Da, similar to the Rule of Three for fragments.2018-07-09 17:55:46Delete
-dDescriptors2018-07-09 18:01:19Delete
pdbbind300Molecules PDBBind dataset filtered using Rule of 32018-07-09 18:05:58Delete
-dDescriptors2018-07-09 18:06:16Delete
Extract_fragmentsDescriptors2018-07-09 18:06:51Delete
Molecules2018-07-10 02:10:32Delete
pdbbind300trainMolecules PDBBind 2017 protein-fragments dataset. 2018-07-10 02:29:28Delete
scoring_rfscore_fragmentsSF2018-07-10 02:58:34Delete
test_train_300_descDescriptors2018-07-10 04:17:55Delete
nooverlap_scoringSF2018-07-10 04:30:44Delete
nooverlap_scoringSF2018-07-10 04:57:10Delete
-dDescriptors2018-07-11 20:03:38Delete
-dDescriptors2018-07-11 20:06:15Delete
-dDescriptors2018-09-08 13:36:56Delete
testing_marcosDescriptors2018-10-31 10:20:07Delete
-dDescriptors2018-12-14 04:06:38Delete
-dDescriptors2018-12-14 04:14:00Delete
-dDescriptors2018-12-14 04:17:43Delete
-dDescriptors2019-01-04 14:43:02Delete
-dDescriptors2019-02-07 11:04:28Delete
-dDescriptors2019-07-08 06:05:13Delete
-dDescriptors2019-07-14 09:23:22Delete
-dDescriptors2019-07-14 09:27:08Delete
-dDescriptors2019-07-14 09:29:46Delete
-dDescriptors2019-07-16 12:23:30Delete
-dDescriptors2019-08-27 07:05:21Delete
-dDescriptors2019-10-09 20:06:08Delete
-dDescriptors2019-10-22 08:11:06Delete
Retest_EcfpSFRetest_Ecfp2019-12-31 19:52:17Delete
r3Descriptorsr32020-03-16 06:50:34Delete
test_ddb_14SF2020-03-22 23:24:57Delete
Lets_test_if_email_is_workingSF2020-03-23 00:05:01Delete
email_test2SF2020-03-23 00:07:59Delete
test_email2SF2020-03-23 01:58:01Delete
test_email3SF2020-03-23 02:17:58Delete
filter_testFilter2020-03-23 02:32:01Delete