Dataset of 140 {Protein Chain, Ligand Residue} pairs used in the study of the relationship between protein-protein sequence identity and ligand-ligand molecular similarity.

PDB Chain Ligand
1a17      CST
1a17      FMN
1a7d      CFO
1a7e      OFO
1a7v  A   HEM
1aca      COA
1aca      PLY
1ae7      SO4
1agm      ACR
1agm      ASL
1agm      MAN
1aig  M   BCL
1aig  M   BPH
1aig  M   U10
1aij  M   LDA
1alu      SO4
1alu      TAR
1aok  B   ACT
1ar1  A   HEA
1ar1  A   LDA
1au1  A   FUC
1au1  A   GLC
1au1  A   MAN
1ayp  A   INB
1ayx      TRS
1b4w  A   BOG
1b5l      SO4
1bcf  A   HEM
1bdj  B   SO4
1be3  C   HEM
1bk6  A   ALA
1bk6  A   ARG
1bk6  A   LYS
1bk6  A   VAL
1bk9      BU1
1bk9      PBP
1bmf  A   ANP
1bp2      MPD
1buc  A   FAD
1byt      4NC
1cb8  A   GOL
1d8d  A   FII
1d8e  A   ILE
1d8e  A   LYS
1d8e  A   MET
1db4  A   SIN
1dcn  D   AS1
1dcy  A   I3N
1dky  A   LEU
1dog      AS1
1dog      AS2
1dog      GLC
1dog      NOJ
1ecm  A   TSA
1ee4  A   LEU
1ee4  A   PRO
1egc  A   CO8
1elr  A   ACE
1elr  A   ASP
1elr  A   GLU
1elr  A   MET
1elr  A   VAL
1elw  A   GLY
1elw  A   ILE
1elw  A   PRO
1elw  A   THR
1elw  A   TRS
1fap  B   RAP
1fdk      GLE
1fgj  A   HEC
1fgj  A   HEM
1fpp  A   PO4
1fuo  A   CIT
1fup  A   PMA
1gah      NAG
1gai      GAC
1hmo  A   OXY
1hrs      PP9
1ivh  A   COS
1jsw  A   GLC
1kny  A   APC
1kny  A   KAN
1kvo  A   OAP
1mab  A   ATP
1mps  M   SPN
1mro  B   COM
1mro  B   TP7
1nbb  A   NBN
1ng1      ACY
1ng1      EDO
1nsg  B   RPX
1ocz  A   AZI
1pcr  M   SPO
1pcr  M   PO4
1ppa      ANL
1ppr  M   CLA
1ppr  M   DGD
1ppr  M   PID
1prc  L   UQ1
1prc  M   BPB
1prc  M   MQ7
1prc  M   NS1
1prc  M   SO4
1pss  M   CRT
1qbq  A   MSE
1qgu  D   EDO
1qgu  D   MO2
1qle  A   PC2
1qle  C   PC2
1qov  M   CDL
1qsa  A   ACT
1qsa  A   GOL
1qsa  A   SO4
1rcc      BET
1sqc      LDA
1vnc      AZI
1vns      SO4
256b  A   SO4
2erl      EOH
2fap  B   RAD
2hmz  A   ACT
2hmz  A   FEA
2lig  A   ASP
2lig  A   PHN
2lig  A   SO4
2mhr      AZI
2mhr      FEO
2mhr      SO4
2prc  L   UQ2
2prc  M   7MQ
2prc  M   BCB
2prc  M   NS5
2sqc  A   C8E
3bct      URE
4prc  L   SMA
4rcr  M   BOG
5p2p  A   DHG
5prc  L   ATZ
6prc  L   CEB
7prc  L   CET

More details of the proteins and ligands described by these reference codes can be found in Dr Roman Laskowski's PDBsum database at EBI.

The definitions used for the bit strings describing the ligand residues are here.