alvaDesc – Python

The alvaDescCLIWrapper package can be used to access alvaDesc functionalities from Python (3.5 or higher).

In order to work, the package requires a licensed version of alvaDesc installed on the same computer.
Minimum alvaDesc version: 1.0.14


1.Calculate two descriptors for three molecules on Windows:

from alvadesccliwrapper.alvadesc import AlvaDesc

aDesc = AlvaDesc(‘C:\\Program Files\\Alvascience\\alvaDesc\\alvaDescCLI.exe’) # Windows default alvaDescCLI.exe location
aDesc.set_input_SMILES([‘C#N’, ‘CCCC’, ‘CC(=O)OC1=CC=CC=C1C(=O)O’])
if not aDesc.calculate_descriptors([‘MW’, ‘AMW’]):
print(‘Error: ‘ + aDesc.get_error())

The result is a list of lists of float containing the required descriptors:

[‘MW’, ‘AMW’]
[[58.14, 4.15285714285714], [180.17, 8.57952380952381]]
Molecule MW AMW
CCCC 58.14 4.15285714285714
CC(=O)OC1=CC=CC=C1C(=O)O 180.17 8.57952380952381

2. Calculate all descriptors for an input file on Linux:

from alvadesccliwrapper.alvadesc import AlvaDesc

aDesc = AlvaDesc(‘/usr/bin/alvaDescCLI’) # Linux default alvaDescCLI location
if not aDesc.calculate_descriptors(‘ALL’):
print(‘Error: ‘ + aDesc.get_error())

3. Calculate the ECFP fingerprint with size 1024 saving the result to a text file on macOS:

from alvadesccliwrapper.alvadesc import AlvaDesc

aDesc = AlvaDesc(‘/Applications/’) # macOS default alvaDescCLI location
aDesc.set_input_file(‘./myfile.sdf’, ‘MDL’)
if not aDesc.calculate_fingerprint(‘ECFP’, 1024):
print(‘Error: ‘ + aDesc.get_error())
# the result is in the output file
#  print(‘Results: ‘ + aDesc.get_output())

Notes on set_output_file:

  • when using set_output_file, the results will be saved in the specified file and they won’t be available with the get_output function.
  • set_output_file writes the output using alvaDesc standard (which can be influenced by alvaDesc settings). Do not use this function if you need a specific output file format.

4. Convert descriptors output to NumPy / Pandas:
If you want, you can convert get_output results to NumPy matrix or Pandas dataframe.

import numpy as np
import pandas as pd
from alvadesccliwrapper.alvadesc import AlvaDesc

aDesc = AlvaDesc() # Windows is the default
aDesc.set_input_SMILES([‘C#N’, ‘CCCC’, ‘CC(=O)OC1=CC=CC=C1C(=O)O’])
if not aDesc.calculate_descriptors([‘AMW’, ‘MW’, ‘nBT’]):
print(‘Error: ‘ + aDesc.get_error())
res_out = aDesc.get_output()
# get molecule names according to alvaDescCLI standard
res_mol_names = aDesc.get_output_molecule_names()
res_desc_names = aDesc.get_output_descriptors()

 # NumPy array of array and matrix
numpy_array_of_array = np.array([np.array(xs) for xs in res_out])
numpy_matrix = np.matrix(res_out) # NumPy matrix
print(‘NumPy matrix’)

 # Pandas dataframe
pandas_df = pd.DataFrame(res_out)
pandas_df.columns = res_desc_names
pandas_df.insert(loc=0, column=’NAME’, value=res_mol_names)
print(‘Pandas dataframe’)

The result is:

NumPy matrix
[[ 4.1529 58.14 13. ]
[ 8.5795 180.17 21. ]]
Pandas dataframe
0 Molecule1 4.1529 58.14 13.0
1 Molecule2 8.5795 180.17 21.0

More examples are available in the documentation contained in alvaDescCLIWrapper zip file.


Please, log in in order to access the content.