Tuesday, December 18, 2012

1-Click Docking

"Docking has never been easier!"

This is the phrase we use when talking about our latest feature: 1-Click Docking. It might sound like a marketing phrase, but it's true. At least I'm not aware of any simpler solution for molecular docking. Here are my reasons:

1. It is online
Desktop applications can't be simpler by definition - think about downloading, installing, running software. 1-Click Docking is online, you go to this URL and it is immediately there, ready to use.

2. Javascript Editor
You don't need to wait for Java-based molecule sketchers to load, again you go to 1-Click Docking and ChemWriter, the Javascript editor is immediately there, so you can start drawing your ligand.

3. Select a target
We have integrated scPDB, which allows you to select a target from 10,000 target structures. No need to download, upload, select binding site, etc. (Note: upload is also possible, but requires registration)

The idea of 1-Click Docking came from our user feedback, which can be summarized with a single word: simplify! They told us that functionalities accessible under the "Screen" tab at mcule.com are really useful, but only for those who know how to use them. So to eliminate the barrier, we decided to introduce 1-Click Applications, which are extremely easy to use.

With 1-Click Docking you only need to draw a ligand, select a target and click on Dock. Then you start browsing the results. Ideal for a first insight about ligand-target interactions and affinity. In some cases that's enough to test your idea. If you need more, take the next step and learn how to use the Docking (Vina) workflow step.

Have a nice docking!

Friday, November 16, 2012

About this StartupSauna thing...

Mcule has been selected as one of the most promising startups from Northern Europe, Baltics and Russia, as a result we are now in StartupSauna in Helsinki, Finland for a one month training. We have been through a lot, improved our presentation skills and got very useful feedback about our business model, products, etc. Here is a nice video showing a little bit of the essence of the whole thing:

Two more weeks are still to go, and we are very excited about it. Of course we keep working on mcule (we are right after a very reasonable release), but this period is very important for us in terms of how we should build our business, marketing strategies, etc. So this is a great atmosphere here in Helsinki that accelerates the transformation of "ideas" into viable companies. It would be great to have more of such incubators for startup companies.

Thursday, November 8, 2012

Major release

Subscription packages, user molecule upload, docking visualization and more!

It has been a while since the last major mcule release, but now it is time for a big announcement! We are really proud of the outcome, so here are the new features:

1. Subscription packages

OK, here is something brand new in this field: you can now hire a single tool for a single project! This means that in contrast to the typical annual licensing schemes of modelling software companies we offer our packages for 1, 3, 6, and 12 month periods. You don't have the budget for maintaining an annual license for a tool? No problem, subscribe only when you really need it. Subscribing is super-easy: can be done in 1 minute!

We have signed partnership agreements with BioSolveIT and ChemAxon about distributing their products at mcule.com. As a result you can now subscribe for drug discovery tools of these software companies here. Besides, we are also introducing two mcule packages, but let's go step by step.

FTrees Visual Similarities is a great tool for scaffold hopping. It not only identifies diverse novel active scaffolds but it also tells why the query and target molecules are similar.

Physicochemical properties calculated by the popular Calculator plugins of ChemAxon are widely used and are known to give excellent correlation with experimental data. More than 100 precalculated ChemAxon properties for the whole mcule database are available in the "Properties Access" package option (Academic users: it's free, so go to the Pricing page, apply, and we will switch it on for you). To calculate ChemAxon properties with all possible settings, you should subscribe for the "Properties Access and Calculator" package option.

First mcule package is: Database Access. It gives you access to two things: (i) Product filter, and (ii) exporting chemical supplier data. The Product filter is very useful to filter by supplier names, catalogs, stock availability, price, delivery time, etc. For example it can help to filter out unreliable suppliers and virtual compounds.

Second mcule package: Docking (Vina). The 100k package option allows you to dock up to 100,000 molecules each month. We are planning to introduce a 1M package later on.

For more information about the subscription packages and package options check out our Pricing page. To learn more about the individual drug discovery tools available at mcule.com, check the mcule documentation.

2. User molecule upload

Go to the "Collections" tab, and you will find a new button: "Molecule upload". From now on, all the cool stuff available at mcule.com are not limited for the mcule database. If you have your in-house database or have designed some virtual libraries you can upload them and run the searches and screens on your own libraries. To learn more about user collections, again, check the mcule documentation.

3. Docking visualization

We have found and implemented a great Javascript 3D visualizer: GLMol. After docking, there will appear a "Visualize pose" link, which will display something like this:

Residue labeling has been also implemented. We continue working to further improve this visualization, but the user experience of the current version is already better than any WebGL molecule viewer (of course I'm biased, but still...). To get the best out of this, we recommend to run mcule in Chrome browser.

4. New tab: "Applications"

Under this tab, we will introduce a lot of useful stuff. First: Property calculator. Draw a molecule or enter its identifier, then click on "Calculate" and you will see all mcule properties listed immediately. This works for molecules not in the mcule database too.

5. Documentation

I have already referred to our documentation several times in this post. This is because it has been significantly extended and you can find most of the basic information about the mcule system there together with some nice screenshots. If you get in trouble you might also want to check the FAQ page, where the most typical questions and answers are listed.

6. Flexibility

This is not as obvious as it should (or it is probably better if you don't notice anything of this), but we have completely redesigned the screening workflow technology behind the scenes. Why? To be able to get it to the next step: scale the system up to the sky. Now you can really put any workflow step after/before anything, and this makes the mcule technology really unique. We integrate this whole bunch of tools so flexibly, that will enable you setting whatever workflow you want.

What's next?

A lot of stuff as always, but I don't think docking a few million compounds will remain an issue too long. Only for the very few people around the world who are not mcule users yet. Do you know any? Please tell them to join mcule! We honor the loyalty of our users especially those who bring new people. Soon you will find out how much!

Wednesday, September 26, 2012

Ten presentations later...

Summer is over, so after attending conferences and having some holiday, we are back to the mcule headquarter. We have made lots of new connections, talked to many people and got great feedback on the current status of mcule.

ACS was very intensive as always, me and Ferenc had 4 oral and 5 poster presentations, which you can check here. The most interesting topics for me were the ones trying to offer new strategies to improve the productivity of the drug discovery pipelines:

Physicochemical property profiles
One significant direction is to improve the physicochemical properties (or at least not significantly worsen them) during optimization. The presentations of Ernesto Freire and George Keseru emphasized that hits and leads should be optimized enthalpically rather than entropically (i.e. more H-bonds, less hydrophobicity) to get more favourable ADMET profiles (less toxicity, more selectivity, etc.). It is argued that clinical failures are primarily due to the unsuitable pharmacokinetic profile of the drug candidates (too hydrophobic, not soluble, etc.). The reason of the failures is either toxicity (because of the lack of selectivity), or the lack of effect (again caused by the small therapeutic window).

Phenotypic screening
The other interesting direction sounds somewhat contradictory to the previous wave. Phenotypic screening is about finding molecules that showing the exact therapeutic effect we need. It is somewhat against the target-based approach, as it is not looking for compounds acting on a single target. On the contrary, the more targets (involved in the systems biology of the disease) it hits, the better. This approach says just the opposite of the first direction. Non-selective ligands are welcome here and to have an effect on multiple targets one would probably go for something hydrophobic, which will lack specificity. Chris Lipinski argues that most successful drugs hit many targets (e.g. kinase inhibitors).

I'm very much looking forward to see which direction will deliver more new drugs to the market in future.

Philly vs. Vienna

Right after the ACS I attended the EuroQSAR conference in Vienna: nice meeting, great people, interesting talks. Interesting case studies and some new approaches have been presented there. Some of the major software companies were also exhibiting, and we discussed potential integration strategies with them. We are currently discussing the details with many of them, so the list of tools waiting for integration is getting longer and longer. Which is a good thing.

In the meantime, the mcule team has been working hard, and a major release is coming up! Short list of what you can expect to appear on the mcule drug discovery platform soon: first subscription packages, improved visualization (incl. binding site and docking poses), user molecule upload, bulk exact search, product filter, and more.

Monday, August 27, 2012


Right after the ACS National Meeting we are now in Wien, attending the 19th EuroQSAR Symposia with the following poster:


Friday, August 17, 2012

Mcule in C&EN pick of the day!

Our first ACS presentation is featured in the C&EN pick of the day video! (mcule is at 1:10)


Join us on Sunday!

"Application of automated and validated virtual screening workflows: A hand-tool for medicinal chemists to generate and/or evaluate ideas”
Session: When Chemists and Computers Collide: Putting Cheminformatics in the Hands of Medicinal Chemists
August 19, 2012 from 10:45 am to 11:05 am Philadelphia Marriott Downtown, Room: 302/303

Thursday, August 16, 2012

ACS, Philadelphia, August 19-23

Only a few days left and we will give 4 oral presentations and present 5 posters at the 244th ACS National Meeting.
You can check and download them here:

Schedule for the oral presentations:

1. “Application of automated and validated virtual screening workflows: A hand-tool
for medicinal chemists to generate and/or evaluate ideas”
August 19, 2012 from 10:45 am to 11:05 am
Philadelphia Marriott Downtown, Room: 302/303

2. “Scaling drug discovery pipelines at mcule.com”
August 20, 2012 from 10:20 am to 10:50 am
Pennsylvania Convention Center, Room: 118 B

3. “Identifying novel JAK1 inhibitors by structure-based virtual screening: A case
study of using drug discovery tools at mcule.com”
August 22, 2012 from 10:45 am to 11:15 am
Pennsylvania Convention Center, Room: 117

4. “Evaluation of data quality in currently available compound libraries”
August 22, 2012 from 11:55 am to 12:15 pm
Philadelphia Marriott Downtown, Room: Conference Room 307

Schedule for the posters:

1. “Exploring the chemical space of histamine receptor ligands using drug discovery
tools at mcule.com”
August 19, 2012 from 7:00 pm to 9:00 pm
Pennsylvania Convention Center, Room: Hall D

2. “Application of automated and validated virtual screening workflows: A hand-tool
for medicinal chemists to generate and/or evaluate ideas”
August 20, 2012 from 8:00 pm to 10:00 pm
Pennsylvania Convention Center, Room: Hall D

3. “Evaluation of data quality in currently available compound libraries”
August 20, 2012 from 8:00 pm to 10:00 pm
Pennsylvania Convention Center , Room: Hall D

4. “Integration strategies for virtual and experimental screening. A case study on c-
Jun N-terminal kinase 3 (JNK3)”
August 21, 2012 from 6:00 pm to 8:00 pm
Pennsylvania Convention Center, Room: Hall G

5. “Identifying TRPV1 modulators by structure-based virtual screening: A case study of using drug discovery tools at mcule.com”
August 21, 2012 from 6:00 pm to 8:00 pm
Pennsylvania Convention Center, Room: Hall G

We wish to see you there and hope you will find our projects interesting!
Stay tuned for more information with mcule on:

Twitter: http://twitter.com/mculecom
Facebook: http://facebook.com/mculecom

The mcule team

Monday, July 30, 2012

mcule webinar on youtube

Watch the mcule introductory webinar in full length on our youtube channel:

This is the 1st mcule webinar introducing the most important features of mcule.com. Robert gives an overview of the functionalities and shows how to use them for special use cases.

Wednesday, July 18, 2012

mcule webinar

We will soon hold a webinar introducing the most important features of mcule.com. It will give an overview of the functionalities and we will show how to use them for special use cases. You will be able to ask questions by chatting during and after the presentation. If you have a special topic you would like to be covered on this webinar, please let us know: info@mcule.com.

You can register for the webinar, by clicking on one of the following links (please select the one with the more convenient time for your time zone):

25th of July 2012, 10:00AM (London, UK), 11:00AM (Berlin, Germany), 5:00PM (Beijing, China) and 6:00PM (Tokyo, Japan)

25th of July 2012, 11:00AM (West Coast, USA) and 2:00PM (East Coast, USA)

We hope to meet you on the webinar!

Tuesday, July 3, 2012

New features at mcule.com: Diversity selection and SMARTS query filter

Some new features have been activated at mcule.com. Yay!

Here is the result of last few weeks' development: Diversity selection, SMARTS query filter and target upload for Docking (Vina). See the details below with some short videos.

Diversity selection
With this new filter you can select a diverse subset of a molecule collection. Diversity selection can be performed with two chemical fingerprints by using the Tanimoto coefficient as a metric. You can set the maximum similarity (minimum diversity) of the output collection, and the maximum number of output molecules. Monthly limit for Diversity selection is set to 10,000. If you would like to apply this filter to larger collections, contact us!

For a more detailed description of the Diversity selection filter, click here.

SMARTS query filtering
Substructure searching is limited to single, well-defined query molecules. Want to define a more complex query? Use the new SMARTS query filter at mcule.com! You can easily paste your SMARTS queries into the input form (max. 5 SMARTS per filter are allowed currently) and specify whether the output molecules should or should not contain any or all your queries. Note that there is no input molecule limit for the SMARTS query filter! Sounds like a useful filter? Give it a try!

For more information about the SMARTS query filter, click here.

User target upload for Docking (Vina)
Besides the comprehensive 10k database of prepared protein structures, user upload has been enabled for Docking(Vina) filter. You can upload any target structures in mol2, pdb or pdbqt formats. Preparation can be done optionally. You can give additional information, such as name, organism, PDB ID, etc. to any uploaded target file. Uploaded target structures will be available from the “Select target” menu and will be searchable by the given additional information.

Any feedback or questions?

Thursday, June 21, 2012

Most important criteria of a chemical supplier database

I recently wrote about our software business model and how mcule.com can serve as an online, integrated drug discovery platform for different users by offering attractive, long- and short-term software tool subscriptions. In order to get the most out of these tools, they need to be integrated with a high quality molecule database. It is equally important to make sure we have a properly prepared molecule database optimized for the searching/screening tools.

Chemical correctness. We have spent many months on our molecule registration system, to filter out problematic structures retaining  the highest quality compounds only. You can find more information about our registration system here.

Purity. 90% but rather 95% is the typical required minimum purity for screening compounds. For some natural products, this limit might be a little lower (80-85%). These values are pretty standard and most chemical suppliers fulfil these criteria. Stereochemical purity is, however, a different story. Suppliers typically send their available compounds in SDF format, but unfortunately stereochemical information is not properly stored in these datafiles. Furthermore, the SDF v2000 format is unable to handle the stereochemistry of several cases. We therefore ask specific stereochemical questions from our suppliers, to make sure their molecules are processed properly.

Identity. NMR and LC-MS spectra should be attached to the shipment or at least accessible upon request.

Acquisition rate. This heavily depends on whether the screening database is up-to-date, how quickly the screening can be completed and whether we have instant availability on the individual compounds when ordering. Worst case scenario: company has a chemical supplier database updated annually, screening takes few weeks (e.g. to screen 1 million compounds by structure-based docking), and they send the IDs of virtual hits to chemical suppliers. Acquisition rate might be <50%. On the one hand, this is a waste of resources (human and computational) spent on the evaluation of non-purchasable compounds. On the other hand, these resources could have been used for compounds that were not included into the screening database because it was out-of-date. A frequently updated database of purchasable compounds integrated with the screening tools themselves is therefore a much better choice. Having instant availability on the compounds when ordering could further improved the acquisition rate close to 100%.

Delivery time. These molecules are intended to serve as interesting chemical starting points for new/early phase drug discovery projects. Therefore it is crucial to get these molecules delivered quickly. New and early stage projects try to exploit all available hit sources. To be able to compete with e.g. in house high-throughput screening, and to have an impact on project decisions, hits coming from external resources should ideally arrive before the in-house experiments starts. Otherwise thoughts of medicinal chemists will be primary tailored by the hits already sitting on their table. Everything else will be sitting on bench. First hits will have priority, and therefore quick delivery plays a crucial role in this competition. Typical industry standard for delivery is 2 weeks.

Single package delivery. Molecule procurement can be very painful. It is definitely not something a scientist would like to deal with. It is therefore very important to offer the possibility to deliver compounds produced by multiple chemical suppliers as a single package to the door of the customer. Instead of negotiating with the representatives of chemical suppliers and deal with custom clearance a scientist should do what he/she is best in - science.

Price. Another important aspect of course is the price of the compounds. Since several compounds are typically offered by multiple suppliers, it is important to find the best deal. Besides the price per compound, other factors determining the final price include the number of compounds ordered per supplier, minimum order fees of suppliers, delivery costs, delivery format and required delivery time. It is important to have a good algorithm that compares all possibilities and finds the best choice for the customer.

Thursday, June 7, 2012

ACS presentations 2.0 (YouTube videos)

Some people requested a non-Silverlight version of our ACS presentations. We contacted ACS to get permission for preparing a different format for the talks. We uploaded them on YouTube, here are the videos: 

Wednesday, May 30, 2012

New release! Live-docking, property filtering, new table view

After the first release of mcule, which included basic searching functionalities, we have been working hard on the integration of several new features. I'm pleased to introduce:

Docking (Vina)

Yes! Docking is part of the release pack, and we believe that this will really make a difference! There are a few docking servers available on the net, but have you ever heard about live-docking? Check this out:

After you launched the docking you can start browsing the results as soon as the first few calculations are finished.
One other cool functionality is that you can select your target from among ~10k PDB structures prepared for docking. Many thanks for the group of Prof. Didier Rognan for allowing the integration of sc-PDB [Meslamani J, Rognan D, Kellenberger E. sc-PDB: a database for identifying variations and multiplicity of 'druggable' binding sites in proteins. Bioinformatics. 2011 May 1;27(9):1324-6.] with mcule. This is a great benefit for our users as you can now directly select targets by searching for PDB ID, Protein name, Organism name, UniProt Name/Accession ID/Taxonomic ID. The centre of the binding site will be also automatically determined based on the position of the co-crystallized ligand. All these protein structures have been automatically prepared for docking. Besides selecting your macromolecule from the sc-PDB database, you can upload your own structures as well.
For docking, we use the latest version of the open-source docking tool AutoDock Vina [Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010 Jan 30;31(2):455-61.]. We were quite satisfied with Vina in the past, as it was involved in our GPCR fragment library design protocol, which yielded 100% hit rate. 


There is a property filter, where you can set minimum and maximum boundaries. Available properties are either calculated from the InChI string (molar mass, number of atoms, heavy atoms, hydrogens, heteroatoms, stereocentres, cis-trans double bonds), or by OpenBabel (logP, PSA, molar refractivity, number of rings, rotatable bonds, H-bond donors, H-bond acceptors) [O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: An open chemical toolbox. J Cheminform. 2011 Oct 7;3:33.]. Besides these, rule-of-five and rule-of-three violations are also calculated.

Table View

We have designed a new view, which makes browsing the results even more efficient.

In the new table view you can set which properties (including screening results, e.g. docking scores) you would like to display. Now, you might have seen molecule databases displayed in a useful and transparent way on the web before (there are not many good examples though), but I doubt you have ever been able to sort that database as blindingly fast as you can from now at mcule.com. Believe it or not, sorting of many hundreds of thousands or even millions of molecules can be done in a few seconds(!). Remember that you are using a website and not a desktop application.


There is another filter, called Sampler. It can be used to cut the database at a certain threshold and retain only the top x number of molecules. By default it retains the top molecules highest ranked by the previous filter, but it can be set to sample the molecule collection randomly.

It is also important to mention that more advanced features are provided for free with some limitations. The maximum number of molecules for docking is currently set to 500 per month. Sampler is an efficient way to reduce the number of input molecules for such limited filters. We will provide subscriptions with much less boundaries on the website soon. Until then, contact us if you need more!

OK. So now you can go and do some docking if you like, but remember that many other features are underway, so there is more to come! Keep looking at this blog for more!

Monday, May 21, 2012

Mcule presentations and exhibition at ChemAxon UGM

We are attending the ChemAxon User Group Meeting held in Budapest (Hotel Novotel) between 21-23 May 2012. We will deliver a talk about mcule in the Partner Session (01:45pm-03:15pm, 22nd of May), and will be exhibiting, so meet us during the coffee breaks in the exhibition area!

Why are we attending? If you have already attended any of the ChemAxon UGMs, you will know that this is a very exciting event with special social programs. This is probably enough for a reason as itself, but even more importantly, we have started the integration of ChemAxon tools into mcule! This means that soon you will be able to subscribe for ChemAxon packages at mcule.com.

So, come and meet us if you will be around!

Wednesday, May 9, 2012

Video presentations of ACS, San Diego are now online

If you have missed our presentations from the 243rd ACS Nation Meeting in San Diego, you can watch them online here:
  1. Mcule.com: A public web service for drug discovery
  2. Registration system of mcule: InChI is the key
You can check about 400 other presentations online here.

Metformin for treating blindness

I found this nice example for polypharmacology on Science Blog about a new indication of metformin.

Here is metformin:

The story in a nutshell:

"University of Texas Medical Branch at Galveston researchers have discovered that ... metformin, which is commonly used to control blood sugar levels in type 2 diabetes, also substantially reduced the effects of uveitis, an inflammation of the tissues just below the outer surface of the eyeball. Uveitis causes 10 to 15 percent of all cases of blindness in the United States. The only treatment now available for the disorder is steroid therapy, which has serious side effects and cannot be used long-term."

I found an ancient article suggesting that metformin acts as a weak histamine agonist, but only histamine H1 and H2 receptors were available at that time. This paper shows that metformin can increase gastric acid levels which is probably associated with a weak H2 stimulation. It would be interesting to see if metformin has got a significant level of H4 affinity. Since H4 antagonism has been shown to reduce inflammation, H4 affinity of metformin might be the missing link here.

In fact, we and others have already found several guanidine containing H4 ligands, see some examples here:

This compound was found by our large-scale structure-based H4 screen.

This is agmatine, published as a low affinity H4 ligand in this paper.

This is VUF8430, another H4 ligand published in this paper.

Anyone interested in measuring the H4 affinity of metformin? :)

Monday, May 7, 2012

Software for all at mcule.com

In the previous blog post, I commented on some aspects on the “Federation of Independent Researchers” – an interesting initiative for smaller players of the pharma/biotech industry. Among the comments on the original post in Derek Lowe’s In the Pipeline, there were a few about software needs of individuals and small companies. For example:

"I'm intrigued by the idea relating to computational chemistry software and finding a way for small companies, particularly startups, to get access to sophisticated modeling and docking software."

Sophisticated modelling tools are expensive. No question. In fact, industrial, annual subscriptions range from $5-200k. Open source is free. So it is quite logical to suggest putting together a software package from open source modelling components. As the comment follows:

"… all the underlying force fields and QM models have been published … it would just take a team of dedicated programmers and computational chemists time and passion to create it"

There are several passionate open-source chemoinformaticions with great expertise, so this part is OK. But time is always an issue as pointed out by Rajarshi Guha:

"It just needs somebody with the time and expertise to implement them. And the combination of these two (in the absence of funding) is not always easy to find."

The other problem is that open source tools are generally not developed systematically enough to provide a complete solution. The development is typically governed by the contributors’ (academic) projects and open-source codes are generated on the side for problems the contributors have to solve anyway. Because of that there always will be missing components. So it looks that providing a complete solution would need several passionate developers working on this full-time, systematically. Putting together the pieces, adding some glue where needed and writing the missing components to complete the jigsaw puzzle. And this is what mcule.com is doing: we integrate. This is a large jigsaw puzzle though. In the absence of funding, this needs a business model. Many people have asked what’s the business model behind mcule.com? So here are some thoughts:

The puzzle wouldn’t be complete without the commercial tools that have been developed for years and reached a level that makes them superior for several tasks compared to open-source ones. We negotiate good prices with software developers and provide them at mcule.com on a subscription basis.

  1. First main difference from standard commercial tool licenses is that we provide subscriptions for 1, 3, 6 and 12 months. This will allow people to subscribe for a tool for a single project only and don’t need to pay for an annual license. We think that this will attract smaller companies and individual consultants, who can’t afford maintaining long-term licenses, but want to get tools for single projects. This will be possible at mcule.com.
  2. We provide full IT infrastructure: you do the clicks at mcule.com, we run the calculations automatically on the cloud. No hardware investments, no maintenance costs, no need paying for system administrator, etc.
  3. Licenses are not CPU limited. What we limit is the maximum number of molecules that these tools can be applied to. This is much more calculable than the number of CPUs. Let’s imagine someone wants do make a large-scale docking. I’m not sure he/she will be able to calculate how long the calculations will run on X number of CPUs. But he/she will definitely know better how many molecules will be screened.

One commenter on the thread said:

"I'm thinking of some kind of virtual server, or remote desktop style operation. Your individual contractor can connect from wherever, and have full access to a range of tools, then transfer their data back to their own location for safekeeping. You would need some kind of central server farm somewhere, but this could probably be hosted on one of the increasing number of cloud services floating around the net these days."

Looks like we can read thoughts. This is exactly what we do.

This is a whole new business model though, not just for us, but most importantly for software vendors. So then comes the question from one of the commenters on Derek Lowe’s blog post:

“Can we propose an alternative business model to software vendors?”

Honestly, we weren’t sure about that at the beginning. But now we can say: Yes, we can! We are very close to sign agreements with some of the big players on the modelling software market. Why is this interesting for them? For various reasons. Most importantly, they were unable to collect long tail users so far. Big pharma has a large budget, can make long-term decisions, has the IT infrastructure in-house, people for maintenance, etc. What’s available from these on the other side? None. So what will a start-up biotech say to an offer for a single tool license alone for $20k? No, thanks. So, how about this other offer for the same tool for a single project (1 month license), complete IT infrastructure, unlimited CPU, no maintenance, no installation, ready to use for $5k? I think that’s something that can work, but we will see what our users will say. I think it is competitive with maintaining significant IT resources, spending days with finding free tools, installing them, writing the missing components, etc. and it is definitely competitive with buying an annual license alone for $20k.
I really liked the renting idea of DeepDyve by making scientific papers available for a few days - an alternative of annual journal subscriptions. I think mcule provides something similar by offering 1 month subscriptions for software. Plus we offer a lot more. Here are the plans:

1. Open community
Since we use some resources developed by the open-source community, we try to give back something. So, people can use open-source tools with some limitations free-of-charge at mcule.com taking that the resulting molecule collections will be public too. This might be an option for people working on non-profit projects e.g. on neglected diseases.

2. Academic users
Significantly reduced prices compared to industrial ones. Resulting collections can be private.

3. Small companies, consultants
Short-term licenses and different bundles/packages can be attractive for small industrial users. We also remove significant burden from their shoulders by providing the IT infrastructure and everything as ready-to-use.

4. Big pharma
Long-term licenses are also available. We provide enterprise solutions to adjust this plan to the actual needs of the companies. They might be interested in the following components: large range of tools, validated screening workflows, large purchasable compound library and easy, single package compound ordering.

What do you think about the above plans? Would be any of these plans interesting for you? I would be very interested to hear opinions!

Besides software, all the users get access to a high quality, up-to-date molecule database, which is the other main component of mcule.com. I will write about this in more detail in the next post!

Wednesday, May 2, 2012


I read an interesting blog post at Derek Lowe’s In ThePipeline about a proposal for a “Federation of Independent Scientists”.
Some background: drug discovery and thus the pharma industry are in big trouble (we all know about the increasing costs and small number of approved drugs). Taking the extremely large economical contribution of the pharma industry the consequences are very serious. Unfortunately, drug discovery projects typically span 10-15 years and therefore even if we change something at the front, we won’t see the difference for a while at the end. Nevertheless, everybody agrees that something should be done differently. Different companies give different answers to the problem. Some of them, like AstraZeneca and Pfizer, chose to shut down several sites to cut back costs, thus increasing the number of unemployed scientists and this is the point where this gets a social problem. Since the chances to find a place at another big pharma is relatively low these days, and the number of available academic positions is limited, some of these people join to small biotechs, or found a company themselves or become consultants. While these plans have several advantages, being alone or playing in a small team always limits the resources for getting the job done. Thus the proposal of Mrs. McGreevy:
“What about a voluntary association of independent research scientists?”
Proposed names for this association were: "Federation of Independent Scientist" or "SCA - Society for Chemistry in America" (opposite of ACS).
Members of the association could basically get:

  1. Group rates on health and life insurance
  2. Group rates on access to journals and library services
  3. Online community for support and networking
  4. Support for grant writing
  5. Marketplace (advertising and bidding for contracts)
  6. Special rates for other resources like HTS libraries

I think using the group power to negotiate with suppliers on the above products/services is definitely a good idea. On the other hand, it looks to me that ACS has already addressed most of these points, at least they are offering group rates for insurances, ACS publications, plus free networking, and career opportunities. To make it clear, I’m not saying these problems have been solved already. In fact, there are a lot of non-ACS journals and access to the ACS publications is still too expensive especially for unemployed people. All I’m saying is that before starting a new association, why not trying to urge ACS to make further steps? Building up a new association is hard work, and unemployed people don’t have years to wait for this to be evolved. ACS probably has enough power to negotiate with suppliers of any kind, and probably can provide better discounts of his own publications if it is forced. The question is: do the supporters of the idea have enough power to persuade ACS to make such changes? ACS currently has about 160,000 members. I think if this initiative can gather ~10,000 supporter members, then it can make a difference. I'm maybe a little naive, but if the number of demanders will be large enough, I don’t think ACS could ignore them. 

Besides this, several of the commenters mentioned already available solutions for some of the listed problems. In particular, DeepDyve can be an alternative for annual journal subscriptions. DeepDyve provides a renting service of scientific papers for $0.99. Renting means: viewing is allowed, downloading is not. While the list of accessible journals is impressive, I don’t see any ACS journals. Again it might sound naive, but why not urge ACS and DeepDyve to start negotiating and make an even more attractive renting service model for ACS members?

Some of the commenters brought up another major problem, namely the need of modelling software. Most of the small biotech companies and also consultants need some tools to work with, but prices of commercial software are high and it is not the only expense here: hardware infrastructure, maintenance, data and software integration, etc. Looks like mcule.com was a good idea! I will write about our solution for these problems in an upcoming post. Stay tuned!

Thursday, April 26, 2012

Some thoughts about ACS in San Diego

I attended the ACS a few times before, but this one was special. Not just because this time the meeting was held in San Diego (it was great to return to this beautiful city and meet some friends and colleagues I haven’t seen for a while), but even more importantly we announced the release of the beta version of mcule.com (go to mcule.com and enter your email address to get access). We put down the basics. Now the question was where to go from here. We certainly had a lot of ideas about where we could go, but we needed (and still need) some feedback from our future users on these plans. In general, feedback was positive about the beta version and people found some of our plans attractive. This is good.

People already tested mcule.com will probably see the difference from other chemical database hosting websites, but those who haven’t seen it yet were asking why are we different. So here is why:
1. First, we provide purchasable molecules AND screening tools. There are some services for either of the two individually, but such an integrated system of these two components makes mcule.com unique.
2. Data quality: we have spent several months developing a rigorous registration system yielding in a high quality database. These days, when large chemical and biological data are accessible, emphasis is placed on the quality rather than on the quantity of databases.
3. Top IT technology behind mcule.com allows many things that are not available anywhere else. For example, flexible collection management for millions of compounds is not a trivial task at all but it is possible on mcule.com.

Another very important feedback coming from many computational experts and non-experts was the following:

- You provide high quality, purchasable molecules. OK. And you will provide a bunch of searching/screening tools. Nice. Now, how will I know which tool is the best for a particular problem?

So it looks like providing a large haystack of molecules and different pitch forks might be attractive for experts, but even they want to know what the best tools for finding the needle are. Computational chemists especially working in the pharma industry simply don’t have time to make large-scale assessments of tools. Looks like we have to do this extra step. And we are happy to do that! So therefore, one of our primary focuses for the next few months is to build up automated screening workflows, optimized and validated by using reference molecules. We have already done this for a few case studies even with experimental validation. For example you can check out our ACS poster on GPCR fragment library selection with 100% hit rate. Such validated, preset workflows will be provided for our users. We will also allow automated validation of screening workflows on any target for which reference molecules are available to assess their predictive power. The ultimate goal is something like this: user goes to mcule.com, enters a target name and gets potential ligands coming out of a validated virtual screening workflow. We are working on it …

PS: You can find all our ACS presentations here.

Monday, March 19, 2012

We are going to the ACS Meeting!

The 243rd ACS National Meeting & Exposition is one of the world's largest scientific conference of the year held in March 25-29, San Diego. It brings together academics and practitioners from all over the world and provides a venue for exchanging, learning and exploring new ideas. We are excited to be part of this great event!

We will give 2 oral presentations.
One is about mcule.com, explaining what mcule can offer to our users and suppliers, focusing on the integrated system of screening tools and cloud technology.

In the other presentation we would like to summarize how InChI is implemented into the mcule registration system and how it is used effectively with our supplier database and open registration services.

In addition, we will present 2 case studies of using drug discovery tools on mcule.com, with the help of the following posters:

1. Seeking novel JAK1 inhibitors

Janus kinase 1 (JAK1) is a human tyrosine kinase protein critical for cytokine signalling. It was recently reported that JAK1 plays a critical role in metastasis formation by enabling cell contractions. This makes JAK1 a very attractive cancer drug target. The available JAK1 crystal structures prompted us to set up a structure-based screening protocol against JAK1. The mcule supplier database and the molecular library of the National Cancer Institute (NCI) were screened against JAK1 by using this protocol. Several promising JAK1 inhibitor candidates were identified by these virtual screens and the most interesting compounds will be tested by biological in vitro assays.
The virtual screens were conducted using cheminformatic tools available at mcule.com and are accessible for everyone who is interested. Mcule also provides several other virtual screening tools and filtering techniques making the screening protocols flexible and customizable.

2. GPCR fragment design

Several class “A” G-protein-coupled receptor (GPCR) crystal structures have been recently published. Similar structural elements of these GPCRs suggest that GPCR ligands might share structural similarities. Mcule developed a virtual screening protocol to assess GPCR-likeness of candidate molecules by structure-based docking. Screening criteria of GPCR-likeness was set by using GPCR bioactivity data from the ChEMBL database. The GPCR-likeness of fragmented druglike molecules from the ChEMBL database and a GPCR library of our chemistry partner were screened by the developed protocol. Ligands with high GPCR-likeness were selected for further pharmacological characterization.
The GPCR-likeness assessment protocol was developed by using cheminformatic tools available at mcule.com and it will become accessible for the public to assess the GPCR-likeness of subsets of the mcule database or external libraries. Mcule provides several other virtual screening tools and filtering techniques making the GPCR-likeness assessment protocol flexible and customizable.

We expect many people to be interested in our projects!
See you there and stay tuned for more information on twitter and facebook!

Tuesday, January 31, 2012

What's going on ...

The mcule project was started in last June with very ambitious goals. Several months have passed and you could ask what’s happening behind the scenes? And when can I finally start searching/screening/ordering etc. Well, a lot of things happened since last June.

Laying down the foundations took more time than we expected. But the good news is: we are almost there now. So here is a short summary of what we have done and what is currently being done, and most importantly: what you can expect in the upcoming first releases of mcule.

Fundamental rules for the registration system have been set and implemented. This will guarantee that our vendor database will contain high quality molecules, and ID ambiguities will be close to zero. If you read the last posts of this blog you might be aware that our molecule registration system is based on InChI. While it is a very powerful tool for structure registration, it needs some adjustments to make sure no corner cases are missed. Vendor companies provide their structural data in SDF format, which suffers from far more limitations. It seems that no molecule representation will be sufficient on its own to represent the whole chemical space (or even just the accessible part of it). It is just too complex. Another source of ambiguities derives from drawing errors. SDF writing rules of molecule sketchers (e.g. setting the chiral flag automatically) can also result in potential stereochemical ambiguities. What we can do is to define specific structural checks to filter out potentially problematic molecules. By inspecting these problematic cases, new preparation steps can be defined to facilitate automatic registration. We have now more than 50 such checks/preparations in our registration system.

First large scale test of our registration system involving 2.5 M compounds has been completed. Results are promising, 97% of the molecules could be automatically processed, while only 3% of the molecules were subjected to further inspections. With some slight changes we can further reduce the number of retained molecules down to 0.5 - 0.1%. We are doing the final touches and will run the registration on the first few millions of vendor compounds again. This will hopefully provide you a reasonable compound deck to start your searches on.

At the same time, the first searching/screening filters have been integrated with the test database. Simple searches (exact, similarity, substructure) will be available in the first release, while more complex searches (e.g. docking, pharmacophore searches) will appear among the filters soon. You will be able to feed these engines with one of the leading javascript editors, alternatively you can search for MCULE ID, InChI or SMILES.

As to the web interface, we hope that you will get an experience, which will make mcule.com your default searching/screening tool. You get (i) informative molecule index pages containing (among others) molecule IDs, properties, vendor and ordering information, (ii) drag and drop filters to build more complex screening workflows, (iii) flexibly manageable molecule collections/hitlists quickly displayed in list/grid views, and more.

We expect to send out the invitation for the private beta version soon, so please stay tuned!