Community TSC/D2OL Project News: 2006 Project Goals, etc.

News and project updates which haven't been added to distributedcomputing.info yet.

Moderators: Jwb52z, CedricVonck, kpearson, Honza, Lupine1647

Community TSC/D2OL Project News: 2006 Project Goals, etc.

Postby Jwb52z » Fri Feb 17, 2006 10:25 am

Adam Hughes
MTSD

February 10, 2006 10:32 AM

Greetings everyone,

My first visit to TRI (February 3 – 6) is now in the books, and I wanted to take a few minutes here to update the community on the activities from this weekend. We were able to accomplish a great deal of knowledge transfer from Charles and Wolfgang to myself, which will be very important as we move forward. Of more immediate concern to the science and community of the project, we established a strong set of goals for the coming year that will help to revitalize the project and give us a firm foundation for future improvements.

First and foremost, this project is about finding drugs that can help fight diseases. You all have done a lot of work over the last few years in crunching the candidates and targets that we have given you, and your efforts have whittled the field down to several top conformers. While we need to keep working on the current libraries and targets in order to find all of the promising conformers that we can, we realize that a large percentage of the solution space has already been visited. It is for this reason that our top priority for this quarter is the release of a new target and new candidate libraries. To that end, we have identified a promising new TSC target and two large libraries of commercially-available drug candidates. I am currently working on formatting the associated files for use with our software.

While the science of drug discovery is the central focus of the project, contributor statistics provide tangible, up-to-date metrics for users to gauge their activity and the contributions they have made to the search for drug treatments. To make tracking our progress a better, more meaningful experience for you (and us!), we are planning to update the statistics processing code to improve both accuracy and currency. I am working with Sengent to make sure we have access to a stable, up-to-date code base before I begin my development work.

Finally, this transition period in the project’s life cycle provides a good opportunity to step back and look at where we’ve been and where we would like to go in the future. Fairly soon, I will be releasing Vision and Values statements for the project, which will help guide our actions moving forward. These will be living documents, and I encourage you to participate in shaping the project by giving me your thoughts on these topics. Then, later in the year, we plan to publish a review article detailing CommunityTSC’s accomplishments and our place in the historical and current Distributed Computing landscape for computational chemistry. This will serve as the perfect celebratory capstone on the first phase of the project as we turn the page and move forward.

As you can see, we had a busy weekend, and there is a lot of work to do in the coming year. I’m confident, though, that we’ll be able to make important improvements to the project and continue to do good science, and I look forward to working with the talented people at TRI and all of you in our dedicated user communities. Please don’t hesitate to contact me with any questions or concerns.

Thank you,
Adam

Charles Beckius

February 10, 2006 04:57 PM

I have implemented the new tasks with all the structures that DID migrate. We're taking a change here wrt: what will the assignment engine do when it can't find the pdbq file I was starving y'all and as Adam is working on how to load new libraries, it seemed foolish to sweat the old library so much. Everyone needs WORK!

--------------------
Charles Beckius
TSC Forum Administrator
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Thu Feb 23, 2006 5:27 pm

Adam Hughes
MTSD

February 19, 2006 03:03 PM

Vision Statement

Community TSC will be a premier distributed-computing platform for conducting drug-discovery research, enabling participants to make meaningful contributions in the fight against childhood diseases. The combination of the latest, most promising computational technologies and scientific knowledge will form the backbone of a community in which participant ownership and engagement are of paramount importance.


Values and Objectives

Value : The Community TSC project is dedicated to fighting childhood diseases through
the use of the latest drug-discovery knowledge and computer simulation
technologies.
Objectives:
• Actively search for new and promising target proteins.
• Maintain fresh candidate libraries.
• Evaluate novel simulation techniques, and implement them in project software where appropriate.


Value : The Community TSC project recognizes the importance of high-quality software
and hardware systems in the modern drug-discovery landscape.
Objectives:
• Perform routine maintenance and updating of software to ensure its viability and compatibility with industry standards.
• Regularly enhance the software to improve its function and the user experience.
• Continue to monitor the hardware systems to provide a stable computing environment.
• Evaluate other software and hardware options where appropriate.


Value: The Community TSC project values the vital contributions of a vibrant, engaged
user community.
Objectives:
• Provide robust day-to-day software and hardware support to the user community.
• Facilitate community involvement and ownership through frequent, honest communication of project goals and status.
• Aggressively seek means of expanding the user base.


Value: The Community TSC project is committed to advancing the science of
computer-aided drug discovery.
Objectives:
• Establish and nurture relationships with drug-discovery researchers in academia and other research venues.
• Share ideas and results through journal articles, presentations, and private communications.

Adam Hughes
MTSD

February 23, 2006 07:43 AM

All,

As you may have noticed, there have been some problems in loading tasks over the last couple of weeks since the new NAS was installed. These problems are due to the fact that many of the files on the old NAS were not able to be copied to the new NAS. Because you all have done such a great job and found so many promising conformers, we feel that we can spend the enormous time and effort involved in a complete replication of the data on the old NAS in more worthwhile pursuits, like the deployment of new libraries. However, we’re still a month or so away from that point, and we DO want to continue to find any good remaining conformers from the present setup in the meantime. With that in mind, here is our plan for going forward.

Charles has switched out the new NAS and re-installed the old NAS. This change should eliminate the warnings you are seeing, such as “loaded 9 of 0 tasks”, etc. It will also allow your computations to continue to span the entire existing problem space, and not just a small segment of it. I am working on formatting new libraries (and a new target) for use with this project, and I expect to have that work done sometime around the end of March. At that point, the new tasks will have been loaded on the new NAS, and we will remove the old NAS and re-install the new NAS. The old tasks will be purged from the database, and you can start crunching on completely fresh material. Note that while the old TASKS will be gone, THE OLD RESULTS WILL BE SAVED, so that they’re available for future laboratory tests.

What all this means to you is that you can continue to do valuable work by loading existing tasks and helping us to fill any holes that we still have in our solution space. It also means that you should be thinking about uploading any cached results within the next month or so. We want to make sure that ALL of your results are counted and saved, and the best way to do this is for you to upload before the new NAS comes on line permanently.

Thanks for your time and efforts, and please let me know if you have any concerns or questions.
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Sun Mar 05, 2006 11:37 pm

Adam Hughes

March 02, 2006 10:33 AM

Greetings everyone,

I wanted to take a few minutes to update you on the projected schedule for switching over, permanently, from the old NAS to the new NAS. Library and target preparation is progressing well, and we plan to have the new tasks loaded on the new NAS by April 3 (more details later on the nature of the targets and libraries). We will then unplug the old NAS and replace it with the new NAS. At this point, the first phase (call it Phase I) of the project will be completed, and we will be starting fresh. All stats from Phase I will be saved and available in a static format, and a Hall of Fame will be formed from the top producers. The project stats will then be reset.

With this in mind, you need to be sure to upload any cached results you have by April 2 to ensure that your stats and the results themselves are saved and counted. Remember, Phase II starts on April 3, so Phase I results turned over after that date won’t count toward your new stats.

If anything happens to affect the planned schedule, I’ll announce it immediately. In the meantime, let me know if you have questions or concerns.

Thank you,
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Wed Mar 29, 2006 1:38 pm

Adam Hughes

March 20, 2006 08:43 AM

Hi everyone,

I just wanted to remind you that Phase II of the TSC Project begins on Monday, April 3. At that time, the new structure NAS will replace the old one, and we will begin working on the new Rheb target using our new candidate drug libraries. After a break-in period during which we make sure things are running smoothly, I'll begin to re-introduce our existing targets, also running against the new drug libraries. Once again, it is important to TURN OVER ALL OUTSTANDING RESULTS by the end of April 2.

In preparation for the switchover, I have temporarily overridden the maximum queue size for users to 500 tasks. This is done completely on the server side, and it simply prevents the server from handing out more than 500 tasks to any user at any one time. The purpose of this move is to reduce the number of outstanding tasks as we approach the switchover date so that the results/stats servers don't get completely swamped on that last weekend. I anticipate further lowering the queue size to around 100 this coming weekend, and then to something close to zero during the final weekend. April 1-2 should be used to finish any remaining tasks and make sure the results are uploaded.

Thanks for your continued support!

Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Tue Apr 04, 2006 12:04 am

Adam Hughes

March 30, 2006 06:43 PM

Phase I of the CommunityTSC project ends on April 2, 2006. On Monday,
April 3, Phase II starts with new libraries and a new TSC target.
As well, updated server software will be deployed to better correlate
the tasks assigned to your node(s) with the statistics credited to you.

To prepare for this new chapter of the project, please be sure to upload
all of your results this coming weekend. Monday will be too late!
After this weekend, you will need to flush any remaining tasks to ensure
that your client is working on the latest material.

We appreciate all of your work during Phase I, and we look forward to
continued collaboration. Stay tuned for more announcements soon!

Adam Hughes

April 02, 2006 06:17 PM

I want to give you a brief rundown of what should happen during the switchover tomorrow:

By 8AM EDT, I will gather and archive the final statistics from Phase I for the purpose of creating the Hall of Fame (which won't appear for a few days). After that I will take the project servers down, and then Charles will switch to the new NAS. Finally, I'll bring the servers back up and we should be back in business. I'll make an announcement when everything is running again, which I expect to be before 10AM.

Initially, I'll put the queue size at 100, just in case something goes wrong. I don't want to hand out thousands of bad data points to you all. When things look OK, I'll bump it back up to the full 2000.

Thanks for your efforts during Phase I and your enthusiasm for Phase II.

Adam

Adam Hughes

April 03, 2006 10:48 AM

Phase II tasks are now available for downloading. The queue size is currently set at 100 but will be increased once we get past any early bumps we might encounter.

I'll monitor the systems pretty closely the next few hours and let you know if there are problems.

Thanks to Charles for doing the physical NAS switch this morning.

Thanks!
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Thu Apr 06, 2006 10:20 am

Adam Hughes

April 06, 2006 10:37 AM

All,

There has been a hard drive failure on the Member Services machine, which is the cause of the problems we are experiencing in, well, memeber services and stats updates. Charles is working to get the server back to a functional state. Running integrity checks and recovery is a slow process, but we'll keep you updated when we have more information.

Thanks,
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Sat Apr 22, 2006 8:17 pm

Adam Hughes

April 20, 2006 09:03 PM

All,

After working through several iterations of statistics-file manipulations, I think that restoring the nodes that were CREATED after 1/24 and before 4/5 programatically is not pracitcal at this point. However, I HAVE been able to successfully add a couple of the nodes in "by hand", so I think this is the approach we'll have to take. It may take a little longer to get everything back, but is less risky. So, what I need from you, for any node that you created from 1/24 to 4/5, is

1) Node ID
2) Member name
3) Registration e-mail

Please send this information to me in a private message or in an e-mail message to ahughes@childhooddiseases.org.

If you've already created new nodes, that's fine, and you can continue to run with the old nodes. To recoup the statistics from your "lost" nodes, though, please send the information requested above.

Note that I will be unavailable on Friday evening, so these updates probably won't be started until Saturday (4/22).

Thanks,
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Wed May 03, 2006 9:57 am

Adam Hughes

May 02, 2006 07:24 PM

Now that things have stabilized a bit after the start of Phase II and the Member Services crash in April, it's a good time to review recent events and the project status, as well as to lay out our plans for the coming weeks and months.

April 2006 Recap
================

1) Phase I of the project was completed on April 3 as we turned the page
to Phase II. This new period was marked by:

a) A new TSC (Rheb) target.
b) Two new candidate drug libraries.
c) Statistics server updates to prevent duplicate submission of results.

2) We experienced a hardware crash on April 5 that affected all aspects of
Member Services and Statistics. Task downloads and uploads were
unaffected, so the important work of finding drug candidates continued.
A temporary software workaround allowed the statistics server to continue.

The Memeber Services system was recovered during the last three weeks
of the month through the following steps:

a) New hardware was purchased and brought online.
b) The Member Services database was restored from the quarterly backup
of January 24, 2006.
c) Nodes that were created after January 24 and before the crash
were manually re-created in the Member Services database.

Further, new data backup protocols have been established
to hopefully make recovery a bit easier and more reproducible in
the future should the unthinkable happen again.

3) A "Hall of Fame" was created as a historical record of all the
hard work that went into Phase I. The statistics for nodes, members,
and teams have been preserved for posterity, and you all have our
hearty congratulations and deep gratitude for your devotion
to this project.

Now, moving forward ...

Planned Activities
==================
1) After more than a week at 1000, the maximum queue size has been bumped
up to 2000, where it will remain unless problems crop up. You should
be getting your full complement of tasks.

2) Starting on or about this coming weekend (May 6-7), the "old" targets
will be re-introduced to the system, only this time we'll be running
them against the new drug candidates. Expect to see about one new/old
target a week until we're back up to full speed.

3) More statistics server changes are planned. In particular, a "turnaround"
time limit will be instituted to encourage more timely uploading of
results and to discourage the accumulation of large amounts of results
that need to be uploaded. The initial time limit will be one month,
and we'll adjust from there as we gain more experience. This means
that the results for each task need to be uploaded within a month
of the assignment date or statistics won't be credited to the node
that did the work. If something comes in late, the docking results
will be collected, but the stats won't be incremented.

Obviously, since I'm just mentioning this now, I'm not going to turn
this feature on tomorrow. My plan is to install this change around
the middle of June.

As always, your questions and comments are welcomed.

Thanks!
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Sat Jul 22, 2006 2:22 am

Adam Hughes

July 08, 2006 08:42 PM

It's been a couple of months since I've checked in with a project status update, so I thought you might all be interested in a recap of recent events and a preview of what's coming up.

May - June 2006
===============
Phase II continued to roll along, with more than 3 million candidate results returned to date. From the beginning of Phase II through the end of June, more than 709 nodes came online to join the cause, and we now have over 35,000 registered nodes. Thank you all for your overwhelming interest and support!

During May and June, further updates were made to the statistics server to ensure more reliable and accurate tracking of your work. As a result of some of this development, task time limits were installed on July 5, with tasks currently expiring 31 days after they have been assigned. In addition, software was written to provide up-to-date candidate coverage information on the Statistics Snapshot page. This data can be used to help us decide when to retire a certain target/candidate library combination.

Upcoming
========
Unfortunately, due to some of the problems encountered with the statistics servers and the nature of the fixes and enhancements, I didn't get around to deploying the "old" targets with the new candidate libraries at the rate I wanted to. So getting these deployed is the next order of business, and you'll probably see at least one of these targets re-appear within the next week or so, paired with brand new candidates.

Also, you may recall from earlier announcements that we plan to publish an article about this project and the fine work you have all been doing for the last several years. I will be working on this project during the next month (or three!), and I'll keep you updated on our progress here. Being able to share our work with others is important, and these types of communications will help us in that regard.

Finally, I plan to begin looking into how we're going to eventually update our client, which I know is high on many of your wish lists. Changing the client presents fairly significant challenges, but I'm confident we'll get things ironed out and develop a flexible protocol that will allow us to respond to client-side issues. Here, too, I'll do my best to keep you in the loop.

As always, your questions and comments are welcomed.

Thanks!
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm

Postby Jwb52z » Wed Nov 08, 2006 12:20 pm

October Project Status
Adam Hughes

posted October 08, 2006 06:47 PM
--------------------------------------------------------------------------------
Our last project status update was in July, so it's high time for another one. I expect development activity to pick up in the next few months, so I'll plan on publishing updates at least monthly, if not more frequently.

July-September 2006
===================
You guys have exceeded your own lofty standards of crunching in the last few months. Since the last project update on July 5, you have returned about 4 million candidate results, and you're now uploading between 35,000 and 45,000 candidates a day. This is outstanding and you are to be commended on your dedication to this project!

In early July, we had just "turned on" candidate time limits, and I hadn't had much time to evaluate the impact to the project by the time of the last status update. Over the last few months, I've monitored the results coming in to see how many "expired" tasks are being returned. In the first week or so of time limits, some nodes were returning lots of expired tasks, but that seems to have become less of a problem as people have gotten the hang of downloading manageable chunks of tasks to work on. Currently, less than 1% of all returned results are expired. And remember, even if a task is expired, it is still saved, but it does not toward a node's statistics.

Along these lines, I'd like to know how you all feel about the time limits now that we've had them for awhile. Is 31 days working OK? Are there problems that the time limits are causing you? I've had a smattering of feedback, but no landslide of protest or yippees. Let me know if we need to change something.

I still have some of the old targets queued up and ready to be deployed with new candidates, but the coverage statistics show that we've still got awhile to go before we have really complete coverage of the targets deployed right now. In general, I'd like to keep a couple of targets active until we've pretty well exhausted them, and then move on (or back) to other targets.

In regards to client updates, you may have noticed the new forum area to discuss which features we'll put into the client update slated for 2007. This is the first step toward making sure that we get the features that are most important to the biggest segment of our users.

Finally, preparation work for the project review article continues, but it's going a little slower than I hoped and expected. I'm still committed to finishing the article, though, and I'll let you know how it's coming.

Upcoming
========
On the server side of the project, I have a few things line up for the not-too-distant future. One is a password utility that will allow you to reset your project password on your own. As I get ready to unroll these items, I may be asking for a few volunteers to try them out before they go live.

In the next couple of months, I'll continue to be asking for your input on features for the next release of the client, scheduled for 2007. The more input you provide, and the more discussion we can generate, the better the final product will suit your needs. Please plan to visit the requirements forum often!

I'll continue to monitor the coverage statistics and switch out targets when appropriate. Something we may want to consider is that the switchouts represent natural breakpoints in the project. We could take statistics snapshots at these times and save them as we did with the Hall of Fame, but without the overall statistics reset. At some point we could possibly have two sets of running stats: all-time and current. This way, we could have "seasons" in which to do our crunching. Let me know your thoughts

As always, your questions and comments are welcomed.

Thanks!
Adam

November Project Status
Adam Hughes

November 05, 2006 10:34 AM
--------------------------------------------------------------------------------
Just a quick project status update as we head into the last couple months of the year. Nothing major here, but I wanted to make sure we're all up to date.

October 2006
============
The big activity for October, aside from your continued production in crunching the candidates, was the beginning of requirements gathering for the updated client, scheduled to be releases sometime in 2007. You've provided lots of great ideas, and I'll start work on estimating these tasks soon.

I also began the process of updating our server-side programs to run on the latest version of JBoss. Some of the internal Jboss servers have already been upgraded, and I'll be working on the others in the next month or so.

Upcoming
========
I'd like to keep the requirements-gathering forums active for another month or so, and then figure out which are the most important features, and which are most feasible for implementation next year. Make sure you get any "must-haves" submitted pretty soon.

Jboss server upgrades will continue, hopefully further stabilizing our environment.

As always, your questions and comments are welcomed.

Thanks!
Adam
Jwb52z
 
Posts: 997
Joined: Tue Aug 30, 2005 10:56 pm


Return to New News

Who is online

Users browsing this forum: No registered users and 1 guest

cron