YOUR FEEDBACK
Ubuntu Here We Come! - Java Finally To Become 100% Open Source
Reader wrote: Since November 206, wow! that is a long process.
SOA World Conference
Virtualization Conference
$200 Savings Expire May 16, 2008... – Register Today!

SYS-CON.TV
TOP THREE LINKS YOU MUST CLICK ON


IBM Life Sciences Framework

Digg This!

Pharmaceutical companies are facing the challenge of improving the productivity of the drug discovery and clinical trials process, creating and sharing knowledge across the silos of that process, and integrating applications and data in enterprise-wide development efforts. Biotech, research, and medical organizations face similar challenges of collaboration and sharing of applications and data.

To help address these challenges, the IBM Life Sciences Framework uses industry-standard technologies (J2EE, XML, Web services, etc.) and protocols and data representations from standards efforts such as the I3C (Interoperable Informatics Infrastructure Consortium), OMG-LSR (Object Management Group-Life Sciences Research), HL7 (Health Level 7), and the Bio* projects. The framework addresses the integration of applications, data, and user interfaces.

The Convergence of Life Sciences and Information Technology
The application of information technology to the life sciences is essential to the progress of medical research, the drug discovery process, and the realization of improved health care. Awesome innovation is occurring in this area. In "Creating a Bioinformatics Nation," Lincoln Stein compares the old city-states of Italy to bioinformatics today. Determining how to assist in the productive and timely use of this innovation, so it proceeds in a parallel process, is a fundamental challenge. The results of the individual "city-states" (individuals, departments, companies, products) can be amplified though the use of common interfaces. This is a problem of coarse-grained, loosely coupled integration (from the IT perspective) of applications and data. The use of broad industry and community standards will enable this.

This article provides a simple example of accessing a Web service called XEMBL. XEMBL is a means of accessing EMBL nucleotide sequence data. This is a publicly available database kept at the European Bioinformatics Institute (EBI). In a future article we'll look at using a UDDI registry to allow service providers and requesters to share services in this sample application.

XML Vocabularies in the Life Sciences
XML as a standard data exchange format provides a great deal of power to support program-to-program interaction. It's also the obvious choice for several reasons. First, XML allows for the creation of tags to define the semantics of a particular XML vocabulary, so we can build vocabularies intended to be useful in specific domains such as math, chemistry, and genomics.

Second, the creation of tag sets in XML is relatively simple and fast. Because the structure of XML is standard, the data is self-describing and can be interpreted and parsed by machines.

Third, tags delimit the content and syntax to allow us to build data structures of arbitrary size and complexity, and the data is delivered in text format. Text is very good for exchanging information across diverse platforms, since virtually every system can handle it.

Finally, industry support - all the major vendors are accepting XML as a standard for exchanging data. Many vocabularies are emerging across the drug discovery domain. Infrastructure support is available to translate among the various vocabularies. Ontologies are being developed to allow for machine-to-machine interaction in this space.

Figure 1 shows some of the XML vocabularies in the life sciences.

XML, Web Services, Etc.
Web services is another standard that has broad industry support. Some fun sample Web services can be found at XMethods (www.xmethods.org). WebSphere has strong support for Web services in WebSphere Application Server (WAS), WebSphere Studio Application Developer (WSAD), and across the product line.

Figure 2 depicts something that's probably pretty familiar to you - an n-tier architecture. Clients communicate using Web browsers or Internet-enabled applications with an application server. WAS is an example of a Tier-1/Tier-2 middleware layer. The Web services support built into WAS and tools such as WSAD help with application integration.

The applications running in the middle tier access databases using SQL queries via JDBC. IBM DiscoveryLink is a convenient way to access distributed heterogeneous data sources. It uses sophisticated query optimization to access those data sources and helps with data integration.

Large multinational pharmaceutical companies want to provide a consistent look and feel for their enterprise-wide applications. Mergers, widely separated development groups, and outsourcing of development projects tend to make this more difficult. A portal, such as WebSphere Portal, helps with integrating the user interface.

One of the organizations working on standards for interoperability in the life sciences is the I3C. IBM is one of its founding members and helped to develop interoperability demonstrations for the 2001 and 2002 BIO conferences. Figure 3 shows the configuration developed for BIO 2002.

The figure shows the client applications interoperating with one another using a common XML vocabulary, called BSML, to represent gene sequences. SOAP is used to invoke Web services on the application server. The Web services use JDBC and SQL to make queries on the data sources. IBM DiscoveryLink allows the application to present a complex query to a federation of distributed heterogeneous databases.

A Simple Example of a Web Service in the Life Sciences
Here you'll see a simple example of invoking a Web service, XEMBL, in the life sciences. XEMBL and the Open Bibliographic Query Service, both hosted at the EBI, are examples of the loosely coupled, services-based (r)evolution that is occurring.

The XEMBL Web service takes two parameters - the accession number for the sequence of interest and the XML format that you want returned. The accession number is a unique identifier for a sequence record. Currently XEMBL supports BSML and AGAVE XML formats for the result. In this example we'll ask the service for the BSML format.

Listing 1 shows the code for a Java client that invokes the XEMBL Web service. We used IBM's Java IDE, WSAD, to develop this sample (see Figure 4). WSAD makes it easy to develop client-side code as well as the full suite of Web services, including services to run on WAS, the WSDL to describe those services, and the use of UDDI to publish and discover those services.

It's possible to develop an application by browsing a UDDI registry and importing services from it. Import the WSDL into your project, then create a skeleton JavaBean and generate a Java client proxy and a sample application from the WSDL document. You can then easily test the code using the integrated debugger.

Listing 2 shows a portion of the response. It's an XML document containing the nucleotide sequence we requested. The "agct"s that you see in the element are the alphabet soup of life. Also in the document are the history and annotations on that sequence. Annotations mark locations of biologically important parts of the sequence.

Listing 3 shows a simple Perl script that uses the SOAP toolkit SOAP::Lite for Perl. It uses the WSDL file from the EBI site to create the service. Perl IDEs help users include Web services in their Perl code. In these IDEs, a Web services popup wizard can be presented to the user. The user can point to the WSDL for the Web service he or she is interested in (in this case the WSDL is on the EBI site), and the Perl IDE can then assist in selecting the method for that Web service and setting the parameters to it. The script can then be tested quickly in the IDE using the Web service.

Summary
In this article we touched on some of the challenges facing the life sciences industry. For more on this see "The Life Sciences Revolution" article elsewhere this issue. We've discussed the need for standards in addressing these challenges; this includes technology standards (J2EE, XML, Web services, etc.) as well as domain standards (coming from the I3C, OMG-LSR, HL7, the Bio* projects, etc.). Finally, we looked at some code to access a Web service at EBI.

References

  • Stein, L. (2002). "Creating a bioinformatics nation." Nature. May.
    www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v417/n6885/full/417119a_fs.html
    &content_filetype=PDF
  • XEMBL:
    www.ebi.ac.uk/xembl
  • Open Bibliographic Query Service:
    http://industry.ebi.ac.uk/openBQS
  • SOAP::Lite for Perl:
    www.soaplite.com
  • developerWorks:
    www.ibm.com/developerworks/webservices
  • I3C:
    www.i3c.org
  • Pacholski, P. (2002). "Web Services Programming with WebSphere Studio Application Developer" April.
    www7b.software.ibm.com/wsdd/library/techarticles/
    0204_pacholski/pacholski.html
  • XMethods:
    www.xmethods.org

    IBM Products and Solutions

  • WebSphere Application Server:
    www.ibm.com/software/webservers/appserv
  • WebSphere Studio Application Developer:
    www.ibm.com/software/ad/studioappdev
  • WebSphere Portal:
    www.ibm.com/software/webservers/portal
  • IBM Life Sciences Framework:
    www.ibm.com/solutions/lifesciences/framework/index.html
  • IBM DiscoveryLink:
    www.ibm.com/solutions/lsframework

    For More Information

  • IBM Life Sciences:
    www.ibm.com/solutions/lifesciences
  • IBM Life Sciences Framework Presentation:
    www.ibm.com/solutions/lifesciences/pdf/FW_Sales_pres entation_V2.pdf
  • Informatics: A Key to Success in the Life Sciences Industry:
    www.ibm.com/solutions/lifesciences/pdf/Informatics.PDF
    About Michael Niemi
    Mike Niemi is a Senior Software Engineer in the IBM Life Sciences organization. He has held various positions in software and hardware development over the past 24 years (not counting the 3 years he spent sailing in the Caribbean). This includes development in Voice Systems, WebSphere Application Server and SiteAnalyzer, TCP/IP while in the Research Division, and mainframes.

  • WEBSPHERE LATEST STORIES . . .
    IBM Unveils Insurance Operations of the Future Powered By SOA
    IBM announced two new advances in the insurance industry - a solution for improving operational efficiency and a framework for process acceleration - that are designed to help insurance providers lower costs and increase customer satisfaction by handling core processes, such as claims
    ParAccel Announces OEM Relationship with IBM
    ParAccel announced it has entered into an original equipment manufacturer (OEM) agreement with IBM. Under the terms of the agreement, ParAccel will embed IBM InfoSphere Change Data Capture within the ParAccel Analytic Database, providing ParAccel customers with seamless and real-time u
    Microsoft To Keynote 4th International Virtualization Conference & Expo
    Mike Neil is general manager for virtualization strategy in the Windows Server Division at Microsoft. Mike is focused on the delivery of the Windows virtualization technology, including Windows Server 2008 Hyper-V, Microsoft Hyper-V Server and Virtual PC 2007. Mike also directs the tec
    Micro Focus Upgrades SOA Express for IBM CICS
    Micro Focus announced the availability of SOA Express 8.0. The new version adds support for direct deployment into IBM's Customer Information Control System (CICS), enabling users to accelerate the deployment of Web services by reusing their existing CICS TS mainframe infrastructure in
    3rd International Virtualization Conference & Expo: Themes & Topics
    From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
    Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
    Red Hat is a trusted open source provider. Red Hat offers enterprise customers a long-term plan for building infrastructures on the quality and innovation of open source. Combining open source operating system platform, Red Hat Enterprise Linux, together with applications, management
    SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
    SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON FEATURED WHITEPAPERS

    ADS BY GOOGLE
    BREAKING WEBSPHERE NEWS
    Bryan Flanagan, Director of Training for Zig Ziglar, to Teach Sales Skills at American Marketing Association Mastering Sales Seminar, May 23, Irving, TX
    Nationally renowned speaker, author and sales trainer, Bryan Flanagan, Director of Corporate T