Click here to close now.




















Welcome!

IBM Cloud Authors: Carmen Gonzalez, Elizabeth White, XebiaLabs Blog, Liz McMillan, Pat Romanski

Blog Feed Post

Benchmarking RRE and SAS

by Thomas Dinsmore Regular readers of this blog may be familiar with our ongoing effort to benchmark Revolution R Enterprise (RRE) across a range of use cases and on different platforms.  We take these benchmarks seriously at Revolution Analytics, and constantly seek to improve the performance of our software.  Previously, we shared results from a performance test conducted by Allstate.   In that test, RRE ran a GLM analysis in five minutes; SAS took five hours to complete the same task.  A reader objected that the test was unfair because SAS ran on a single machine, while RRE ran on a five node cluster.  It's a fair point, except that given the software in question (PROC GLM in SAS/STAT) the performance would be the same on five nodes or a million nodes, since PROC GLM can scale up but not out. Arguing that the Allstate benchmark was "apples to oranges", SAS responded by publishing its own apples to orange benchmark.  In this benchmark, SAS demonstrated that its new HPGENSELECT procedure is very fast when it runs on a 144 node grid with 2,304 cores.  As noted in the paper, this performance is only possible if you license more software, since HPGENSELECT can only run in Distributed mode if the customer licenses SAS High Performance Statistics. We will be happy to stipulate that PROC HPGENSELECT runs faster on 2,304 cores than RRE on 20 cores. As a matter of best practices, software benchmarks should run in comparable hardware environments, so that we can attribute performance differences to the software alone and not to differences in available computing resources.   Consequently, we engaged an outside vendor with experience running SAS in clustered environments to perform an "apples to apples" benchmark of RRE vs. SAS.  The consultant used a clustered computing environment consisting of five four-core commodity servers (with 16G RAM each) running CentOS, Ethernet connections and a separate NFS Server. We tested RRE 7 versus SAS Release 9.4, with Base SAS, SAS/STAT and SAS Grid Manager.  (We did not test with SAS High Performance Statistics because we could find no vendors with experience using this new software.  We note that more than two years into General Availability, SAS appears to have no public reference customers for this software.)  In our experience, when customers ask how we perform compared to SAS, they are most interested in how we compare with the SAS software they already use. To test Revolution R Enterprise ScaleR, we first deployed IBM Platform LSF and Platform MPI Release 9 on the grid, then installed Revolution R Enterprise Release 7 on each node.  SAS Grid Manager uses an OEM version of IBM Platform LSF that cannot run concurrently with the standard version from IBM, so we configured the environment and ran the tests sequentially.   To simplify test replication across different environments, we used data manufactured through a random process.  The time needed to manufacture the data is not included in the benchmark results.  Prior to running the actual tests, we loaded the randomized data into each software product’s native file system: for SAS, a SAS Data Set; for Revolution R Enterprise, an XDF file. Although we have benchmarked Revolution R Enterprise on data sets as large as a billion rows, typical data sets used by even the largest enterprises tend to be much smaller.  We chose to perform the tests on wide files of 591 columns and row counts ranging from 100,000 to 5,000,000, file sizes that represent what we consider to be typical for many analysts.  We also ran scoring tests on “narrow” files of 21 columns with row counts ranging up to 50,000,000. Rather than comparing performance on a single task, we prepared a list of multiple tasks, then wrote programs in SAS and RRE to implement the test.  Readers will find the benchmarking scripts here, on Git together with a script to produce the manufactured data. To implement a fair test, we asked the SAS consultant to review the SAS programs and enable them for best performance in the clustered computing environment. Detailed results of the benchmark test are shown here, in our published white paper RRE ran the tasks forty-two times faster than SAS on the larger data set RRE outperformed SAS on every task The RRE performance advantage ranged from 10X to 300X The RRE advantage increased when we tested on larger data sets SAS’ new HP PROC, where available, only marginally improved SAS performance We invite readers to use the scripts in your own environment; let us know the results you achieve. Revolution Analytics Whitepapers: Revolution R Enterprise: Faster Than SAS 

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@ThingsExpo Stories
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device access to health records while reducing operating costs and complying with government regulations.
Containers are not new, but renewed commitments to performance, flexibility, and agility have propelled them to the top of the agenda today. By working without the need for virtualization and its overhead, containers are seen as the perfect way to deploy apps and services across multiple clouds. Containers can handle anything from file types to operating systems and services, including microservices. What are microservices? Unlike what the name implies, microservices are not necessarily small, but are focused on specific tasks. The ability for developers to deploy multiple containers – thous...
The Internet of Things is in the early stages of mainstream deployment but it promises to unlock value and rapidly transform how organizations manage, operationalize, and monetize their assets. IoT is a complex structure of hardware, sensors, applications, analytics and devices that need to be able to communicate geographically and across all functions. Once the data is collected from numerous endpoints, the challenge then becomes converting it into actionable insight.
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, will introduce the technologies required for implementing these ideas and some early experiments performed in the Kurento open source software community in areas ...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal an...
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.
The 3rd International WebRTC Summit, to be held Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 15th International Cloud Expo, 6th International Big Data Expo, 3rd International DevOps Summit and 2nd Internet of @ThingsExpo. WebRTC (Web-based Real-Time Communication) is an open source project supported by Google, Mozilla and Opera that aims to enable bro...
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes about through a Communications Platform as a Service which allows for messaging, screen sharing, video...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so they don't have to be replaced and are instantly converted to become smart, connected devices.
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on demos and comprehensive walkthroughs.
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.